Abstract
For managers of road infrastructure, culvert deterioration is a major concern since culvert failures can cause serious risks to the traveling public. The efficiency of the cost- and labor-intensive culvert inspection and maintenance process can be improved by properly identifying the key impact factors on culvert condition deterioration. Although the use of machine learning (ML) techniques to predict culvert conditions has been proven to be a promising tool for enhancing culvert management and enabling proactive scheduling of maintenance tasks, the information provided by the developed ML models has been given little attention for further use and analysis. By utilizing the predictor importance results of an evaluated decision tree (DT) culvert condition prediction model and the Mann–Whitney U test, this study provided insights to the identification of the key variables influencing culvert deterioration. According to the findings, five impact factors, including culvert span, pH, age, rise, and cover height, often have significant impact on the condition ratings of culverts made of various materials. In addition, such a statistical test-assisted factor identification process offered a way of identifying and enhancing the input variable selection for predictive ML model development.
1. Introduction
Water can flow beneath roads, railroads, and other infrastructure through culverts. Millions of culverts have been constructed beneath highways in the United States, in Ohio alone, there are nearly 100,000 culverts installed [1, 2]. If culverts are not properly managed and maintained, they might obstruct the necessary and intended passage of storm water runoff [3–5]. Thus, it is necessary to have culverts well managed and maintained. Driver’s safety as well as the environment can both suffer significantly from culvert failures. Water can overflow from a culvert that is blocked or broken, which can then cause flooding or road failures and endanger neighboring properties or pose risks to motorists [6], because water that rushes onto a road can erode the subbase and pavement; leading to sinkholes, potholes, and other damage that can be expensive to fix. Thus, effective culvert management is critical for ensuring the safety and integrity of these structures [7].
As part of current culvert management procedures, state departments of transportation (DOTs) dispatch trained people to inspect the culverts according to a predetermined schedule. Given the enormous number of culverts that must be maintained, this operation frequently takes a lot of time and effort. By facilitating improved predictive maintenance, anomaly detection, and decision-making, machine learning (ML) has emerged as a promising technique for enhancing culvert management [8–12]. However, most existing studies did not pay enough attention to the information provided by the models other than the prediction results and the different impact levels of the culverts’ physical and environmental factors on specific culvert damage or deterioration types.
This research examines the influencing factors of culvert conditions through statistical test on the predictor importance results of an evaluated decision tree (DT) culvert condition prediction model, because the predictor importance results alone only offer limited information for comprehending the complex interactions between various factors [13].
This study takes into account culverts composed of the four most widely used materials: concrete, corrugated metal, corrugated plastic, and vitrified clay. Additionally, four prevalent culvert deterioration types are chosen, including material condition, culvert alignment, seams of joint, and scour, which are represented by MC, CA, SOJ, and SC, respectively, in the remainder of this paper. The Ohio Department of Transportation’s (ODOT) Transportation Information Mapping System (TIMS) culvert inventory database was used to collect the data for this study.
2. Literature Review
An extensive literature review was conducted to achieve the research objectives of this study. The covered areas include: (1) culvert inspection routine; (2) ML models for culvert condition prediction; (3) culvert deterioration types and condition impact factors; and (4) Mann–Whitney U test.
2.1. Fixed Time Interval Culvert Inspection Routine
Considering the large number of culverts installed, it often takes a substantial amount of resources to conduct all necessary inspections. The current routine that is used for arranging culvert inspections is often based on a fixed time period. The “Culvert Inspection Manual” of the Federal Highway Administration (FHWA) in the US mandates that culverts to be inspected every 2 years [14]. At the state level, state DOTs have also created related manuals and guidelines to offer suggestions for how frequently to conduct culvert inspections [15]. The National Cooperative Highway Research Program (NCHRP) conducted a survey on culvert-inspection policies and practices nationwide, and the findings showed that there is no uniform inspection cycle used by state transportation agencies; instead, many states have developed their own standards for allocating time for culvert inspections [16]. The main shortcoming of such fixed-schedule inspection is that it lacks focus and neglects the fact that culverts made of different materials may deteriorate at different rates, resulting in overlook culverts in bad conditions during two inspections.
The frequency of culvert inspections utilized by different transportation authorities is given in Table 1.
2.2. ML Models for Culvert Condition Prediction
In earlier studies, the application of ML approaches to forecast culvert status had been effectively established. Three ML algorithms, including artificial neural network (ANN), support vector machine (SVM), and DT, were frequently employed. These models were developed for many applications, such as culverts’ remaining service life estimation [9, 10, 17], predicting specific culvert failure or deterioration types [8, 11, 12, 18–20], coupling with digital image correlation (DIC) techniques to identify and analyze structural defects [21, 22], and predicting the condition of other types of transportation assets [23–26]. For model evaluation, commonly used metrics are accuracy, recall, precision, F-score, and receiver operating characteristic (ROC) curve for classification. The results of existing studies using ML algorithms to predict the condition of culverts and other infrastructure proved the effectiveness and reliability of such applications. However, there is a lack of further utilization of the information provided by the ML models, for example, the predictor importance results, which were often mentioned but rarely analyzed.
2.3. Culvert Deterioration Types and Condition Impact Factors
Culvert can deteriorate in multiple ways. The most commonly seen ones include partly structural defects or failure, material aging, joint dislocate, and disalignment [27–30]. To represent the extent and specific types of culvert deterioration, several DOTs have developed culvert rating systems. For example, the ODOT rates culverts on a scale of 0–9, where 0 represents culvert’s total failure, and 9 represents culverts in new or like-new conditions for 16 different culvert ratings as shown in Table 2 [31].
This provided the basis for the output variable selection of this study, including MC, CA, SOJ, and SC. Figure 1 provides images from previous studies as examples of the four forms of culvert failure and deterioration [28–30].

In terms of impact factors, both the culverts’ physical and environmental properties, such as its material, length, shape, and pH level of the water inside it, are commonly used in existing studies [8, 9, 32, 33]. The actual selection of these factors is frequently constrained by the data availability situation. In the ODOT TIMS culvert database, most of the commonly used culvert physical and environmental factors are available, so based on combined consideration of existing studies and data availability, 13 variables were used by the authors to develop the DT model as well as to determine the culvert condition impact factors.
2.4. Mann–Whitney U Test
Existing studies that used ML models to forecast culvert conditions paid little attention to the subsequent study of predictor importance outcomes, particularly using statistical test procedures. Statistical analysis is a scientific way to discover the insights or underlying pattern of data and convert the data into a meaningful way. In this study, the Mann–Whitney U test was selected for further statistical testing and analysis of the predictor importance results. As an illustration, the Mann–Whitney U test, sometimes referred to as the Wilcoxon rank sum test, examines differences between two groups based on a single ordinal variable that lacks a defined distribution [34, 35]. The Mann–Whitney U test is intended to determine if two groups (e.g., samples “a” and “b”) come from the same population, which is a null hypothesis significance test stipulating that both samples are subsets from the same population [36, 37]. For instance, although culvert age was identified to be an important input variable for predicting culvert conditions [1, 33, 38], this does not necessarily mean that culverts in different conditions had significantly different age distributions.
3. Methodology
Figure 2 displays an overview of the study’s approach. In order to provide background data for the examination of the most significant effect factors, the input variables’ predictor importance results of the ML model were retrieved first. The significance levels are then calculated by statistically testing the ML model’s output variables along with impact factors. The results are then examined, and commensurate deductions are made.

3.1. Overview of Used Data
The TIMS of the ODOT provided the information used in this study. TIMS stores inventories of transportation assets, road data, traffic counts, transportation construction projects, and environmental data. The public can easily obtain TIMS data to help them make better judgments [2]. Considering that a sizable number of values are missing in the original data, data preprocessing was performed, and 12,400 culverts were left to be used in this work to create the culvert condition prediction models. Such amount of data are sufficient for the study, considering the data used in existing studies as well as the minimal requirements for most ML algorithms [39].
3.2. ML Culvert Prediction Models
The highly unbalanced data in the culvert inventory database were handled by the ML prediction models using a DT method and the synthetic minority over-sampling technique. Gao and Elzarka [8] went into great depth about the creation of the models. Accuracy and ROC curve were utilized to assess the performance of the ML models, demonstrated that the developed DT models can make dependable and comprehensive predictions about culvert condition states. The models had satisfactory areas under the curve of 0.8, with accuracy rates of more than 80% for the training set and 75% for the testing set. Images of the ROC curves are shown in Figure 3.

3.3. Influential Factor and Culvert Condition Selection
The input and output variables employed in the DT model created by Gao and Elzarka [8] were chosen for further investigation in this research, as was previously stated. Thirteen culvert physical and environmental features were chosen as the input variables, as indicated in Table 3. These 13 variables’ prediction importance values from the DT model were sorted in order to determine the most significant impactors for culvert conditions. Four culvert condition-related ratings were chosen for the output variables: MC, CA, SOJ, and SC.
3.4. Statistical Test for Culvert Condition Factors’ Correlation and Their Impact
The most influential elements were statistically assessed alongside the culvert conditions because the predictor importance results alone only offer a limited amount of information to suit the objective of studying the impact factors of culvert conditions. The Mann–Whitney U test is employed in this study with a significance threshold of 0.01 to reduce Type I error, which leads to the rejection of a null hypothesis that is in fact true [40]. This signifies that the test’s H0 will be approved if the p value returned by the test is more than 0.01, and it will be refused otherwise. In this study, the H0 is that the distributions of culverts in various condition states for the tested variable are the same. So, if the p value is less than or equal to 0.01, it means culverts in the two condition states have different distributions for the tested variable.
At last, detailed observations and analysis of the statistical test results were carried out primarily for three purposes: (1) understanding which physical or environmental characteristics (factors) have more significant influence on culvert conditions; (2) determining whether culverts in different conditions have significantly different distributions in these variables; and (3) determining which culvert conditions impact each identified factor the most.
4. Results
According to Louppe et al. [41], the predictor importance results from the ML models show how crucial each variable is for classifying the input data into various groups for the output variable. In order to determine the impact factors that have a significant impact on culvert conditions, the predictor importance results generated by the created ML models are compared and assessed in this section. Figures 4–7 show the outcomes of the suggested research’s prediction of predictor importance.




In these figures, the X-axis shows the input variables used in the ML model’s development process, while the Y-axis uses a number between 0 and 1 to represent the importance of each variable from the least important to the most important, respectively. The four lines in each figure are the predictor importance results obtained in this study for the selected ratings.
For concrete culverts, the five most important variables are culvert span, pH, cover height, age, and rise, with average importance values of 0.24, 0.19, 0.15, 0.13, and 0.1, respectively.
The five most important variables for corrugated metal culverts are culvert age, span, rise, pH, and cover height, with average importance values of 0.21, 0.14, 0.13, 0.12, and 0.10, respectively.
For corrugated plastic culverts, the five most important variables are culvert cover height, age, rise, length, and average daily traffic (ADT) with average importance values of 0.32, 0.18, 0.10, 0.08, and 0.08, respectively.
The five most important variables for vitrified clay culverts are culvert cover height, slope, span, age, and ADT, with average importance values of 0.22, 0.13, 0.13, 0.12, and 0.11, respectively.
From the observations made from the presented figures, the five variables with the highest averaged predictor importance for culverts in different materials are summarized in Table 4.
It is found that condition ratings of culverts made from different materials are often highly influenced by the same five variables including span, pH, age, rise, and cover height. So, these five variables are determined to be the most influential variables in this step.
Then, the Mann–Whitney U test is used to determine whether the distributions of these variables are significantly different among culverts in the four ratings. If a statistically significant distributional difference is discovered for a certain variable, it can be used to distinguish between different culvert conditions. Table 5 displays the p values for the tests that were run.
The information provided in Table 5 was analyzed from two aspects as follows:(1)From the culvert material aspect: (a) For concrete culverts, culvert age, span, pH, and rise have significant test results for all ratings; (b) for corrugated metal culverts, culvert age and pH values have significant test results for all ratings; (c) for corrugated plastic culverts, only culvert age has significant test results for all ratings; and (d) for vitrified clay culverts, only pH value has significant test results for all ratings. This indicates that the impact of the same variable on culvert conditions can vary for different culvert materials.(2)From the condition rating aspect, the test results for the same variable can vary for different condition ratings. For example, for concrete culverts, the test results of culvert cover height are significant for MC and SC, but insignificant for CA and SOJ. This indicates that the impact of the same variable on different condition ratings can also vary.
5. Discussion
Based on testing results and the conducted analysis, the following discussions are made.
Different condition states of culverts result in statistically distinct distributions in terms of the selected input variables including culvert age, span, rise, cover height, and pH. This suggests that there is room for improvement in the fix schedule inspection criteria currently in use to a more sophisticated multifactored inspection scheduling system.
The criteria for scheduling culvert inspections should also take into consideration the age and pH of the culvert, since they have significant test results for all four ratings of culverts made from all materials, which indicates that they both have a more general and substantial impact on the condition of most culverts.
6. Conclusions
Understanding not only the extent to which culverts’ physical and environmental qualities have an impact, but also how each attribute affects a culvert’s precise condition is made much easier by the proposed approach of calculating and assessing culvert condition impact factors. The findings revealed that culverts composed of various materials had statistically distinct distributions in both physical and environmental features such as culvert age, span, and pH value. For culverts made of the same material but in different condition states, the case still stands. This mainly contributed to the practical and the research sectors in two ways: (1) The current culvert inspection routine that utilizes a fixed frequency can be improved into a more sophisticated process by taking more factors into consideration to reduce the chance of missing the inspection of culverts in bad conditions. This will contribute to more efficient culvert inspection and management, and eventually contribute to travel safety and (2) as using ML prediction models to assist the infrastructure management process by predicting their conditions continues to draw attention from researchers, this study offered a way of identifying and enhancing the input variable selection process. This will have a significant impact on future studies as the input variable selection is crucially important for the development of ML models, and the selections in existing studies are often varied.
Data Availability
The data that support the findings of this study are available from the corresponding author, P.L., upon reasonable request.
Conflicts of Interest
All authors of this paper declare that there is no conflict of interest to disclose with respect to this paper, the related works, and the publication of this paper.
Acknowledgments
This research was funded by the Key Area Dedicated Project of Guangdong General Universities and Colleges (2023ZDZX1095), the Guangdong Key Areas R&D Program Projects (2020B0101130005), and Hunan Provincial Social Science Foundation Project (21YBA245).