1,919
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Estimation and interpretation of equilibrium scour depth around circular bridge piers by using optimized XGBoost and SHAP

, , , &
Article: 2244558 | Received 23 Apr 2023, Accepted 30 Jul 2023, Published online: 21 Aug 2023

Abstract

Most bridge failures result from scouring around bridge piers, resulting in economic losses and risks to public safety. The conventional equations for predicting the depth of scour at bridge piers have several limitations: (1) They mainly use regression-based techniques that cannot robustly capture the nonlinear relationship between the scour depth and its effective variables; (2) they are applicable only to a narrow range of variability of data; and (3) they are typically calibrated using laboratory data rather than field measurements and thus cannot simulate the prototype environment. To overcome these limitations, in this study, three novel hybrid machine learning methods: particle swarm optimization - extreme gradient boosting (PSO – XGBoost), red fox optimization - XGBoost (RFO – XGBoost), and relativistic particle swarm optimization - XGBoost (RPSO – XGBoost) are applied to estimate the scour depth around circular bridge piers, and their effectiveness is validated using three statistical metrics, i.e. the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). RPSO – XGBoost generates the best results for both dimensional and dimensionless data. Moreover, the proposed approaches outperform the state-of-the-art techniques. The SHapley Additive exPlanations (SHAP) method is used to assess the relative significance of the contributing factors for predicting the scour depth.

1. Introduction

Scouring around bridge piers is a notable driving factor of bridge collapses (Wardhana & Hadipriono, Citation2003; Wang et al., Citation2017). According to the US Department of Transportation, scouring led to the damage of 17 bridges in New York and New England during the flood in 1987 (Richardson & Davis, Citation2001). The Federal Highway Administration defines scouring as the process in which the streambed sediments are eroded and removed from bridges by the water flowing over the bed (Mueller & Wagner, Citation2005). Different bed materials undergo various levels of scouring, leading to different scour rates. For example, non-cohesive materials such as sand and gravel may erode over a few hours, but more time is required for cohesive bed materials to be eroded (Richardson & Davis, Citation2001).

The scouring process around bridge piers is composed of two mechanisms: three-dimensional flow separation and sediment transport, which render it complex (Sreedhara et al., Citation2019). The water rushing around the pier experiences three-dimensional separation, which generates pressure resulting in a downward flow upstream of the pier. This downward flow creates a vortex system upstream of the pier, termed a horseshoe vortex owing to its appearance being similar to that of a horseshoe from the top view (Pandey et al., Citation2020a). The shear stress increases due to the vortex system and corresponding downflow around piers, leading to accelerated sediment transport, which finally creates a scour hole near the pier. After the scour hole is fully developed and reaches its maximum size, the flow pattern is altered, and the shear stress and sediment transport diminish (Pandey et al., Citation2020b).

The safe design of bridge foundations is critical for preventing bridge failures. The scour depth must be precisely estimated to ensure the economic and secure design of bridges. Hence, many regression-based formulas have been established to determine the depth of equilibrium scour around bridge piers (Laursen & Toch, Citation1956; Shen et al., Citation1969; Hancu, Citation1971; Jain & Fischer, Citation1979; Kothyari et al., Citation1992; Lee & Sturm, Citation2009; Melville & Coleman, Citation2000; Khan et al., Citation2017; Pandey et al., Citation2018). Notably, the existing equations suffer from the following limitations: (1) They are mainly based on limited experimental data and are thus applicable to only a narrow range of variability of inputs (e.g. flow depth, flow velocity, pier diameter, and sediment size) and output (scour depth) for which they are calibrated; (2) they cannot robustly capture the highly nonlinear relationship between scour depth and its determining factors; and (3) they are typically calibrated based on laboratory data and thus cannot accurately capture the prototype environment, resulting in overestimation of the scour depth (Bateni et al., Citation2007a).

In a departure from regression-based approaches, machine learning (ML) techniques have been used to predict scour depth. Sharafi et al. (Citation2016) showed that the support vector machine (SVM) outperforms artificial neural networks (ANN), adaptive neuro-fuzzy inference systems (ANFIS), and nonlinear regression methods in estimating the scour depth. Ebtehaj et al. (Citation2017) compared the scour depth prediction performance of the self-adaptive extreme learning machine with those of regression-based, ANN, and SVM techniques. Choi et al. (Citation2017) used ANFIS to determine the scour depth around circular bridge piers. While their approach outperformed the conventional equations, the small number of data points in their study restricts the generalizability of their proposed method. Genetic programming and ANFIS were proposed as alternatives to conventional methods owing to their promising performance in estimating the scour depth near piers (Abd El-Hady Rady, Citation2020). Shamshirband et al. (Citation2020) developed several PSO-based equations to predict the scour depth around circular bridge piers using different input variables.

Dang et al. (Citation2021) optimized ANN by PSO and firefly algorithms to enhance its efficiency in scour depth estimation. However, the performances of their proposed methods were not compared with other studies, limiting the ability to assess their effectiveness. Sreedhara et al. (Citation2021) utilized gradient tree boosting (GTB) to estimate the temporal scour depth around circular, rectangular, sharp-nosed, and round-nosed bridge piers. Their proposed ML model was calibrated based on a limited number of laboratory data points and was not evaluated against previous studies. Chou and Nguyen (Citation2022) introduced the Metaheuristic-Optimized Stacking System (MOSS) to estimate the scour depth around bridge piers. By combining a metaheuristic algorithm with multiple ML methods, they showed the superiority of MOSS over single ML models, conventional regression-based methods, and mathematical approaches. Despite the strength of their study, it used a small dataset, which hinders its generalization. Also, none of the abovementioned studies used explainable artificial intelligence (AI) to show the relative contribution of input features to scour depth estimations.

This study overcomes the drawbacks of the existing studies by (1) using 841 experimental data points from 35 field and laboratory studies in clear water conditions, thereby covering a wide range of variability of the scour depth and other relevant variables; (2) applying three novel hybrid ML techniques: particle swarm optimization–extreme gradient boosting (PSO–XGBoost), red fox optimization–XGBoost (RFO–XGBoost), and relativistic particle swarm optimization–XGBoost (RPSO–XGBoost), given that PSO, RFO, and RPSO can tune the hyperparameters of XGBoost to improve its performance; and (3) interpreting the proposed ML models using the SHapley Additive exPlanations (SHAP) method. In particular, SHAP represents an intuitive and effective tool for exploring the contribution of each input feature to the output of an ML model.

XGBoost is a robust and scalable ML method that was proposed by Chen et al. (Citation2015). This framework operates on the basis of a gradient boosting (GB) tree for gradient enrichment. XGBoost has demonstrated robust performances in regression problems, in terms of speed, memory usage, scalability, and hardware (Lucca et al., Citation2021; Qiu et al., Citation2021), in applications such as detection in water delivery systems (Wu et al., Citation2022), evaluation of flash flood risk (Ma et al., Citation2021), groundwater level estimation (Osman et al., Citation2021), and flood forecasting (Venkatesan & Mahindrakar, Citation2019). In general, XGBoost can model a variety of water resource problems, and its performance can be improved by tuning its hyperparameters (Ni et al., Citation2020; Yu et al., Citation2020; Shi et al., Citation2021; Nguyen et al., Citation2021; Demir & Sahin, Citation2023). Although this tuning can be performed through trial and error, such approaches are extremely time-consuming and cumbersome.

PSO is a popular swarm-based metaheuristic optimization technique that draws inspiration from animal foraging behaviours (Kennedy & Eberhart, Citation1995). RFO, as one of the newest metaheuristic optimization tools, was introduced by Połap and Woźniak (Citation2021). Its concept is based on the predation tricks of red foxes (Cui et al., Citation2022; Natarajan et al., Citation2022). RPSO, introduced by Roder et al. (Citation2020), is a variant of the PSO algorithm that functions based on the concept of relativity. These optimization strategies have been successfully used to adjust hyperparameters of ANN (Alizamir & Sobhanardakani, Citation2018), XGBoost (Yu et al., Citation2020; X. Zhang et al., Citation2020), and random forest platforms (Pham et al., Citation2020).

In this study, we exploit the PSO, RFO, and RPSO techniques to identify the optimal hyperparameters of XGBoost to improve its performance (Gu et al., Citation2021; Lucca et al., Citation2021). To the best of our knowledge, the proposed hybrid approaches have not been previously used to predict the scour depth. Moreover, the performances of the developed hybrid models are compared with those of state-of-the-art methods.

Furthermore, the interpretation of ML methods is crucial to explain how these algorithms obtain optimal predictions. To this end, explainable artificial intelligence (XAI) methods have attracted increasing attention in recent years. Representative XAI techniques include SHAP, accumulated local effects, and partial dependence plots (Ishfaque et al., Citation2022).

In this study, the relative significance of the different input parameters for scour depth prediction is determined using SHAP values. SHAP, proposed by Lundberg and Lee (Citation2017), quantifies the significance of each feature and explains how ML models make predictions. The predictions of a model can be expressed as the summation of a fixed base value and each feature’s corresponding SHAP values. Unlike the conventional feature importance methods such as feature importance and mean decrease impurity (Mangalathu et al., Citation2020; Demir & Sahin, Citation2023), SHAP can determine whether the contribution of each feature is positive or negative. SHAP has been used to interpret several ML methods for dam seepage problems (Ishfaque et al., Citation2022) and drought forecasting (Dikshit & Pradhan, Citation2021) owing to its notable advantages such as local and global model interpretation, scalability, robustness, and feasibility for a vast range of problems such as regression, classification, and ranking (Lundberg et al., Citation2020). To the best of the authors’ knowledge, this study represents the first attempt at performing SHAP analysis to interpret ML methods for scour depth prediction.

The rest of the paper is organized as follows: Section 2 describes the variables affecting the scour depth and data sources. Section 3 discusses the XGBoost, PSO, RFO, RPSO, and SHAP techniques. Section 4 presents the findings, and Section 5 presents the concluding remarks.

2. Data

The scour depth, influenced by flow characteristics, sediment properties, and pier dimensions, is inherently complex due to the intricate interplay of hydraulic forces, sediment dynamics, and morphological changes (Choi et al., Citation2017). This complexity accentuates the need for extensive understanding and modeling techniques to analyze and predict the scour depth accurately. However, conducting three-dimensional physical-based simulations around bridge piers to accurately represent the scour mechanism is challenging (Dang et al., Citation2021). Given the physical complexity and the limitations of physical models, the availability of a comprehensive scour dataset becomes crucial. When utilized with ML approaches, such a dataset provides valuable insights into the intricate scour phenomena and enables the rapid generation of accurate scour depth estimates (Dang et al., Citation2021). Importantly, our study stands out as the first one to employ such a comprehensive dataset to model scour depth around circular bridge piers, further highlighting its significance.

The maximum scour depth, also known as the equilibrium, scour depth, is reached around bridge piers after a period of gradual development. The equilibrium scour depth near circular bridge piers is determined by the fluid flow, bed materials, and pier diameter (Melville & Coleman, Citation2000): (1) dse=f(ρ,μ,V,Y,g,d50,Vc,D)(1) where dse is the equilibrium scour depth, ρ is the density of water, μ is the dynamic viscosity of water, V is the flow mean velocity, Y is the flow depth, g is the gravitational acceleration, d50 is the median sediment size, Vc is the mean critical velocity, and D is the pier diameter.

Using the Π Buckingham theory and considering ρ, D, and V as repeating parameters, the dimensionless form of Eq. (1) can be expressed as follows: (2) dseY=f(VVc,VgY,DY,d50Y,ρVDμ)(2) Here, VgY is the Froude number (Fr), and ρVDμ is the Reynolds number (Re).

Reynolds number determines the flow regime around bridge piers, where a higher Reynolds number signifies a more turbulent condition that may enhance the erosive potential and increases scouring (Hu et al., Citation2022). Therefore, including Re in simulating scouring around bridge piers can be beneficial to account for its underlying effect and ensure a thorough understanding of the scouring process (Tavouktsoglou et al., Citation2017). However, it is worth mentioning that our results in Section 4.2. show that Re has a marginal impact on the scour depth estimates, and its effect on dse/ Y is less than the other variables in equation (2).

In this study, 841 field and laboratory data points in clear water condition for circular bridge piers are collected from 35 published and unpublished reports, which are listed in Table .

Table 1. Sources for scour depth data used in this study.

Table  summarizes the mean, minimum value, maximum value, standard deviation (SD), and coefficient of variation (CV) of the data.

Table 2. Statistical indices of variables.

3. Models and methods

3.1. XGBoost

XGBoost is a novel variant of the GB technique (Friedman, Citation2001, Citation2002). Unlike the GB technique, XGBoost can efficiently manage a considerable amount of data by generating boosted trees and implementing them in parallel (Le et al., Citation2019; Zhang et al., Citation2020). Consequently, reliable and fast simulations can be performed for engineering problems with a large number of data points. The complexity of trees is addressed by the variation of a loss function. In other words, the residuals are used to calibrate the former predictions in each iteration through the optimization of a loss function. Furthermore, XGBoost uses a regularization function in the objective function (Zhou et al., Citation2021).

The process flow of XGBoost can be mathematically illustrated as follows: Consider a dataset with m data and n feature:S=(xi,yi)(|S|=m,xiRn,yiR).xi and yi are the input and output, respectively. The output of a tree ensemble model can be obtained by adding P functions: (3) yˆi=ψ(xi)=p=1Pfp(xi),fpF(3) where F is the area containing regression trees and is expressed as: (4) F={f(x)=ωq(x)}(q:RnT,ωRT)(4) where q indicates the associated leaf for each data point as a tree structure, T is the number of leaves on the tree, fp is a function corresponding to a stand-alone tree structure, and ω is the output weight of leaves.

The target function (J) for the XGBoost method is defined as in Eq. (5) to minimize the error of ensemble trees: (5) J(t)=i=1mL(yi,yˆi(t1)+ft(xi))+Ω(ft)(5) where L is the training loss function used to determine the distance between the estimated (yˆi) and observed (yi) values, superscript t denotes the number of iterations, and Ω is the regularization function that controls the model complexity: (6) Ω(f)=γT+12λω2(6) where λ is the penalty coefficient, and γ shows the complexity of each tree. A higher value of γ means fewer complex trees. The hyperparameters of XGBoost are the learning rate, max_delta_step, max_depth, and min_child_weight. The learning rate demonstrates the shrinkage (reduction in the size of incremental steps) used to prevent overfitting in the learning process. A shrinking weight in each step will result in a more conservative model. Max_delta_step indicates the weight estimate of each tree. A positive value results in a conservative update step. The maximum depth of the tree, max_depth, is used to control overfitting in the learning phase. Min_child_weight is the minimum summation of weights in a child tree, higher values of which make the model more conservative. In this study, we apply three optimization techniques to find the optimal hyperparameters of XGBoost, as described in the following subsections.

3.2. PSO

PSO, introduced by Kennedy and Eberhart (Citation1995), is a metaheuristic algorithm that draws inspiration from nature. Each particle has an assigned position (x) and velocity (v). The initial position (x) and velocity (v) of each particle are defined by an n-dimensional random vector and vector of zeros, respectively. Each dimension represents a specific decision parameter. Assume that particle i moves at velocity vit at iteration t, as a part of a swarm sized P, where i{1,2,,P}. The particle velocity at the time t+1 (vit+1.) can be obtained as (Le et al., Citation2019): (7) vit+1=wvit+c1r1(xixit)+c2r2(gxit)(7) where xi represents the best location achieved by particle i upo the current iteration, g is the present best global solution among all swarms w is the inertial weight, c1 and c2 are the cognitive and social parameters, respectively; and r1 and r2 are random numbers that vary from 0 to 1.

In the following step, given the updated velocity of particle i, the particle's position is adjusted as follows: (8) xit+1=xit+vit+1(8) where xitindicates the location of particle i at iteration t.

3.3. RFO

Red foxes are highly populated species of foxes that can live and survive in different climatic conditions. Based on the red foxes’ lifestyle and method of hunting, Połap and Woźniak (Citation2021) developed a metaheuristic optimization algorithm named RFO. The RFO initialization is simulated by a fixed number of foxes, with the coordinates of each fox defined as: (9) X=[x0,x1,,xn1](9) where n describes the number of coordinates.

Each fox in each iteration is identified using the notation (Xji)t, where i indicates the number of a fox in the population, j is the coordinate based on the dimension of the searching area, and t is the number of iterations.

Assuming fRn to be a condition function of n variables, each point in the search space [a,b]n, where a,bR, can be expressed as in Eq. (10): (10) (X)i=[(x0)i,(x1)i,,(xn1)i](10) (X)i is considered the optimum solution once the value of f ((X)i) yields the global optimal. Every fox must participate in protecting the flock from threats. If adequate prey is not available in the area, the foxes must move to farther regions to achieve a better result for the exploration term. The information collected by a fox for the best location is shared with the others if a region with adequate prey is found. The fitness value is used to sort the results of the evaluated function.

The Euclidean distance (D) is used to find (Xbest)t: (11) D[(Xi)t,(Xbest)t]=||(Xi)t(Xbest)t||(11) where || . || is the Euclidean norm. Thus, the foxes travel toward the optimum solution as follows: (12) (Xi)t=(Xi)t+a×sign((Xbest)t((X)i)t)(12) where a is a randomly chosen integer number in the range of a(0,D[(Xi)t,(Xbest)t])

Then, the appropriate solution is obtained using the updated position of the candidates. Otherwise, the former location is retained. The probability of a fox being observed while they move to capture the prey is modeled using a random variable λ: (13) {move closer λ>0.75stay and camouflage λ0.75(13) The radius (r) is another important term in the RFO, which indicates the vision radius of the predator fox: (14) r={αsinφ0φ0φ00βφ0=0(14) where α is the scaling coefficient for modeling changes in the fox’s vision radius while they approach a prey and lies in the range [0, 0.2]. β is a random value between 0 and 1 that denotes the impact of adverse weather such as rain or fog on the foxes’ vision angle. φ0, which represents the fox’s angle of sight, varies in the range [0, 2π]. Therefore, the fox population approaching prey can be simulated as follows: (15) {x0New=αrcos(φ1)+x0Oldx1New=αrsin(φ1)+αrcos(φ2)+x1Oldx2New=αrsin(φ1)+αrsin(φ2)+αrcos(φ3)+x2Old...xn2New=αrk=1n2sin(φk)+αrcos(φn1)+xn2Oldxn1New=αrsin(φ1)+αrsin(φ2)+...+αrsin(φn1)+xn1Old(15) where xNew is the updated coordinate, xOld is the coordinate in the previous iteration, and φ1,φ2,.φn1 are the foxes’ angles of sight for the corresponding coordinates.

3.4. RPSO

RPSO is a variant of the PSO, which was introduced by Roder et al. (Citation2020). We introduce the theory of relativity to illustrate the RPSO. The theory of relativity is one of the most significant theories in physics, proposed by Albert Einstein in Citation1916. Conventional mechanics’ theories were implemented to numerically model the phenomena detected by Einstein. One of the most famous theories pertains to momentum, which can be calculated in three-dimensional coordinates as: (16) M(v)=ς(v)mv(16) where M is the momentum, v = (vx, vy, vz) and m are the velocity and mass of a body, respectively. ς is the Lorentz factor, which is defined as follows: (17) ς(v)=11(|v|c)2(17) where |v| denotes the size of vector v, and c is the speed of light.

Unlike PSO, RPSO considers the effects of the speed of light, improved social behaviour of swarms, and particles’ mass on the optimum values of the solution. RPSO transforms the three-dimensional momentum formula (Eq. 16) into an n-dimensional formula to compute the velocity of each particle in an n-dimensional search space: (18) vit+1=M(vit)+c1r1(xixit)+c2r2(gxit)(18) where M(.) is the typical relativistic momentum in three dimensions, which is extended to the n-dimensional space. Mass m is selected from a uniform distribution with values between [0, 1] to compute the relativistic momentum of each particle (Roder et al., Citation2020).

3.5. Model performance evaluation metrics

Three statistical indices are used to assess the efficiency of the proposed methods: root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). Lower MAE and RMSE denote better results. R2, which indicates the percentage of variation in dependent variable that can be accounted for by independent variables, lies in the range [0,1]. R2 values closer to 1 indicate a stronger relationship between the predictions and observations. The three metrics are defined as follows: (19) RMSE=i=1N(dseiobsdseipred)2N(19) (20) MAE=i=1N|dseiobsdseipred|N(20) (21) R2=1i=1N(dseiobsdseipred)2i=1N(dseiobsd¯se)2(21) where N is the number of data points, dseipred and dseiobs are the estimated and observed scour depths for the ith data point, respectively, and d¯se indicates the average values of observed scour depth.

3.6. SHAP

Lundberg and Lee (Citation2017) developed the SHAP method based on game theory to interpret ML models. The concept was used in game theory to assess a player’s participation in a cooperative team game and distribute a fair reward among players based on their contributions (Shapley, Citation1953). SHAP determines the degree of contribution of each input to the model’s output prediction. Moreover, it evaluates whether the effect of each input on the model’s prediction is positive or negative. The use of SHAP to interpret ML models can help prevent the problem of these models operating as black boxes. The model behaviour in both global (for the entire dataset) and local (for a single prediction) aspects is explained by assigning SHAP values to each feature (Zhang et al., Citation2022). The contribution of each input feature can be determined as: (22) κi=ζG|ζ|!(n|ζ|1)!n![τ(ζ{i})τ(ζ)](22) where κi indicates the contribution of the i-th feature (in terms of Shapley values), G is the collection of all features, n is the number of features in G, and ζ is a subset of G, |.| denotes the number of members in a set, τ(ζ{i}) and τ(ζ)indicate the mel’s output with and without the i-th feature, respectively. The model output can be written as a linear summation of a constant base value and SHAP values as (Khattak et al., Citation2022): (23) e(z)=κ0+i=1nκiz(23) where κ0 is a base value, z{0,1}n, where z=1 when a feature is present and z=0otherwise.

Figure  illustrates the process of developing hybrid ML models for scour depth prediction.

Figure 1. Process flow of the proposed hybrid ML models and SHAP for estimating the scour depth near circular bridge piers.

Figure 1. Process flow of the proposed hybrid ML models and SHAP for estimating the scour depth near circular bridge piers.

4. Results and discussion

4.1. Model evaluation

The 841 experimental data points are split randomly into training (70%), validation (15%), and testing (15%) datasets. The training dataset is used to train the ML models and typically includes most of the data. The model uses the training data to minimize the differences between observed and predicted values by adjusting the hyperparameters (Brownlee, Citation2020). The validation dataset is used to assess the model efficiency during training and to obtain the best set of hyperparameters. The validation dataset is distinct from the training dataset to prevent overfitting (Brownlee, Citation2020). The performance of the final (calibrated) ML model is assessed using the testing dataset, which is completely withheld from the model during the training and validation stages (Mangalathu et al., Citation2020). In this manner, testing data enable an unbiased evaluation of ML models. All three datasets should be used to ensure that the ML model is precise, robust, and not overfitted to training data.

In this study, the hyperparameters of the XGBoost method, i.e. learning rate, max_delta_step, max_depth, min_child weight, and n_estimators are optimized using PSO, RFO, and RPSO. The default values of these parameters are used to initialize XGBoost. The merits of hybrid models include higher accuracy, flexibility, and robustness in handling high-dimensional data (Jalil et al., Citation2022). Tables  and present the optimal values of the hyperparameters obtained by different optimization approaches for dimensional and dimensionless datasets, respectively.

Table 3. Initial hyperparameters of the XGBoost model and values optimized using the PSO, RFO, and RPSO techniques for the dimensional dataset.

Table 4. Initial hyperparameters of the XGBoost model and values optimized using the PSO, RFO, and RPSO techniques for the dimensionless dataset.

First, the original variables in Eq. (1) are used in the developed models to estimate the scour depth. Figure  presents the scatter plot of scour depth predictions from XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost versus observations for the training, validation, and testing stages. The scour depth estimations from all models are close to the 45° line, which shows that the proposed models can efficiently predict the scour depth. However, the predictions for larger scour depths deviate from the 1:1 line in the validation and testing steps, owing to the lower frequency of data in this range. Scour predictions in the training and validation phases are better than those in the testing step because the training and validation data are used in the learning process.

Figure 2. Scatter plot of predicted versus observed scour depth values for the training, validation, and testing phases using XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost.

Figure 2. Scatter plot of predicted versus observed scour depth values for the training, validation, and testing phases using XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost.

Table  summarizes the performance metrics of the proposed models (i.e. RMSE, MAE, and R2) for the training, validation, and testing stages.

Table 5. Statistical indicators of scour depth predictions made by the proposed machine learning techniques using the dimensional variables in Eq. (1).

As expected, XGBoost is the least accurate method with MAEs of 0.0035, 0.0154, and 0.0228 m and RMSEs of 0.0102, 0.0255, and 0.0434 m in the training, validation, and training stages, respectively. Figure  and Table  demonstrate that all three hybrid models are more accurate than the XGBoost method owing to the use of optimized hyperparameters. In the training phase, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost reduce the RMSE of XGBoost by 0.98%. The corresponding RMSE reductions are 2.30%, 2.53%, and 8.29% for testing and 25.09%, 32.94%, and 40.78% for validation. In terms of the MAE, in the training step, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost enhance the accuracy of XGBoost by 5.71%, 17.14%, and 34.28%, respectively. RPSO–XGBoost exhibits the highest performance with the lowest RMSEs (0.0101, 0.0151, and 0.0398 m) and MAEs (0.0023, 0.0104, and 0.0207 m) for the training, validation, and testing phases, respectively, followed by RFO–XGBoost and PSO–XGBoost. In the training step, the MAEs of RPSO–XGBoost are 30.30% and 20.68% lower than those of PSO–XGBoost and RFO–XGBoost, respectively. In the validation step, RPSO–XGBoost improves the MAE (RMSE) of PSO–XGBoost and RFO–XGBoost by 21.21% (40.78%) and 7.96% (11.69%), respectively. A similar trend of MAE (RMSE) reduction is seen in the testing stage, where the MAEs (RMSEs) of RPSO–XGBoost are 13.38% (6.13%) and 0.95% (5.91%) smaller than those of PSO–XGBoost and RFO–XGBoost, respectively. Overall, the proposed models can be ranked in the decreasing order of their ability to predict the scour depth around bridge piers as RPSO–XGBoost, RFO–XGBoost, PSO–XGBoost, and XGBoost.

Subsequently, the dimensionless parameters in Eq. (2) are used in the models to estimate dse/Y. Figure  compares the dse/Y estimates from XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost with observations for the training, validation, and testing stages.

Figure 3. Scatter plot of predicted versus observed dimensionless scour depth (dse/Y) in the training, validation, and testing phases, derived using XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost.

Figure 3. Scatter plot of predicted versus observed dimensionless scour depth (dse/Y) in the training, validation, and testing phases, derived using XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost.

Figure  shows that the scour depth predictions from all proposed models are concentrated around the 45° line, indicating their satisfactory accuracies. A comparison of Figs. 2 and 3 demonstrates that all models produce more reliable results when trained by dimensional data (Eq. (1)) instead of dimensionless data (Eq. (2)). A possible explanation for this may be that the use of raw variables rather than a combination of variables enhances the flexibility of models in simulating the highly nonlinear relationship between the inputs and output. This result is in line with those of Bateni et al. (Citation2007a, Citation2007b) and Zounemat-Kermani et al. (Citation2009), who reported that the results of ANN and ANFIS models trained by the dimensional dataset were superior to the models trained by dimensionless data.

Table  presents a comparison of the predictive performances of the proposed methods for dse/Y. The MAEs (RMSEs) of the hybrid models are lower than those of XGBoost. The best dse/Y estimates are attributed to RPSO–XGBoost with RMSE = 0.1723, MAE = 0.0431, and R2 = 0.9765 for training; RMSE = 0.2065, MAE = 0.1354, and R2 = 0.9714 for validation; and RMSE = 0.2065, MAE = 0.1354, and R2 = 0.9714 for testing. For the testing phase, the RMSE of RPSO–XGBoost is 4.36% and 5.48% lower than those of RFO–XGBoost and PSO–XGBoost, respectively. XGBoost has the weakest performance among all proposed methods with RMSE = 0.1825, MAE = 0.0763, and R2 = 0.9735 in the training phase; RMSE = 0.4066, MAE = 0.1832, and R2 = 0.8889 in the validation phase; and RMSE = 0.3435, MAE = 0.1712, and R2 = 0.7824 in the testing phase. Table  and Figure  also highlight the superiority of all hybrid models over XGBoost: In the training stage, the MAEs (RMSEs) of PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost are 14.94%, (0.11%), 30.27% (3.29%), and 43.51% (5.59%) lower than that of XGBoost, respectively. Similarly, for validation data, the MAEs (RMSEs) of PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost are lower than those of XGBoost by 5.40% (20.76%), 19.05% (39.69%), and 26.09% (49.21%), respectively. These results indicate that PSO, RFO, and RPSO can successfully tune the hyperparameters of XGBoost, thereby enhancing its scour depth prediction capabilities. Therefore, the proposed models can be ranked in decreasing order of their accuracy in predicting dse/Y as RPSO–XGBoost, RFO–XGBoost, PSO–XGBoost, and XGBoost. This finding is consistent with the previous ranking of methods for scour depth prediction using the dimensional variables in EquationEq. (1).

Table 6. Statistical indices of (dse/Y) predictions from the proposed methods using the dimensionless variables in Eq. (2).

The superiority of RPSO–XGBoost over PSO–XGBoost and RFO–XGBoost is attributed to several factors. Firstly, the RPSO technique incorporates relativistic effects, allowing particles to move at speeds that can approach the speed of light (Roder et al., Citation2020). These relativistic effects enable RPSO–XGBoost to explore the search space extensively, conducting a more comprehensive and thorough search for optimal hyperparameters (Roder et al., Citation2020). Secondly, the relativistic effects in RPSO facilitate more efficient convergence of particles towards the global optimum (Roder et al., Citation2020). The effective navigation of the complex search space by RPSO allows RPSO–XGBoost to reveal significant patterns and dependencies that influence scour depth, resulting in highly accurate predictions.

Based on testing data, Table  compares the performances of the proposed models and 28 existing techniques. The numbers in parentheses show the percentage of increase in the RMSE/MAE and decrease in R2 compared with those of RPSO–XGBoost, which is the highest-performing approach developed in this study. Because the testing data are withheld from the model in the training and validation steps, they can be used as an independent data source to perform a fair evaluation of the final models. The results show that the four proposed methods outperform all the existing approaches. Among the 28 existing methods, the model proposed by Shamshirband et al. (Citation2020) achieves the best scour depth predictions with RMSE, MAE, and R2 of 0.0654, 0.0470, and 0.8268 m, respectively.

Table 7. Comparison of scour depth estimates from XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost with those obtained using existing approaches.

Shamshirband et al. (Citation2020) used the PSO algorithm to develop an explicit equation for estimating scour depth around circular bridge piers. Their equation was developed by minimizing the discrepancy between predicted and observed scour depth values via the PSO algorithm (Shamshirband et al., Citation2020). The PSO technique enhances the capability of their expression to capture inherent relationships among input and output data, leading to improved prediction accuracy. Furthermore, the optimization process performed by the PSO algorithm facilitates the generalizability of the equation based on the available dataset (Taormina & Chau, Citation2015). As a result, their derived equation has the potential to be applied to a diverse range of scenarios beyond the specific dataset utilized for its development. The ability to generalize effectively is of considerable significance for practical applications.

Other studies in Table  mostly used regression techniques to derive equations for predicting dse. The traditional regression-based approaches cannot robustly capture the nonlinearity and complexity among the scour depth and its influential variables (Ebtehaj et al., Citation2017).

The RMSE (MAE) of the scour depth estimates from RPSO–XGBoost is 39.14% (55.95%) lower than those of Shamshirband et al. (Citation2020). The model proposed by Blench (Citation1969) produces the least accurate results, with RMSE = 0.7712 m, MAE = 0.7463 m, and R2 = 0.1318. On average, RPSO–XGBoost improves the RMSE and MAE of scour depth predictions by 64.91% and 70.94% compared with those of the 28 existing approaches. Figure  visually compares the scour depth estimates from the four proposed models with those obtained using the model of Shamshirband et al. (Citation2020) as the highest-performing method among the 28 considered techniques. Most of the scour depth predictions obtained using the proposed models are closer to the 1:1 line, which demonstrates their superiority over those of the model proposed by Shamshirband et al. (Citation2020).

Figure 4. Visual comparison of the predictions obtained using the proposed methods and the method proposed by Shamshirband et al. (Citation2020), based on testing data

Figure 4. Visual comparison of the predictions obtained using the proposed methods and the method proposed by Shamshirband et al. (Citation2020), based on testing data

The violin plot in Fig. 5a shows the distribution of scour depth estimates from RPSO–XGBoost (as the best proposed model) in comparison with those obtained using two famous models (i.e. HEC-18 and Sheppard et al. (Citation2014)) and the best method among the 28 considered existing methods (i.e. Shamshirband et al. (Citation2020)). A similar comparison is made for dse/Y in Fig. 5b. In general, violin plots combine a kernel density plot with a box plot. Unlike box plots, violin plots can show both the statistic and density of data, thereby explaining the variability of data. The small white dot in the violin plots in Fig. 5 represents the median of the data. The interquartile range is presented by a thick black line. The thin black line indicates the data beyond the interquartile range except outliers. The shape of the data distribution is shown by the kernel density approximation on the two sides of the black line. The wider shape of the violin plot around the median denotes a high concentration of data in this region. The tapered shape of the ends of the violin plot indicates a lower concentration of data in that area.

Figure  shows that the median and interquartile values of the scour depths predicted by the RPSO–XGBoost are close to the observations, highlighting its high predictive accuracy. For both dimensional and dimensionless data, the scour depth estimates from all methods are clustered around the median, indicating a higher probability in this region. In the case of dimensional data, the distribution of scour depth predictions from RPSO–XGBoost is similar to that of the measurements. In addition, the interquartile range, range of data, and median of predictions from RPSO–XGBoost are closer to the observations for both dimensional and dimensionless data, compared with those of Shamshirband et al. (Citation2020), Sheppard et al. (Citation2014), and HEC-18. A comparison of Figure (a,b) confirms the findings of the previous analyses demonstrating the superiority of scour depth predictions obtained using the dimensional variables in Eq. (1) over those obtained using the dimensionless variables in Eq. (2). The upper end of the violin plots for dse/Y predictions (Figure (b)) indicates a higher value than that of dse/Y measurements because the models overestimate dse/Y in the case of large values.

Figure 5. Violin plots comparing the distribution of (top) dimensional dataset and (bottom) dimensionless dataset estimates from different approaches with measurements.

Figure 5. Violin plots comparing the distribution of (top) dimensional dataset and (bottom) dimensionless dataset estimates from different approaches with measurements.

To further investigate the ability of the proposed methods, the Taylor diagram for scour depth measurements and estimates is derived, as shown in Figure . Taylor diagrams are valuable tools for visualizing results, as they can graphically summarize several evaluation metrics, i.e. the SD, correlation coefficient between the predicted and measured values, and centered RMSE, in a single plot. Thus, Taylor diagrams can help obtain reliable conclusions regarding the efficiency of the proposed models. Notably, the statistical indicators shown in the Taylor diagram in Figure  correspond to the testing data. When using the dimensional dataset, the proposed XGBoost, PSO–XGBoost, and RPSO–XGBoost models underestimate the range of scour depth variation, whereas RFO–XGBoost tends to overestimate the scour depth variations. When using dimensionless data, all the proposed methods overestimate the variations in the scour depth, as indicated by the higher SD compared with the observations (i.e. the values lie beyond the red dashed curve). Additionally, Figure  provides further evidence of the higher accuracy of proposed models using dimensional variables compared with those using dimensionless variables, as the correlation coefficient of the former is more than 0.95 but that of the latter ranges between 0.9 and 0.95. RPSO–XGBoost yields better results for both input configurations given their higher correlation with the observations and lower RMSE compared with those of the other three methods. Furthermore, the variability (represented by SD) of dse and dse /Y predicted by all proposed methods is close to that of the observations. For both dimensional and dimensionless datasets, the predictions of XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost are closer to the scour depth observations (black triangle in the Taylor diagram) compared with those obtained using the method of Shamshirband et al. (Citation2020), Sheppard et al. (Citation2014), and HEC-18 (Arneson et al. Citation2012). Furthermore, the SD of dse/Y predictions from Sheppard et al. (Citation2014) and HEC-18 differs significantly from the observations, which shows that these methods cannot accurately predict the variability in dse/Y.

Figure 6. Taylor diagrams for scour depth estimates from various approaches and observations.

Figure 6. Taylor diagrams for scour depth estimates from various approaches and observations.

Figure (a,b) show the RMSEs of dse and dse/Y predictions from RPSO–XGBoost (the most accurate method) under different numbers of iterations, respectively. As the number of iterations increases, the RMSEs of dse and dse/Y predictions from RPSO–XGBoost decrease rapidly to their asymptotes. A learning curve of a model is considered a good fit (i.e. neither underfit nor overfit) if it satisfies two conditions: (1) The RMSEs in the training phase decrease to the final stable value, and (2) the RMSEs in the validation phase follows the trend of the training phase with minor differences. According to Figure , the learning behaviour of the proposed model (RPSO–XGBoost) can be a good fit.

Figure 7. Variations in the RMSE of (left) dse and (right) dse/Y predictions from RPSO–XGBoost (the best model in this study) versus the number of iterations in the training and validation stages.

Figure 7. Variations in the RMSE of (left) dse and (right) dse/Y predictions from RPSO–XGBoost (the best model in this study) versus the number of iterations in the training and validation stages.

4.2. Model interpretation

A SHAP analysis is performed to identify the variable with the most notable contribution to the scour depth predictions. As mentioned previously, SHAP values not only allow us to determine the contribution of each variable contributes to the prediction but also to identify and visualize the significant relationships in the model. SHAP values can be visualized using various approaches such as waterfall, force plot, decision plot, mean SHAP plot, and beeswarm plot. A beeswarm plot aggregates the SHAP values of all observations to ensure that all SHAP values can be visualized at once. Figure (a,b) show the beeswarm plots based on RPSO–XGBoost (the most accurate model) for dimensional and dimensionless variables, respectively.

Figure 8. SHAP summary plots based on RPSO–XGBoost to rank the significance of each feature in estimating (a) dse and (b) (dse/Y).

Figure 8. SHAP summary plots based on RPSO–XGBoost to rank the significance of each feature in estimating (a) dse and (b) (dse/Y).

In this plot, the variables are ordered based on their importance on the y-axis. The x-axis denotes the mean SHAP value, and the colour of each point shows the actual value of that feature (i.e. not the SHAP value). Thicker parts of the beeswarm plot imply a higher density of predictions in that region. Figure (a) indicates that for dimensional data, the pier diameter (D) has the most notable effect on scour depth prediction. The second most influential variable is the flow depth (Y), followed by the flow velocity (V), median grain size (d50), and critical velocity of sediments (Vc). This ranking is consistent with the study of Bateni et al. (Citation2007a), who reported that the pier diameter and critical velocity have the greatest and least impact on the estimation of scour depth using ANN, respectively. Larger values of the pier diameter correspond to larger SHAP values, which in turn yield higher scour depth estimates. A similar positive relationship is observed for the flow depth and flow velocity. In contrast, a clear inverse relationship is found between the scour depth and the median grain size and critical velocity, both of which are associated with sediment characteristics. These relationships are compatible with the physical nature of the scouring process, as larger pier diameter, flow depth, and flow velocity result in higher scour depths, and piers located in riverbeds with finer sediments and sediments with a lower critical velocity typically experience larger scour depths. As outlined in Figure (b), the most and least significant dimensionless variables for predicting dse/Y are D/Y and Re, respectively. The remaining dimensionless variables can be listed in decreasing order of their importance as V/Vc, d50/Y, and Fr. Higher values of D/Y and V/Vc lead to higher dse/Y predictions. Because of certain outliers in the SHAP values of d50/Y, Fr, and Re, no clear relationship can be concluded between these variables and dse/Y predictions.

To ensure the physical consistency of scour depth estimates from ML models, the following multifaceted approach is essential (Najafzadeh & Oliveto, Citation2020): (1) Compare predictions from ML models with those obtained from established physical models and/or widely used empirical equations to verify the physical consistency of results (Kollet et al., Citation2017). In this study, we met this criterion by comparing the scour depth estimations from XGBoost, PSO–XGBoost, RFO–XGBoost, and RPSO–XGBoost with those of 28 existing equations in the literature (see Table ); (2) Identify the importance of input features to ensure an alignment with the physical governing principles. Herein, the SHAP method is used to find the relative significance of each input feature on scour depth predictions. The results of SHAP are consistent with those of Bateni et al. (Citation2007a, Citation2007b), Najafzadeh and Barani (Citation2011), Sharafi et al. (Citation2016), Ebtehaj et al. (Citation2019), and Shamshirband et al. (Citation2020); (3) Evaluate the performance of ML models using unseen data to provide additional evidence of their physical consistency and generalization capability. This study assessed the feasibility of all ML models via the testing dataset. This dataset is kept hidden from the models during the training and validation phases. Hence, this study considered all the criteria to ensure the physical consistency of our results.

5. Conclusions

Three novel hybrid ML methods: PSO–XGBoost, RFO–XGBoost, and RPSO –XGBoost are applied to estimate the equilibrium scour depth around circular bridge piers. A comprehensive laboratory and field dataset containing 841 data points from 35 studies is used to train, validate, and test the proposed ML methods. The hyperparameters of the XGBoost method are tuned using three optimization techniques (PSO, RFO, and RPSO). The proposed models are trained using both dimensional and dimensionless datasets. The results are evaluated using three statistical metrics i.e. the RMSE, MAE, and R2. All three proposed methods accurately predict the scour depth. The higher accuracy of the hybrid models over XGBoost shows that the three optimization techniques can efficiently tune the hyperparameters of XGBoost and enhance its accuracy in scour depth prediction.

For models trained by dimensional data, RPSO–XGBoost yields the best scour depth predictions with RMSE = 0.0101 m, MAE = 0.0023 m, and R2 = 0.9936 in training; RMSE = 0.0151 m, MAE = 0.0104 m, and R2 = 0.9526 in validation; and RMSE = 0.0398 m, MAE = 0.0207 m, and R2 = 0.9284 in the testing phrase. Similarly, for dimensionless data, RPSO–XGBoost produces the most accurate scour depth predictions with RMSE = 0.1723, MAE = 0.0431, and R2 = 0.9765 in training; RMSE = 0.2065, MAE = 0.1354, and R2 = 0.9714 in validation; and RMSE = 0.3157, MAE = 0.1657, and R2 = 0.8164 in testing. Moreover, more accurate results can be achieved when the proposed models are trained with dimensional data rather than dimensionless data.

The results of the proposed methods are compared with those of 28 existing methods, and their superiority over the existing techniques is demonstrated. The MAE (RMSE) of RPSO–XGBoost (as the best method from our study) is 55.95% (39.14%) lower than that of the model proposed by Shamshirband et al. (Citation2020) (as the best method among the 28 studies). Finally, the aggregated (SHAP) values of each variable for both dimensional and dimensionless datasets are used to determine the parameters, most notably affecting the scour depth predictions and clarify the input–output relationships. The SHAP analysis shows that among the dimensional variables, the pier diameter (D) and critical velocity (Vc) have the most and least notable effects on the scour depth estimation, respectively. Higher values of D, flow depth (Y), and flow velocity (V) yield larger values of the scour depth, whereas higher median grain size (d50) and Vc lead to lower values of the scour depth. SHAP analysis on the dimensionless data shows that D/Y and Re have the most and least notable contributions to dse/Y, respectively.

This finding highlights the marginal impact of Re on the scour depth around circular bridge piers.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This study has been made possible by the Hawaii Department of Transportation (HDOT) and Federal Highway Administration (FHWA) grant DOT-10-030 to the University of Hawaii at Manoa.

References

  • Abd El-Hady Rady, R. (2020). Prediction of local scour around bridge piers: Artificial-intelligence-based modeling versus conventional regression methods. Applied Water Science, 10(2), 1–11. https://doi.org/10.1007/s13201-020-1140-4
  • Aksoy, A. O., Bombar, G., Arkis, T., & Guney, M. S. (2017). Study of the time-dependent clear water scour around circular bridge piers. Journal of Hydrology and Hydromechanics, 65(1), 26–34. https://doi.org/10.1515/johh-2016-0048
  • Aksoy, A. O., & Eski, O. Y. (2016). Experimental investigation of local scour around circular bridge piers under steady state flow conditions. Journal of the South African Institution of Civil Engineering, 58(3), 21–27. https://doi.org/10.17159/2309-8775/2016/v58n3a3
  • Alizamir, M., & Sobhanardakani, S. (2018). An artificial neural network-particle swarm optimization (ANN-PSO) approach to predict heavy metals contamination in groundwater resources. Jundishapur Journal of Health Sciences, 10(2), 2. https://doi.org/10.5812/jjhs.67544
  • Ansari, S. A., & Qadar, A. (1994). Ultimate depth of scour around bridge piers. Hydraulic Engineering, 51–55.
  • Arneson, L. A., Zevenbergen, L. W., Lagasse, P. F., & Clopper, P. E. (2012). Evaluating scour at bridges. National Highway Institute (US).
  • Bateni, S. M., Borghei, S. M., & Jeng, D. S. (2007a). Neural network and neuro-fuzzy assessments for scour depth around bridge piers. Engineering Applications of Artificial Intelligence, 20(3), 401–414. https://doi.org/10.1016/j.engappai.2006.06.012
  • Bateni, S. M., Jeng, D. S., & Melville, B. W. (2007b). Bayesian neural networks for prediction of equilibrium and time-dependent scour depth around bridge piers. Advances in Engineering Software, 38(2), 102–111. https://doi.org/10.1016/j.advengsoft.2006.08.004
  • Blench, T. (1969). Mobile-bed fluviology. University of Alberta Press, Edmonton.
  • Blench, T., Bradley, J. N., Joglekar, D. v., Bauer, W. J., Tison, L. J., Chitale, S. v., Thomas, A. R., Ahmad, M., & Romita, P. L. (1962). Discussion of “scour at bridge crossings.”. Transactions of the American Society of Civil Engineers, 127(1), 180–207. https://doi.org/10.1061/TACEAT.0008391
  • Breusers, H. N. C. (1965). Scouring around drilling platforms. Bulletin, Hydraulic Research, IAHR, 19, 276.
  • Breusers, H. N. C., Nicollet, G., & Shen, H. W. (1977). Local scour around cylindrical piers. Journal of Hydraulic Research, 15(3), 211–252. https://doi.org/10.1080/00221687709499645
  • Brownlee, J. (2020, August 14). Machine Learning Mastery. https://machinelearningmastery.com/.
  • Chabert, J., & Engeldinger, P. (1956). Étude des affouillements autour des piles de ponts, Laboratoire National d’Hydraulique, Chatou, France.
  • Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., & Chen, K. (2015). Xboost: Extreme gradient boosting. R Package Version 0.4-2, 1(4), 1–4.
  • Chiew, Y. M. (1984). Local scour at bridge piers. Auckland University.
  • Chitale, S. v. (1962). Discussion of scour at bridge crossing. Transactions of the American Society of Civil Engineers, 127(1), 191–196.
  • Choi, S. U., Choi, B., & Lee, S. (2017). Prediction of local scour around bridge piers using the ANFIS method. Neural Computing and Applications, 28(2), 335–344. https://doi.org/10.1007/s00521-015-2062-1
  • Chou, J. S., & Nguyen, N. M. (2022). Scour depth prediction at bridge piers using metaheuristics-optimized stacking system. Automation in Construction, 140. https://doi.org/10.1016/j.autcon.2022.104297
  • Coleman, N. L. (1971). Analyzing laboratory measurements of scour at cylindrical piers in sand beds. Proc., 14th Congress I.A.H.R., International Association of Hydraulic Research.
  • Cui, W., Zhao, L., Xu, Y., & Mamlooki, M. (2022). The compressive strength prediction for FRP-confined concrete in circular columns by applying the normalized AlexNet-ELM and the advanced red fox optimization algorithm. Advanced Theory and Simulations, 5(4), 2100410. https://doi.org/10.1002/adts.202100410
  • Dang, N. M., Tran Anh, D., & Dang, T. D. (2021). ANN optimized by PSO and firefly algorithms for predicting scour depths around bridge piers. Engineering with Computers, 37(1), 293–303. https://doi.org/10.1007/s00366-019-00824-y
  • Demir, S., & Sahin, E. K. (2023). Predicting occurrence of liquefaction-induced lateral spreading using gradient boosting algorithms integrated with particle swarm optimization: PSO-XGBoost, PSO-LightGBM, and PSO-CatBoost. Acta Geotechnica, https://doi.org/10.1007/s11440-022-01777-1
  • Dey, S., Bose, S. K., & Sastry, G. L. N. (1995). Clear water scour at circular piers: A model. Journal of Hydraulic Engineering, 121(12), 869–876. https://doi.org/10.1061/(ASCE)0733-9429(1995)121:12(869)
  • Dikshit, A., & Pradhan, B. (2021). Explainable AI in drought forecasting. Machine Learning with Applications, 6, 100192. https://doi.org/10.1016/j.mlwa.2021.100192
  • Ebtehaj, I., Bonakdari, H., Zaji, A. H., & Sharafi, H. (2019). Sensitivity analysis of parameters affecting scour depth around bridge piers based on the non-tuned, rapid extreme learning machine method. Neural Computing and Applications, 31(12), 9145–9156. https://doi.org/10.1007/s00521-018-3696-6
  • Ebtehaj, I., Sattar Ahmed, M. A., Bonakdari, H., & Zaji, A. H. (2017). Prediction of scour depth around bridge piers using self-adaptive extreme learning machine. Journal of Hydroinformatics, 19(2), 207–224. https://doi.org/10.2166/hydro.2016.025
  • Einstein, A. (1916). Relativity: The special and general theory, 1920 translation edn. H. Holt and Company.
  • Ettema, R. (1976). Influence of bed material gradation on local scour. University of Auckland.
  • Ettema, R. (1980). Scour at bridge piers: Report No. 216. University of Auckland, School Of.
  • Ettema, R., Kirkil, G., & Muste, M. (2006). Similitude of large-scale turbulence in experiments on local scour at cylinders. Journal of Hydraulic Engineering, 132(1), 33–40. https://doi.org/10.1061/(ASCE)0733-9429(2006)132:1(33)
  • Fael, C., Lança, R., & Cardoso, A. (2016). Effect of pier shape and pier alignment on the equilibrium scour depth at single piers. International Journal of Sediment Research, 31(3), 244–250. https://doi.org/10.1016/j.ijsrc.2016.04.001
  • Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 1189–1232.
  • Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2
  • Froehlich, D. C. (1988). Analysis of onsite measurements of scour at piers. Hydraulic engineering: Proceedings of the 1988 national conference on hydraulic engineering, 534–539.
  • Graf, W. H. (1995). Load scour around piers. Annual Report, Ecole Polytechnique Federal de Lausanne.
  • Gu, K., Wang, J., Qian, H., & Su, X. (2021). Study on intelligent diagnosis of rotor fault causes with the PSO-XGBoost algorithm. Mathematical Problems in Engineering, 2021. https://doi.org/10.1155/2021/9963146
  • Hancu, S. (1971). Sur le calcul des affouillements locaux dams la zone des piles des ponts. Proceedings of the 14th IAHR Congress, Paris, France, 3(1), 299–313.
  • Hu, R., Wang, X., Liu, H., Leng, H., & Lu, Y. (2022). Scour characteristics and equilibrium scour depth prediction around a submarine piggyback pipeline. Journal of Marine Science and Engineering, 10(3), 350. https://doi.org/10.3390/jmse10030350
  • Inglis, S. C. (1949). Maximum depth of scour flatheads of guide bands and groynes, pier noses, and downstream bridges-the behavior and control of rivers and canals. Indian waterways experimental station, Poona, India, 327–348.
  • Ishfaque, M., Salman, S., Jadoon, K. Z., Danish, A. A. K., Bangash, K. U., & Qianwei, D. (2022). Understanding the effect of hydro-climatological parameters on Dam seepage using shapley additive explanation (SHAP): A case study of earth-fill tarbela Dam, Pakistan. Water, 14(17), 2598. https://doi.org/10.3390/w14172598
  • Jain, S. C. (1981). Maximum clear-water scour around circular piers. Journal of the Hydraulics Division, 107(5), 611–626. https://doi.org/10.1061/JYCEAJ.0005667
  • Jain, S. C., & Fischer, E. E. (1979). Scour around bridge piers at high Froude numbers. FHWA-RD- 79-104, Federal Highway Administration, U.S. Department of Transportation, Washington, D.C., U.S.A.
  • Jalil, A., Faiz, R. b., Alyahya, S., & Maddeh, M. (2022). Impact of optimal feature selection using hybrid method for a multiclass problem in cross project defect prediction. Applied Sciences, 12(23), 12167. https://doi.org/10.3390/app122312167
  • Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95-International Conference on Neural Networks, 4, 1942–1948. https://doi.org/10.1109/ICNN.1995.488968
  • Khan, M., Tufail, M., Ajmal, M., Haq, Z. U., & Kim, T. W. (2017). Experimental analysis of the scour pattern modeling of scour depth around bridge piers. Arabian Journal for Science and Engineering, 42(9), 4111–4130. https://doi.org/10.1007/s13369-017-2599-7
  • Khattak, A., Chan, P. W., Chen, F., & Peng, H. (2022). Prediction and interpretation of low-level wind shear criticality based on Its altitude above runway level: Application of Bayesian optimization–ensemble learning classifiers and SHapley additive exPlanations. Atmosphere, 13(12), https://doi.org/10.3390/atmos13122102
  • Kollet, S., Sulis, M., Maxwell, R. M., Paniconi, C., Putti, M., Bertoldi, G., Coon, E. T., Cordano, E., Endrizzi, S., & Kikinzon, E. (2017). The integrated hydrologic model intercomparison project, IH-MIP2: A second set of benchmark results to diagnose integrated hydrology and feedbacks. Water Resources Research, 53(1), 867–890. https://doi.org/10.1002/2016WR019191
  • Kothyari, U. C., Garde, R. C. J., & Ranga Raju, K. G. (1992). Temporal variation of scour around circular bridge piers. Journal of Hydraulic Engineering, 118(8), 1091–1106. https://doi.org/10.1061/(ASCE)0733-9429(1992)118:8(1091)
  • Kumar, A. (2007). Scour around circular compound bridge piers. Indian Institute of Technology Roorkee Roorkee.
  • Kumar, A. (2011). Scour around circular piers founded in clay-sand-gravel sediment mixture. [Doctoral dissertation]. Indian Institute of Technology Roorkee.
  • Lança, R. M., Fael, C. S., Maia, R. J., Pêgo, J. P., & Cardoso, A. H. (2013). Clear-water scour at comparatively large cylindrical piers. Journal of Hydraulic Engineering, 139(11), 1117–1125. https://doi.org/10.1061/(asce)hy.1943-7900.0000788
  • Landers, M. N., & Mueller, D. S. (1996). Channel scour at bridges in the United States.
  • Larras, J. (1963). Profondeurs Maximales d’Erosion des Fonds Mobiles Autour des Piles en Rivere. Annales des Ponts et Chausses, 133, 411–424.
  • Laursen, E. M., & Toch, A. (1956). Scour around bridge piers and abutments (Vol. 4). Iowa Highway Research Board Ames.
  • Le, L. T., Nguyen, H., Zhou, J., Dou, J., & Moayedi, H. (2019). Estimating the heating load of buildings for smart city planning using a novel artificial intelligence technique PSO-XGBoost. Applied Sciences, 9(13), 2714. https://doi.org/10.3390/app9132714
  • Lee, S. O., & Sturm, T. W. (2009). Effect of sediment size scaling on physical modeling of bridge pier scour. Journal of Hydraulic Engineering, 135(10), 793–802. https://doi.org/10.1061/(ASCE)HY.1943-7900.0000091
  • Lodhi, A. S., Jain, R. K., Sharma, P. K., & Karna, N. (2014). Time evolution of clear water bridge pier scour. Proceedings of international civil engineering symposium, VIT University Vellore, India, 252–260.
  • López, G., Asce, F., Teixeira, L., Asce, F., Ortega-sánchez, M., Asce, F., Simarro, G., & Asce, F. (2014). Estimating final scour depth under clear-water flood waves. Journal of Hydraulic Engineering, 140(3), 328–332. https://doi.org/10.1061/(ASCE)HY.1943-7900.0000804
  • Lucca, G., Junior, G. B., Cardoso, A., Paiva, D., Acatauassú, R., & Gattass, M. (2021). Automatic method for classifying COVID-19 patients based on chest X-ray images, using deep features and PSO-optimized XGBoost. Expert Systems with Applications, 183(September 2020), https://doi.org/10.1016/j.eswa.2021.115452
  • Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. https://doi.org/10.1038/s42256-019-0138-9
  • Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765–4774.
  • Lyn, D. A. (2008). Pressure-flow scour: A reexamination of the HEC-18 equation. Journal of Hydraulic Engineering, 134(7), 1015–1020.
  • Ma, M., Zhao, G., He, B., Li, Q., Dong, H., Wang, S., & Wang, Z. (2021). XGBoost-based method for flash flood risk assessment. Journal of Hydrology, 598, 126382. https://doi.org/10.1016/j.jhydrol.2021.126382
  • Mangalathu, S., Hwang, S. H., & Jeon, J. S. (2020). Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach. Engineering Structures, 219. https://doi.org/10.1016/j.engstruct.2020.110927
  • Melville, B. W. (1997). Pier and abutment scour: Integrated approach. Journal of Hydraulic Engineering, 123(2), 125–136. https://doi.org/10.1061/(ASCE)0733-9429(1997)123:2(125)
  • Melville, B. W., & Chiew, Y.-M. (1999). Time scale for local scour at bridge piers. Journal of Hydraulic Engineering, 125(1), 59–65. https://doi.org/10.1061/(ASCE)0733-9429(1999)125:1(59)
  • Melville, B. W., & Coleman, S. E. (2000). Bridge scour. Water Resources Publication.
  • Melville, B. W., & Sutherland, A. J. (1988). Design method for local scour at bridge piers. Journal of Hydraulic Engineering, 114(10), 1210–1226. https://doi.org/10.1061/(ASCE)0733-9429(1988)114:10(1210)
  • Mia, M. F., & Nago, H. (2003). Design method of time-dependent local scour at circular bridge pier. Journal of Hydraulic Engineering, 129(6), 420–427. https://doi.org/10.1061/(ASCE)0733-9429(2003)129:6(420)
  • Mueller, D. S., & Wagner, C. R. (2005). Field observations and evaluations of streambed scour at bridges. Federal Highway Administration. Office of Research.
  • Najafzadeh, M., & Barani, G. A. (2011). Comparison of group method of data handling based genetic programming and back propagation systems to predict scour depth around bridge piers. Scientia Iranica, 18(6), 1207–1213. https://doi.org/10.1016/j.scient.2011.11.017
  • Najafzadeh, M., & Oliveto, G. (2020). Riprap incipient motion for overtopping flows with machine learning models. Journal of Hydroinformatics, 22(4), 749–767. https://doi.org/10.2166/hydro.2020.129
  • Natarajan, R., Megharaj, G., Marchewka, A., Divakarachari, P. B., & Hans, M. R. (2022). Energy and distance based multi-objective red fox optimization algorithm in wireless sensor network. Sensors, 22(10), 3761. https://doi.org/10.3390/s22103761
  • Neill, C. R. (1964). River-bed scour—a review for engineers: Ottawa, Canada, Canadian Good Roads Association Technical Publication No. 23.
  • Neill, C. R. (1973). “Guide to bridge hydraulics.” Roads and Transportation Assoc. of Canada, University of Toronto Press, Toronto, Canada, 191pp.
  • Nguyen, D. H., Le, X. H., Heo, J. Y., & Bae, D. H. (2021). Development of an extreme gradient boosting model integrated with evolutionary algorithms for hourly water level prediction. IEEE Access, 9, 125853–125867. https://doi.org/10.1109/ACCESS.2021.3111287
  • Ni, L., Wang, D., Wu, J., Wang, Y., Tao, Y., Zhang, J., & Liu, J. (2020). Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model. Journal of Hydrology, 586, 124901. https://doi.org/10.1016/j.jhydrol.2020.124901
  • Oliveto, G., & Hager, W. H. (2002). Temporal evolution of clear-water pier and abutment scour. Journal of Hydraulic Engineering, 128(9), 811–820. https://doi.org/10.1061/(ASCE)0733-9429(2002)128:9(811)
  • Osman, A. I. A., Ahmed, A. N., Chow, M. F., Huang, Y. F., & El-Shafie, A. (2021). Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal, 12(2), 1545–1556. https://doi.org/10.1016/j.asej.2020.11.011
  • Pandey, M., Oliveto, G., Pu, J. H., Sharma, P. K., & Ojha, C. S. P. (2020a). Pier scour prediction in non-uniform gravel beds. Water, 12(6), 1696. https://doi.org/10.3390/w12061696
  • Pandey, M., Sharma, P. K., Ahmad, Z., & Karna, N. (2018). Maximum scour depth around bridge pier in gravel bed streams. Natural Hazards, 91(2), 819–836. https://doi.org/10.1007/s11069-017-3157-z
  • Pandey, M., Zakwan, M., Khan, M. A., & Bhave, S. (2020b). Development of scour around a circular pier and its modelling using genetic algorithm. Water Supply, 20(8), 3358–3367. https://doi.org/10.2166/ws.2020.244
  • Pham, B. T., Qi, C., Ho, L. S., Nguyen-Thoi, T., Al-Ansari, N., Nguyen, M. D., Nguyen, H. D., Ly, H.-B., Le, H. v., & Prakash, I. (2020). A novel hybrid soft computing model using random forest and particle swarm optimization for estimation of undrained shear strength of soil. Sustainability, 12(6), 2218. https://doi.org/10.3390/su12062218
  • Połap, D., & Woźniak, M. (2021). Red fox optimization algorithm. Expert Systems with Applications, 166(September 2020), 114107. https://doi.org/10.1016/j.eswa.2020.114107
  • Qiu, Y., Zhou, J., Khandelwal, M., Yang, H., Yang, P., & Li, C. (2021). Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Engineering with Computers, 0123456789. https://doi.org/10.1007/s00366-021-01393-9
  • Raikar, R. v., & Dey, S. (2005). Clear water scour at bridge piers in fine and medium gravel beds. Canadian Journal of Civil Engineering, 781(1980), 775–781. https://doi.org/10.1139/L05-022
  • Richardson, E. v., & Davis, S. R. (2001). Evaluating scour at bridges. Federal Highway Administration. Office of Bridge Technology.
  • Roder, M., de Rosa, G. H., Passos, L. A., Papa, J. P., & Rossi, A. L. D. (2020). Harnessing particle swarm optimization through relativistic velocity. 2020 IEEE congress on evolutionary computation, CEC 2020 - conference proceedings.
  • Shamshirband, S., Mosavi, A., & Rabczuk, T. (2020). Particle swarm optimization model to predict scour depth around a bridge pier. Frontiers of Structural and Civil Engineering, 14(4), 855–866. https://doi.org/10.1007/s11709-020-0619-2
  • Shapley, L. S. (1953). A value for n-person games.
  • Sharafi, H., Ebtehaj, I., Bonakdari, H., & Zaji, A. H. (2016). Design of a support vector machine with different kernel functions to predict scour depth around bridge piers. Natural Hazards, 84(3), 2145–2162. https://doi.org/10.1007/s11069-016-2540-5
  • Shen, H. W. (1971). Scour Near Piers. River Mechanics, vol. II. Ft. Collins, Colo (Chapter 23).
  • Shen, H. W., Schneider, V. R., & Karaki, S. (1969). Local scour around bridge piers. Journal of the Hydraulics Division, 95(6), 1919–1940. https://doi.org/10.1061/JYCEAJ.0002197
  • Sheppard, D. M. (2004). Overlooked local sediment scour mechanism. Transportation Research Record: Journal of the Transportation Research Board, 1890(1), 107–111. https://doi.org/10.3141/1890-13
  • Sheppard, D. M., Melville, B., & Demir, H. (2010). SCOUR AT WIDE PIERS AND LONG SKEWED PIERS FINAL REPORT Prepared for NCHRP Transportation Research Board Of The National Academies.
  • Sheppard, D. M., Melville, B., & Demir, H. (2014). Evaluation of existing equations for local scour at bridge piers. Journal of Hydraulic Engineering, 140(1), 14–23. https://doi.org/10.1061/(asce)hy.1943-7900.0000800
  • Shi, R., Xu, X., Li, J., & Li, Y. (2021). Prediction and analysis of train arrival delay based on XGBoost and Bayesian optimization. Applied Soft Computing, 109. https://doi.org/10.1016/j.asoc.2021.107538
  • Sreedhara, B. M., Patil, A. P., Pushparaj, J., Kuntoji, G., & Naganna, S. R. (2021). Application of gradient tree boosting regressor for the prediction of scour depth around bridge piers. Journal of Hydroinformatics, 23(4), 849–863. https://doi.org/10.2166/hydro.2021.011
  • Sreedhara, B. M., Rao, M., & Mandal, S. (2019). Application of an evolutionary technique (PSO–SVM) and ANFIS in clear-water scour depth prediction around bridge piers. Neural Computing and Applications, 31(11), 7335–7349. https://doi.org/10.1007/s00521-018-3570-6
  • Taormina, R., & Chau, K. (2015). Neural network river forecasting with multi-objective fully informed particle swarm optimization. Journal of Hydroinformatics, 17(1), 99–113. https://doi.org/10.2166/hydro.2014.116
  • Tavouktsoglou, N. S., Harris, J. M., Simons, R. R., & Whitehouse, R. J. S. (2017). Equilibrium scour-depth prediction around cylindrical structures. Journal of Waterway, Port, Coastal, and Ocean Engineering, 143(5), 04017017.
  • Venkatesan, E., & Mahindrakar, A. B. (2019). Forecasting floods using extreme gradient boosting-a new approach. International Journal of Civil Engineering and Technology, 10(2), 1336–1346.
  • Wang, C., Yu, X., & Liang, F. (2017). A review of bridge scour: Mechanism, estimation, monitoring and countermeasures. Natural Hazards, 87(3), 1881–1906. https://doi.org/10.1007/s11069-017-2842-2
  • Wardhana, K., & Hadipriono, F. C. (2003). Analysis of recent bridge failures in the United States. Journal of Performance of Constructed Facilities, 17(3), 144–150. https://doi.org/10.1061/(ASCE)0887-3828(2003)17:3(144)
  • Williams, P., Bolisetti, T., & Balachandar, R. (2017). Evaluation of governing parameters on pier scour geometry. Canadian Journal of Civil Engineering, 44(1), 48–58. https://doi.org/10.1139/cjce-2016-0133
  • Wu, J., Ma, D., & Wang, W. (2022). Leakage identification in water distribution networks based on XGBoost algorithm. Journal of Water Resources Planning and Management, 148(3), 4021107. https://doi.org/10.1061/(ASCE)WR.1943-5452.0001523
  • Yanmaz, A. M., & Altinbilek, H. D. g. ˇ. a. (1991). Study of time-depenbent local scour around bridge piers. Journal of Hydraulic Engineering, 117(10), 1247–1268. https://doi.org/10.1061/(ASCE)0733-9429(1991)117:10(1247)
  • Yu, J., Zheng, W., Xu, L., Zhangzhong, L., Zhang, G., & Shan, F. (2020). A pso-xgboost model for estimating daily reference evapotranspiration in the solar greenhouse. Intelligent Automation and Soft Computing, 26(5), 989–1003. https://doi.org/10.32604/iasc.2020.010130
  • Zhang, X., Nguyen, H., Bui, X., Tran, Q., Nguyen, D., Bui, D. T., & Moayedi, H. (2020). Novel soft computing model for predicting blast-induced ground vibration in open-pit mines based on particle swarm optimization and XGBoost. Natural Resources Research, 29(2), 711–721. https://doi.org/10.1007/s11053-019-09492-7
  • Zhang, Y., Lin, R., Zhang, H., & Peng, Y. (2022). Vibration prediction and analysis of strip rolling mill based on XGBoost and Bayesian optimization. Complex and Intelligent Systems, 9(1), 133–145.
  • Zhou, J., Qiu, Y., Zhu, S., & Jahed, D. (2021). Estimation of the TBM advance rate under hard rock conditions using XGBoost and Bayesian optimization. Underground Space, 6(5), 506–515. https://doi.org/10.1016/j.undsp.2020.05.008
  • Zounemat-Kermani, M., Beheshti, A. A., Ataie-Ashtiani, B., & Sabbagh-Yazdi, S. R. (2009). Estimation of current-induced scour depth around pile groups using neural network and adaptive neuro-fuzzy inference system. Applied Soft Computing Journal, 9(2), 746–755. https://doi.org/10.1016/j.asoc.2008.09.006

Appendix

Table A presents several well-known equations to determine the depth of scour near bridge piers. In the equations, dse is the equilibrium scour depth, Y is the flow depth, D is the pier diameter, V is the flow velocity, Vc is the critical velocity, and Fr is the Froude number.

Table A. Well-known equations for the estimation of scour depth around bridge piers