332
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Fluvial bedload transport modelling: advanced ensemble tree-based models or optimized deep learning algorithms?

, , , , , & show all
Article: 2346221 | Received 18 Jan 2024, Accepted 15 Apr 2024, Published online: 10 May 2024

Abstract

The potential of advanced tree-based models and optimized deep learning algorithms to predict fluvial bedload transport was explored, identifying the most flexible and accurate algorithm, and the optimum set of readily available and reliable inputs. Using 926 datasets for 20 rivers, the performance of three groups of models was tested: (1) standalone tree-based models Alternating Model Tree (AMT) and Dual Perturb and Combine Tree (DPCT); (2) ensemble tree-based models Iterative Absolute Error Regression (IAER), ensembled with AMT and DPCT; and (3) optimized deep learning models Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN) ensembled with Grey Wolf Optimizer. Comparison of the predictive performance of the models with that of commonly used empirical equations and sensitivity analysis of the driving variables revealed that: (i) the coarse grain-size percentile D90 was the most effective variable in bedload transport prediction (where Dx is the xth percentile of the bed surface grain size distribution), followed by D84, D50, flow discharge, D16, and channel slope and width; (ii) all tree-based models and optimized deep learning algorithms displayed ‘very good’ or ‘good’ performance, outperforming empirical equations; and (iii) all algorithms performed best when all input parameters were used. Thus, a range of different input variable combinations must be considered in the optimization of these models. Overall, ensemble algorithms provided more accurate predictions of bedload transport than their standalone counterpart. In particular, the ensemble tree-based model IAER-AMT performed most strongly, displaying great potential to produce robust predictions of bedload transport in coarse-grained rivers based on a few readily available flow and channel variables.

1. Introduction

Bedload transport is the key driver of morphological change in coarse-grained rivers, exacerbating flooding (e.g. Nones, Citation2019) and posing risks to infrastructure (e.g. Feeney et al., Citation2022; Li et al., Citation2021) and benthic habitats (e.g. Fisher et al., Citation1982). Predicting bedload transport rate accurately is a major challenge due to the vast number of flow and channel properties that control bedload transport, its non-linear relationship with these variables, its stochastic nature, and high complexity in its spatio-temporal patterns. Influential variables include upstream source of sediment supply, storage, and delivery (Gao, Citation2011), river channel characteristics such as slope, wide, riverbed structure, and roughness (e.g. Zhang et al., Citation2010), bed material size and its variation (e.g. Recking et al., Citation2023), and river flow properties such as discharge and bed shear stress (e.g. Gomez and Church, Citation1989).

Direct measurement of bedload is costly, time-consuming, and associated with high uncertainty, particularly during flooding (Graf, Citation1971). To overcome these difficulties, a vast array of laboratory flume experiments have been conducted under different flow and bed material conditions, from which many empirical equations have been developed, e.g. those reported by Meyer-Peter and Müller (Citation1948), Einstein (Citation1950), Bagnold (Citation1966), Wilcock and Crowe (Citation2003), and Recking (Citation2013). For example, Poorhosein et al. (Citation2014) developed two types of empirical/linear equations for bedload transport rate prediction, one based on hydraulic parameters and one based on geometric parameters, and found good predictive performance for both types. They also identified Froude number, Shields parameter, and shape factor as the three most effective hydraulic variables in bedload transport prediction, while grain size distribution and water channel slope were the most important and effective geometric variables (Poorhosein et al., Citation2014). Using 2600 datasets, Hinton et al. (Citation2018) tested a number of empirical equations, including those developed by Barry et al. (Citation2004), Parker (Citation1990; both calibrated and uncalibrated), Meyer-Peter and Müller (Citation1948), Wilcock (Citation2001), Rosgen et al. (Citation2006; ‘Pagosa good condition’), Elhakeem and Imran (Citation2016), and Recking (Citation2013). Their results showed that the ‘Pagosa good condition’ and Barry et al. equations outperformed the others, while the Meyer-Peter and Müller (Citation1948) and uncalibrated Parker (Citation1990) equations gave the lowest predictive power.

Alternatively, bedload transport can be predicted using numerical approaches, which attempt to mathematically represent the physics behind the processes of entrainment, transportation, and deposition. For example, Jilani and Hashemi (Citation2013) developed a smoothed particle hydrodynamic (SPH) model and found it be reliable and efficient, while Barzgaran et al. (Citation2019) developed and implemented a second-order finite volume method and wave propagation algorithm and found it to be efficient. Both models have been successfully applied in later studies, but model implementation is difficult, they require vast amounts of data for calibration and validation, and calibration is time-consuming, limiting their wider application. Various approaches have been employed to simplify these models, including prediction of flow variables using a depth-averaged method, the Manning’s (Citation1891) equation with estimates of the Manning roughness coefficient, and using transport capacity equations under unlimited sediment supply conditions (Mustafa et al., Citation2017; Shahiri et al., Citation2016; Wainwright et al., Citation2015).

The use of machine learning (ML) models in hydrology and river science, and in many other fields of study, is now increasing. These models seek to find a robust relationship between readily available input and output parameters. The main advantages of ML models are that they are user-friendly, require only small amounts of data, are simple and fast to calibrate, are able to handle large amounts of data, and have a non-linear structure that is able to replicate complicated environmental behaviour (e.g. Asheghi & Hosseini, Citation2020; Hosseiny et al., Citation2023; Khosravi et al., Citation2020; Kisi & Yaseen, Citation2019; Latif et al., Citation2023; Roushangar & Koosheh, Citation2015).

Artificial Neural Network (ANN) is one of the oldest and most widely used ML models in hydrology and water science. Hosseiny et al. (Citation2023) found an ANN model to be efficient in the prediction of bedload transport based on 8117 measurements from 134 rivers. However, ANN algorithms have slow coverage speed during the training procedure, high errors in the modelling phase, and low convergence and generalization power (Kisi et al., Citation2012). Thus, ANN algorithms have poor predictive power when the range of the testing dataset is outside the range of the training data (Kisi et al., Citation2016; Melesse et al., Citation2011), and they require a large dataset to achieve reasonable results. To overcome this weakness, ANN algorithms have been ensembled with fuzzy logic algorithms to create Adaptive Neural Fuzzy Inference System (ANFIS) models. Riahi-Madvar and Seifi (Citation2018) developed an ANFIS model for bedload transport prediction and found that it outperformed an ANN model. However, in other environmental fields of study, ANFIS models have been found to be poor at finding the best weight parameters, heavily influencing the prediction accuracy (Tien Bui et al., Citation2016). Furthermore, ANFIS algorithms suffer from the need for a large number of model operators, each of which must be set accurately, especially the weights of membership function. Additionally, ANFIS algorithms lack a systematic approach in the design of fuzzy rules and in the choice of membership functions variables (Tien Bui et al., Citation2016; Khosravi et al., Citation2018).

The ANFIS model is neuron-based and several other algorithms of this type, such as Support Vector Regression (SVR), have been widely used in river science. For example, Roushangar and Koosheh (Citation2015) developed a hybridized model, SVR-GA, by combining SVR with the Genetic Algorithm (GA) approach, and found that it had better predictive power than empirical equations of bedload transport rate. However, SVR models have many hyper-parameters, making calibration time-consuming and model implementation difficult (Ahmad et al., Citation2018). Generally, the prediction power of neuron-based models to are improved when combined with metaheuristic models such as GA, heap-based optimizer (HBO), political optimizer (PO), teaching-learning based optimization (TLBO), backtracking search algorithm (BSA) and jellyfish search optimization (JFSO) (Moayedi et al., Citation2024; Vakharia et al., Citation2023).

New types of neuron-based models, called deep learning (DL) algorithms, have been developed to overcome the weaknesses of conventional ML models. The two main advantages of DL models are their greater flexibility, and their ability to handle large and complex data, both structured and unstructured. Thus DL have higher predictive performance (Ghorbanzadeh et al., Citation2019). Convolutional Neural Network (CNN), Recurrent Neural Networks (RNN), and Long Short-Term Memory (LSTM) networks are among the most popular and widely used DL approaches, owing to superior performance. For example, Latif et al. (Citation2023) found that a LSTM model achieved better performance in prediction of bedload transport rate than SVR and ANN, while Shakya et al. (Citation2023) found that a different DL algorithm, Deep Neural Network (DNN), performed better in prediction of total sediment load in rivers than SVR, linear regression (LR), and extreme learning machine (ELM) models.

Another type of ML model which is widely used in hydrology and water resources, especially for spatial modelling of natural hazards, are tree-based algorithms such as random forest (RF), M5Prime (M5P), and Reduced Error Pruning Tree (REPT). Khosravi et al. (Citation2018) applied several tree-based models, including Logistic Model Trees (LMT), REPT, Naïve Bayes Trees (NBT), and Alternating Decision Trees (ADT), in flood susceptibility mapping in Iran and found that all models achieved very good performance, although ADT outperformed the other models. Rahmati et al. (Citation2019) applied numerous tree-based models, including Rule-Based Decision Tree (RBDT), Boosted Regression Trees (BRT), Classification And Regression Tree (CART), and a RF model in land subsidence susceptibility mapping and found that the RF model achieved the best performance. Hussain and Khan (Citation2020) developed a RF model for monthly river flow forecasting and found that it achieved around 18% and 34% higher performance (based on root mean square error, RMSE) than MLP and SVM, respectively. However, there is a significant knowledge gap regarding the potential of DL algorithms for bedload transport prediction. Thus the challenge lies in establishing the most flexible and accurate algorithm for this purpose, and identifying readily available, reliable, and optimum inputs.

The aim of this study was to address this challenge through comparing the performance of empirical models, standalone and ensemble tree-based models, and optimized DL models in prediction of bedload transport rate in coarse-grained rivers. Specific objectives were to establish, using 926 datasets for 20 rivers: (1) the potential of tree-based and DL algorithms to provide accurate predictions using a few readily available and measurable river properties, such as channel size (width and slope), flow discharge, and sediment size; (2) the most effective variable in bedload transport prediction; (3) the most effective input variable combination in optimizing predictive power; and (4) the effect of hybridization and ensemble-based approaches on prediction accuracy. This study is the first to apply a wide range of tree-based and DL models in prediction of bedload transport and offers new insights into the potential of these algorithms to provide simple, fast, accurate, and efficient predictions of bedload transport.

2. Methodology

2.1. Data

The data used in the analysis comprised 926 sets of bedload transport rate for 20 rivers, compiled from BedloadWeb (http://en.bedloadweb.com) (Recking, Citation2019) and (Hosseiny et al., Citation2023; https://doi.org/10.5281/zenodo.7641313). In addition to measured bedload sediment transport rate per unit width (qb; g/m/s), the data included river bed slope (S; m/m), river discharge (Q; m3/s), river width (w; m), and bed surface sediment sizes (D16, D50, D84, and D90, where Dx is the xth percentile of the bed surface grain size distribution in m). Summary statistics on the dataset are presented in Table .

Table 1. Summary statistics on the training/testing data.

The datasets were split in two in a ratio of 70:30, with 633 datasets used for model development, calibration, and training (training data), and the remaining 293 datasets used for model validation and performance comparison (testing data). There is no consensus on how best to split data for training and testing, but a 70:30 split is the most widely used approach in spatial (e.g. Khosravi et al., Citation2018) and time series (e.g. Kouadio et al., Citation2018; Samadianfard et al., Citation2019) modelling by ML/DP. Although the training and testing datasets were selected randomly, a manual check was performed to ensure that they were separated correctly in terms of representing a range of qb values.

Three main approaches were used to construct different input data scenarios: a manual approach and two feature selection ML-based models, CfsSubsetEval (CSE) and Principal Component Analysis (PCA). These are the most common approaches among feature ranking methods, such as Fisher score, ReliefF, Wilcoxon rank, Gain ratio and Memetic feature (Vakharia et al., Citation2016).

2.2.1 Manual approach

Eight different data input scenarios were constructed and explored to find the most effective input combination (Table ). First, the parameter/variable with the highest correlation coefficient was selected as the first input scenario to explore whether the most correlated parameter/variable was efficient in predicting qb individually. Then other variables with the second, third, fourth, etc. highest correlation coefficient were added step-by-step to construct the eight different input combinations.

Table 2. Input combination scenarios.

2.2.2. Cfssubseteval approach

CfsSubsetEval is a correlation-based feature subset selection and multivariate filter evaluator approach that embraces the worth of a subset of attributes by considering the individual predictive ability of each feature and the degree of redundancy between features (Hall, Citation1999). Subsets of features that are highly correlated with the class, but have low intercorrelation, are preferred. CSE is calculated as (Qiao et al., Citation2022): (1) CSE=maxsk[rcf1+rcf2++rcfkk+2(rf1f2++rfifj++rfkfk1)](1) where sk is feature subset S consisting of k features, rcfi is correlation between input features and the output target, and rfifj is intercorrelation between input features. This, along with the PCA approach, was implemented in Waikato Environment for Knowledge Analysis (WEKA) 3.9 software. The CSE approach produced input No. 3 in Table .

2.2.3. Principal component analysis approach

Principal Component Analysis is a popular linear feature extractor used for unsupervised feature selection based on eigenvector analysis to identify critical original features for principal components. PCA is a statistical method applied to decrease the dimensionality of a dataset through linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data. The PCA approach produced input No. 5 in Table . All eight input combinations were implemented, and the resulting RMSE was calculated to assess the most efficient input combination.

Metaheuristic algorithms were applied for determination of the most effective and optimum values of DL model hyperparameters, using MATLAB programming software. In this approach, the Grey Wolf Optimizer (GWO) algorithm was combined with DL algorithms to identify optimum hyperparameter values automatically. For tree-based models, which were implemented in WEKA software, the most common and basic trial and error approaches were utilized for tuning model hyperparameters. This approach involved calculating the RMSE for the default values, and then considering higher and lower values, to identify the most effective values (see Table A and B in supplementary material).

2.4. Model description

2.4.1. Dual perturb and combine tree (DPCT)

A DPCT model is a regression and classification tree-based model. Perturb and combine algorithms (PC algorithms) are used to develop and construct different subset models from the training dataset. All predicted values are then combined to generate the final target value (Breiman, Citation1998). Geurts and Wehenkel (Citation2005) showed that the PC model is reliable, and delivers high accuracy. The DPCT model is a more advanced kind of PC model that only generates one model for prediction through delays to the prediction stage for generation of multiple prediction. This delay is produced by perturbing the attribute vector corresponding to a test case.

2.4.2. Alternating model tree (AMT)

Introduced by Frank et al. (Citation2015), AMT is a type of regression tree-based model that uses forward additive regression (AR) and a cross-validation approach to build the tree model.. This type of ensemble model benefits from numerous advanced algorithms for development and growing. AMT models grow based on two nodes; splitter node (divides the quantitative attributes at the median value) and predictor node (forecasts the system’s response through linear regression) (Gao et al., Citation2019).

2.4.3. Iterative absolute error regression (IAER)

IAER iteratively fits a regression model by attempting to minimize absolute error, using a base learner that minimizes weighted squared error. Weights are bounded from below by 1.0 / Utils.SMALL. The algorithm re-samples data based on weights if the base learner is not a Weighted Instances Handler. More information can be found in Schlossmacher (Citation1973).

2.4.4. Recurrent neural network (RNN)

The RNN model is a popular and robust DL model for sequential data modelling and prediction, and is a form of advanced bi-directional ANN model (i.e. it feeds back the output from some nodes to affect subsequent input to the same nodes). This process has a significant impact on the learning ability of the model. In other words, for each new input, the output is identified and then fed back as the modified input to the modelling process. This operation is continued until a constant output has been attained. RNN uses the same weights for each element of the sequence, decreasing the number of parameters and allowing the model to generalize to sequences of varying lengths.

2.4.5. Long short-term memory (LSTM)

LSTM is a type of RNN model which is capable of learning long-term dependencies, especially in time series problems or in processing sequential data (Hochreiter & Schmidhuber, Citation1997). LSTM is composed of memory blocks. These blocks are memory cells that are capable of storing or remembering sequential dataset/information through units called gates (Azzouni & Pujolle, Citation2017). Input gates, forget gates, and output gates are the three main gates in the LSTM network, and they control the flow of incoming information, amount of information retained from the previous memory, and flow of outgoing information, respectively (Vu et al., Citation2021). When networks in a LSTM model forget a previous hidden state, they are capable of combining memory blocks to cause the networks to learn.

2.4.6. Grey wolf optimizer (GWO)

GWO is one of the most flexible, popular, strong, and efficient meteoritic algorithms that can be applied for ML model optimization, mimicking the leadership hierarchy and hunting mechanism of grey wolves in nature (Mirjalili, Mirjalili, & Lewis, Citation2014). The model structure is similar to a pyramid with four levels, of alpha (α), beta (β), delta (δ), and omega (ω) wolves. Alpha wolves are located at the top of the pyramid and are the optimal and efficient solutions that wolf leaders make. Beta and delta wolves at the second and third level are responsible for sub-optimal decisions or are subservient wolves in decision-making (Li et al., Citation2021). Omega wolves at the bottom of the pyramid play the role of scapegoat. GWO achieves an efficient solution by updating the positions of other wolves according to the positions of α, β, and δ wolves.

2.4.7. Einstein (1950) equation

The Einstein (Citation1950) equation considers bedload transport as a probabilistic phenomenon, relating the flow intensity to the bedload transport rate: (2) qBed=11Π(0.413/τ)2(0.413/τ)2et2dt=43.5q1+43.5q(2) where τ* is Shields stress, t is an integral parameter, and q* is the Einstein bedload number. More information about the Einstein (Citation1950) equation can be found in Hosseiny et al. (Citation2023).

2.4.8. Recking (2013) bedload equation

Recking (Citation2013) developed a bedload transport equation based on 6319 field observations and 1317 flume measurements: (3) qBed=14τ842.5/[1+(τm/τ84)4](3) where τm is non-dimensional mobility Shields stress related to transition from partial to full mobility, and τ84 is non-dimensional Shields stress related to bed surface sediment size D84.

2.5. Model evaluation

A number of quantitative and qualitative/visual approaches were used for model evaluation and comparison. The quantitative group included coefficient of determination (R2), RMSE, Nash-Sutcliffe efficiency (NSE), percent bias (PBIAS), and ratio of RMSE to standard deviation of measured data (RSR). These error metrics were calculated as follows: (4) R2=(i=1n(qBedMq¯BedM)(qBedPq¯BedP)i=1n(qBedMq¯BedM)2×i=1n(qBedPq¯BedP)2)20R21Optimum=1(4) (5) RMSE=1ni=1n(qBedPqBedM)20RMSE+∞Optimum=0(5) (6) NSE=1i=1n(qBedPqBedM)2i=1n(qBedPq¯BedP)2NSE1Optimum=1(6) (7) PBIAS=(i=1n(qBedMqBedP)i=1nqBedM)100PBIAS+∞Optimum=0(7) (8) RSR=i=1n(qBedPqBedM)2i=1n(qBedMq¯BedM)20RSR+∞Optimum=0(8) where qBedM and qBedP is measured and predicted bedload transport rate, respectively, q¯BedM and q¯BedM is mean measured and predicted qb value, respectively, and n is number of data points.

The qualitative/visual approaches used in the comparison of model performance were scatter plots, line-variation graphs, Taylor diagrams, and violin plots, allowing the model fit to be seen across the full range of bedload transport values, particularly at the extreme end of the range. One distinct advantage of the Taylor diagram is that it benefits from the use of two common correlation statistics: correlation and standard deviation (SD) (Taylor, Citation2001).. The measured data point in the Taylor diagram is considered the reference point. The closer the predicted value to this reference value in terms of R2 and SD, the higher the prediction capability.

The Freidman test was applied for the different model outputs. If the test was significant, then an additional posthoc Nemenyi test was carried out to check for statistically significant differences between the models. The null hypothesis was that there was a statistically significant difference between the models at α = 0.05. At p-value < 0.05 the null hypothesis was rejected.

3. Results

3.1. Variable importance

The effectiveness and importance of each potential input variable in qb prediction was explored through a correlation coefficient and relief attribute evaluator (RAE) approach (Figure ). RAE evaluates the worth of an attribute by repeatedly sampling an instance and considering the value of the given attribute for the nearest instance of the same and different class.

Figure 1. Radar-chart of variable importance, determined by (a) correlation coefficient and (b) relief attribute evaluator (RAE). Variables: River bed slope (S), river width (w), river discharge (Q), bed surface sediment size (D16, D50, D84, D90).

Figure 1. Radar-chart of variable importance, determined by (a) correlation coefficient and (b) relief attribute evaluator (RAE). Variables: River bed slope (S), river width (w), river discharge (Q), bed surface sediment size (D16, D50, D84, D90).

According to the correlation coefficient, presented in terms of a radar-chart (Figure a), river bed slope (S) had the largest impact on qb prediction, followed by D84, D50, D90, D16, w, and Q. The results from the RAE approach broadly agreed, with D90 shown as the most effective variable, followed by D84, D50, Q, D16, S, and w (Figure b).

3.2. Best input combination

On adding more input variables to the input combination, the prediction accuracy of the different models increased (Figure ). According to IAER-AMT (the most reliable model), the best input combination gave 32.9% and 39.3% higher performance (lower RMSE) during the training and testing phase, respectively, than the worst performing model. The best input scenario (generated manually) had around 28% and 29% higher predictive power than the scenarios proposed by CSE and PCA ML-based methods, respectively, in terms of RMSE during the training phase. In the testing this phase, this equated to 30% and 4% higher predictive power, respectively. These RMSE values were only used to explore the best input combination, and model hyperparameter tuning for tree-based models was not implemented in this step; tuning should only occur once the most efficient input scenario has been determined.

Figure 2. Change in model performance with input combination scenarios for (a) training data and (b) testing data (dashed red boxes show the best input scenario).

Figure 2. Change in model performance with input combination scenarios for (a) training data and (b) testing data (dashed red boxes show the best input scenario).

3.3. Model performance evaluation

The scatter plots and R2 values showed that the new ensemble tree-based algorithm IAER-AMT had the highest prediction capability (R2 = 0.80), with the data points being more closely distributed around the line of equality across a fuller range of qb values (Figure ). The second best performer was also a new ensemble tree-based model, IAER-DPCT (R2 = 0.76), followed by AMT (R2 = 0.73), DPCT (R2 = 0.72), LSTM-GWO (R2 = 0.69), and RNN-GWO (R2 = 0.67). The two lowest performing models by some margin were the empirical equations, Einstein (Citation1950) (R2 = 0.09) and Recking (Citation2013) (R2 = 0.08). According to the R2 values, IAER-AMT, IAER-DPCT, LSTM-GWO, RNN-GWO, AMT, and DPCT all achieved ‘very good’ performance (0.7R21), LSTM and RNN ‘good’ performance (0.6R20.7), and Einstein (Citation1950) and Recking (Citation2013) ‘unsatisfactory’ performance (R20.5).

Figure 3. Scatter plot of measured and predicted qb within the testing phase for different modelling approaches tested.

Figure 3. Scatter plot of measured and predicted qb within the testing phase for different modelling approaches tested.

According to the line-variation graphs (Figure ), all tree-based models were able to predict qb values well. In particular, the ensemble tree-based models predicted extreme values more accurately than the other models, while the empirical models overestimated the higher range of qb values (Figure ).

Figure 4. Line variation graph of measured and predicted bedload sediment transport rate per unit width (qb) within the testing phase for different modelling approaches.

Figure 4. Line variation graph of measured and predicted bedload sediment transport rate per unit width (qb) within the testing phase for different modelling approaches.

The Taylor diagram (Figure ) revealed that the IAER-AMT model had the highest correlation, 0.90, with the predicted standard deviation in qb being closest to the standard deviation of the observed data, followed by IAER-DPCT. The empirical equations had the lowest performance and higher standard deviation than the measured data. Although IAER-DPCT showed lower performance than IAER-AMT, the model produced a standard deviation closer to the measured value.

Figure 5. Taylor diagram displaying statistical comparison with observations of 10 model estimates of bedload sediment transport rate per unit width.

Figure 5. Taylor diagram displaying statistical comparison with observations of 10 model estimates of bedload sediment transport rate per unit width.

An examination of summary statistics of predicted qb revealed that IAER-DPCT predicted the minimum, first quartile, and median qb most accurately (Table ). The LSTM-GWO model performed most strongly in predicting the third quartile and the DPCT model in predicting the maximum value.

Table 3. Summary statistics on predicted bedload sediment transport rate per unit width (qb).

All quantitative error metrics showed that the IAER-AMT model had the highest predictive power (Table ), followed by IAER-DPCT, AMT, DPCT, LSTM-GWO, RNN-GWO, Einstein (Citation1950), and Recking (Citation2013). According to the NSE values, the IAER-AMT and IAER-DPCT models had ‘very good performance’ (0.75NSE1), LSTM-GWO, RNN-GWO, AMT, and DPCT had ‘good’ performance (0.65NSE0.75), and the empirical equations had ‘unsatisfactory’ performance (NSE0.5). These differences in performance were statistically significant in most comparisons under the Freidman (Chi-Square statistic = 453; p-value < 0.001) and Nemenyi tests (and 5) Table .

Table 4. Comparison of performance of the different models, based on root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), percent bias (PBIAS), and ratio of RMSE to standard deviation of measured data (RSR)

Table 5. The p-values of a Nemenyi test of model performance difference (yellow cells show a statistically significant difference between models at the 0.05 significance level, while green cells show there is no statistically significant difference)

4. Discussion

4.1 Comparison of prediction performance achieved by empirical equations, tree-based models, and optimized deep learning algorithms

A large dataset of bedload transport measurements collected from various field-based studies was used to investigate model efficiency. The empirical equations performed poorly, particularly for higher rates of bedload transport in which accurate prediction is most required for understanding morphological change and forecasting erosion hazards (Feeney et al., Citation2022; Li et al., Citation2021). This result indicates that these equations should be used with due caution when applied outside the conditions for which they were developed. The high degree of uncertainty associated with empirical equations when applied to field-based studies is because most have been developed based on flume experiments involving simplified flow and bed conditions, such as steady and uniform flow (Mao, Citation2012), equilibrium sediment transport conditions (Wainwright et al., Citation2015), and non water-water gravel beds (Cooper & Tait, Citation2009). Problems then arise in trying to scale flow and sediment properties correctly, and the magnitude of transport that can be reproduced is limited (Kleinhans et al., Citation2014). Therefore producing an estimate of bedload transport rate for a field setting that is within the same order of magnitude as a measured value is often considered ‘reasonable’ prediction for an empirical equation, and no single empirical formula can be applied to all datasets (Gomez & Church, 1989). This flaw is because most empirical equations are linear and unable to capture non-linearity in input and output data.

In contrast, all tree-based models and optimized DL algorithms tested displayed ‘very good’ or ‘good’ performance. Among the standalone models, the tree-based models outperformed the optimized DL models for a number of reasons: (1) tree-based models have higher accuracy on tabular data (Shwartz-Ziv & Armon, Citation2022), because they require less tuning and processing effort; (2) DL models are biased to overly smooth solutions (Grinsztajn et al., Citation2022) and fit low-frequency functions (Rahaman et al., Citation2019), and thus they struggle to fit irregular target functions, such as those within the bedload datasets, compared with tree-based models; (3) tree-based models can handle data that are not normally distributed and therefore do not require scaling or normalization; and (4) tree-based models require little data preparation. The best performing standalone tree-based model was AMT, because the algorithm uses step-wise forward cumulative regression (statistical boosting version) and cross-validation techniques to reduce square error and limit tree development (Moayedi et al., Citation2020).

In all cases, the ensemble algorithms outperformed their standalone counterpart. This enhancement of performance occurred because hybridization produces a coupled model with higher flexibility that is better trained and has a non-linear structure (De’ath & Fabricius, Citation2000). High flexibility and non-linear structure are particularly important in the prediction of bedload transport rate because of the non-linearity between variables, the low correlation between individual variables and bedload transport rate, and the general complexity of bedload transport.

4.2. Effect of input variables on model prediction performance

The combination of input variables used in the models had a strong effect on predictive power, confirming that determination of the optimum combination of input variables is one of the most significant steps in producing accurate ML and DL models. Manual development of input variable combinations led to a more efficient and practical input scenario than the use of intelligent approaches (CSE and PCA). This advantage largely stemmed from being able to test the efficiency of numerous input combinations and the impact of adding each parameter on model performance. Thus, through this manual approach it was possible to determine the most sensitive hyperparameters and understand the hyperparameter reaction and trend of a model. When using this approach, inclusion of all input variables resulted in the highest performance. The intelligent approaches proposed an input scenario based only on the parameters that were most highly correlated with qb (S, D50, D84, and Q), while ignoring parameters with a low degree of correlation (D16, D90, and w). As a result, the intelligence approaches produced models with a RMSE value in the testing phase that was 30% (CSE) and 4% (PCA) higher than the optimal input combination identified in the manual approach. This aspect further highlights the complex, non-linear nature of the interaction of bedload transport with flow mechanics and channel conditions, and the requirement for multiple input parameters to represent this interaction, even when some might have a low degree of correlation.4.3 Applying ensemble tree-based models to predict bedload transport rate in rivers.

Overall, the results showed that ensemble tree-based models have great potential to produce robust predictions of bedload transport in coarse-grained rivers. Unlike empirical equations, these models performed well over a range of flow and channel conditions, while also remaining simple, and easy and inexpensive to build and run, unlike theoretical and numerical models. Although other parameters, such as Shields stress and turbulent kinetic energy, have a significant impact on bedload transport rates, the aim was to find a model that could produce high-accuracy estimates of bedload transport based on a few readily available and measurable river properties, such as channel size (width and slope), flow discharge, and sediment size. Given that inclusion of all input variables produced the highest performance, addition of more variables can be expected to further improve performance. However, while a model with a high degree of complexity might be able to capture more of the variation in the data (reduce the training error), it will be more difficult to train and more prone to overfitting (model fitting to the noise in the data rather than the underlying pattern). Overfitting can be a significant issue for bedload prediction because measured data are noisy due to the stochastic behaviour of bedload entrainment and transport, the difficulty in obtaining representative samples, and the highly non-linear relationship of bedload with river properties. Thus, a higher-complexity model could perform poorly when applied to new and unseen data, causing loss of model generalization. With these considerations in mind and noting the very good performance of the ensemble tree-based models using readily available parameters, the models developed in this study appear to strike the correct balance between model complexity, generalization, and performance.

The major disadvantages of the types of model developed here are two-fold. First, like all statistical methods, they only relate directly to the rivers considered, and their application to other rivers may prove inappropriate. The input parameter range will also likely be wider than the range examined in this paper, despite using datasets composed from a large variety of sources. Thus, future studies should develop and apply ensemble tree-based model to rivers with differing flow and channel conditions, to test their wider applicability. Second, due to their ‘black-box’ structure, these models provide poor explanatory power, and are thus unable to improve understanding of the physical processes that determine bedload entrainment and transport.

This study has shown that incorporating just seven controlling parameters (channel slope, channel width, flow discharge, and four key bed surface grain size percentiles) can produce very good predictions of bedload transport rate. Future studies should examine the potential of other tree-based models, such as Random Forest and M5 model tree, as well as models that combine ML methods with the seasonal adjustment method (Li & Yang, Citation2022). Where data are available, future studies should assess how other factors affect the performance of these models, such as grain-size sorting (e.g. Recking et al., Citation2023) and grain shelter-exposure (armour ratio Dx/D50; Fu et al., Citation2023), whilst trying to not make the developed model overly complex, and continuing to use readily available and easily measured data. Such an approach would help determine the most influential parameters in bedload transport and why they vary between rivers with differing flow and channel properties.

5. Conclusions

The morphodynamics of coarse-grained rivers depend predominantly on bedload transport rate. Due to the non-linear interactions between channel and flow mechanics, tree-based models and optimized deep learning algorithms have great potential to produce accurate predictions of flow velocity. Using 926 datasets from 20 rivers, this study explored this potential by examining the predictive power of (1) standalone tree-based models (alternating model tree (AMT) and Dual Perturb and Combine Tree (DPCT)); (2) ensemble tree-based models Iterative Absolute Error Regression (IAET) ensembled with AMT and DPCT (IAER-AMT and IAER-DPCT); and (3) optimized deep learning models Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN), ensembled with Grey Wolf Optimizer (LSTM-GWO and RNN-GWO). Their performance was benchmarked against two commonly used empirical equations. The main findings were as follows:

  1. Sensitivity analysis identified D90 as the most effective variable in bedload transport prediction, followed by D84, D50, Q, D16, S, and w.

  2. All algorithms tested performed best when all input parameters were used in building the model. Variables with low correlation coefficient with bedload transport rate enhanced the predictive power. Thus a range of different input variable combinations must be considered in the optimization of tree-based and optimized deep learning models.

  3. Assessment of model performance showed that all tree-based models and optimized deep learning algorithms displayed ‘very good’ or ‘good’ performance and outperformed empirical equations, which had ‘unsatisfactory’ performance. The tree-based algorithms were more efficient and reliable than the deep learning models.

  4. In all cases, ensemble algorithms outperformed their standalone counterpart, with the ensemble tree-based model IAER-AMT being the best performing model overall.

Together, these findings reveal that ensemble tree-based models have great potential for predicting bedload transport rates based on a few readily available and easily measured flow and channel variables. These algorithms could play a particularly important role in predicting morphological change and assessing erosion hazards in coarse-grained rivers where an understanding of the physical processes may be lacking. Thus, investigating the potential of other tree-based models across a wide range of different flow and channel conditions can be an important future research direction for river scientists. In addition, the results obtained in the present study indicate that tree-based models can be a promising tool for decision makers and beneficial for stakeholders that manage the impacts of river erosion.

Data

Data related to this study are available upon request. In addition, it is publicly available in BedloadWeb.

Author contributions

KK: Conceptualization, methodology, software, writing – original draft, review, and editing; AAF: Conceptualization, methodology, Supervision, review, funding and editing; SMB and CJ: methodology, review, and editing; DM: Conceptualization, writing – original draft; ZK and JRC: Conceptualization, methodology, review, and editing

Supplemental material

Supplemental Material

Download MS Word (15.8 KB)

Acknowledgements

We thank the creators of BedloadWeb for providing free access to the bedload data used in this publication, and to Hosseiny et al. (Citation2023) for providing free access to their input data through https://doi.org/10.5281/zenodo.7641313 under a GNU General Public License v2.0 or later. James Cooper was supported by a UK Natural Environment Research Council grant (NE/V008404/1).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This study was funded in part by a grant from the National Research Foundation of Korea (NRF) (grant number NRF-2022R1A4A3032838), supported by the Ministry of Science and ICT (MSIT) of the Korean government (grant number c). Additional support was provided by the Korea Environmental Industry & Technology Institute (KEITI) as part of the Wetland Ecosystem Value Evaluation and Carbon Absorption Value Promotion Technology Development Project, funded by the Korea Ministry of Environment (MOE) (grant number 2022003640001). In addition, the authors are thankful to Dr. Mahdi Panahi for his assistance in implementing deep learning and Nemenyi test.

References

  • Ahmad, M. W., Reynolds, J., & Rezgui, Y. (2018). Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. Journal of Cleaner Production, 203, 810–821. https://doi.org/10.1016/j.jclepro.2018.08.207
  • Asheghi, R., & Hosseini, S. A. (2020). Prediction of bed load sediments using different artificial neural network models. Frontiers of Structural and Civil Engineering, 14(2), 374–386. https://doi.org/10.1007/s11709-019-0600-0
  • Azzouni, A., & Pujolle, G. (2017). A long short-term memory recurrent neural network framework for network traffic matrix prediction.
  • Bagnold, R. A. (1966). An approach to the sediment transport problem from general physics. US Geol. Surv. Prof. Paper, 422(1), 231–291.
  • Barry, J. J., Buffington, J. M., & King, J. G. (2004). A general power equation for predicting bed load transport rates in gravel bed rivers. Water Resources Research, 40(10), W10401. https://doi.org/10.1029/2004WR003190
  • Barzgaran, M., Mahdizadeh, H., & Sharifi, S. (2019). Numerical simulation of bedload sediment transport with the ability to model wet/dry interfaces using an augmented Riemann solver. Journal of Hydroinformatics, 21(5), 834–850. https://doi.org/10.2166/hydro.2019.046
  • Breiman, L. (1998). Arcing classifiers. The Annals of Statistics, 26(801), 849.
  • Bui, D., Pradhan, B., Nampak, H., Bui, Q-H., Tran, Q-A., Nguyen, Q-P. (2016). Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibilitgy modeling in a high-frequency tropical cyclone area using GIS. Journal of Hydrology, 540, 317–330. https://doi.org/10.1016/j.jhydrol.2016.06.027
  • Cooper, J. R., & Tait, S. J. (2009). Water-worked gravel beds in laboratory flumes - a natural analogue? Earth Surface Processes and Landforms, 34(3), 384–397. https://doi.org/10.1002/esp.1743
  • De’ath, G., & Fabricius, K. E. (2000). Classification and regression trees: A powerful yet simple technique for ecological data analysis Ecology (2000) https://doi/abs/10.1890/0012-9658%282000%29081%5B3178%3ACARTAP%5D2.0.CO%3B2.
  • Einstein, H. A. (1950). The bed-load function for sediment transportation in open channel flows (No. 1026). Department of Agriculture, Washington, D.C.: US.
  • Elhakeem, M., & Imran, J. (2016). Bedload model for nonuniform sediment. Journal of Hydraulic Engineering, 142(6), 06016004. https://doi.org/10.1061/(ASCE)HY.1943-7900.0001139
  • Feeney, C. J., Godfrey, S., Cooper, J. R., Plater, A. J., & Dodds, D. (2022). Forecasting riverine erosion hazards to electricity transmission towers under increasing flow magnitudes. Climate Risk Management, 36, 100439. https://doi.org/10.1016/j.crm.2022.100439
  • Fisher, S. G., Gray, L. J., Grimm, N. B., & Busch, D. E. (1982). Temporal succession in a desert stream ecosystem following flash flooding. Ecological Monographs, 52(1), 93–110. https://doi.org/10.2307/2937346
  • Frank, E., Mayo, M., & Kramer, S. (2015). Alternating model trees. In Sac ‘15 Proceedings of the 30th Annual ACM Symposium on Applied Computing (pp. 871–878). ACM New York.
  • Fu, H., Shan, Y., & Liu, C. (2023). A model for predicting the grain size distribution of an armor layer under clear water scouring. Journal of Hydrology, 623, 129842. https://doi.org/10.1016/j.jhydrol.2023.129842
  • Gao, P. (2011). An equation for bed-load transport capacities in gravel-bed rivers. Journal of Hydrology, 402(3–4), 297–305. https://doi.org/10.1016/j.jhydrol.2011.03.025
  • Gao, W., Guirao, J. L. G., Abdel-Aty, M., & Xi, W. (2019). An independent set degree condition for fractional critical deleted graphs. Discret Contin Dyn Syst S, 12(4–5), 877–886.
  • Geurts, P., & Wehenkel, L. (2005, October 3–7). Segment and combine approach for non-parametric time-series classification. Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal (pp. 478–485. LNCS.
  • Ghorbanzadeh, O., Meena, S. R., Blaschke, T., & Aryal, J. (2019). UAV-based slope failure detection using deep-learning convolutional neural networks. Remote Sensing, 11(17), 2046. https://doi.org/10.3390/rs11172046
  • Gomez, B., & Church, M. (1989). An assessment of bed load sediment transport formulae for gravel bed rivers. Water Resources Research, 25(6), 1161–1186.
  • Graf, W. H. (1971). Hydraulics of Sediment Transport. McGraw-Hill.
  • Grinsztajn, L., Oyallon, E., & Varoquaux, G. (2022). Why do tree-based models still outperform deep learning on tabular data? http://arxiv.org/abs/2207.08815.
  • Hall, M. A. (1999). Correlation-based feature selection for machine learning, no. April.
  • Hinton, D., Hotchkiss, R., & Cope, M. (2018). Comparison of calibrated empirical and semi-empirical methods for bedload transport rate prediction in gravel bed streams. Journal of Hydraulic Engineering, 144(7), 1–17. https://doi.org/10.1061/(ASCE)HY.1943-7900.0001474.
  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
  • Hosseiny, H., Masteller, C., Dale, J., & Phillips, C. (2023). Development of a machine learning model for river bed load. Earth Surface Dynamics, 11(4), 681–693. https://doi.org/10.5194/esurf-11-681-2023
  • Hussain, D., & Khan, A. A. (2020). Machine learning techniques for monthly river flow forecasting of Hunza River, Pakistan. Earth Science Informatics, 13(3), 939–949. https://doi.org/10.1007/s12145-020-00450-z
  • Jilani, A. N., & Hashemi, S. U. (2013). Numerical investigations on bed load sediment transportation using SPH method. Iranian Journal of Science, 20(2), 294–299.
  • Khosravi, K., Cooper, J. R., Daggupati, P., Pham, B., & Bui, D. (2020). Bedload transport rate prediction: Application of novel hybrid data mining techniques. Journal of Hydrology, 585, 124774. https://doi.org/10.1016/j.jhydrol.2020.124774
  • Khosravi, K., Panahi, M., & Tien Bui, D. (2018). Spatial prediction of groundwater spring potential mapping based on an adaptive neuro-fuzzy inference system and metaheuristic optimization. Hydrology and Earth System Sciences, 22(9), 4771–4792. https://doi.org/10.5194/hess-22-4771-2018
  • Kisi, O., Dailr, A. H., Cimen, M., & Shiri, J. (2012). Suspended sediment modeling using genetic programming and soft computing techniques. Journal of Hydrology, 450–451, 48–58. https://doi.org/10.1016/j.jhydrol.2012.05.031
  • Kisi, O., Genc, O., Dinc, S., & Zounemat-Kermani, M. (2016). Daily pan evaporation modeling using chi-squared automatic interaction detector, neural networks, classification and regression tree. Computers and Electronics in Agriculture, 122, 112–117. https://doi.org/10.1016/j.compag.2016.01.026
  • Kisi, O., & Yaseen, Z. M. (2019). The potential of hybrid evolutionary fuzzy intelligence model for suspended sediment concentration prediction. Catena, 174, 11–23. https://doi.org/10.1016/j.catena.2018.10.047
  • Kleinhans, M. G., van Dijk, W. M., van de Lageweg, W. I., Hoyal, D. C. J. D., Markies, H., van Maarseveen, M., Roosendaal, C., van Weesep, W., van Breemen, D., Hoendervoogt, R., & Cheshier, N. (2014). Quantifiable effectiveness of experimental scaling of river- and delta morphodynamics and stratigraphy. Earth-Science Reviews, 133, 43–61. https://doi.org/10.1016/j.earscirev.2014.03.001
  • Kouadio, L., Deo, R. C., Byrareddy, V., Adamowski, J. F., Mushtaq, S., & Phuong Nguyen, V. (2018). Artificial intelligence approach for the prediction of Robusta coffee yield using soil fertility properties. Computers and Electronics in Agriculture, 155, 324–338. doi:https://doi.org/10.1016/j.compag.2018.10.014
  • Latif, S. D., Chong, K. L., Ahmed, A. N., Huang, Y. F., Sherif, M., & El-Shafie, A. (2023). Sediment load prediction in Johor river: Deep learning versus machine learning models. Applied Water Science, 13(3), 79. https://doi.org/10.1007/s13201-023-01874-w
  • Li, S., & Yang, J. (2022). Modelling of suspended sediment load by Bayesian optimized machine learning methods with seasonal adjustment. Engineering Applications of Computational Fluid Mechanics, 16(1), 1883–1901. https://doi.org/10.1080/19942060.2022.2121944
  • Li, X., Cooper, J. R., & Plater, A. J. (2021). Quantifying erosion hazards and economic damage to critical infrastructure in river catchments: Impact of a warming climate. Climate Risk Management, 32, 100287. https://doi.org/10.1016/j.crm.2021.100287
  • Manning, R. (1891). On the flow of water in open channels and pipes. Transactions of the Institution of Civil Engineers of Ireland, 20, 161–207.
  • Mao, L. (2012). The effect of hydrographs on bed load transport and bed sediment spatial arrangement. Journal of Geophysical Research: Earth Surface, 117(F3). http://doi.org/10.1029/2012JF002428
  • Melesse, A., Ahmad, S., McClain, M. E., Wang, X., & Lim, Y. H. (2011). Suspended sediment load prediction of river systems: An artificial neural network approach. Agricultural Water Management, 98(5), 855–866. https://doi.org/10.1016/j.agwat.2010.12.012
  • Meyer-Peter, E., & Müller, R. (1948). Formulas for bed-load transport. In Proceedings of the 2nd Meeting of the International Association for Hydraulic Structures Research, 39–64.
  • Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in Engineering Software, 69, 46–61.
  • Moayedi, H., Aghel, B., Foong, L. K., & Bui, D. T. (2020). Feature validity during machine learning paradigms for predicting biodiesel purity. Fuel, 262, 116498. https://doi.org/10.1016/j.fuel.2019.116498
  • Moayedi, H., Ahmadi Dehrashid, A., & Nguyen Le, B. (2024). A novel problem-solving method by multi-computational optimization of artificial neural network for modelling and prediction of the flow erosion processes. Engineering Applications of Computational Fluid Mechanics, 18(1), 2300456. https://www.tandfonline.com/doi/full/10.108019942060.2023.2300456
  • Mustafa, A. S., Sulaiman, S. O., & Al_Alwani, K. M. (2017). Application of HEC-RAS model to predict sediment transport forEuphrates River fromHaditha toHeet. Journal of Engineering Sciences, 20(3), 570–577.
  • Nones, M. (2019). Dealing with sediment transport in flood risk management. Acta Geophysica, 67(2), 677–685. https://doi.org/10.1007/s11600-019-00273-7
  • Parker, G. (1990). Surface-based bedload transport relation for gravel rivers. Journal of Hydraulic Research, 28(4), 417–436. https://doi.org/10.1080/00221689009499058
  • Poorhosein, M., Afzalimehr, H., Sui, J., Singh, V. P., & Azareh, S. (2014). Empirical bed load transport equations. International Journal of Hydraulic Engineering, 3(3), 93–101. https://doi.org/10.5923/j.ijhe.20140303.03
  • Qiao, Q., Yunusa-Kaltungo, A., & Edwards, R. (2022). Feature selection strategy for machine learning methods in building energy consumption prediction. Energy Reports, 8, 13621–13654. https://doi.org/10.1016/j.egyr.2022.10.125
  • Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F., Bengio, Y., & Courville, A. (2019). On the spectral bias of neural networks, International Conference on Machine Learning, May 2019.
  • Rahmati, O., Falah, F., Naghibi, A., Biggs, T., Soltani, M., Deo, R. C., Cerdà, A., Mohammadi, F., & Tien Bui, D. (2019). Land subsidence modelling using tree-based machine learning algorithms. Science of The Total Environment, 672, 239–252. https://doi.org/10.1016/j.scitotenv.2019.03.496
  • Recking, A. (2013). Simple method for calculating reach-averaged bed-load transport. Journal of Hydraulic Engineering, 139(1), 70–75. https://doi.org/10.1061/(ASCE)HY.1943-7900.0000653
  • Recking, A. (2019). BedloadWeb. Retrieved April 25, 2022, from https://en.bedloadweb.com/.
  • Recking, A., Vázquez Tarrío, D., & Piton, G. (2023). The contribution of grain sorting to the dynamics of the bedload active layer. Earth Surface Processes and Landforms, 48(5), 979–996. https://doi.org/10.1002/esp.5530
  • Riahi-Madvar, H., & Seifi, A. (2018). Uncertainty analysis in bed load transport prediction of gravel bed rivers by ANN and ANFIS. Arabian Journal of Geosciences, 11(21), 688. https://doi.org/10.1007/s12517-018-3968-6
  • Rosgen, D. L., Silvey, H. L., & Frantila, D. (2006.). Watershed assessment of river stability and sediment supply (WARSSS). Wildland Hydrology.
  • Roushangar, K., & Koosheh, A. (2015). Evaluation of GA-SVR method for modeling bed load transport in gravel-bed rivers. Journal of Hydrology, 527, 1142–1152. https://doi.org/10.1016/j.jhydrol.2015.06.006
  • Samadianfard, S., Jarhan, S., Salwana, E., Mosavi, A., Shamshirband, S., & Akib, S. (2019). Support vector regression integrated with fruit fly optimization algorithm for river flow forecasting in Lake Urmia Basin. Water, 11(9), 1934. https://doi.org/10.3390/w11091934
  • Schlossmacher, E. J. (1973). An iterative technique for absolute deviations curve fitting. Journal of the American Statistical Association, 68(344), 857–859. https://doi.org/10.1080/01621459.1973.10481436
  • Shahiri, P., Noori, M., Heydari, M., & Rashidi, M. (2016). Floodplain zoning simulation by using HEC-RAS and CCHE2D Models in the Sungai Maka River. Air Soil Water Res., 9(9), 55–62.
  • Shakya, D., Deshpande, V., Kumar, B., & Agarwal, M. (2023). Predicting total sediment load transport in rivers using regression techniques, extreme learning and deep learning models. Artificial Intelligence Review, 56(9), 10067–10098. https://doi.org/10.1007/s10462-023-10422-6
  • Shwartz-Ziv, R., & Armon, A. (2022). Tabular data: Deep learning is not all you need Inf. Fusion.., 81, 84–90. https://doi.org/10.1016/j.inffus.2021.11.011
  • Taylor, K. E. (2001). Summarizing multiple aspects of model performance in a single diagram. Journal of Geophysical Research: Atmospheres, 106(D7), 7183–7192. https://doi.org/10.1029/2000JD900719
  • Vakharia, V., Gupta, V. K., & Kankar, P. K. (2016). A comparison of feature ranking techniques for fault diagnosis of ball bearing. Soft Computing, 20(4), 1601–1619. https://doi.org/10.1007/s00500-015-1608-6
  • Vakharia, V., Shah, M., Nair, P., Borade, H., Sahlot, P., & Wankhede, V. (2023). Estimation of Lithium-ion battery discharge capacity by integrating optimized explainable-AI and stacked LSTM model. Batteries, 9(2), 125. https://doi.org/10.3390/batteries9020125
  • Vu, M. T., Jardani, A., Massei, N., & Fournier, M. (2021). Reconstruction of missing groundwater level data by using Long Short-Term Memory (LSTM) deep neural network. Journal of Hydrology, 597, 125776. https://doi.org/10.1016/j.jhydrol.2020.125776
  • Wainwright, J., Parsons, A. J., Cooper, J. R., Gao, P., Gillies, J. A., Mao, L., Orford, J, & Knight, P. G. (2015). The concept of transport capacity in geomorphology. Reviews of Geophysics, 53(4), 1155–1202. https://doi.org/10.1002/2014RG000474
  • Wilcock, P. R. (2001). Toward a practical method for estimating sediment transport rates in gravel bed rivers. Earth Surface Processes and Landforms, 26(13), 1395–1408. https://doi.org/10.1002/esp.301
  • Wilcock, P. R., & Crowe, J. C. (2003). Surface-based transport model for mixed-size sediment. Journal of Hydraulic Engineering, 129(2), 120–128. https://doi.org/10.1061/(ASCE)0733-9429(2003)129:2(120)
  • Zhang, K., Wang, Z. Y., & Liu, L. (2010). The effect of riverbed structure on bed load transport in mountain streams. River Flow 2010 - Dittrich, Koll, Aberle & Geisenhainer (eds),Bundesanstalt für Wasserbau ISBN 978-3-939230-00-7.