Full article: Flood susceptibility mapping using ANNs: a case study in model generalization and accuracy from Ontario, Canada

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Accurate flood susceptibility mapping (FSM) is critical for mitigating the environmental, social and economic consequences of floods. The influence of model generalizability onto new watersheds, and the impact of arbitrarily selecting a small subset of flooded and nonflooded locations are current major knowledge gaps in FSM research impacting predictive accuracy. As such, this study conducts an assessment of machine learning models – (i) an Artificial Neural Network - Synthetic Minority Oversampling Technique (ANN-SMOTE) hybrid ensemble with (ii) knowledge-based Analytical Hierarchy Process (AHP) and (iii) diversity-based Shannon Entropy approaches. The ANN-SMOTE, AHP and Entropy models were trained and tested on the Don River watershed in Ontario, Canada, with Overall Accuracy (OA) results of 0.549, 0.404 and 0.452, respectively. ANN-SMOTE’s predictive accuracy remained high when it was tested on four independent watersheds from southern Ontario, indicating strong generalization ability. To simulate the commonly used flood point inventory approach, the number of training samples was reduced by a factor of a 1000, which resulted in a 28% decrease in accuracy. The high performance and generalization potential of the ANN-SMOTE model demonstrate its utility and versatility for future FSM studies, and as a support tool in flood risk management decision making.

Keywords:

1. Introduction

Floods, the most frequent and severe of global natural disasters, inflict significant social, environmental and fiscal impacts, including the loss of human life, damage to natural habitats and damage to infrastructure (Rincón et al. Citation2018; Yang et al. Citation2018). Factors contributing to floods include rapid urbanization, overflowing of river channels, snow melt and the increased frequency of extreme rain events due to global climate change (Hu et al. Citation2018; Rincón et al. Citation2018; Cabrera and Lee Citation2020). Flood susceptibility mapping (FSM) is a process of identifying flood vulnerable areas based on their physical characteristics. As such, FSM facilitates risk management decision making, assists in implementation of flood mitigation and protection strategies, and provides guidance towards landcover/land use development in flood vulnerable areas (Vojtek and Vojteková Citation2019; Bui et al. Citation2020; Wang et al. Citation2020).

Existing FSM models can be broadly categorized as either physically based or empirical (Giovannettone et al. Citation2018; Mudashiru et al. Citation2021a). Physically based models predict flood susceptibility by using simplified representations of complex physical and natural processes. To do so, a large number of hydrological, meteorological and geophysical inputs are required to parametrize models that are, then, run under different storm scenarios, resulting in a time consuming and iterative process (Giovannettone et al. Citation2018; Choubin et al. Citation2019; Mudashiru et al. Citation2021a). Additionally, larger scale applications (e.g. regional) suffer from significant computational time, and validating physically based models requires high-resolution datasets which may not always be available (Hong et al. Citation2018; Liu et al. Citation2022; Lin et al. Citation2023).

Empirical models, which rely on observational and data-driven relationships between the input and output, can further be subdivided into two categories: quantitative and qualitative. Quantitative models form mathematical relationships between inputs (commonly referred to as flood conditioning factors, FCFs) and the output (e.g. flood susceptibility) (Mudashiru et al. Citation2021b). Examples of quantitative methods include: (i) statistical methods such as Logistic Regression (LR) (Al-Juaidi et al. Citation2018; Ali et al. Citation2020; Pham et al. Citation2020), Frequency Ratio (FR) (Tehrany and Kumar Citation2018; Wang et al. Citation2021), Weights of Evidence (WoE) (Rahmati et al. Citation2015; Costache, Pham, et al. Citation2020) and Shannon Entropy (Khosravi et al. Citation2016; Mahmoody Vanolya and Jelokhani-Niaraki Citation2021) amongst others; and (ii) Machine Learning (ML) models such as Artificial Neural Networks (ANN) (Zhao et al. Citation2018; Jahangir et al. Citation2019; Janizadeh et al. Citation2019; Andaryani et al. Citation2021), Convolutional Neural Networks (CNN) (Wang et al. Citation2020; Zhao et al. Citation2020), Support Vector Machines (SVM) (Liu et al. Citation2022), Random Forests (RF) (Abedi et al. Citation2021; Farhadi et al. Citation2021). Conversely, qualitative models rely on expert opinion to weigh an area’s physical characteristics (FCFs) according to the degree of perceived influence on flood susceptibility (Hong et al. Citation2018). The most commonly used qualitative model for FSM is the knowledge-based Analytical Hierarchy Process (AHP) (Rincón et al. Citation2018; Hammami et al. Citation2019; Vojtek and Vojteková Citation2019).

AHP integrates several competing criteria such as environmental, economical and social factors, as well as has the ability to produce FSMs in areas with scarce flood inventory data (Gudiyangada Nachappa et al. Citation2020; Vojtek et al. Citation2021). AHP prioritizes expert knowledge in ranking FCFs, which can vary the most influential FCF from study to study. This uncertainty due to the subjective nature of the decision-making scheme is a common limitation in AHP studies (Hammami et al. Citation2019; Souissi et al. Citation2019; Das Citation2020).

To minimize this uncertainty from the qualitative nature of AHP, a quantitative statistical approach such as Entropy, can be used as an alternative. Entropy models calculate the degree of diversity within each FCF input, and rank more spatially diverse FCFs as more influential to flood susceptibility (Khosravi et al. Citation2016; Mahmoody Vanolya and Jelokhani-Niaraki Citation2021). One of the most significant limitations of a diversity approach is the lack of consideration given to the interdependencies between the input and output (Malekinezhad et al. Citation2021), leading to potential inaccuracies. This limitation is addressed by using data-driven objective ML approaches, where weighted connections between the inputs and output are formed, to predict flood susceptibility. In particular, ANNs have demonstrated ability to handle complex nonlinear relationships, making them a commonly used data-driven approach in FSM.

ANNs are widely used in many hydrological forecasting applications such as streamflow predictions (Cheng et al. Citation2020; Snieder et al. Citation2020), rainfall-runoff simulations (Asadi et al. Citation2019; Kim and Han Citation2020; Vidyarthi et al. Citation2020), low impact development design optimization (Latifi et al. Citation2019; Raei et al. Citation2019) and FSM (Zhao et al. Citation2018; Jahangir et al. Citation2019; Janizadeh et al. Citation2019; Andaryani et al. Citation2021). In a study of ANN’s comparative performance against other ML methods including self-organizing map (SOM) and fuzzy adaptive resonance theory (FART), the ANN produced the FSM with the greatest accuracy in a river basin in Iran (Andaryani et al. Citation2021). Recently, advanced forms of ANNs, known as deep learning neural networks (DLNNs), have risen in popularity in FSM studies (Chen et al. Citation2021). Chen et al. (Citation2021) compared the performance of these methods, such as CNNs to shallow ANNs, and their results showed that the increased complexity of CNNs yields detrimental performance in comparison to the ANN. The complexity of these advanced models is partially governed by the number of interconnected layers, and may contribute to their lower performance, whereas the shallow ANNs maintain simpler architecture and frequently exhibit performance that is on par with these models (Chakrabortty et al. Citation2021; Kalantar et al. Citation2021; Zhang et al. Citation2021).

Within flood mapping studies, ANN performance is adversely impacted by data imbalances, where there is a much larger representation of nonflooded to flooded training points (Batista et al. Citation2004; Wang et al. Citation2019; Priscillia et al. Citation2021). To remedy this, techniques such as the Synthetic Minority Oversampling Technique (SMOTE) are used to increase the number of flooded points within the training process. ANN performance can also be affected by the number of training points used (Tehrany and Jones Citation2017). Often FSM studies use an arbitrarily selected number of training points to form a point inventory of known flooded and nonflooded locations from within a watershed (Costache, Țîncu, et al. Citation2020; Pham et al. Citation2020; Wang et al. Citation2020; Ahmadlou et al. Citation2021; Shahabi et al. Citation2021). The impact of this approach on ANN performance, in comparison to using a larger number of training points, should be assessed as it is the most prevalent in current practice. A further limitation in existing studies using ANNs for FSM is models are frequently trained and validated on the same watershed, with very few studies testing their performance on independent watersheds (Zhao et al. Citation2021; Seleem et al. Citation2022). This is a key indicator of the ANN’s generalization, facilitating flood susceptibility mapping across multiple areas, which is particularly important in data sparse regions. Therefore, the impact of the number of training points on model performance and the model’s generalizability are some of the existing knowledge gaps within ANN-FSM applications.

To address the aforementioned knowledge gaps, and the current limitations of physically based models, qualitative (AHP) and quantitative (Entropy) approaches, an alternative data driven ANN-SMOTE ensemble model is proposed. These three models were selected for comparison due to their widespread use in FSM studies (Siahkamari et al. Citation2017; Zhao et al. Citation2018; Hammami et al. Citation2019; Andaryani et al. Citation2021; Malekinezhad et al. Citation2021; Costache et al. Citation2022), and ease of application (Mudashiru et al. Citation2021a, Citation2021b).

The proposed ANN-SMOTE ensemble is compared to the existing commonly used models to establish its baseline performance for the Don River watershed in Southern Ontario, Canada. As few studies assess model generalization, and none using ANN ensembles, we propose testing the ANN-SMOTE’s model performance on four independent test watersheds within Southern Ontario. This novel research aims to demonstrate the generalizability of ANNs in producing accurate FSMs for independent watersheds whose inputs were not part of the ANN calibration, thereby contributing to a better understanding of the predicative ability of ANNs. In doing so, the ANN’s utility and applicability is highlighted not only for data scarce regions but introduces the ANN as an alternative or complementary approach to physically based and qualitative models for updating outdated FSMs. Finally, given the prevalence of the flood point inventory (FPI) approach, we assess the performance of conventional ANN’s developed in this manner to the baseline performance of the ANN-SMOTE model developed using a larger number of training samples. This comparison is missing in existing literature, signifying a necessary comparison to further the understanding and application of the FPI method within FSM studies. The findings of this study highlight the superiority of FSMs produced using the ANN-SMOTE ensemble, and may be easily expanded to advancing generalization frameworks and uncertainty quantification within the FPI approach.

2. Materials and methods

2.1. Study area

Five watersheds from southern Ontario were selected for this research: the Don, Highland, Humber, Etobicoke and Mimico River watersheds, as shown in . These watersheds lie within the Toronto and Region Conservation Area (TRCA) jurisdiction – the local government authority in charge of flood protection – who provided the existing modelled floodplain maps used to build the ANNs. These five watersheds were selected out of the ten within the TRCA jurisdiction, as they contain eight out of the top ten Flood Vulnerable Areas (FVAs) identified by the TRCA (IBI Group Citation2019). These FVAs consist of clusters of roads and structures that have experienced and are expected to be susceptible to riverine flooding under severe and less severe storms, presenting a risk to both human life and economic welfare (IBI Group Citation2019). Of the five selected watersheds, the Don River watershed was selected as the training watershed as it is the most densely populated, with an area of 360 km², home to 1.4 million residents and has the commercial and financial centre of the City of Toronto within its boundaries. The independent test watersheds are the Highland, Humber River, Etobicoke and Mimico watersheds. The Highland watershed is the most urbanised of all five, with over 89% urban landcover and an area of 102 km². The Humber River watershed is the largest, with an area of 911 km² and 600 waterbodies. It is home to approximately 1 million inhabitants and has 37% urban landcover. Etobicoke has an area of 212 km² and approximately 67% urban landcover. Finally, Mimico is the smallest watershed with an area of 78 km² and 88% urban landcover.

Figure 1. The location of the selected watersheds within Southern Ontario, Canada.

2.2. Flood conditioning factors

The FCFs chosen in this study were selected based on their widespread use within FSM studies, consisting of the most commonly used factors which are considered integral to the field of FSM (Wang et al. Citation2020; Zhao et al. Citation2020; Bunmi Mudashiru et al. Citation2022; Dahri et al. Citation2022; Seleem et al. Citation2022). Rincón et al. (Citation2018) used AHP to conduct a flood hazard mapping study for the lower Don River where they found the best performing model used distance to streams (DS), slope, height above nearest drainage (HAND) and curve number (CN). This study builds on these results by introducing effective precipitation (EP) as it incorporates rainfall (commonly used in studies) (Tehrany et al. Citation2014; Hammami et al. Citation2019; Andaryani et al. Citation2021; Vilasan and Kapse Citation2022) into a surface runoff depth calculation. Topographic wetness index (TWI), which also factors flow accumulation, is frequently used in literature (Vojtek and Vojteková Citation2019; Das Citation2020; Vojtek et al. Citation2021; Vilasan and Kapse Citation2022).

Detailed information for each FCF is provided below. These FCFs were generated as geospatial layers using: a 30 m-by-30 m Digital Elevation Model (DEM), a 1 to 50,000 m resolution landcover dataset, a 1 to 50,000 m watercourse layer, and finally a precipitation layer obtained from an online IDF tool developed by Simonovic et al. (Citation2015) for gauged and ungauged locations. All the FCFs were converted to raster format with a grid size of 30 m-by-30m using ArcGIS 3.1.2.

Slope was selected because it governs the velocity of surface runoff and its pooling potential (Choubin et al. Citation2019; Vojtek and Vojteková Citation2019; Islam et al. Citation2023). A flatter slope angle leads to slower water velocities and greater water stagnation potential; therefore, flatter regions are at greater risk of flooding. Slope was calculated using the Slope tool in ArcGIS with the DEM layer as an input.

The DS indicates the proximity to rivers and streams, since flood vulnerable areas are generally in closer vicinity to river and streams (Rahmati et al. Citation2015; Shafizadeh-Moghadam et al. Citation2018; Wang et al. Citation2020; Andaryani et al. Citation2021). The DS was calculated using Euclidean distance from the watercourses layer.

The CN of an area, defined by the Soil Conservation Services (SCS), describes the infiltration capacity and subsequent runoff generation capability of an area, and is thus an indicator of landuse/landcover and soil permeability (Zhao et al. Citation2018; Rahmati et al. Citation2020). Values of CN can range from 30 for areas with highly permeable soils with low runoff potential to 98 for impermeable surfaces with high runoff potential. This layer was generated using the SCS CN values published (USDA Citation1986) as well as the landuse and soil type.

Rainfall data is commonly included in FSM studies as it is directly related to flooding potential (Tehrany et al. Citation2014; Hammami et al. Citation2019; Das Citation2020; Bera et al. Citation2022). Using the online IDF tool, rainfall depths across the watershed were obtained and interpolated using the inverse distance weighting (IDW) method, to produce a Total Precipitation (TP) layer. The EP layer was then generated using the TP layer. EP, also known as runoff, is the precipitation depth after losses (such as infiltration). The greater the value of EP, the greater the potential for localized flooding as the runoff pools on the surface. It is calculated using TP and CN using the following equation: (1) $EP = \frac{{(TP - \frac{5080}{CN} + 50.8)}^{2}}{(TP + \frac{20, 320}{CN} - 203.2)}$ (1) where TP represents total precipitation, CN represents the curve number. This equation is expressed in metric units for mm.

The TWI is a quantitative factor incorporating slope and flow accumulation to identify areas with high water accumulation potential (Khosravi et al. Citation2016; Tehrany and Kumar Citation2018; Bui et al. Citation2020; Costache, Ngo, et al. Citation2020). Areas having a higher TWI are more susceptible to flooding than areas with a lower TWI value. It is calculated as follows: (2) $TWI = ln (\frac{α}{tan β})$ (2) where $α$ is the upslope catchment area draining from a point with a slope angle of $β .$

Finally, HAND impacts flood susceptibility as low-lying areas adjacent to streams are more susceptible to flooding, in contrast to higher-lying ground (Rincón et al. Citation2018). This layer was generated using the HAND tool, which is part of the Riparian Topography Toolbox created by Dilts (Citation2015). Simply, HAND is the elevation difference between a grid cell and its nearest drainage cell.

The FCFs layers for the Don River are provided in , whereas the input data source for each layer is provided in .

Figure 2. FCF layers for the Don River.

Table 1. FCFs and their associated data sources.

Download CSV Display Table

To ensure consistency of units across these six FCFs for each of the three proposed methods, ANN-SMOTE, AHP and Entropy, each input value was normalized based on whether larger or smaller values corresponded to greater flood susceptibility (Mahmoody Vanolya and Jelokhani-Niaraki Citation2021). Datasets in which larger values represent a higher flood susceptibility potential were maximised, such that the highest value in the dataset was assigned a 1 and the lowest value a 0. Conversely, minimum normalization was performed on datasets where lower values represent a higher flood susceptibility and were therefore assigned a value of 1. For example, larger values of CN led to greater flood susceptibility so maximum normalization was applied, whereas smaller values of DS correspond to greater flood susceptibility, so minimum normalization was applied. The maximum and minimum normalization are given by EquationEqs. (3)(3) $a_{ij} = \frac{S_{ij} ‐ S_{j}^{min}}{S_{j}^{max} ‐ S_{j}^{min}}$ (3) and Equation(4)(4) $a_{ij} = \frac{S_{j}^{max} ‐ S_{ij}}{S_{j}^{max} ‐ S_{j}^{min}}$ (4) , respectively. (3) $a_{ij} = \frac{S_{ij} ‐ S_{j}^{min}}{S_{j}^{max} ‐ S_{j}^{min}}$ (3) (4) $a_{ij} = \frac{S_{j}^{max} ‐ S_{ij}}{S_{j}^{max} ‐ S_{j}^{min}}$ (4) where $S_{ij}$ is the value of the ith pixel in the jth dataset, $S_{j}^{max}$ is the maximum value of the jth dataset, $S_{j}^{min}$ is the minimum value of the jth dataset and $a_{ij}$ is the standardized pixel value at location i for the jth dataset. Accordingly, CN, TWI and EP were maximised whereas slope, HAND and DS were minimised.

2.2.1. Similarities of FCFs across selected watersheds

As ANN generalizability is being assessed, an investigation of the similarities and differences between the Don River FCFs and the independent test watersheds is necessary. To do this, the probability density functions (pdf) of each FCF for each watershed were plotted and are presented in Subsection 3.1. These plots provide a visual indication of the shared topographic and physical characteristics across the five urban watersheds; allowing a greater understanding of the applicability of ANN generalizability. For example, if the watersheds share similarities across slope, we can surmise that the ANN will have consistent performance across these watersheds as they share these similarities. This is of vital importance as it underscores the ANN as an established accurate approach to FSM within these urban watersheds and may be used in place of its time consuming, data-intensive and subjective physical and qualitative model counterparts.

2.3. Flood inventory

Due to the nature of fluvial flooding, and the increased susceptibility of floodplain adjacent areas to flooding (Zhao et al. Citation2020), the models were trained on and validated against a floodplain layer obtained from the TRCA. This layer delineates the extent of the floodplain through both hydrological and hydraulic modelling. The hydrological model simulates the flow within rivers and streams based on the rainfall resulting from a 100-year or regional storm. The hydraulic model then shows the inundated areas and extent of the floodplain (How Does TRCA Define Flood Risk? 2023).

2.4. Analytical hierarchy process

AHP is a pairwise comparison framework based on expert knowledge contribution for multicriteria and multi-objective decision making (Gudiyangada Nachappa et al. Citation2020; Bunmi Mudashiru et al. Citation2022). Developed by Saaty (Citation1988), the criteria are ranked against each other using an established scale ranging from 1 to 9, where 1 is equal relevance and 9 is extreme relevance. To calculate the AHP rankings, the six FCFS are ranked in a pairwise comparison where variables considered to have equal influence over flood susceptibility are given a ranking of 1. Where one variable is deemed to be more important than the other, a higher ranking is given to that variable. For example, if DS is considered to have moderate importance over CN, the ranking of DS would be 3 when compared to CN and the ranking of CN would be 1/3 when compared to DS. The pairwise comparison matrix is, then, normalized using the eigenvector technique and tested for consistency using the consistency index (CI).

The CI is calculated as follows: (5) $CI = \frac{λ - n}{n - 1}$ (5) where $λ$ is the eigenvector and n is the number of criteria. For the AHP method to be consistent, the CI must be below 0.1.

2.5. Shannon entropy

The Entropy ranking method is based on principle of Shannon Entropy, which measures the amount of diversity within the dataset. If the dataset is more diverse across its values, the greater its Entropy-based rank will be (Mahmoody Vanolya and Jelokhani-Niaraki Citation2021). The rank of each j dataset can be calculated as follows: (6) $E_{j} = - \frac{\sum_{i = 1}^{m} P_{ij} ln (P_{ij})}{ln (m)}$ (6) where $E_{j}$ is the Entropy value, m is the dataset size and $P_{ij}$ is calculated as follows for each normalized pixel value: (7) $P_{ij} = \frac{a_{ij}}{\sum_{i = 1}^{m} a_{ij}}$ (7) where $a_{ij}$ is the ith pixel value for the jth FCF. The degree of diversity, (b_j), in the dataset is, then, calculated as: (8) $b_{j} = 1 - E_{j}$ (8)

The more diverse a dataset is, the higher the value of this criterion. The larger this value, the smaller the value of Entropy will be, and the larger the Entropy-based ranking ( $W_{E_{j}}$ ) is, which can be calculated as follows: (9) $W_{E_{j}} = \frac{b_{j}}{\sum_{j = 1}^{n} b_{j}}$ (9)

Therefore, this demonstrates that a dataset is considered more important if it is more diverse. If a dataset is completely homogenous, its criterion ranking will be 0 and subsequently it will be removed from the decision-making dataset.

Through this approach, the FCFs are ranked solely on the diversity within that dataset. If there is little variation in the values of the dataset, it will be considered to have little importance to flood susceptibility and its weighting will reflect that. For example, if the slope within the entire watershed is consistently shallow, it will have a smaller ranking than if there was more variation within the topography. This method prioritizes variables with greater diversity, however, that may not be a holistic depiction of its importance to flood susceptibility.

2.6. Artificial neural networks

As demonstrated, shallow ANNs exhibit comparable performance to their deep learning counterparts (Chakrabortty et al. Citation2021; Chen et al. Citation2021; Kalantar et al. Citation2021; Zhang et al. Citation2021), providing a commonly used, robust and easily implementable model, an important advantage to policymakers and modellers. The ANN framework commonly consists of a set of interconnected layers, namely the input, hidden and output layer. Information is relayed across the layers in a feedforward sequential nature. Each input-hidden node connection has an associated multiplicative weight, w_k, that is randomly initialized at the onset of training, and iteratively adjusted at each training step to minimize the cross-entropy cost function. Similarly, the output-hidden node connection has both randomly initialized weights and biases, b_k, as well as a nonlinear activation function. In developing this ANN, a trial-and-error procedure was used to determine the number of hidden nodes. A single layer with 10 hidden nodes was selected, as using a larger number of hidden nodes had negligible impact on predictive accuracy. Thus, to avoid increasing model complexity and to align with best practises, a small hidden layer was selected (Khan et al. Citation2018; Snieder et al. Citation2021). Six inputs nodes were initialized for each FCF, where each input was normalized following the procedure mentioned in Subsection 2.2, and one output node predicted flood susceptibility in the range of [0,1]. The scaled conjugate descendent backpropagation algorithm was used along with a sigmoid output activation function. An early stopping procedure was used in the validation dataset, whereby if the error between the predicted and true output continuously increased for 6 epochs, the training is stopped to minimize the risk of overfitting.

Ensemble modelling has been proven to produce more accurate FSMs, compared to using individual ANN models (Ren et al. Citation2016; Choubin et al. Citation2019; Ahmed et al. Citation2021; Yaseen et al. Citation2022). Therefore, a 50 ANN ensemble was created where each ANN used a 60 − 20 − 20% datasplit for training, validation and testing, respectively. Each new ANN in the ensemble was generated by randomly resampling the original input dataset with replacement, a process known as bootstrapping. Additionally, each model is trained, validated and tested on a different subset of the original dataset, producing distinctly initialized models and predictive capabilities. The ANNs were trained and validated against the Don River TRCA modelled floodplain layer to predict flood susceptibility. Model performance was then tested on the entire input dataset and on four independent test watersheds not used in the training process. The training process is adversely impacted with the use of imbalanced datasets (Batista et al. Citation2004; Priscillia et al. Citation2021), which is the case in this study as the nonflooded points make up approximately 90% of the input dataset.

2.6.1. Data imbalances - SMOTE

To increase the number of underrepresented flooded points, SMOTE is used in conjunction with the ANN ensemble to create a hybrid ANN-SMOTE ensemble approach. This approach increases the number of underrepresented variables in the dataset by finding the K nearest neighbours (KNN) to the minority sample and calculating the Euclidean distance between them. This difference is multiplied by a random number between 0 and 1 and added back to the minority sample, effectively creating a random point along the line segment between the minority sample and its neighbour (Chawla et al. Citation2002). To remain consistent with common practices, five KNNs were used creating an additional 100,000 input data points for the base Don River model, for a total of approximately 507,000 input points. Within FSM studies, however, it is common practice to develop a FPI consisting of a balanced set of 100–200 flooded and nonflooded data points (Costache, Țîncu, et al. Citation2020; Pham et al. Citation2020; Wang et al. Citation2020; Ahmadlou et al. Citation2021; Shahabi et al. Citation2021). The performance implications of using a much smaller training dataset in comparison to the ANN-SMOTE should be explored.

2.6.2. Varying the number of training points

A second ANN ensemble approach is developed in this study to assess the impacts on predicted FSM accuracy. To remain consistent with standard practice, 200 nonflooded and flooded data points were randomly selected from the Don River and floodplain layer, respectively, for a total of 400 points. This approach will be referred to as the ANN-FPI and will be compared with the ANN-SMOTE approach. To predict flood susceptibility across the entire watershed, the complete FCF dataset for each watershed was used as inputs into each respective ANN ensemble approach. Meaning, although the ANN-FPI approach was trained on 240 out of the 400 points, the input dataset to predict flood susceptibility across the Don River consisted of each 30 m-by-30 m pixel, resulting in 407249 input points for each of the 6 FCFs (a 407249 by 6 input matrix). Throughout this study, the ANN-SMOTE is approach is used, except where the comparison to ANN-FPI is conducted.

2.6.3. Combined neural pathway strength

Knowing which FCF has the greatest propensity for inducing flood susceptibility within a region is a critical piece of information for policymakers, affecting their ability to accurately allocate flood preventative measures. As such this information is extracted from the ANN ensemble through using the different neural weights between the input and output, which signify the input’s predicative capability to the output. These weights can be extracted from the ensemble of nets to quantify the relative importance of each input, through a process known as Combined Neural Pathway Strength (CNPS) (Snieder et al. Citation2020).

The CNPS method calculates the neural pathway strength between the inputs and output based on their weights and ranks the inputs in order of highest pathway strength. This is first done through a matrix multiplication of the input-hidden weights and hidden-output weights. Then, the inputs are ranked based on the consistency of their positive or negative correlation to the output. This is calculated as follows: (10) $α_{i} = \frac{max (\sum ({CNPS}_{i} > 0), \sum ({CNPS}_{i} < 0))}{n}$ (10) where ${CNPS}_{i}$ represents matrix multiplication of the weights and n is number of ensemble members. If in an ensemble of 50 ANNs, an input exhibits positive correlation with the output 50 times, it will have a consistency score of 1. If there are multiple inputs with a consistency score of 1, they are ranked on the relative range of their CNPS values as follows: (11) $S = \frac{min ({CNPS}_{i})}{max ({CNPS}_{i})}$ (11)

Inputs with a smaller CNPS range will receive a higher score, with a maximum score of 1, indicating the same stable behaviour across the entire ensemble. Using CNPS in this manner offers an explanatory approach to which inputs are deemed most influential to predicting flood susceptibility.

2.7. Performance metrics

To analyze and validate predictive performance, the area under the Receiver Operating Characteristic curve (AUC-ROC) and the overall accuracy (OA) were used. The AUC-ROC curve is a commonly used metric, constructed by plotting sensitivity on the y-axis and (1 – specificity) on the x-axis (Shafizadeh-Moghadam et al. Citation2018; Rahmati et al. Citation2020; Wang et al. Citation2020). Sensitivity measures the ratio of accurately predicted flood locations to the total number of flood locations, while specificity measures the ratio of accurately predicted nonflooded locations to the total number of nonflooded locations. The AUC-ROC is indicative of predictive capability, and is bounded between 0 and 1, where a larger value indicates better model performance. OA measures the ratio of correct predictions to the total number of data points and is measured as follows: (12) $OA = \frac{TP + TN}{TP + TN + FP + FN}$ (12) where TP is true positive, TN is true negative, both denoting correctly classified flooded and nonflooded locations, respectively. Additionally, FP is false positive, and FN is false negative both denoting inaccurately classified flooded and nonflooded locations, respectively. As both ANN-FPI and ANN-SMOTE predict flood susceptibility and not a binary classification of flood occurrence, the predictions must be translated to a binary classification for the calculation of OA components. This is done through obtaining each unique flood susceptibility value and using that as a threshold for binary reclassification. For example, if the flood susceptibility predictions ranged from 0 to 1, for each threshold, all predictions greater than or equal to this value are classified as 1 and all values less than this value are classified as 0.

Within the ANN ensemble model, the median of all predictions was calculated and then used in the AUC-ROC and OA calculations. For the Don River model, the complete predicted median dataset was used to calculate AUC-ROC and OA, and similarly, the complete dataset of the remaining four independent test watersheds was used to calculate the test performance of the Don River for both ANN approaches.

2.8. Multicollinearity analysis

Multicollinearity occurs when highly correlated factors are introduced in the modelling process, leading to a potentially biased output (Wang et al. Citation2021). Therefore, this is tested for through the Variance Inflation Factor (VIF), which is calculated as follows: (13) $VIF = \frac{1}{1 ‐ R^{2}}$ (13) where $R^{2}$ is the coefficient of determination. There is consensus that VIF values < 10 indicate no multicollinearity problem, whilst values above that are problematic and indicate those variables should be removed from the input data set (Rahmati et al. Citation2020).

3. Results

3.1. FCF distribution across all watersheds

provides the pdfs of the FCFs across the five watersheds. All five watersheds are relatively flat with typical slope values below three degrees. Since slope is used in the TWI calculation, the TWI pdfs are also similar across the watersheds. The differences are more apparent in the CN, although the watersheds are all highly urbanized, the degree of imperviousness differs. The Don River and Highland watersheds both have large diversity of DS values, while Etobicoke, Humber and Mimico have a larger concentration of regions close to streams. Finally, HAND values are typically concentrated within the lower bounds, indicating low lying regions close to flow accumulating regions. Humber, however, has more regions that are low lying close to the flow accumulation regions compared to the other watersheds. These results suggest that there are similarities within the topography of watersheds, as seen by slope and TWI, and more pronounced variations in DS, HAND and CN.

Figure 3. FCF probability distribution comparison across all watersheds.

3.2. Multicollinearity

The multicollinearity test was conducted on the input dataset to ensure there is no strong correlation between the variables. The VIF values are presented in and based on this, none of the FCFs exhibit a VIF value greater than 10, therefore, there is no strong correlation between them, and they are all included in the input dataset for the three methods.

Table 2. Variance inflation factor values for each of the FCFs.

Download CSV Display Table

3.3. ANN, AHP and entropy comparisons

The AHP rankings were derived using expert knowledge, shown in . The FCFs were classified into five categories of flood susceptibility: very low, low, medium, high and very high using the natural breaks method. These rankings lead to a CI value of 0.007, which meets the 0.1 threshold for consistency.

Table 3. FCF rankings for AHP.

Download CSV Display Table

lists the FCF rankings obtained for all three methods, the ANN-SMOTE, the AHP and Entropy weightings. The ANN-SMOTE rankings are obtained through CNPS and Entropy rankings are assigned based on their degree of diversity, as explained in Subsection 2.5. Each method assigns a different ranking to the FCFs, representative of its influence to flood susceptibility. These ranking are highly informative to policymakers as they can be used as explanatory variables, representative of the underlying physical characteristics with the greatest propensity for flooding.

Table 4. Ranking scheme for the FCFs for each FSM method.

Download CSV Display Table

The resulting flood susceptibility maps of the Don River from the ANN-SMOTE, AHP and Entropy are shown in , along with the actual floodplain map. The AHP and Entropy FSM was derived using the weighted overlay method in ArcGIS. Each FCF is overlayed according to the rankings given in . Each pixel within the watershed is multiplied by the FCF ranking and the final map is produced by the addition of each FCF pixel after it is multiplied by its associated ranking. These maps show five categories of susceptibility; very low, low, medium, high and very high. ANN-SMOTE’s predictions were classified into the same five categories using the equal interval method in ArcGIS, where pixels with a flood susceptibility value below 20% were rated very low, pixels between 20% and 40% were rated low and so forth. With 84% of flood susceptibility pixels classified as very low flood susceptibility, ANN-SMOTE had the highest proportion of very low susceptibility pixels. The very high susceptibility pixels accounted for 3.77% of the total watershed flood susceptibility and were mainly concentrated along the floodplain. Results from AHP and Entropy did not exhibit a similar spatial pattern with more spread-out locations of very high flood susceptibility. Both methods had most of the flood susceptibility points concentrated in the high and medium flood susceptibility categories. The ANN-SMOTE had 60.6% and 31.7% susceptibility in the high and medium categories, respectively, whereas, using Entropy yielded 39.8% and 49.8% flood susceptibility, respectively. Very low flood susceptibility was not recorded in the AHP method, whereas it occupied only 0.01% in the Entropy method, a very large difference from the ANN-SMOTE results.

Figure 4. A comparison of the FSMs resultant from ANN-SMOTE (top right), Entropy (bottom left) and AHP (bottom right) along with the actual floodplain (top left).

provides the AUC-ROC results, with the dashed green line representing the random predictor model performance. Any lines falling below the dashed line indicate worse performance than randomly predicting whether a flood occurs or not at a given location. shows the OA and AUC-ROC scores. AUC-ROC was the highest for ANN-SMOTE with a value of 0.902 whereas AHP and Entropy achieved values of 0.650 and 0.472. Higher AUC-ROC values indicate better model performance so, ANN-SMOTE exhibits the best performance. In OA, ANN-SMOTE has the greatest accuracy with 54.9% whereas AHP and Entropy follow behind with 40.4% and 45.2%, respectively. It is worth noting that these values of OA are averaged over a range of threshold values and as such, high OA values are neutralized by low OA values. Nonetheless, these results suggest that across the threshold values for all three methods, ANN-SMOTE demonstrated greater accuracy in classifying flood susceptibility across the watershed than the other two methods.

Figure 5. AUC-ROC performance comparison for the ANN, AHP and Entropy for the Don River.

Table 5. Performance metrics for the ANN-SMOTE, AHP and Entropy methods on the Don River.

Download CSV Display Table

Given the lower performance metrics, the subjective nature and potential bias introduction characteristic of AHP as well as the lack of interdependence consideration between the inputs and output in Entropy, the ANN-smote will be the sole model further explored due to its performance, ease of implementation and data-driven nature.

3.4. Testing ANN generalizability

To test ANN-SMOTE’s ability to generalize flood susceptibility, it was applied on four independent watersheds, which were not included in the training of the ANN. The resulting ANN-SMOTE flood susceptibility maps and each watershed’s floodplain are compared in , and the performance metrics for each watershed is given in . The floodplain maps show the extent of flood water in the river valley and potential spill areas where the flood waters were not contained within the river valley.

Figure 6. Test watersheds predicted by the Don River ANN-SMOTE model. From top to bottom: Etobicoke (1st row). Mimico (2nd row). Humber (3rd row). Highland (4th row).

Table 6. Performance metrics for the four test watersheds.

Download CSV Display Table

In the Highland model, ANN-SMOTE had the highest AUC-ROC performance metric, but the lowest OA in comparison with the other modelled watersheds, with an OA of 53.2%. Generally, OA values remained in the range of 50% with the Humber, Mimico and Etobicoke models demonstrating OA values of 57.3%, 55.7% and 56.9%, respectively. Given the pdfs demonstrated in , the consistent prediction accuracy across the watersheds is expected, as they share topographic similarities between the Don River and independent test watersheds. Additionally, from , the predicted FSM maps for these independent test watersheds, align very closely with the modelled TRCA floodplain. The very high susceptibility regions are concentrated along the floodplain for each watershed, with most of the high susceptibility areas also centered around the floodplain. These prediction results and shared similarities demonstrate the ability of ANN-SMOTE to accurately predict flood susceptibility across urban watersheds. This marks a significant finding as these results are particularly important in data-scarce regions where hydrological/hydraulic modelling may be limited due to the large data requirement of these models. Additionally, any outdated FSMs that require updating due to shifting urban patterns could be produced using ANN-SMOTE in this manner with demonstrated prediction accuracy.

ANN-SMOTE’s ability to infer connections between the FCFs and flood susceptibility is clearly demonstrated in the Humber model. As demonstrated in , the Humber model exhibits significantly more flood susceptible areas when compared to the floodplain. This is because certain minor tributaries were not included in the hydraulic modelling efforts undertaken by the TRCA. They are however included in the ANN-SMOTE’s flood susceptibility map as those streams were not removed from the input set, demonstrating that the ANN-SMOTE trained on the Don River inferred a strong connection between the proximity and existence of streams to that of flood susceptibility. This is shown through the concentration of predicted very high flood susceptibility around the rivers and tributaries, a finding that aligns with the tied first place ranking of DS by CNPS in . As such, even though these smaller tributaries were removed from the modelling process for the TRCA modelled floodplain map, their existence within the ANN-SMOTE’s input set guarantees that high flood susceptibility is observed in these areas. Additionally, the Humber watershed is uniquely characterized by a larger number of smaller streams and tributaries, a fact not shared with the other watersheds.

The results for the Mimico model demonstrated the lowest AUC-ROC value, however, OA results were on par with other watersheds. This could be attributed to a greater proportion of FPs, as shown by the streams predicted throughout the Mimico model in , which are not found within the TRCA floodplain layer. The FPs decrease the (1 – specificity) value in AUC-ROC, which is not explicitly calculated in OA, explaining the difference in values.

3.5. A comparison of ANN-SMOTE to ANN-FPI

The accuracy of FSM’s generated using a subset of flooded and nonflooded points (ANN-FPI approach) is compared against the FSM generated using the ANN-SMOTE approach. The results for the Don River are shown in (may be compared with the ANN-SMOTE approach in ) and a comparison of performance metrics for both approaches is provided in .

Figure 7. FSM for the Don River produced by the ANN-FPI method.

Table 7. Performance metrics comparison for the ANN-SMOTE and ANN-FPI approach.

Download CSV Display Table

As demonstrated by the results, the AUC-ROC metrics are very similar for both approaches, while the OA score deteriorates from 0.549 for ANN-SMOTE to 0.265 for the ANN-FPI approach. Although both metrics are averaged, the results suggest that a greater portion ANN-FPI predictions demonstrate lower performance than the ANN-SMOTE approach. Graphically, this finding is supported by , which provides the components of a confusion matrix (namely TP, TN, FP, FN) for both approaches across each threshold within the ANN-SMOTE predictions. As shown in , the proportion of TN identified by the ANN-SMOTE is greater than those identified by ANN-FPI, suggesting that ANN-FPI more frequently misclassifies nonflooded areas as flooded. This is reiterated in the plot of FP and FN predictions, where the ANN-FPI demonstrates a greater tendency to incorrectly identify nonflooded areas as flooded. The bottom two plots show the individual components of the AUC-ROC curve, alongside the AUC-ROC curve. This is provided to highlight the differences between the individual components of the AUC-ROC plot, as those differences are lost in translation within the AUC-ROC plot. From the false positive rate of ANN-FPI is consistently greater than ANN-SMOTE, which is supported by the findings of TN and FP. The sensitivity is greater for ANN-FPI than ANN-SMOTE until a threshold value of approximately 0.7 when the opposite becomes true. This suggests that the ANN-FPI approach performs better up to a threshold of 0.7, also demonstrated by where the TP metric of ANN-SMOTE overtakes ANN-FPI. These findings suggest a closer look at the individual components of AUC-ROC are necessary to obtain a more holistic view of performance metrics, as these differences between false positive rate and sensitivity are not as visibly expressed in .

Figure 8. Confusion matrix components for the ANN-FPI and ANN-SMOTE approach for the Don River. The plots demonstrate (a) TP and TN metrics, (b) FP and FN metrics, (c) False positive rate and sensitivity and (d) AUC-ROC plots.

Additionally, a comparison of the mean and median, provided in , of both approaches reveals that ANN-FPI overestimates flood susceptibility, indicated by the larger values of both mean and median. For the same input set of FCFs for the Don River, the ANN-FPI approach has a mean flood susceptibility prediction of 26.5% compared to 11.8%. These results reiterate the previous results of , where it was demonstrated that ANN-FPI has a higher misclassification rate of FPs. This is also graphically demonstrated by the pdf of flood susceptibility, shown in . Both approaches display a right skew, which is expected given that there are more areas with lower flood susceptibility. The ANN-FPI approach has a greater spread of flood susceptibility predictions, indicating greater uncertainty and variability within the predictions, whereas the ANN-SMOTE approach shows a much lower spread, with a much larger concentration of predicted data under 10% flood susceptibility.

Figure 9. Probability distribution of the Don River flood susceptibility values for both ANN-SMOTE and ANN-FPI.

A breakdown percentage of each susceptibility class and its respective area is provided in . The percent difference in is between the ANN-FPI and ANN-SMOTE approach. The ANN-FPI approach overestimates the low, medium and high susceptibility areas and underestimates the very low flood susceptibility areas. This demonstrates a more conservative approach to FSM due to its overestimating tendency.

Table 8. A comparison of the Don River map areas designated by their various susceptibility categories for each FSM approach.

Download CSV Display Table

These findings demonstrate the ANN-FPI approach more frequently misclassifies nonflooded regions as flooded, thereby resulting in a decrease in prediction accuracy. These results are presented to highlight these tendencies, and therefore, bring attention to the conservative nature of these predictions, especially given the wide-spread use of this approach within FSM studies.

The impact of using the ANN-FPI approach on the test watersheds is evaluated next. The resulting FSM maps for the four test watersheds along with the performance metrics table are provided in and , respectively.

Figure 10. FSM produced by the Don River ANN-FPI. Top left: Etobicoke. Top right: Mimico. Bottom left: Humber. Bottom right: Highland.

The difference between the ANN-SMOTE and ANN-FPI approach performance metrics are also provided in . A positive difference indicates that the ANN-SMOTE approach had better performance, whereas a negative difference indicates the ANN-FPI approach performed better. Based on these results, there was no substantial change in the AUC-ROC metric. Based on this finding, as well as the findings of , the AUC-ROC metric may not be sensitive enough to portray the differences in its individual components. This highlights the importance of multiple performance metrics and suggests AUC-ROC results should be presented along with their individual components for a more accurate depiction of performance. OA values demonstrated a drop, falling from approximately 55% to 28%. It is worth noting that though OA values are averaged, meaning high performance is neutralized by lower performance, averaging to about 50% accuracy as expected, the drop to 28% signals a decrease in TN predictions, which was shown in the Don River predictions in . This drop in OA signifies the conservative approach of ANN-FPI, extended onto the independent test watersheds.

Table 9. Performance metrics for the test watersheds based on the Don River ANN-FPI.

Download CSV Display Table

Flood susceptibility of each class as a proportion of the entire watershed is provided in . Similar to the findings of the Don River, the ANN-FPI approach here demonstrates a conservative estimate of flood susceptibility across the watersheds, by underestimating the most extreme flood susceptibility designations (very low and very high) and overestimating the remaining three categories. This is indicative of its lower accuracy, relative to ANN-SMOTE, and subsequent performance drop in accurately distinguishing between flooded and nonflooded locations, resulting in a more unimodal probability distribution rather than the more realistic bimodal distribution.

Table 10. A comparison of the proportion of watershed areas categorized into their respective flood susceptibility designation.

Download CSV Display Table

For completeness, four other ANNs were trained using the inputs of each independent watershed and were used to predict flood susceptibility across the remaining watersheds, including the Don River. This was done for both ANN-FPI and ANN-SMOTE, and the findings from the Don River trained ANN were found to be consistent across the results from these ANNs, which are presented in Appendix.

4. Discussion

One of the requirements of successful flood susceptibility mitigation and prevention are accurate FSMs (Wang et al. Citation2020). Of the various FSM approaches, shallow ANNs were used to address the limitations associated with the computational complexity of physically based models (Giovannettone et al. Citation2018; Choubin et al. Citation2019), the subjective nature of qualitative models (AHP) (Souissi et al. Citation2019; Das Citation2020) and the lack of interdependency consideration in diversity-based quantitative approaches (Entropy) (Malekinezhad et al. Citation2021). The ANN-SMOTE model developed in this study demonstrated strong potential as an accurate FSM tool with good generalization. It can be used as a complementary or standalone model in future studies where other approaches may not be feasible. Finally, using the FPI approach adversely impacts predictive accuracy when compared to using a larger training set, and given its prevalence, this uncertainty should be expressed as a limitation in future studies.

Based on a comparison of the performance metrics, the ANN-SMOTE exceeded AHP and Entropy across all metrics. These findings are supported by Andaryani et al. (Citation2021), where the ANN outperformed FART and SOM. Dahri et al. (Citation2022) compared the performance of Analytic Network Process (ANP) and ANNs and found the ANN to outperform the ANP for FSM. Finally in landslide susceptibility mapping, Park et al. (Citation2013) found the ANN to have the highest AUC-ROC values out of AHP, RF and FR. Akay (Citation2021) compared multiple FSM approaches, amongst which were Entropy and AHP, and found them to have similar performance, which resonates with this study’s OA findings for both, provided in . Wang et al. (Citation2021) investigated ANNs against FR and Entropy, and found the ANN to outperform both other approaches, in line with the findings of this research. This research confirms that ML approaches are superior to subjective qualitative approaches (AHP) and diversity-based approaches (Entropy) due to their ability to quantitatively infer connections within the input data that best describe the output.

Ranking variables provides insight into the most critical factors contributing to flood potential, which is beneficial to policymakers (Wang et al. Citation2020). The FCFs were ranked differently by all models, however there was some similarity within the ANN-SMOTE and AHP ranking schemes, such as DS and HAND being in the top three ranked variables. Shafizadeh-Moghadam et al. (Citation2018) found high flood susceptibility to be concentrated in areas with low DS, and consequently, selected DS as the most important factor in their ML approach to FSM, yielding similar results to our findings. Slope is frequently found as the most influential factor in FSM studies; however, it was ranked 5th in our study. This ranking is offset by TWI’s 3rd ranking, which factors in slope, such that high flood susceptibility was found along regions with larger values of TWI, in line with findings across other studies (Choubin et al. Citation2019; Vojtek and Vojteková Citation2019; Islam et al. Citation2023). The ability to easily select the most influential FCFs highlights ANN-SMOTE’s utility, in comparison to the more time-consuming ad-hoc trial-and-error procedure of qualitative approaches. This is especially beneficial under climate change considerations, where the most influential FCFs may change under different climate patterns (An et al. Citation2022; Bera et al. Citation2022), ANNs can be easily retrained whereas their qualitative counterpart will require a more intensive approach.

The generalizability of ANN-SMOTE was tested on four independent watersheds that were not included in the training dataset. The resulting flood susceptibility maps were similar to their respective floodplain maps (which were developed via hydrological and hydraulic modelling), indicating the ANN-SMOTE’s generalization capability to accurately retain the geo-physical relationships between riverine flooding and the given inputs, and apply that on the test watersheds. As shown in , the test watersheds displayed some geo-physical similarities with the Don River such as slope and HAND, however, there were more pronounced differences within the levels of imperviousness (CN) and resulting runoff (EP). Wading through these differences and similarities, the ANN-SMOTE maintained consistent performance across both the AUC-ROC and OA indices as it did with the Don River study model, as shown in . These results suggest that ANN models may effectively replace or complement hydraulic/hydrological models in FSM applications, through being used in data-scarce regions where it is difficult to validate and implement a hydraulic/hydrological model or being used to map certain regions within watersheds experiencing shifting urban patterns and mosaicking this prediction within the broader FSM. These findings are supported by Seleem et al. (Citation2022) who used RF, ANN and CNN to predict FSM on the entirety of Berlin using flooded points from the entire region with a high concentration in the central region. Their results demonstrated that the ANN maintained a high accuracy in regions that were not sampled during the training process (Seleem et al. Citation2022). Similarly, Guo et al. (Citation2022) used a CNN to predict flood susceptibility across four independent regions to assess model generalizability in different catchments (Guo et al. Citation2022). Their findings suggest that the CNN maintains a reasonable accuracy in identifying the extent of flooding, even under different landcover conditions such as rural and urban (Guo et al. Citation2022) which align with our findings.

It is standard practice in FSM studies to accumulate a flooding point inventory from a collection of historical flood events (Mudashiru et al. Citation2021a). Although common practice, this study found the proposed ANN-SMOTE approach outperforms ANN-FPI across all performance metrics. This is supported by the results of Tehrany and Jones (Citation2017), who varied the FPI values from 1000 to 50, and found an approximate 20% drop in prediction accuracy. The ANN-SMOTE method extends this by facilitating the use of all available data points in the study area in conjunction with a minority oversampling technique to address the data imbalances, and comparing it against the smaller balanced FPI sample. The random selection of flooded events and their location is also addressed by the ANN-SMOTE approach. This arbitrary choice of FPI and number of points in the FPI method is reiterated in the works of Al-Abadi and Pradhan (Citation2020). Their discussion highlights the random nature of point selection without scientific basis, rather an extraction of locations from singular floods in certain cases (Al-Abadi and Pradhan Citation2020). Similarly, Al-Abadi (Citation2018) used approximately 3500 flood polygons to ensure the inclusion of flood potential extent into their model, which is a practice inherent in our model. Therefore, the limitations discussed associated with the FPI approach have been addressed by the ANN-SMOTE procedure.

Using the FPI approach is demonstrated to overestimate areas of medium and high risk as shown in and . The number of TN and FP predictions was shown to be greater than its ANN-SMOTE counterpart, showing that the decrease in the number of training points adversely affects the accuracy of prediction. Interestingly, the Humber model had the highest OA performance and demonstrated only an 8% drop in the ANN-FPI approach. The smaller difference in performance, when compared to the other test watersheds, is attributed to an improved TN prediction over the ANN-SMOTE approach for certain thresholds. In , the ANN-FPI consistently underperformed in TN predictions across all thresholds, whereas in the Humber model case, ANN-FPI does outperform ANN-SMOTE in TN predictions over a short range of thresholds, likely contributing to the improved performance in comparison to the other watershed models. The visual comparisons of performance metrics for the Humber model predicted by Don River are presented in Appendix.

Given the overestimation of the FPI approach, it is recommended in future studies to incorporate data from across the entire study area coupled with SMOTE to address data imbalances. Using the prevalent FPI approach results in conservative FSM estimates, inflating the degree of susceptibility within regions. This overestimation of high flood susceptibility may adversely impact flood mitigation efforts as there is no clear demarcation of regions requiring the most efforts.

5. Conclusion

Accurate flood susceptibility maps are of critical importance since they can be used as a tool to help policymakers in identifying flood vulnerable zones. Therefore, a comparison of existing methods and their accuracy is the first step to developing reliable maps. Three approaches to FSM were trained on the Don River and validated against four independent watersheds. The highly popular AHP method, Shannon’s Entropy and the proposed ANN-SMOTE model were compared against each other using the common AUC-ROC and OA metrics. The ANN-SMOTE ensemble (AUC-ROC: 0.902, OA: 0.549) outperformed the other two methods across all performance metrics. It was selected as the superior model as it addresses the limitations associated with subjective (AHP) and diversity-based (Entropy) approaches through its quantitative forming of connections in the input dataset that best align with the output. High accuracy was maintained when the ANN-SMOTE ensemble was applied onto four test watersheds with AUC-ROC results ranging from 0.919 to 0.735. These findings are valuable to policymakers requiring accurate FSM, particularly in regions where high-resolution datasets are not available for physically based models, or in regions experiencing rapid changes (e.g. urbanization) where FSMs need to be frequently regenerated. The ANN-SMOTE also substantially outperformed an ANN ensemble trained using the FPI approach. FSMs generated using FPI suffer from flood susceptibility overestimation. Therefore, it is recommended to incorporate data from the entire study area in conjunction with a minority oversampling technique, such as SMOTE, to address data imbalances. Future research should focus on the impact of data resolution, particularly the threshold of resolution at which prediction is adversely impacted. Additionally, coupling machine learning models with optimization algorithms to enhance the FCF ranking should be explored.

Disclosure statement

The authors report there are no competing interests to declare.

Data availability statement

The raw data that were used in this research are publicly available as listed in the article (). Model data may be obtained from the corresponding author, RK, upon reasonable request.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Additional information

Funding

This work was supported by the Ontario Ministry of Environment Conservation and Parks under Grant TPON Case No.: 2021-02-1-1569389059; The Natural Sciences and Research Council of Canada under Grant RGPIN-2023-05077; Rahma Khalid was supported by a Natural Sciences and Research Council CGS scholarship.

References

Abedi R, Costache R, Shafizadeh-Moghadam H, Pham QB. 2021. Flash-flood susceptibility mapping based on XGBoost, random forest and boosted regression trees. Geocarto Int. 37(19):5479–5496. doi: 10.1080/10106049.2021.1920636.
Web of Science ®Google Scholar
Ahmadlou M, Al-Fugara A, Al-Shabeeb AR, Arora A, Al-Adamat R, Pham QB, Al-Ansari N, Linh NTT, Sajedi H. 2021. Flood susceptibility mapping and assessment using a novel deep learning model combining multilayer perceptron and autoencoder neural networks. J Flood Risk Manage. 14(1):e12683. doi: 10.1111/jfr3.12683.
Web of Science ®Google Scholar
Ahmed N, Hoque MAA, Arabameri A, Pal SC, Chakrabortty R, Jui J. 2021. Flood susceptibility mapping in Brahmaputra floodplain of Bangladesh using deep boost, deep learning neural network, and artificial neural network. Geocarto Int. 37(25):8770–8791. doi: 10.1080/10106049.2021.2005698.
Web of Science ®Google Scholar
Akay H. 2021. Flood hazards susceptibility mapping using statistical, fuzzy logic, and MCDM methods. Soft Comput. 25(14):9325–9346. doi: 10.1007/s00500-021-05903-1.
Web of Science ®Google Scholar
Al-Abadi AM. 2018. Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study. Arab J Geosci. 11(9):1–19. doi: 10.1007/S12517-018-3584-5/TABLES/7.
Web of Science ®Google Scholar
Al-Abadi AM, Pradhan B. 2020. In flood susceptibility assessment, is it scientifically correct to represent flood events as a point vector format and create flood inventory map? J Hydrol. 590:125475. doi: 10.1016/j.jhydrol.2020.125475.
Web of Science ®Google Scholar
Al-Juaidi AEM, Nassar AM, Al-Juaidi OEM. 2018. Evaluation of flood susceptibility mapping using logistic regression and GIS conditioning factors. Arab J Geosci. 11(24):1–10. doi: 10.1007/S12517-018-4095-0/FIGURES/7.
Web of Science ®Google Scholar
Ali SA, Parvin F, Pham QB, Vojtek M, Vojteková J, Costache R, Linh NTT, Nguyen HQ, Ahmad A, Ghorbani MA. 2020. GIS-based comparative assessment of flood susceptibility mapping using hybrid multi-criteria decision-making approach, naïve Bayes tree, bivariate statistics and logistic regression: a case of Topľa basin, Slovakia. Ecol Indic. 117:106620. doi: 10.1016/j.ecolind.2020.106620.
Web of Science ®Google Scholar
An NN, Song Nhut H, Anh Phuong T, Quang Huy V, Cao Hanh N, Thi Phuong Thao G, Chi Minh H, Pham The Trinh V, The Trinh P, Viet Hoa P, et al. 2022. Groundwater simulation in Dak Lak province based on MODFLOW model and climate change scenarios. FEBE. 2(1):55–67. doi: 10.1108/FEBE-11-2021-0055.
Google Scholar
Andaryani S, Nourani V, Haghighi AT, Keesstra S. 2021. Integration of hard and soft supervised machine learning for flood susceptibility mapping. J Environ Manage. 291:112731. doi: 10.1016/J.JENVMAN.2021.112731.
PubMed Web of Science ®Google Scholar
Asadi H, Shahedi K, Jarihani B, Sidle RC. 2019. Rainfall-runoff modelling using hydrological connectivity index and artificial neural network approach. Water. 11(2):212. doi: 10.3390/w11020212.
Web of Science ®Google Scholar
Batista G, Prati RC, Monard MC. 2004. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor Newsl. 6(1):20–29. doi: 10.1145/1007730.1007735.
Google Scholar
Bera A, Meraj G, Kanga S, Farooq M, Singh SK, Sahu N, Kumar P. 2022. Vulnerability and risk assessment to climate change in Sagar Island, India. Water. 14(5):823. doi: 10.3390/w14050823.
Web of Science ®Google Scholar
Bui QT, Nguyen QH, Nguyen XL, Pham VD, Nguyen HD, Pham VM. 2020. Verification of novel integrations of swarm intelligence algorithms into deep learning neural network for flood susceptibility mapping. J Hydrol. 581:124379. doi: 10.1016/j.jhydrol.2019.124379.
Web of Science ®Google Scholar
Bunmi Mudashiru R, Sabtu N, Abdullah R, Saleh A, Abustan I. 2022. Optimality of flood influencing factors for flood hazard mapping: an evaluation of two multi-criteria decision-making methods. J Hydrol. 612:128055. doi: 10.1016/j.jhydrol.2022.128055.
Web of Science ®Google Scholar
Cabrera JS, Lee HS. 2020. Flood risk assessment for Davao Oriental in the Philippines using geographic information system-based multi-criteria analysis and the maximum Entropy model. J Flood Risk Manage. 13(2):e12607. doi: 10.1111/jfr3.12607.
Web of Science ®Google Scholar
Chakrabortty R, Chandra Pal S, Rezaie F, Arabameri A, Lee S, Roy P, Saha A, Chowdhuri I, Moayedi H. 2021. Flash-flood hazard susceptibility mapping in Kangsabati River Basin, India. Geocarto Int. 37(23):6713–6735. doi: 10.1080/10106049.2021.1953618.
Web of Science ®Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. 2002. SMOTE: synthetic minority over-sampling technique. arXiv:1106.1813. 16:321–357. doi: 10.1613/jair.953.
Google Scholar
Chen J, Huang G, Chen W. 2021. Towards better flood risk management: assessing flood risk and investigating the potential mechanism based on machine learning models. J Environ Manage. 293:112810. doi: 10.1016/J.JENVMAN.2021.112810.
PubMed Web of Science ®Google Scholar
Cheng M, Fang F, Kinouchi T, Navon IM, Pain CC. 2020. Long lead-time daily and monthly streamflow forecasting using machine learning methods. J Hydrol. 590:125376. doi: 10.1016/j.jhydrol.2020.125376.
Web of Science ®Google Scholar
Choubin B, Moradi E, Golshan M, Adamowski J, Sajedi-Hosseini F, Mosavi A. 2019. An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines. Sci Total Environ. 651(Pt 2):2087–2096. doi: 10.1016/J.SCITOTENV.2018.10.064.
PubMed Web of Science ®Google Scholar
Costache R, Ngo PTT, Bui DT. 2020. Novel ensembles of deep learning neural network and statistical learning for flash-flood susceptibility mapping. Water. 12(6):1549. doi: 10.3390/w12061549.
Web of Science ®Google Scholar
Costache R, Pham QB, Avand M, Thuy Linh NT, Vojtek M, Vojteková J, Lee S, Khoi DN, Thao Nhi PT, Dung TD. 2020. Novel hybrid models between bivariate statistics, artificial neural networks and boosting algorithms for flood susceptibility assessment. J Environ Manage. 265:110485. doi: 10.1016/J.JENVMAN.2020.110485.
PubMed Web of Science ®Google Scholar
Costache R, Țîncu R, Elkhrachy I, Pham QB, Popa MC, Diaconu DC, Avand M, Costache I, Arabameri A, Bui DT. 2020. New neural fuzzy-based machine learning ensemble for enhancing the prediction accuracy of flood susceptibility mapping. Hydrol Sci J. 65(16):2816–2837. doi: 10.1080/02626667.2020.1842412.
Web of Science ®Google Scholar
Costache R, Trung Tin T, Arabameri A, Crăciun A, Ajin RS, Costache I, Reza Md. Towfiqul Islam A, Abba SI, Sahana M, Avand M, et al. 2022. Flash-flood hazard using deep learning based on H2O R package and fuzzy-multicriteria decision-making analysis. J Hydrol. 609:127747. doi: 10.1016/j.jhydrol.2022.127747.
Web of Science ®Google Scholar
Dahri N, Yousfi R, Bouamrane A, Abida H, Pham QB, Derdous O. 2022. Comparison of analytic network process and artificial neural network models for flash flood susceptibility assessment. J Afr Earth Sci. 193:104576. doi: 10.1016/j.jafrearsci.2022.104576.
Web of Science ®Google Scholar
Das S. 2020. Flood susceptibility mapping of the Western Ghat coastal belt using multi-source geospatial data and analytical hierarchy process (AHP). Remote Sens Appl. 20:100379. doi: 10.1016/j.rsase.2020.100379.
Google Scholar
Dilts T. 2015. Topography Tools for ArcGIS 10.3 and earlier - Overview. [accessed 2021 Dec 19]. https://www.arcgis.com/home/item.html?id=b13b3b40fa3c43d4a23a1a09c5fe96b9.
Google Scholar
Farhadi H, Najafzadeh M, Mohammadi B, Zhang Y. 2021. Flood risk mapping by remote sensing data and random forest technique. Water. 13(21):3115. doi: 10.3390/w13213115.
Web of Science ®Google Scholar
Giovannettone J, Copenhaver T, Burns M, Choquette S. 2018. A statistical approach to mapping flood susceptibility in the lower connecticut river valley region. Water Resour Res. 54(10):7603–7618. doi: 10.1029/2018WR023018.
Web of Science ®Google Scholar
Gudiyangada Nachappa T, Tavakkoli Piralilou S, Gholamnia K, Ghorbanzadeh O, Rahmati O, Blaschke T. 2020. Flood susceptibility mapping with machine learning, multi-criteria decision analysis and ensemble using Dempster Shafer Theory. J Hydrol (Amst). 590:125275. doi: 10.1016/j.jhydrol.2020.125275.
Web of Science ®Google Scholar
Guo Z, Moosavi V, Leitão JP. 2022. Data-driven rapid flood prediction mapping with catchment generalizability. J Hydrol. 609:127726. doi: 10.1016/j.jhydrol.2022.127726.
Web of Science ®Google Scholar
Hammami S, Zouhri L, Souissi D, Souei A, Zghibi A, Marzougui A, Dlala M. 2019. Application of the GIS based multi-criteria decision analysis and analytical hierarchy process (AHP) in the flood susceptibility mapping (Tunisia). Arab J Geosci. 12(21):1–16. doi: 10.1007/S12517-019-4754-9/TABLES/7.
Web of Science ®Google Scholar
Hong H, Tsangaratos P, Ilia I, Liu J, Zhu AX, Chen W. 2018. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Sci Total Environ. 625:575–588. doi: 10.1016/J.SCITOTENV.2017.12.256.
PubMed Web of Science ®Google Scholar
How Does TRCA Define Flood Risk? 2023. Toronto and Region Conservation Authority. [accessed 2023 Jul 10]. https://trca.ca/conservation/flood-risk-management/defining-flood-risk/.
Google Scholar
Hu P, Zhang Q, Shi P, Chen B, Fang J. 2018. Flood-induced mortality across the globe: spatiotemporal pattern and influencing factors. Sci Total Environ. 643:171–182. doi: 10.1016/J.SCITOTENV.2018.06.197.
PubMed Web of Science ®Google Scholar
IBI Group. 2019. Toronto and Region Conservation Authority Flood Risk Assessment and Ranking Project. Report No.: 114571.
Google Scholar
Islam ARMT, Bappi MMR, Alqadhi S, Bindajam AA, Mallick J, Talukdar S. 2023. Improvement of flood susceptibility mapping by introducing hybrid ensemble learning algorithms and high-resolution satellite imageries. Nat Hazards. 119(1):1–37. doi: 10.1007/s11069-023-06106-7.
Web of Science ®Google Scholar
Jahangir MH, Mousavi Reineh SM, Abolghasemi M. 2019. Spatial predication of flood zonation mapping in Kan River Basin, Iran, using artificial neural network algorithm. Weather Clim Extrem. 25:100215. doi: 10.1016/j.wace.2019.100215.
Web of Science ®Google Scholar
Janizadeh S, Avand M, Jaafari A, Van Phong T, Bayat M, Ahmadisharaf E, Prakash I, Pham BT, Lee S. 2019. Prediction success of machine learning methods for flash flood susceptibility mapping in the tafresh watershed, Iran. Sustainability. 11(19):5426. doi: 10.3390/su11195426.
Web of Science ®Google Scholar
Kalantar B, Ueda N, Saeidi V, Janizadeh S, Shabani F, Ahmadi K, Shabani F. 2021. Deep neural network utilizing remote sensing datasets for flood hazard susceptibility mapping in Brisbane, Australia. Remote Sens. 13(13):2638. doi: 10.3390/rs13132638.
Google Scholar
Khan UT, He J, Valeo C. 2018. River flood prediction using fuzzy neural networks: an investigation on automated network architecture. Water Sci Technol. 2017(1):238–247. doi: 10.2166/WST.2018.107.
Google Scholar
Khosravi K, Pourghasemi HR, Chapi K, Bahri M. 2016. Flash flood susceptibility analysis and its mapping using different bivariate models in Iran: a comparison between Shannon’s Entropy, statistical index, and weighting factor models. Environ Monit Assess. 188(12):656. doi: 10.1007/S10661-016-5665-9/FIGURES/6.
PubMed Web of Science ®Google Scholar
Kim HII, Han KY. 2020. Urban flood prediction using deep neural network with data augmentation. Water. 12(3):899. doi: 10.3390/w12030899.
Web of Science ®Google Scholar
Latifi M, Rakhshandehroo G, Nikoo MR, Sadegh M. 2019. A game theoretical low impact development optimization model for urban storm water management. J Clean Prod. 241:118323. doi: 10.1016/j.jclepro.2019.118323.
Web of Science ®Google Scholar
Lin L, Tang C, Liang Q, Wu Z, Wang X, Zhao S. 2023. Rapid urban flood risk mapping for data-scarce environments using social sensing and region-stable deep neural network. J Hydrol. 617:128758. doi: 10.1016/j.jhydrol.2022.128758.
Web of Science ®Google Scholar
Liu J, Wang J, Xiong J, Cheng W, Li Y, Cao Y, He Y, Duan Y, He W, Yang G. 2022. Assessment of flood susceptibility mapping using support vector machine, logistic regression and their ensemble techniques in the Belt and Road region. Geocarto Int. 37(25):9817–9846. doi: 10.1080/10106049.2022.2025918.
Web of Science ®Google Scholar
Mahmoody Vanolya N, Jelokhani-Niaraki M. 2021. The use of subjective–objective weights in GIS-based multi-criteria decision analysis for flood hazard assessment: a case study in Mazandaran, Iran. GeoJournal. 86(1):379–398. doi: 10.1007/S10708-019-10075-5/FIGURES/8.
Web of Science ®Google Scholar
Malekinezhad H, Sepehri M, Pham QB, Hosseini SZ, Meshram SG, Vojtek M, Vojteková J. 2021. Application of entropy weighting method for urban flood hazard mapping. Acta Geophys. 69(3):841–854. doi: 10.1007/S11600-021-00586-6/TABLES/4.
Web of Science ®Google Scholar
Mudashiru RB, Sabtu N, Abustan I. 2021a. Quantitative and semi-quantitative methods in flood hazard/susceptibility mapping: a review. Arab J Geosci. 14(11):1–24. doi: 10.1007/s12517-021-07263-4.
Web of Science ®Google Scholar
Mudashiru RB, Sabtu N, Abustan I, Balogun W. 2021b. Flood hazard mapping methods: a review. J Hydrol. 603:126846. doi: 10.1016/j.jhydrol.2021.126846.
Web of Science ®Google Scholar
Ontario GeoHub. 2019a. Provincial digital elevation model (PDEM). [accessed 2023 Aug 3]. https://geohub.lio.gov.on.ca/maps/mnrf::provincial-digital-elevation-model-pdem/explore?location=45.644126%2C-79.315587%2C4.58.
Google Scholar
Ontario GeoHub. 2019b. Soil Survey Complex | Soil Survey Complex. [accessed 2023 Aug 3]. https://geohub.lio.gov.on.ca/datasets/ontarioca11::soil-survey-complex/explore?location=43.785026%2C-79.392323%2C9.84.
Google Scholar
Ontario GeoHub. 2019c. Southern Ontario Land Resource Information System (SOLRIS) 3.0. [accessed 2023 Aug 3]. https://geohub.lio.gov.on.ca/documents/0279f65b82314121b5b5ec93d76bc6ba/about.
Google Scholar
Open Government Portal. 2017. Topographic Data of Canada - CanVec Series. [accessed 2023 Aug 3]. https://open.canada.ca/data/en/dataset/8ba2aa2a-7bb9-4448-b4d7-f164409fe056.
Google Scholar
Park S, Choi C, Kim B, Kim J. 2013. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ Earth Sci. 68(5):1443–1464. doi: 10.1007/S12665-012-1842-5/TABLES/7.
Web of Science ®Google Scholar
Pham BT, Phong TV, Nguyen HD, Qi C, Al-Ansari N, Amini A, Ho LS, Tuyen TT, Yen HPH, Ly HB, et al. 2020. A comparative study of kernel logistic regression, radial basis function classifier, multinomial naïve bayes, and logistic model tree for flash flood susceptibility mapping. Water. 12(1):239. doi: 10.3390/w12010239.
Web of Science ®Google Scholar
Priscillia S, Schillaci C, Lipani A. 2021. Flood susceptibility assessment using artificial neural networks in Indonesia. Artif Intell Geosci. 2:215–222. doi: 10.1016/j.aiig.2022.03.002.
Google Scholar
Raei E, Reza Alizadeh M, Reza Nikoo M, Adamowski J. 2019. Multi-objective decision-making for green infrastructure planning (LID-BMPs) in urban storm water management under uncertainty. J Hydrol. 579:124091. doi: 10.1016/j.jhydrol.2019.124091.
Web of Science ®Google Scholar
Rahmati O, Darabi H, Panahi M, Kalantari Z, Naghibi SA, Ferreira CSS, Kornejady A, Karimidastenaei Z, Mohammadi F, Stefanidis S, et al. 2020. Development of novel hybridized models for urban flood susceptibility mapping. Sci Rep. 10(1):12937. doi: 10.1038/s41598-020-69703-7.
PubMed Web of Science ®Google Scholar
Rahmati O, Pourghasemi HR, Zeinivand H. 2015. Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto Int. 31(1):42–70. doi: 10.1080/10106049.2015.1041559.
Web of Science ®Google Scholar
Ren Y, Zhang L, Suganthan PN. 2016. Ensemble classification and regression-recent developments, applications and future directions. IEEE Comput Intell Mag. 11(1):41–53. doi: 10.1109/MCI.2015.2471235.
Web of Science ®Google Scholar
Rincón D, Khan UT, Armenakis C. 2018. Flood risk mapping using gis and multi-criteria analysis: a greater Toronto area case study. Geosciences. 8(8):275. doi: 10.3390/geosciences8080275.
Web of Science ®Google Scholar
Saaty TL. 1988. What is the analytic herarchy process? In: Mitra G, Greenberg HJ, Lootsma FA, Rijkaert MJ, Zimmermann HJ, editors. Mathematical models for decision support. vol 48. Heidelberg, Berlin: Springer; p. 109–121.
Google Scholar
Seleem O, Ayzel G, de Souza ACT, Bronstert A, Heistermann M. 2022. Towards urban flood susceptibility mapping using data-driven models in Berlin, Germany. Geomatics Nat Haz Risk. 13(1):1640–1662. doi: 10.1080/19475705.2022.2097131.
Web of Science ®Google Scholar
Shafizadeh-Moghadam H, Valavi R, Shahabi H, Chapi K, Shirzadi A. 2018. Novel forecasting approaches using combination of machine learning and statistical models for flood susceptibility mapping. J Environ Manage. 217:1–11. doi: 10.1016/J.JENVMAN.2018.03.089.
PubMed Web of Science ®Google Scholar
Shahabi H, Shirzadi A, Ronoud S, Asadi S, Pham BT, Mansouripour F, Geertsema M, Clague JJ, Bui DT. 2021. Flash flood susceptibility mapping using a novel deep learning model based on deep belief network, back propagation and genetic algorithm. Geosci Front. 12(3):101100. doi: 10.1016/j.gsf.2020.10.007.
Web of Science ®Google Scholar
Siahkamari S, Haghizadeh A, Zeinivand H, Tahmasebipour N, Rahmati O. 2017. Spatial prediction of flood-susceptible areas using frequency ratio and maximum Entropy models. Geocarto Int. 33(9):927–941. doi: 10.1080/10106049.2017.1316780.
Web of Science ®Google Scholar
Simonovic SP, Schardong A, Srivastav R, Sandink D. 2015. IDF_CC web-based tool for updating intensity-duration-frequency curves to changing climate – ver 6.5. Western University Facility for Intelligent Decision Support and Institute for Catastrophic Loss Reduction. [accessed 2021 Nov 21]. https://www.idf-cc-uwo.ca.
Google Scholar
Snieder E, Abogadil K, Khan UT. 2021. Resampling and ensemble techniques for improving ANN-based high-flow forecast accuracy. Hydrol Earth Syst Sci. 25(5):2543–2566. doi: 10.5194/hess-25-2543-2021.
Web of Science ®Google Scholar
Snieder E, Shakir R, Khan UT. 2020. A comprehensive comparison of four input variable selection methods for artificial neural network flow forecasting models. J Hydrol. 583:124299. doi: 10.1016/j.jhydrol.2019.124299.
Web of Science ®Google Scholar
Souissi D, Zouhri L, Hammami S, Msaddek MH, Zghibi A, Dlala M. 2019. GIS-based MCDM – AHP modeling for flood susceptibility mapping of arid areas, southeastern Tunisia. Geocarto Int. 35(9):991–1017. doi: 10.1080/10106049.2019.1566405.
Web of Science ®Google Scholar
Tehrany MS, Jones S. 2017. Evaluating the variations in the flood susceptibility maps accuracies due to the alterations in the type and extend of the flood inventory. Int Arch Photogramm Remote Sens Spatial Inf Sci. XLII-4/W5(4W5):209–214. doi: 10.5194/isprs-archives-XLII-4-W5-209-2017.
Google Scholar
Tehrany MS, Kumar L. 2018. The application of a Dempster–Shafer-based evidential belief function in flood susceptibility mapping and comparison with frequency ratio and logistic regression methods. Environ Earth Sci. 77(13):1–24. doi: 10.1007/S12665-018-7667-0/TABLES/3.
Web of Science ®Google Scholar
Tehrany MS, Pradhan B, Jebur MN. 2014. Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. J Hydrol. 512:332–343. doi: 10.1016/j.jhydrol.2014.03.008.
Web of Science ®Google Scholar
TRCA Open Data Portal. 2020. Watercourses TRCA | Watercourses TRCA. [accessed 2023 Aug 3]. https://trca-camaps.opendata.arcgis.com/datasets/5a9254bf984949cca546a5ca64e2aff1/explore.
Google Scholar
United States Department of Agriculture (USDA). 1986. Urban hydrology for small watersheds. Chapter 2, Estimating runoff; p. 2–1–2-11.
Google Scholar
Vidyarthi VK, Jain A, Chourasiya S. 2020. Modeling rainfall-runoff process using artificial neural network with emphasis on parameter sensitivity. Model Earth Syst Environ. 6(4):2177–2188. doi: 10.1007/s40808-020-00833-7.
Web of Science ®Google Scholar
Vilasan RT, Kapse VS. 2022. Evaluation of the prediction capability of AHP and F-AHP methods in flood susceptibility mapping of Ernakulam district (India). Nat Hazards. 112(2):1767–1793. doi: 10.1007/s11069-022-05248-4.
Web of Science ®Google Scholar
Vojtek M, Vojteková J. 2019. Flood susceptibility mapping on a national scale in Slovakia using the analytical hierarchy process. Water. 11(2):364. doi: 10.3390/w11020364.
Web of Science ®Google Scholar
Vojtek M, Vojteková J, Costache R, Pham QB, Lee S, Arshad A, Sahoo S, Linh NTT, Anh DT. 2021. Comparison of multi-criteria-analytical hierarchy process and machine learning-boosted tree models for regional flood susceptibility mapping: a case study from Slovakia. Geomatics Nat Hazards Risk. 12(1):1153–1180. doi: 10.1080/19475705.2021.1912835.
Web of Science ®Google Scholar
Wang Y, Fang Z, Hong H, Costache R, Tang X. 2021. Flood susceptibility mapping by integrating frequency ratio and index of entropy with multilayer perceptron and classification and regression tree. J Environ Manage. 289:112449. doi: 10.1016/j.jenvman.2021.112449.
PubMed Web of Science ®Google Scholar
Wang Y, Fang Z, Hong H, Peng L. 2020. Flood susceptibility mapping using convolutional neural network frameworks. J Hydrol. 582:124482. doi: 10.1016/j.jhydol.2019.124482.
Web of Science ®Google Scholar
Wang Y, Wu X, Chen Z, Ren F, Feng L, Du Q. 2019. Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using SMOTE for Lishui City in Zhejiang Province, China. Int J Environ Res Public Health. 16(3):368. doi: 10.3390/ijerph16030368.
PubMed Web of Science ®Google Scholar
Yang W, Xu K, Lian J, Ma C, Bin L. 2018. Integrated flood vulnerability assessment approach based on TOPSIS and Shannon entropy methods. Ecol Indic. 89:269–280. doi: 10.1016/j.ecolind.2018.02.015.
Web of Science ®Google Scholar
Yaseen A, Lu J, Chen X. 2022. Flood susceptibility mapping in an arid region of Pakistan through ensemble machine learning model. Stoch Environ Res Risk Assess. 36(10):3041–3061. doi: 10.1007/s00477-022-02179-1.
Web of Science ®Google Scholar
Zhao G, Pang B, Xu Z, Cui L, Wang J, Zuo D, Peng D. 2021. Improving urban flood susceptibility mapping using transfer learning. J Hydrol. 602:126777. doi: 10.1016/j.jhydrol.2021.126777.
Web of Science ®Google Scholar
Zhao G, Pang B, Xu Z, Peng D, Zuo D. 2020. Urban flood susceptibility assessment based on convolutional neural networks. J Hydrol. 590:125235. doi: 10.1016/j.jhydrol.2020.125235.
Web of Science ®Google Scholar
Zhao G, Pang B, Xu Z, Yue J, Tu T. 2018. Mapping flood susceptibility in mountainous areas on a national scale in China. Sci Total Environ. 615:1133–1142. doi: 10.1016/j.scitotenv.2017.10.037.
PubMed Web of Science ®Google Scholar
Zhang G, Wang M, Liu K. 2021. Deep neural networks for global wildfire susceptibility modelling. Ecol Indic. 127:107735. doi: 10.1016/j.ecolind.2021.107735.
Web of Science ®Google Scholar

Appendix

Table A1. Highland model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Download CSV Display Table

Table A2. Humber model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Download CSV Display Table

Table A3. Etobicoke model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Download CSV Display Table

Table A4. Mimico model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Download CSV Display Table

Flood susceptibility mapping using ANNs: a case study in model generalization and accuracy from Ontario, Canada

Abstract

1. Introduction

2. Materials and methods

2.1. Study area

2.2. Flood conditioning factors

Table 1. FCFs and their associated data sources.

2.2.1. Similarities of FCFs across selected watersheds

2.3. Flood inventory

2.4. Analytical hierarchy process

2.5. Shannon entropy

2.6. Artificial neural networks

2.6.1. Data imbalances - SMOTE

2.6.2. Varying the number of training points

2.6.3. Combined neural pathway strength

2.7. Performance metrics

2.8. Multicollinearity analysis

3. Results

3.1. FCF distribution across all watersheds

3.2. Multicollinearity

Table 2. Variance inflation factor values for each of the FCFs.

3.3. ANN, AHP and entropy comparisons

Table 3. FCF rankings for AHP.

Table 4. Ranking scheme for the FCFs for each FSM method.

Table 5. Performance metrics for the ANN-SMOTE, AHP and Entropy methods on the Don River.

3.4. Testing ANN generalizability

Table 6. Performance metrics for the four test watersheds.

3.5. A comparison of ANN-SMOTE to ANN-FPI

Table 7. Performance metrics comparison for the ANN-SMOTE and ANN-FPI approach.

Table 8. A comparison of the Don River map areas designated by their various susceptibility categories for each FSM approach.

Table 9. Performance metrics for the test watersheds based on the Don River ANN-FPI.

Table 10. A comparison of the proportion of watershed areas categorized into their respective flood susceptibility designation.

4. Discussion

5. Conclusion

Disclosure statement

Data availability statement

Correction Statement

Additional information

Funding

References

Appendix

Table A1. Highland model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Table A2. Humber model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Table A3. Etobicoke model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Table A4. Mimico model prediction metrics for both ANN-SMOTE and ANN-FPI approaches.

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature