532
Views
0
CrossRef citations to date
0
Altmetric
ENVIRONMENTAL ENGINEERING

A chemometrics-based approach for the chemical prediction of lead (Pb) levels in surface soil, Dammam, Saudi Arabia

Article: 2199967 | Received 06 Nov 2022, Accepted 01 Apr 2023, Published online: 12 Apr 2023

Abstract

High levels of trace metals in top soil may impose serious health problems to humans and the environment. Thus, there is a need to assess the geochemical conditions of surface soils where various human activities are intensified. This study aims to evaluate contaminations associated with trace metals within Dammam, Saudi Arabia. The study also aims to hyphenate the results into a chemometrics-based approach for predicting the concentration of lead (Pb) due to its toxicity. No previous work was found on Pb prediction in top soils using chemometrics-based approach. Surface soil samples were collected from the different zones and analyzed in the laboratory for their trace metals’ constituents. According to the study’s findings, all trace metals were below international allowable metals with the highest mean concentration of Ba (309.7 mg/kg), Zn (7.9 mg/kg), and Cr (10.2 mg/kg) in industrial, agriculture and residential areas, respectively. The prediction steps involve the application of two AI-based techniques; Gaussian process regression (GPR) and Least-square (L-Boost), as well as a linear Step-Wise-Linear Regression (SWLR). The modelling was featured in two scenarios, M1 and M2, based on the input–output relationship designated according to the correlational feature extraction approach. The performance results of the models indicate that the second scenario (M2) showed higher performance skills than the first scenario (M1) in all three approaches. Overall, the performance accuracy of the models showed that the non-linear GPR-M2 showed higher performance accuracy than all the model combinations applied in the current work with 99% accuracy and an MSE of 0.012.

1. Introduction

Reliable hydro-environmental modelling and assessment are the essential stages and an integral part of sustainable development. Modelling and optimization were proven to enhance the system efficiency, which would significantly increase the process and maintenance cost (Abdullahi et al., Citation2020). It is essential to note that environmental modelling is essential for predicting the impacts of human activities on water systems, assessing water quality, and informing management decisions to protect aquatic ecosystems (Alhaji et al., Citation2022; Gaya et al., Citation2020). Recently, researchers found an insight of coupling experimental conditions with numerical and artificial simulation to reduce the complex nature of absorption process and find the optimum process (Abel et al., Citation2010; Hassan et al., Citation2017). Among the trace elements, lead (Pb) is one of the widely utilized trace metals in various industries (e.g. batteries, painting, dying and other sensitive manufacturing). Hence, it is not surprising that substantial amount of Pb were reported to be found in hydro-environmental vicinity (Rodríguez-Salazar et al., Citation2011). According to the previous literature, Pb have been justified to be attributed with toxicity and other concern illness as such it is prediction and removal is paramount.

It was worth mentioning that less attention was given to surface soil pollution research even though studies on environmental impact assessment (EIA) were in demand in several locations across Saudi Arabia. The monitoring studies related to this scenario could also be found in the Jizan city (Arif & Hashem, Citation1998), Red Sea coastal line (Al-Hefne et al., Citation2005), Jeddah City (Kadi, Citation2009), and Wadi Hanifah, Riyadh (Alyemeni et al., Citation2014). Arif and Hashem (Citation1998) demonstrated that the soil in Jizan city contained high concentrations of some trace elements such as Pb, Cu, Ni, etc., based on the conducted studies in five different locations. Al-Hefne et al. (Citation2005) analyzed some trace element concentrations using the ICP-MS method in surface soil. The outcomes detected some trace elements, such as Sr, U, etc., that emerged high out of 28 elements uncovered in surface soil along the Red Sea coastline. Similarly, Kadi (Citation2009) examined different soil samples from Jeddah City, Saudi Arabia, and concluded that Pb and Zn were investigated to be very in very high concentrations. The study also revealed that the significant reason for this was the traffic conjunctions along the surface of the roads. Alyemeni et al. (Citation2014) investigated the natural topology along Wadi Hanifah, to examine some trace elements in the drainage system for the capital city of Riyadh. The outputs depicted that some trace metals were found above the considerable limit and standard, for example Pb, Cd, Ni, and Zn in the selected locations. Besides, various chemometrics approaches such as classical approaches, AI paradigms, GIS, and other computational techniques play a role in different environmental areas such as earth science, hydrology, studies on the effects of trace metals and trace metals etc.

Therefore, integrating the experimental procedures with chemometrics techniques such as AI paradigms is of paramount importance, which gives a clear picture and relationship of various trace metals additionally, their effects and impacts on the environment, humans, and other aquatic and land-living creatures. Various chemometricians reported that AI-based paradigms are data sensitive. Therefore, this shows that no single universal model can be employed in solving all data-driven approach issues (Alas et al., Citation2020; Bhagat et al., Citation2022; Ghali et al., Citation2020; Kazemi & Hosseini, Citation2011; Mohammadi et al., Citation2020; Nourani et al., Citation2018; Nunno et al., Citation2022; Citation2020; Tawabini et al., Citation2022; Usman, Işik, et al., Citation2021; Wang et al., Citation2020; Yaseen, Citation2021). It is consequently significant to apply various approaches to understand their behavior towards a specific dataset. Hence, to the extent that we know, no study has reported the application of GPR, L-Boost and SWLR models for predicting Pb concentration from the surface soil of the Dammam region of Saudi Arabia which is the main aim, motivation and gap of this study. Scopus literature survey, as indicated in Figure , showed that the conceptual application and interest in the field were recorded from 1985 to date. Similarly, the database depicted more than 3,000 published papers adopted from the literature with more than 1500 keywords showing the profound popularity of this field across the globe. Several literatures the use of AI paradigm in developing trace elements. Although there have been many studies on this topic, the sustainability of these approaches requires consideration of many factors. There is currently a debate about the limitations of conventional regression and traditional ML methods, and the need for new optimization learning approaches, such as metaheuristic algorithms, is essential. Additionally, several factors need to be considered, such as environmental variables, data types, hyperparameters, performance metrics, and model architecture, when using ML-based optimization for trace elements modeling. Despite promising performance in trace element modeling, there are concerns about the reliability of these models and their applicability compared to standalone AI-based and conventional models. Therefore, further investigation is needed to address these concerns.

Figure 1. Major search terms employed in trace metals AI technique analysis.

Figure 1. Major search terms employed in trace metals AI technique analysis.

2. Material and methods

2.1. Study area and sampling plan

Dammam is located between latitudes 26° 20’ and 26° 32’ and longitudes 49° 49’ and 50° 09’. The region is the major city across the eastern part, with over a million residents and the integrated port of the Arabian Gulf, East of Saudi Arabia. The industrial part and the nearby location of the Dammam were the actual location globally for the mining and production of oil and petroleum-related products (Yassin et al., Citation2022). Apart from other recreational centers, the region was enclosed by the farms that hugely produce dates and other vegetables for human and economic growth. The Dammam City has lately developed in terms of industrialization and urbanization for decades.

The study area was divided into four zones, namely rural (background) areas, industrial zones, agricultural regions, and built-up areas (residential). The rural area was selected away from the other three zones to account for the background levels of metals. A total of 132 soil samples were taken from the zone (Figure ) at a depth of 10–15 cm, representative soil samples of one soil type were obtained using an auger from each sampling zone. In order to avoid sampling overlap, the optimal sample locations were chosen using satellite pictures and street Google guide maps. Every sample location’s position data was gathered using handheld Global Positioning System (GPS) devices (Garmen Handheld, ETrex 20). The samples were then transported to the laboratory after being kept in polythene bags and housed in a hard box casing.

Figure 2. Sampling locations in Dammam area.

Figure 2. Sampling locations in Dammam area.

2.2. Analysis of soil samples

Collected soil samples were prepared in accordance with USEPA procedure 3050B. (USEPA, Citation1994). Samples were then analyzed for trace metals using SPECTRO Inductively Coupled Plasma-Optical Emission Spectroscopy (ICP-OES) in accordance with USEPA method 200.7, revision 5 (USEPA, Citation2001). For the ICP calibration, a standard solution of trace metals of different concentrations was employed. The suitability and precision of the apparatus were examined using six samples of the working standards and one blank. Each batch of processed samples underwent quality control measures as well. One identical sample, two spiked samples, two blank samples, and two standard samples made up each set’s 20 samples, for a total of 27 pieces in each set. The machine’s accuracy and bias were checked using the blank, spike, and identical models. The ICP detection limits (LOD) of tested trace metals were as follows: As (2.6 μg/L); Ba (0.15 μg/L); Cd (0.15 μg/L); Cr (0.5 μg/L); Cu (1.2 μg/L); Hg (1.1 μg/L); Zn (0.15 μg/L); V (1.4) and Pb (2.4 μg/L).

2.3. The proposed computational method

Primary experimental data were used throughout the current study from various trace metals from the Dammam region of Saudi Arabian surface soil (Yassin et al., Citation2022). The computational method predicts the concentration of lead (Pb) as the dependent variable, with Cd, As, Ba, Ti, V, Cr, Cu, Hg, Ni and Zn as the corresponding input variables. The Spearman correlation analysis was conducted before starting the prediction step to select suitable features depending on the input-output connection demonstrated in Figure , two different model classes of the input-combinations were developed as M1 comprising Cd, As, Ba, Ti and V and M2 comprising Cr, Cu, Hg, Ni and Zn, respectively, based on their correlation performance with the target Pb. Cross-validation was applied to avoid underfitting and overfitting the performance results of the employed datasets. The data were classified into 65% training and 35% testing, respectively.

Figure 3. Input–output relations of the experimental data.

Figure 3. Input–output relations of the experimental data.

2.4. Gaussian Process Regression (GPR)

Any reliable non-linear prediction model, probabilistic, nonparametric, supervised, and unsupervised learning technique that summarizes the complicated and non-linear function mapping concealed in data sets is known as Gaussian process regression (Huang et al., Citation2021). GPR has recently piqued the interest of academics from a variety of engineering disciplines. GPR can deal with non-linear data since it makes use of kernel functions. Furthermore, among the benefits of a GPR model is that it can deliver a consistent reaction to input data (Huang et al., Citation2021; Kargar et al., Citation2020; Lal & Datta, Citation2020; Wiangkham et al., Citation2022).

2.5. Least square boost (LS-Boost)

Boosting is a controlled learning method that helps reduce bias and variation as well as transforming a slow learner to a solid learner (Tran et al., Citation2021). An arbitrarily high correlation between a good learner and the right classification, whereas a weak learner has little correlation with it. By using the gradient boosting system approach (GBM), Friedman suggests a definition of gradient boosting expands the boost to regression. Gradient boosting builds a prediction model using a process ensemble of weak prediction models (Abba et al., Citation2022). Generally, decision trees function as low learners. To enhance a function of any arbitrary differentiable losses, the generalization is made. With gradient boosting, predictors are applied to an ensemble sequentially, and each one modifies the one before it. The residual error is modified to the current predictor.

2.6. Step-Wise-Linear Regression (SWLR)

In general, linear regression (LR) is one of the most widely used computational methods for modelling a wide range of input and output variables. It is worth noting that there is a link connecting both simple and complex variables when it comes to determining the best combination of parameters for the best forecast effectiveness, which is tied to the output variable (Abba et al., Citation2020). Several modelers have defined systematic regression as an advanced option that uses the best input data set by eliminating or inserting variables under the impact of the residual sum of the squares (Heddam, Citation2016). By examining the effects of the variables, the SWLR adheres to their consistent variations. Each factor that lacks to make contributions to and meet the model’s procedure might be removed one by one to minimize its influence. The principle of SWLR might be shown using MLR. Integrating or deleting a fixed input from LR is known as systematic regression (Usman, Ahmad, et al., Citation2021).

2.7. Evaluation metrics of the models

Efficiency skills are evaluated utilizing a number of elements dependent on a relative forecast and a measured value for any type of data-driven technique. In this investigation, we used the root-mean-squared error (RMSE) and mean-squared error (MSE) as statistical error measures, as well as the determination coefficient (DC) and Pearson coefficient (PC) as goodness-of-fit metrics.

(1) MSE=1Ni=1NPbpPbo2(1)
(2) RMSE=1Ni=1NPbpPbo2(2)
(3) DC=1i=1NPbpQo2i=1NPbpPb o2(3)
(4) PC=i=1NPbt,iPbtPbˆt,iPb˜ti=1NPbt,iPbt2Pbˆt,iPb˜t2(4)

2.8. Description of the data set and model validation

Validation techniques include k-fold cross-validation, holdout, leaving one out, and others. The fact that the evaluation and training datasets are independent is the most significant advantage of the k-fold cross-validation procedure (Cai et al., Citation2020). According to the k-fold cross-validation, the data divided into 65% for the training stage and 35% for the testing stage in this study. As a result, it is critical to understand that various calibrated models might be used for the test data. Furthermore, the data set was collected over 5 months, with 32 instances for each of the variables.

3. Results and discussions

3.1. Levels of trace metals detected in the surface soil samples

Figure displays the surface soil sample levels (background area) and Figure (agriculture, residential, and industrial areas) along with the maximum allowable limits stated by Canadian Soil Quality Guidelines for different trace metals as well as the USEPA maximum contaminant levels (MCL). Most of the metals showed a general trend in concentration, with the soils in industrial areas containing the highest means, followed by agricultural area and residential area. The background region contained the lowest means.

Figure 4. Levels of trace metals in background area 50 km away from Dammam.

Figure 4. Levels of trace metals in background area 50 km away from Dammam.

Figure 5. Levels of trace metals in (a) agriculture, (b) industrial and (c) residential areas in Dammam.

CESQG, Canadian Environmental Soil Quality Guidelines CESQG; MCLs, USA Maximum Contaminant Levels.
Figure 5. Levels of trace metals in (a) agriculture, (b) industrial and (c) residential areas in Dammam.

The amounts of metals are listed below in parentheses, starting with the average value and continuing with the highest value discovered in (mg/kg) at the sampling sites. We can assess the amount of trace metal contamination in our environment using the surface soil chemistry analysis. Consequently, it is important to regularly check for metal pollution in soils (Han et al., Citation2006; Joimel et al., Citation2016). The Canadian Environmental Soil Quality Recommendations (CESQG) guidelines for the Protection of Environment and Human Health (PEHH) were used for assessment since the Kingdom of Saudi Arabia lacks proper instructions concerning the limits of trace metals in soil.

According to the study’s findings, barium (Ba) concentrations were higher in industrial areas than in agricultural or residential areas (mean: 335.5 mg/kg, maximum: 1966.5 mg/kg; mean: 34.46 mg/kg, maximum: 100.62 mg/kg; and mean: 34.11 mg/kg, maximum: 98.55 mg/kg) were the permissible limit of 500 mg/kg was surpassed in some samples from industrial locations. Higher amounts of barium have been found in industrial locations, which can be accredited to the use of barium compounds or oxides in a variety of industrial processes (Kresse et al., Citation2007). For fireworks to have a green color, barium nitrate is employed (Russell & Svrcula, Citation2008). Barium titanate is used in electro-ceramics (Wadhawan, Citation2019) and mineral barite is used as an chemical in oil well drilling muds.

Industrial soil samples had the highest levels of chromium (Cr) with mean and maximum values of 51.77 and 247.6 mg/kg, followed by residential areas with mean and maximum values of 29.64 and 120.22 mg/kg, and agricultural areas with mean and maximum values of 24.4 and 74.7 mg/kg. The permissible limit of 74 mg/kg was exceeded in certain samples from each of the three studied regions. Both natural sources and atmospheric deposition of Cr-containing compounds may be to blame for the increasing amounts of Cr in agricultural soil. Three of the soil samples at residential revealed increased levels of Cr, a finding that can be explained by the usage of Cr as a wood preservative, an accidentally released substance, or atmospheric confession of compounds containing Cr. Six industrial samples also included high levels of Cr, which is related to the usage of Cr-containing chemicals in the manufacturing of dyes, paints, and leather tanners as well as past industrial solvents (Hunger, Citation2007).

However, zinc (Zn) concentrations were highest in industrial areas with mean and maximum values of 65.44 and 676.5 mg/kg, next in agricultural areas with mean and maximum values of 12.43 and 46.25 mg/kg, and finally in residential areas with mean and maximum values of 8.47 and 39.39 mg/kg. The permitted limit of 200 mg/kg was not exceeded by any of the samples. For human health, zinc must be present in trace levels. It strengthens the human immune system and is crucial for maintaining vision (Banerjee et al., Citation2009; Lakshmi et al., Citation2021). Zn is necessary for metabolic activities, but excessive or abusive consumption might be hazardous (Fosmire, Citation1990). All of the examined samples came under the permissible limit of 200 mg/kg set forth in the accepted standards for the level of zinc in soil.

Similar to this, nickel (Ni) concentrations were highest in industrial areas with mean and maximum values of 13.14 and 45.2 mg/kg, then in agricultural areas with mean and maximum values of 9.22 and 16.25 mg/kg, and finally in residential areas with mean and maximum values of 6.44 and 13.23 mg/kg. The permitted maximum of 50 mg/kg was not surpassed by any of the samples. Ni, which is present in the environment naturally in small proportions, is a crucial component of human diet (Yeskis & Zavala, Citation2022). Ni is present in ambient air as a result of the emission of Ni-containing compounds from the burning of oil and gas, the incineration of sewage sludge, and manufacturing processes that employ or emit Ni-containing compounds. The Ni levels in the surface soil samples gathered for this investigation, however, were within the range that was permitted by the accepted specification.

Furthermore, the industrial area had the greatest levels of copper (Cu) with mean and maximum values of 11.0 and 95.75 mg/kg), next by the agricultural with mean and maximum values of 8.74 and 31.64 mg/kg, then residential with mean and maximum values of 4.38 and 19.17 mg/kg areas. The permissible limit of 63 mg/kg was exceeded in two of the industrial area samples. The usage of Cu in the creation of electrical cables, roofing, plumbing, and industrial gear may be the cause of the increased amounts of Cu in 2 out of 33 industrial samples (Emsley, Citation2011). It might also be caused by unintentional releases of Cu-containing chemicals or atmospheric deposition. To minimize a person’s exposure to copper or compounds containing copper; however, great caution must be taken (Gaetke & Chow, Citation2003).

The concentration of lead (Pb) was highest in industrial areas (mean 11.42 mg/kg, maximum 100.25 mg/kg), next in agricultural areas (mean 6.49 mg/kg, maximum 52.35 mg/kg), and finally in residential areas (mean 4.79 mg/kg, maximum 25.6 mg/kg). The permitted limit of 140 mg/kg was not exceeded by any of the samples. A very little quantity of lead is present in the environment naturally. One cannot determine its physiological function in the human body. The manufacture of paints, plumbing supplies, and batteries are examples of found applications. Its negative consequences range widely (Mañay et al., Citation2008). In its natural state, lead has no negative health impacts. The high levels of Pb in the environment are caused by human actions like using leaded fuel and fossil fuels. Guidotti and Ragain (Citation2007) expressed worry that residual soil may contain Pb that may cause its concentration in the environment to rise. All examined samples fall within the permissible range of the accepted requirements, despite the fact that there is no description specifying the level of Pb anticipated to be detected in Dammam soil. The ecosystem and the soil should be continuously monitored and given enough care.

The study findings also reveal that the study’s surface soil samples included low amounts of As, Cd, Hg, and V. The concentration was highest in an industrial area with mean and maximum values of 1.58 and 4.56, followed by an agricultural region with mean and maximum values of 1.52 and 3.14 mg/kg, and it was lowest in a residential area (0.97 mg/kg on average, 2.22 mg/kg at its highest). The samples were all below the 12 mg/kg cutoff point, though. Cadmium (Cd) concentrations were highest in industrial areas, followed by residential areas with mean and maximum values of 1.878  and 28.69 mg/kg, and agricultural areas with mean and maximum values of 0.07 and 1.137 mg/kg. The permitted limit of 10 mg/kg was exceeded in one sample from both the industrial and agricultural regions. Contrarily, the concentration of mercury (Hg) was highest in industrial areas with mean and maximum values of 0.11 and 1.44 mg/kg, next in agricultural areas with mean and maximum values of 0.087 and 1.02 mg/kg), and finally in residential areas with mean and maximum values of 0.06 and 0.59 mg/kg. The permitted limit of 6.6 mg/kg was not exceeded by any of the samples. Industrial areas had the highest concentrations of vanadium (V) with mean and maximum values of 13.11 and 20.42 mg/kg, followed by agricultural with mean and maximum values of 11.52 and 21.89 mg/kg, and residential with mean and maximum values of 7.0 and 17.73 mg/kg. The permitted limit of 130 mg/kg was not exceeded by any of the samples. There is no obvious hazard from the assessed trace metals in the Dammam study regions, according to the results and compared to worldwide criteria. However, it is crucial to conduct adequate monitoring to keep these metal levels under control.

3.2. Results of data-driven technique

This section demonstrates the visual and quantitative performance results of the data-driven approach employed in the current study for predicting the concentration of Pb. It is worthy of note that AI techniques are currently robust integrated computational approaches used in the environmental field of study, especially in estimating, predicting, and classifying heavy metals. AI-based techniques are becoming too complex due to incorporating several parameters before and during the modelling stage. Therefore, different trial and error methods were applied to determine the best performing structure for the three data-driven approaches; SWLR, L-Boost and GPR for both M1 and M2 used in the current work. Thus, the present section focused on highlighting the graphical visualization and quantitative performance of the models (SWLR, L-Boost and GPR) used in the training and testing steps. Table depicts the quantitative interpretation of both scenarios 1 and 2. The comparative performance indicates that M2 is superior to M1 in training and testing. Overall, GPR-M2 represents higher performance than GPR-M1, SWLR-M1, SWLR-M2, L-Boost-M1 and L-Boost-M2 models in the training and validation steps (see Table ). Both error metrics inform of RMSE, and MSE always showed positive values, in which lower error metrics demonstrate higher performance skills and higher error metrics showed lower performance accuracy. Therefore, Figure indicates the visualized exploratory performance of the models in both the training stage and testing stage.

Figure 6. Performance error metrics inform of RMSE and MSE in both the training and testing stages.

Figure 6. Performance error metrics inform of RMSE and MSE in both the training and testing stages.

Table 1. Performance skills of the SWLR, L-Boost and GPR

The DC and PC are used in evaluating the fitness between the experimental and predicted values (Abdullahi et al., Citation2020; Alas et al., Citation2020; Aliyu et al., Citation2021; Costache et al., Citation2022; Haruna et al., Citation2021; Ismail et al., Citation2022; Pham et al., Citation2021; Tao et al., Citation2022). Hence, the scatter plot can be used to compare the models’ performance abilities visually (see Figure ). The scatter plots indicate the degree of distribution of the experimental and predicted values, in which the model that produced predicted values approaching the experimental values will be considered as the superior model.

Figure 7. The degree of fitness of the two prediction scenarios against the experimental values.

Figure 7. The degree of fitness of the two prediction scenarios against the experimental values.

Therefore, GPR-M2, L-Boost and SWLR-M2 demonstrate higher capturing performance skills against the experimental values as compared with their respective first scenarios GPR-M1, L-Boost-M1 and SWL-M1 for the estimation of the Pb concentration in both the training and testing stages. Hence, the performance of the models depicted in Table is in line with the input–output relationship shown in Figure . This shows that the input variable combinations of Cr, Cu, Hg, Ni and Zn have more prediction impact than combinations of Cd, As, Ba, Ti and V towards predicting the concentration of Pb; this can be pictured clearly using the response plot (see Figure ).

Figure 8. Response plot performance of M1 and M2 input combinations for predicting the concentration of Pb (mg/Kg).

Figure 8. Response plot performance of M1 and M2 input combinations for predicting the concentration of Pb (mg/Kg).

4. Conclusions

Ten critical minor concentrations of metals in the surface soil of the Dammam region showed a pattern that was common to practically all metals. The soil samples from industrial locations had the highest amounts of them, followed by samples from residential and agriculture areas. Only a small percentage of the samples had maximum values that were higher than the permitted limits. The average metal concentration, however, was higher than allowed. The fact that the amounts of all the metals analyzed were within permissible ranges and did not pose an immediate risk to the environment is initially relieved by these findings. In the end, the dangers and risks provided by such elements may well be associated with natural resource, animal health, and human health. Furthermore, three different computational methods composed of two AI-based techniques, GPR and L-Boost, and a linear regression SWLR model, were used to predict the concentration of Pb (mg/kg) from the industrial station in two scenarios, M1 and M2. The performance results of the models indicate that the second scenario (M2) showed higher performance skills compared to the first scenario (M1) in all three approaches. Overall, the performance accuracy of the models showed that the non-linear GPR-M2 showed higher performance accuracy than all the model combinations applied in the current work. The performance of these models can be improved using other robust approaches and models such as the ensemble machine learning paradigms, emotional neural networks and different metaheuristic approaches, etc.

Disclosure statement

No potential conflict of interest was reported by the author.

References

  • Abba, S. I., Benaafi, M., Usman, A. G., & Aljundi, I. H. (2022). Sandstone groundwater salinization modelling using physicochemical variables in Southern Saudi Arabia: Application of novel data intelligent algorithms. Ain Shams Engineering Journal, 14(3), 101894. https://doi.org/10.1016/j.asej.2022.101894
  • Abba, S. I., Jasim, S., Sh, S., Salih, S. Q., Abdulkadir, R. A., Pham, Q. B., & Yaseen, Z. M. (2020). Evolutionary computational intelligence algorithm coupled with self-tuning predictive model for water quality index determination. Journal of Hydrology, 587(March), 124974. https://doi.org/10.1016/j.jhydrol.2020.124974
  • Abdullahi, H. U., Usman, A. G., & Abba, S. I. (2020). Modelling the absorbance of a bioactive compound in HPLC method using artificial neural network and multilinear regression methods. Dutse Journal of Pure and Applied Sciences (DUJOPAS), 6(2), 362–13.
  • Abel, M. T., Suedel, B., Presley, S. M., Rainwater, T. R., Austin, G. P., Cox, S. B., McDaniel, L. N., Rigdon, R., Goebel, T., Zartman, R., Leftwich, B. D., Anderson, T. A., Kendall, R. J., & Cobb, G. P. (2010). Spatial distribution of lead concentrations in urban surface soils of New Orleans, Louisiana USA. Environmental Geochemistry and Health, 32(5), 379–389. https://doi.org/10.1007/s10653-009-9282-1
  • Alas, M., Ali, S. I. A., Abdulhadi, Y., & Abba, S. I. (2020). Experimental evaluation and modeling of polymer nanocomposite modified asphalt binder using ANN and ANFIS. Journal of Materials in Civil Engineering, 32(10), 04020305. https://doi.org/10.1061/(ASCE)MT.1943-5533.0003404
  • Alhaji, U., Chinemezu, E., & Isah, S. (2022). Machine learning models for biomass energy content prediction: A correlation-based optimal feature selection approach. Bioresource Technology Reports, 19(September), 101167. https://doi.org/10.1016/j.biteb.2022.101167
  • Al-Hefne, J., Al-Dyel, O., Chowdhury, D. A., & Al-Ajayan, T. (2005). Distribution and ICP-MS determination of heavy elements in the surfacial sand along the red sea coastline of Saudi Arabia. Atomic Spectroscopy, 26(2), 51–58.
  • Aliyu, D. S., Malami, S. I., Anwar, F. H., Farouk, M. M., Labbo, M. S., & Abba, S. I. (2021). Prediction of compressive strength of lightweight concrete made with partially replaced cement by animal bone ash using artificial neural network. 2021 1st International Conference on Multidisciplinary Engineering and Applied Science, ICMEAS 2021, July. https://doi.org/10.1109/ICMEAS52683.2021.9692317
  • Alyemeni, M. N., Hayat, Q., Wijaya, L., & Hayat, S. (2014). Effect of salicylic acid on the growth, photosynthetic efficiency and enzyme activities of leguminous plant under cadmium stress. Notulae Botanicae Horti Agrobotanici Cluj-Napoca, 42(2), 440–445. https://doi.org/10.15835/nbha4229447
  • Arif, I., & Hashem, A. R. (1998). Soil analysis and mycoflora of Jizan city, Saudi Arabia (Vol. 62, No. 1-2, pp. 109–113). Phyton (Buenos Aires). https://eurekamag.com/research/003/274/003274329.php
  • Banerjee, P., Prasad, R. K., & Singh, V. S. (2009). Forecasting of groundwater level in hard rock region using artificial neural network. Environmental Geology, 58(6), 1239–1246. https://doi.org/10.1007/s00254-008-1619-z
  • Bhagat, S. K., Tiyasha, T., Kumar, A., Malik, T., Jawad, A. H., Khedher, K. M., Deo, R. C., & Yaseen, Z. M. (2022). Integrative artificial intelligence models for Australian coastal sediment lead prediction: An investigation of in-situ measurements and meteorological parameters effects. Journal of Environmental Management, 309(May), 114711. https://doi.org/10.1016/j.jenvman.2022.114711
  • Cai, H., Jia, X., Feng, J., Li, W., Hsu, Y. M., & Lee, J. (2020). Gaussian Process Regression for numerical wind speed prediction enhancement. Renewable Energy, 146, 2112–2123. https://doi.org/10.1016/j.renene.2019.08.018
  • Costache, R., Trung Tin, T., Arabameri, A., Crăciun, A., Ajin, R. S., Costache, I., Reza, M., Towfiqul Islam, A., Abba, S. I., Sahana, M., & Avand, M. (2022). Flash-flood hazard using Deep Learning based on H2O R package and fuzzy-multicriteria decision-making analysis. Journal of Hydrology, 609(June), 127747. https://doi.org/10.1016/j.jhydrol.2022.127747
  • Emsley, J. (2011). Nature’s building blocks : Everything you need to know about the elements. Oxford University Press.
  • Fosmire, G. J. (1990). Zinc Toxicity. The American Journal of Clinical Nutrition, 51(2), 225–227. https://doi.org/10.1093/ajcn/51.2.225
  • Gaetke, L. M., & Chow, C. K. (2003). Copper toxicity, oxidative stress, and antioxidant nutrients. Toxicology, 189(1–2), 147–163. https://doi.org/10.1016/S0300-483X(03)00159-8
  • Gaya, M. S., Abba, S. I., Abdu, A. M., Tukur, A. I., Saleh, M. A., Esmaili, P., & Wahab, N. A. (2020). Estimation of water quality index using artificial intelligence approaches and multi-linear regression. IAES International Journal of Artificial Intelligence (IJ-AI), 9(1), 126–134. https://doi.org/10.11591/ijai.v9.i1.pp126-134
  • Ghali, U. M., Usman, A. G., Alhosen, M., Degm, A., Alsharksi, A. N., Naibi, A. M., & Abba, S. I. (2020). Applications of artificial intelligence-based models and multi-linear regression for the prediction of thyroid stimulating hormone level in the human body. International Journal of Advanced Science and Technology, 29(4), 3690–3699.
  • Guidotti, T. L., & Ragain, L. (2007). Protecting children from toxic exposure: Three strategies. Pediatric Clinic of North America, 54(2), 227–235. 2007. https://doi.org/10.1016/j.pcl.2007.02.002
  • Han, F., Su, Y., Monts, D., Waggoner, C., & Plodinec, M. (2006). Binding, distribution, and plant uptake of mercury in a soil from Oak Ridge, Tennessee, USA. The Science of the Total Environment, 368(2–3), 753–768. https://doi.org/10.1016/j.scitotenv.2006.02.026
  • Haruna, S. I., Malami, S. I., Adamu, M., Usman, A. G., Farouk, A., Ali, S. I. A., & Abba, S. I. (2021). Compressive strength of self-compacting concrete modified with rice husk ash and calcium carbide waste modeling: A Feasibility of Emerging Emotional Intelligent Model (EANN) versus Traditional FFNN. Arabian Journal for Science and Engineering, 46(June), 11207–11222. https://doi.org/10.1007/s13369-021-05715-3
  • Hassan, T., Parveen, S., Bhat, B. N., & Ahmad, U. (2017). Seasonal variations in water quality parameters of river Yamuna, India. International Journal of Current Microbiology and Applied Sciences, 6(5), 694–712. https://doi.org/10.20546/ijcmas.2017.605.079
  • Heddam, S. (2016). Simultaneous modelling and forecasting of hourly dissolved oxygen concentration (DO) using radial basis function neural network (RBFNN) based approach: A case study from the Klamath River, Oregon, USA. Modeling Earth Systems and Environment, 2(3). https://doi.org/10.1007/s40808-016-0197-4
  • Huang, H., Peng, X., Li, Z., & Ding, S. (2021). Gaussian Process Regression with Maximizing the Composite Conditional Likelihood. IEEE Transactions on Instrumentation and Measurement · August 2021. https://doi.org/10.1109/TIM.2021.3104376
  • Hunger, K. (2007). Industrial Dyes: Chemistry, Properties, Applications. John Wiley & Sons.
  • Ismail, S., Abdulkadir, Usman, A. G., Abdulkadir, R. A., & Abdulkadir, R. A. (2022). Development of chemometrics ‑ based neurocomputing paradigm for simulation of manganese extraction using solid ‑ phase tea waste. Modeling Earth Systems and Environment, 8(4), 5031–5040. https://doi.org/10.1007/s40808-022-01369-8
  • Joimel, S., Cortet, J., Jolivet, C., Saby, N., Chenot, E., Branchu, P., Consalès, J., Lefort, C., Morel, J.L., & Schwartz, C. (2016). Physico-chemical characteristics of topsoil for contrasted forest, agricultural, urban and industrial land uses in France. The Science of the Total Environment, 545, 40–47. https://doi.org/10.1016/j.scitotenv.2015.12.035
  • Kadi, M. W. (2009). “Soil Pollution Hazardous to Environment”: A case study on the chemical composition and correlation to automobile traffic of the roadside soil of Jeddah city, Saudi Arabia. Journal of Hazardous Materials, 168(2–3), 1280–1283. https://doi.org/10.1016/j.jhazmat.2009.03.015
  • Kargar, K., Samadianfard, S., Parsa, J., Nabipour, N., Shamshirband, S., Mosavi, A., & Chau, K. -W. (2020). Estimating longitudinal dispersion coefficient in natural streams using empirical models and machine learning algorithms. Engineering Applications of Computational Fluid Mechanics, 14(1), 311–322. https://doi.org/10.1080/19942060.2020.1712260
  • Kazemi, S. M., & Hosseini, S. M. (2011). Comparison of spatial interpolation methods for estimating trace metalsin sediments of Caspian Sea. Expert Systems with Applications, 38(3), 1632–1649. https://doi.org/10.1016/j.eswa.2010.07.085
  • Kresse, R., Baudis, U., Jäger, P., Riechers, H., Heinz, H., Jocher, W., & Wolf, H. (2007). Barium and barium compounds, Ullman, Franz. Ullmann’s Encyclopedia of Industrial Chemistry:WileyVCH https://doi.org/10.1002/14356007.a03_325.pub2
  • Lakshmi, D., Akhil, D., Kartik, A., Gopinath, K. P., Arun, J., Bhatnagar, A., Rinklebe, J., Kim, W., & Muthusamy, G. (2021). Artificial intelligence (AI) applications in adsorption of trace metals using modified biochar. The Science of the Total Environment, 801, 801. https://doi.org/10.1016/j.scitotenv.2021.149623
  • Lal, A., & Datta, B. (2020). Performance evaluation of homogeneous and heterogeneous ensemble models for groundwater salinity predictions: A regional-scale comparison study. Water, Air, and Soil Pollution, 23(1), 1–21. T. (2008).
  • Mañay, M., Cousillas, A., Alvarez, C., & Heller. (2008). Lead contamination in Uruguay: The “La Teja” neighborhood case. Reviews of Environmental Contamination and Toxicology, 195( 2008), 93–115.
  • Mohammadi, B., Linh, N. T. T., Pham, Q. B., Ahmed, A. N., Vojteková, J., Guan, Y., Abba, S. I., & El-Shafie, A. (2020). Adaptive neuro-fuzzy inference system coupled with shuffled frog leaping algorithm for predicting river streamflow time series. Hydrological Sciences Journal, 65(10), 1–14. https://doi.org/10.1080/02626667.2020.1758703
  • Nourani, V., Elkiran, G., & Abba, S. I. (2018). Wastewater treatment plant performance analysis using artificial intelligence - an ensemble approach. Water Science and Technology, 78(10), 2064–2076. https://doi.org/10.2166/wst.2018.477
  • Nunno, D., Abba, F. S. I., Quoc, B., Abu, P., Towfiqul, R., & Swapan, I. (2022). Groundwater level forecasting in Northern Bangladesh using nonlinear autoregressive exogenous (NARX) and extreme learning machine (ELM) neural networks. Arabian Journal of Geosciences, 15(647), 1–20. https://doi.org/10.1007/s12517-022-09906-6
  • Pham, Q. B., Gaya, M. S., Abba, S. I., Abdulkadir, R. A., Esmaili, P., Linh, N. T. T., Sharma, C., Malik, A., Khoi, D. N., Dung, T. D., & Linh, D. Q. (2020). Modeling of bunus regional sewage treatment plant using machine learning approaches. Desalination and Water Treatment, 203(2020), 80–90.
  • Pham, Q. B., Sammen, S. S., Abba, S. I., Mohammadi, B., Shahid, S., & Abdulkadir, R. A. (2021). A new hybrid model based on relevance vector machine with flower pollination algorithm for phycocyanin pigment concentration estimation. Environmental Science and Pollution Research, 28(25), 32564–32579.
  • Rodríguez-Salazar, M. T., Morton-Bermea, O., Hernández-Álvarez, E., Lozano, R., & Tapia-Cruz, V. (2011). The study of metal contamination in urban surface soils of Mexico City using GIS. Environmental Earth Sciences, 62(5), 899–905.
  • Russell, M. S., & Svrcula, K. (2008). Chemistry of fireworks. Royal society of chemistry.
  • Tao, H., Majeed Hameed, M., Abdulameer Marhoon, H., Zounemat-Kermani, M., Salim, H., Sungwon, K., Oleiwi Sulaiman, S., Leong Tan, M., Sa’adi, Z., Danandeh Mehr, A., Falah Allawi, M., Abba, S. I., Mohamad Zain, J., Falah, M. W., Jamei, M., Dhanraj Bokde, N., Bayatvarkeshi, M., Al-Mukhtar, M., Kumar Bhagat, S., Mundher Yaseen, Z. (2022). Groundwater Level Prediction using Machine Learning Models: A Comprehensive Review. Neurocomputing, 489, 271–308. https://doi.org/10.1016/j.neucom.2022.03.014
  • Tawabini, B., Yassin, M. A., Benaafi, M., Adetoro, J. A., Al-Shaibani, A., & Abba, S. I. (2022). Spatiotemporal variability assessment of trace metals based on subsurface water quality impact integrated with artificial intelligence-based modeling. Sustainability (Switzerland), 14(4). https://doi.org/10.3390/su14042192
  • Tran, D. A., Tsujimura, M., Ha, N. T., Nguyen, V. T., Binh, D. V., Dang, T. D., Doan, Q. V., Bui, D. T., Anh Ngoc, T., Phu, L. V., Thuc, P. T. B., & Pham, T. D. (2021). Evaluating the predictive power of different machine learning algorithms for groundwater salinity prediction of multi-layer coastal aquifers in the Mekong Delta, Vietnam. Ecological Indicators, 127, 107790. https://doi.org/10.1016/j.ecolind.2021.107790
  • USEPA. (1994). SW-846 Test Method 3050B: Acid Digestion of Sediments, Sludges, and Soils. Test Methods for Evaluating Solid WAste, Physical/Chemical Methods.
  • USEPA. (2001). METHOD 200.7. Trace elements in water, solids, and biosolids by inductively coupled plasma-atomic emission spectrometery. Rev. 5, Jan. 2001. EPA-821-R-01-010.
  • Usman, A. G., Ahmad, M. H., Danraka, N., & Abba, S. I. (2021). The effect of ethanolic leaves extract of Hymenodictyon floribundun on inflammatory biomarkers: A data driven approach. Bulletin of the National Research Centre, 45(128), 1–2. https://doi.org/10.1186/s42269-021-00586-y
  • Usman, A. G., Işik, S., Abba, S. I., & Meriçli, F. (2021). Chemometrics-based models hyphenated with ensemble machine learning for retention time simulation of isoquercitrin in Coriander sativum L. using high-performance liquid chromatography. Journal of Separation Science, 44(4), 843–849. https://doi.org/10.1002/jssc.202000890
  • Wadhawan. (2019) . Introduction to ferroic materials. CRC PRESS.
  • Wang, H., Yilihamu, Q., Yuan, M., Bai, H., Xu, H., & Wu, J. (2020). Prediction models of soil heavy metal(loid)s concentration for agricultural land in Dongli: A comparison of regression and random forest. Ecological Indicators, 119(December), 106801. https://doi.org/10.1016/j.ecolind.2020.106801
  • Wiangkham, A., Ariyarit, A., & Aengchuan, P. (2022). Prediction of the influence of loading rate and sugarcane leaves concentration on fracture toughness of sugarcane leaves and epoxy composite using artificial intelligence. Theoretical and Applied Fracture Mechanics, 117(2022), 103188.
  • Yaseen, Z. M. (2021). An insight into machine learning models era in simulating soil, water bodies and adsorption heavy metals: Review, challenges and solutions. Chemosphere, 277, 130126. https://doi.org/10.1016/j.chemosphere.2021.130126
  • Yassin, M. A., Tawabini, B., Al-Shaibani, A., Adetoro, J. A., Benaafi, M., Al-Areeq, A. M., Usman, A. G., & Abba, S. I. (2022). Geochemical and spatial distribution of surface soil HMS coupled with modeling of cr using chemometrics intelligent techniques: Case study from Dammam Area, Saudi Arabia. Molecules, 27(13), 4220. https://doi.org/10.3390/molecules27134220
  • Yeskis, D., & Zavala, B. (2022). Ground-Water Sampling Guidelines for Superfund and RCRA Project Managers. USEPA Office of Solid Waste and Emergency Response: EPA 542-S-02-001 May 2002. www.epa.gov/tio