260
Views
0
CrossRef citations to date
0
Altmetric
Report

Hyperspectral estimation of mercury content of soil in Oasis city in arid zones of China

, , , &
Article: 2299147 | Received 13 Jun 2023, Accepted 20 Dec 2023, Published online: 18 Jan 2024

Abstract

Mercury (Hg) is one of the most toxic heavy metals to the human body. Conventional methods for measuring Hg content in soil are time-consuming and expensive. In order to select a high-effective method for estimating soil Hg content based on hyperspectral remote sensing techniques, a total of 85 soil samples were collected from the Urumqi city, northwest China, to obtain the Hg contents and related hyperspectral data. A total of 12 spectral transformation methods were used to the original spectral data for selecting significant wavebands. The partial least squares regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were used to establish hyperspectral inversion models for soil Hg content using selected significant wavebands. The results showed that the Hg content of soil was significantly higher than its corresponding background value, which obviously enriched in soil in the study area. The spectral transformation of the original wavebands can effectively reduce the interference of the background noise and can improve the correlations between the spectral data and the soil Hg content. The RFR model based on logarithmic first-order differential (LTFD–RFR) or on reciprocal logarithmic first-order differential (ATFD–RFR) had the best inversion effects, with the highest prediction ability (R2 = 0.856, RMSE = 0.002 and MAE = 0.072). The LTFD–RFR or ATFD–RFR methods can be used as a means of inversion of Hg content of soil in oasis cities. The novel contribution of this work is to construct hyperspectral inversion model which can accurately estimate the Hg content of urban soils in arid zones. Results of this study can provide a technical support for hyperspectral estimation of soil Hg content.

1. Introduction

Mercury (Hg) is a highly toxic metal contaminant known for its concealment, persistent impact and bio-accumulation (Pirrone et al. Citation2010). Considering the potential ecological and health risks of Hg in soils, it is imperative to closely monitor Hg contamination for areas with high emissions. Urban soil is the main source and potential banks of contaminants, including Hg, in urban environment. Urban soil is regarded as the carrier of various environmental contaminants, such as traffic emissions and other anthropogenic sources (Nazupar et al. Citation2022). Through circulation among different ecosystems, Hg that accumulated in the urban soils can ultimately pose a potential threat to the urban ecosystems and human health (Zhao et al. Citation2018).

Quickly and efficiently estimating heavy metal contents in soil is crucial for protecting the natural environment and human health (Zhou et al. Citation2022). The traditional chemical methods for soil heavy metal contents are field sampling followed by laboratory experimentation of soil samples. The traditional method has high accuracy but is cost-intensive, time-consuming and inefficient (Xue et al. Citation2020; Ajay et al. Citation2020), and the detection reagents may cause secondary contamination (Tian et al. Citation2020). Hyperspectral remote sensing technology has the advantages of rapid monitoring and lower cost, and plays an important role in the estimation and detection of heavy metal contents in soil (Wang et al. Citation2018; Lin et al. Citation2019; Tan et al. Citation2021; Chang et al. Citation2022). Few studies (Idowu et al. Citation2008; Zhao et al. Citation2018) indicated that the spectral curves of soils contaminated by heavy metals are different from those that are not contaminated. In light of this information, research concerning the hyperspectral estimation of soil Hg content has emerged as an important frontier in environmental research in recent years.

The establishment of the hyperspectral inversion model is a basic approach towards estimating heavy metal contents of soil, and has proven to be effective in analysing the contamination levels of soil in different regions in the world (Song et al. Citation2015; Dong et al. Citation2016). Differential transformation of soil spectral data can improve the correlations between spectral data and heavy metal contents of soil, thus can effectively improve the prediction accuracy of the hyperspectral inversion models for soil heavy metal contents (Liu et al. Citation2017; Yang, Xu, et al. Citation2022).

Statistical analysis models, especially the partial least squares regression (PLSR), and machine learning models, such as the support vector machine regression (SVMR) and the random forest regression (RFR) are the most commonly used inversion methods for the hyperspectral estimation of soil heavy metal contents. For example, Liu et al. (Citation2017) pointed out that the LG-PLSR model is the best prediction model to predict the soil Hg content in Nanjing, Jiangsu Province, China, with the R2 of 0.906 and the RMSE of 0.077. Zhao et al. (Citation2018) established a relationship between the hyperspectral data and the soil Hg content. Results of their study showed that the back-propagation neural network based on genetic algorithm (GA–BPNN) is the best method to estimate Hg content of soil in the Guangdong Province of China, with the R2 of 0.923, the RMSE of 0.042 and the MAE of 0.033. Tao et al. (Citation2018) reported that the support vector machine (SVM) is the most effective hyperspectral inversion model for lead and zinc in soil in mining area in the Chenzhou city in China. The CWT–RBF method was also to be proven as a useful means of the hyperspectral inversion of heavy metals in soil, by using the PLSR and the radial basis function neural network (RBFNN) models based on four spectral transformation methods, including the first-order differential, the second-order differential, the continuum removal and the continuous wavelet transform (Zhang et al. Citation2019). Liu et al. (Citation2020) established a hyperspectral inversion model of Hg in reed leaves under different levels of soil Hg contamination. Their study indicated that the levels of Hg in soil can affect the estimation accuracy of the hyperspectral inversion models for Hg in reed leaves, and the NR-CWT-SMLR model (R2 = 0.859, RMSE = 0.096) and the NR-CWT-RF model (R2 = 0.856, RMSE = 0.106) showed the high accuracies in Hg inversion in the Hengshui, Hebei Province, China. Wei et al. (Citation2020) reported that the RBFNN had the best universality and high prediction accuracy for estimating the arsenic content in soil of different land-use types. The SVM recursive feature elimination cross-validation (SVM–RFECV) combined with AdaBoost method was more stable and can effectively estimate the Cd, Cr, Cu and Ni contents in soil (Wang et al. Citation2023). Nawar et al. (Citation2023) investigated the potential of visible nearinfrared spectroscopy combined with the PLSR and variable selection methods to predict main heavy metals in agricultural soils under arid conditions. Meanwhile, the spectrum contextual self-attention deep learning network (SCSANet) was proven to have the highest inversion accuracy for soil Hg with a similar magnitude of content (Zhang et al. Citation2023). However, many publications have achieved good results in the research of hyperspectral remote sensing inversion of soil heavy metals, and the accuracy of the model is getting higher and higher. Due to the effects of the natural environment, soil types and soil properties in different regions, there is no optimum hyperspectral remote sensing inversion method.

The aforementioned researches were mainly focused on the hyperspectral inversion of heavy metals in mining areas, agricultural soils or natural soils. Some other recent studies also focused on the hyperspectral inversion of other soil properties. For example, Datta et al. (Citation2023) used different machine and deep learning techniques to predict soil nutrient properties to propose an optimum machine/deep learning model that can be used as a rapid soil test. Xu et al. (Citation2023) investigated the potential of the detection of microplastics in farmland soil using hyperspectral technology and machine learning methodology, including the SVM, back propagation neural network (BPNN) and one-dimensional convolutional neural network (1D-CNN) models. So far, there are few literature found as yet on the hyperspectral inversion of Hg content of urban soils in arid zones. Obtained results of previous studies only feasible for a limited land area, because the most appropriate hyperspectral estimation model for different soil types, or soils with different geo-chemical properties in different regions are different. Besides, the accuracy of the hyperspectral inversion models could be affected by the natural conditions of studied areas, as well as the quality of the inputs data, the spectral analysis methods and model optimization algorithms (Shekhar et al. Citation2018; Zhao et al. Citation2022). Therefore, it is very important to analyse the possibility of the hyperspectral inversion of Hg contents in urban soils, which different from the agricultural or natural soils.

Urumqi is the economic center and the most important industrialized city in the Xinjiang of the northwestern China. In recent years, due to the rapid urbanization, the Urumqi city has been influenced by industries, transportation and other human activities, and thus the soils in Urumqi have been heavily contaminated by Hg (Nazupar et al. Citation2022). Therefore, special attentions should be paid to Hg contents in order to target the lowest threats, and it’s necessary to rapid and dynamic monitoring of the Hg content of soil for protecting environmental safety of this city.

The main object of this research is to select an accurate model for the hyperspectral inversion of Hg content in urban soil by comparing the accuracy of hyperspectral inversion models based on the PLSR and two machine learning methods including the SVMR and the RFR. Results of this study will solve the existing problems in the current hyperspectral inversion of Hg content in the urban soils using statistical analysis and machine learning methods.

2. Materials and methods

2.1. Study area

The city of Urumqi is located in the northwestern arid zones of China, and middle parts of the Tianshan Mountain with a total urban area of about 550 km2 (). Urumqi is the biggest metropolitan city in the Xinjiang, and also one of the most important cities in the ‘Silk Road Economic Belt’ and occupies a very important position in the ‘Belt and Road’ strategy. Urumqi is characterized by a typical continental desert climate, with an average annual temperature of 6.7 °C, and average precipitation and evaporation of 280 mm and 2730 mm, respectively. Grey desert soil and salinized soil are main soil types in this area, and Hg is the most polluted heavy metal in soil in the Urumqi (Nazupar et al. Citation2022).

Figure 1. Locations of the study area and sample sites. (a), (b) and (c) represent the field condition of the sample sites.

Figure 1. Locations of the study area and sample sites. (a), (b) and (c) represent the field condition of the sample sites.

2.2. Sample collection and analysis

A total of 85 soil samples were collected from the study area in April of 2021 (). At each sample sites, five sub-samples were taken from topsoil (0–20 cm) layer within 100 m × 100 m areas and then mixed together to form one composite soil sample and then manually mixed in a clean polyethylene bag. All the soil samples returned to the laboratory naturally dried, crushed and sieved for 20 meshes. Collected samples were divided into two groups, one for determination of Hg content and another one for the hyperspectral measurement.

The Hg content of soil samples was determined by an atomic fluorescence spectrophotometer (AFS–933) after dissolving by four-acid method, and the determination method for Hg content is described in GB/T 22105.2–2008 (MAPRC 2008). The analytical data quality was analysed by the laboratory quality control methods, including the use of reagent blanks, duplicates and standard reference materials for each batch of soil samples. For the precision of the analytical procedures, a standard solution was used to compare samples to the national standards (Chinese national standards samples, GSS–12). All of the soil samples were tested repeatedly, and the determined consistency of the Hg measurements was 94.5%. The basic statistical summary of the salinity, moisture contents and pH values of the collected soil samples is given in .

Table 1. Characteristics of the collected soil samples.

2.3. Spectrometric determination and processing

Spectral data of the collected soil samples were measured using a FieldSpec®3 portable object spectrometer manufactured by the Analytical Spectral Devices (ASD), USA. The interval of data acquisition was 1 nm with a spectral measurement range from 350 to 2500 nm. The spectrometer was preheated for half an hour before the test, then the probe was vertically aligned to the optimized whiteboard, and finally the soil samples were measured to obtain the original spectral curves. The 20–mesh sieved soil sample was placed on a black cardboard of 200 cm × 200 cm during the test, and the sensor probe of spectrometer was positioned 15 cm above and perpendicular to the soil surface. During the test, a total of 15 replicate measurements were taken and 15 spectral curves were collected for the same soil samples. Then, the average value was selected as its final spectral curve by using the ViewSpecPro software. Due to the influence of the spectral instrument itself, atmospheric properties and stability, and the surrounding environment, the large noise generated in the spectral band. Therefore, the data from 350 to 399 nm, from 1350 to 1430 nm, from 1781 to 1970 nm and from 2401 to 2500 nm were excluded. Thus, a total of 1730 spectral bands of data were obtained. The Savitzky–Golay filter algorithm was implemented using the Python for smoothing and removing noises from the spectral curves. illustrates the spectral reflectance curves of the collected soil samples after the above spectral pretreatment.

Figure 2. The original and Savitzky-Golay smoothing spectral reflectance curve of soil.

Figure 2. The original and Savitzky-Golay smoothing spectral reflectance curve of soil.

2.4. Algorithm construction

The original spectral response signals of heavy metals in soil are weak, so it’s difficult to directly reflect the important wavelengths with original data spectra (Michelle et al. Citation2011). In order to further reduce the interference of the environmental background on the soil spectral data and to enhance the spectral information related to Hg in the soil samples, the original spectral reflectance data were subjected to the first-order differentiation (FD), second-order differentiation (SD), reciprocal first-order differentiation (RTFD), reciprocal second-order differentiation (RTSD), logarithmic first-order differentiation (LTFD), logarithmic second-order differentiation (LTSD), root mean first-order differentiation (RMSFD), root mean second differentiation (RMSSD), reciprocal logarithmic first-order differential (ATFD), reciprocal logarithmic second-order differential (ATSD), reciprocal logarithmic first-order differential (RLFD) and reciprocal logarithmic second-order differential (RLSD). To determine the spectrum highly correlated with the soil Hg contents, the Pearson correlation coefficients between the soil Hg contents and the original spectral reflectance and those after 12 types of spectral transformation were calculated. Then the important wavebands with strong correlation coefficients were selected for constructing the hyperspectral estimation model.

2.5. Algorithm assessment approach

In order to consider both the soil Hg content and the spectral vector, the collected soil samples were randomly split into a calibration set and verification set for modeling and verification, respectively. The number of calibration sets and verification sets was 66 and 19, respectively. The calibration set was used to construct hyperspectral estimation model, while the validation dataset was used to evaluate the estimation accuracy of hyperspectral models. The original and 12 types of transformed spectral data were selected as the independent variables, whereas the soil Hg content was selected as the dependent variable for constructing hyperspectral estimation model. Then, the PLSR, RFR and SVMR models were constructed for predicting the Hg content of soil in this study. Based on the Python, calibration set was modeled, and the ‘random_state’ of three models was set as 53. Due to the randomness of the RFR model, the number of ‘n_estimators’ will disturb the predictive performance of the model. Under the consideration of model performance, model running time, sample number, and other factors, the number of ‘n_estimators’ of the RFR model was set as 3. The determination coefficient (R2), root mean square error (RMSE) and mean absolute error (MAE) were used to evaluate the prediction accuracy of the constructed inversion models.

The R2 was used to express the fitness of the inversion models. When R2 < 0.5, the inversion model does not have prediction ability, when 0.5 ≤ R2 < 0.7, the inversion model has preliminary prediction ability, and when R2 ≥ 0.7, the inversion model has good prediction ability (Vohland et al. Citation2011). The RMSE and MAE were used to indicate the predictive capacity and the robustness of inversion models. In general, the smaller values of RMSE and MAE indicate the higher prediction accuracy of inversion models (Wang et al. Citation2023).

2.6. Statistical analysis

Statistical analysis was performed using the Origin software (Origin 2018, Origin Lab, Northampton, MA). The original data of Hg in the collected soil samples were summarized using ranges, average values, the standard deviations (St.D) and the coefficients of variation (CV). The Pearson’s correlation analysis was used to identify the correlations between Hg content and spectral reflectance data of soil.

3. Results

3.1. Statistical analysis of Hg content in soil

The basic statistical summary of Hg contents of calibration sets and verification sets was given in , and the corresponding background value (Mamattursun et al. Citation2018) of Hg in soils in the Xinjiang was also provided. indicates that the Hg contents of all the collected soil samples in the study area distributed in the range of 0.21–0.87 mg·kg−1, with an average value of 0.47 mg·kg−1, which exceeded its corresponding background value by 28.24 times. It indicated that the Hg content of soil was significantly higher than their respective background values, and was obviously enriched in soil in the study area. The average values of salinity, moisture contents, and pH values of the collected soil samples were 1.21%, 0.05 mg·kg−1 and 7.85, respectively, with alkalinity ().

Table 2. Descriptive statistics of Hg content in each data set.

The average Hg contents of calibration sets and verification sets were 0.47 mg·kg−1 and 0.46 mg·kg−1, respectively. The St.D values of Hg contents of calibration sets and verification sets were both 0.17, while the CV values of the calibration sets and the verification sets were 0.35 and 0.40, respectively. It can be seen that the average, St.D and CV values of Hg contents in the calibration sets and verification sets were essentially the same. It indicates that the division of soil samples was reasonable and can be used for subsequent modeling.

3.2. Hg content in soil and spectral correlations

In order to identify the correlations between the Hg content and spectral data of the collected soil samples, the correlation analysis was performed between the Hg content, the original spectral reflectance data and the spectral reflectance data after 12 types of mathematical transformations. The degree of correlation was expressed by the Pearson coefficient (R), and the correlation analysis was examined in the significance test at the p < 0.01 level (two-sided).

As illustrated in , the R of the original spectrum (red line on ) varied by a small amount and the curve was relatively smooth. The correlation coefficients between the original spectral reflectance data and the Hg content were within −0.187–0.018. It indicates that the correlations between the original spectral reflectance data and the Hg content of soil were very low. However, after 12 types of spectral transformation, the correlation coefficients of the spectral reflectance data improved accordingly, and the numbers of effective and sensitive wavebands were increased significantly, according to . It explains that the spectral transformation processing for the original spectral data can effectively reduce the interference of the background noise and can improve the correlation between the spectral data and the Hg content in soil in the study area.

Figure 3. Correlations between soil Hg content and spectral reflectance data.

Figure 3. Correlations between soil Hg content and spectral reflectance data.

3.3. Establishment and analysis of the hyperspectral inversion model

The collected soil samples were divided into a calibration set and a verification set for modeling and verification. The calibration set was used to construct the inversion model, whereas the verification set was used to evaluate the performance of the final model. According to the correlation coefficient between the Hg content and the spectral reflectance data after 12 types of transformations, wavebands with absolute values more than 0.276 under the processed spectral reflectance data were taken as the significant wavebands. Then, the significant wavebands were selected as the independent variables (x), and the Hg contents of soil were selected as the dependent variables (y). The hyperspectral inversion models for soil Hg content in the optimal spectral transformation form were established based on the PLSR, the SVMR and the RFR algorithms, respectively. The basic statistics related to the stability and accuracy of the hyperspectral inversion models are given in .

Table 3. Statistics of precision parameter of the inversion models.

As shown in , the prediction results of different inversion models were different. The R2 inversed by the PLSR models ranged from 0.001 to 0.067, while the RMSE values ranged from 0.014 to 0.042, and the MAE values ranged from 0.173 to 0.179. It indicates that the PLSR model does not have a prediction ability of Hg content in soil, and the significant wavebands selected by different spectral transformation failed to establish an effective inversion model by the PLSR.

The ranges of R2, RMSE, and MAE values inversed by the SVMR were 0.151–0.216, 0.051–0.056 and 0.142–0.146, respectively. The R2 was lower than 0.5. It’s clear that the fitting effect and stability of the SVMR were poor, so the SVMR model basically has no predictive ability. It indicates that the SVMR model also had a very poor estimation ability of Hg content in soil. But, the prediction ability of the SVMR was relatively better than the PLSR.

The R2 inversed by the RFR model ranged from 0.718 to 0.856, indicating that the constructed inversion model had good prediction ability. The ranges of RMSE and MAE values inversed by the RFR model were 0.001–0.008 and 0.072–0.093. The relatively smaller values of RMSE and MAE indicated that the inversion prediction accuracy of the RFR model was high, and had a high stability and generalization ability. The inversion accuracy of the Hg content in soil can be ranked as R2RFR > R2SVMR > R2PLSR. Therefore, combined with the performance of the inversion accuracy of the constructed models, the inversion accuracy of the RFR among the three modeling methods was significantly better than that of the PLSR and the SVMR.

Besides, as shown in , the fitness, stability and accuracy of the inversion model were improved after different spectral transformation of original spectral data. For example, the R2 of the RFR model under various spectral transformation can be ranked as: LTFD (0.856) = ATFD (0.856) > LTSD (0.809) = ATSD (0.809) > RTFD (0.800) > RLSD (0.779) > RMSFD (0.770) = SD (0.770) > RTSD (0.764) > FD (0.763) > RMSSD (0.742) > RLFD (0.718). The RMSE and MAE values inversed by the RFR model were also different. It indicates that the spectral transformation of the original spectral data obviously highlighted the characteristic wavebands of soil, reduced the interference of the background noise and improved the correlations between the spectral data and the Hg content in soil in the study area.

However, the fitness, prediction ability, and prediction accuracy of the RFR model based on the logarithmic first-order differential (LTFD–RFR) and the reciprocal logarithmic first-order differential (ATFD–RFR) was the best (R2 = 0.856, RMSE = 0.002, MAE = 0.072) among the 12 types of spectral transformation of the original spectral data. Therefore, the combination of the characteristic wavebands was selected based on the best spectral transformation and the RFR model. The Hg content in soil was inverted effectively, and the scatter plot of the measured and the predicted values of Hg content modeling by the RFR model was exhibited in .

Figure 4. Measured and RFR predicted values of Hg content in soil.

Figure 4. Measured and RFR predicted values of Hg content in soil.

4. Discussions

Trace elements in soil can affect the spectral reflectance curve of soil, and provide a theoretical basis for the indirect prediction of contents of trace elements in soil (Wang et al. Citation2018). With the development of hyperspectral remote sensing techniques, many scholars began to perform the hyperspectral inversion of contents of trace elements, such as As, Cd, Cu, and Hg, in soil (Shi et al. Citation2014; Zhou et al. Citation2021). Also, some other studies perform a hyperspectral inversion of soil moisture (Levy and Johnson Citation2021), soil organic matter (Yang, Hu, et al. Citation2022), soil salinity (Jia, Zhang, He, Yuan, et al. Citation2022), and soil electrical conductivity (Jia, Zhang, He, Hu, et al. Citation2022). Studies above indicated that the spectral reflectance is very important for establishing the hyperspectral inversion model for soil geochemistry. However, the results of our study were in agreement with the results of previous studies.

Estimating the Hg content in soils using the hyperspectral remote sensing is a cost-efficient method but challenging due to the effects of natural environmental conditions and soil properties (Liu et al. Citation2020). The spectral reflectance of soil is affected by the effects of soil geochemical characteristics, combined with natural environmental factors, such as the soil parent material and soil formation conditions (Zhang et al. Citation2019). Due to the presence of contaminants such as metal elements in soil, spectral reflectance of the contaminated soil is slightly different from the uncontaminated soil (Wei et al. Citation2020). Besides, the special environmental conditions of oasis soil are complex and display special regional characteristics. However, the average Hg content of the collected soil samples in this study exceeded its corresponding background value by 28.24 times, with high enrichment. The hyperspectral reflectance data and Hg contents of soil in the study area are particularly suitable for the hyperspectral modeling. The extremely arid environmental conditions and special soil types of the studied area, and extremely high enrichment of Hg in the collected soil samples suggested that this study can provide implications for both research and practice for the hyperspectral modeling of Hg in soil in arid zones.

Spectral transformation is an effective approach for identifying the highly correlated spectral bands and the best hyperspectral inversion models (Wang et al. Citation2023). In this study, 12 types of spectral transformations were taken on the original spectral data. Obtained results of this study indicated that the response of Hg content to the original spectral values is weak, so it cannot be directly used to inverse the soil Hg content by using original spectral data. However, different spectral transformations of the original spectral data showed different enhancement effects on the correlations between spectral reflectance data and Hg contents in soil in this study. It proved that the spectral transformation enhanced the spectral information, and further improved the prediction accuracy of the hyperspectral inversion models of Hg content in soil. Based on the differential spectral transformation and the Pearson correlation analysis, this study used the PLSR, SVMR, and RFR algorithms for modeling, and finally verified that the RFR model based on the LTFD–RFR or the ATFD–RFR had the best inversion effects, with the highest prediction ability (R2 = 0.856, RMSE = 0.002 and MAE = 0.072). Both the LTFD–RFR and the ATFD–RFR methods can be used as a means of the hyperspectral inversion of Hg content in soil in oasis cities.

However, the fitting effect and stability of the PLSR model after all types of spectral transformations were poor as shown in . So, the PLSR model basically had no prediction ability. It is because the Hg content and the spectral reflectance of soil have a complex nonlinear relationship (Zhao et al. Citation2018), which resulted in a very poor performance of the PLSR model. The RFR performed better prediction ability than the PLSR and the SVMR, partly because the RF is adept at regression of large datasets. Moreover, with randomly selecting samples and independent variables, the RFR not only probed the association between independent and dependent variables, but also emphasized the differences between different samples and independent variables, and efficiently improved the inversion accuracy (Liu et al. Citation2020).

A faster and convenient method for estimating the Hg content in soil was described in this work. This method can provide an effective way for predicting the Hg contents of soil in arid zone oasis. The obtained results in this study were in agreement with the results of previous studies (Liu et al. Citation2017; Zhao et al. Citation2018; Liu et al. Citation2020), which implied that the Hg contents of soil can be assessed using the hyperspectral remote sensing technology with reasonable accuracy. Results of this study support the use of a remote-sensing technical approach for characterizing Hg contents in soil in arid zones.

Overall, the novel contribution of this work is to construct hyperspectral inversion model which can accurately estimate the Hg content of urban soils in arid zones, thereby following remediation work can be carried out with less time and cost consuming comparing with traditional chemical methods. However, only Hg element was studied in the present work. The influences of other trace elements and soil properties (such as soil salinity, moisture content and pH value) on the prediction accuracy were not considered. Therefore, relevant physical and chemical properties of soil samples should be given attentions. In addition, present work only focused on the PLSR, RFR and SVMR algorithms in the statistical analysis. Other powerful machine/deep learning methods, such as the Geographically weighted regression (GWR), artificial Neural Networks (ANN), Naive Bayes (NB), AdaBoost, k-Nearest Neighbor method (KNN) and Cubist, should be assessed for their potential to further improve the prediction accuracy of Hg contents in urban soils.

5. Conclusion

Heavy metal contamination of soil is a major environmental issue in urban areas. Traditional measuring of soil Hg content is time-consuming and inefficient. This work demonstrated a hyperspectral remote sensing technique for predicting Hg content of urban soil based on the relationship between the Hg contents and spectral values of soil, thus facilitating a non-destructive approach for estimating the Hg contents of the urban soils. However, the methods proposed in this study contribute to the development of green analytical chemistry. Results of this study indicated that spectral transformation can obviously reduce the interference of the environmental background and can improve the correlations between soil spectral reflectance data and Hg contents of soil. The PLSR, RFR and SVMR algorithms were used to model and analyse the relations between Hg contents and spectral variables. RFR is a more accurate approach for estimating soil Hg content in the study area, and it may have good predictive ability for hyperspectral estimation of the Hg content of soils in oasis cities in arid zone. Finally, the LTFD–RFR and the ATFD–RFR models are more stable and have the best inversion effect, with the highest prediction ability (R2 = 0.856, RMSE = 0.002 and MAE = 0.072) for the Hg content in soil in the study area. This study demonstrated the possibility of directly applying the hyperspectral remote sensing approaches to estimating Hg contents of urban soil in arid land. This method can provide an applicable tool and a technical support for the hyperspectral estimation of the Hg content of urban soil in arid zones, and can facilitate the improvement of soil management practices that require rapid detection of Hg contamination of urban soils.

Author Contributions

Conceptualization, Z.Q., M.E., M.A., R.S. and H.M.; methodology, Z.Q. and M.E.; software, M.A.; validation, Z.Q. and M.E.; formal analysis, Z.Q. and M.E.; investigation, Z.Q.; resources, Z.Q. and M.E.; data curation, Z.Q.; writing – original draft preparation, Z.Q.; writing – review and editing, Z.Q., M.E., M.A., R.S. and H.M.; visualization, M.A. and R.S.; supervision, Z.Q. and M.E.; project administration, M.E.; funding acquisition, M.E. All authors have read and agreed to the published version of the manuscript.

Acknowledgements

The original version of this article was substantially improved thanks to the constructive comments by three anonymous reviewers.

Disclosure statement

The authors declare that they have no known competing financial interests or personal relationships that could have affected the work presented in this article.

Data availability statement

Data will be available upon request to the corresponding author.

Additional information

Funding

This research is funded by the National Natural Science Foundation of China. (No. U2003301) and the Tianshan Talent Training Project of Xinjiang.

References

  • Ajay KP, Jayanta GK, Sameer SU. 2020. Fractional abundances study of macronutrients in soil using hyperspectral remote sensing. Geocarto Int. 37(2):1–20.
  • Chang RC, Chen Z, Wang DM, Guo K. 2022. Hyperspectral remote sensing inversion and monitoring of organic matter in black soil based on dynamic fitness inertia weight particle swarm optimization neural network. Remote Sens. 14(17):4316. doi: 10.3390/rs14174316.
  • Datta D, Paul M, Murshed M, Teng SW, Schmidtke L. 2023. Comparative analysis of machine and deep learning models for soil properties prediction from hyperspectral visual band. Environments. 10(5):77. doi: 10.3390/environments10050077.
  • Dong JH, Dai WT, Xu JR, Li SN. 2016. Spectral estimation model construction of heavy metals in mining reclamation areas. Int J Environ Res Public Health. 13(7):640. doi: 10.3390/ijerph13070640.
  • Idowu OJ, Van EHM, Abawi GS, Wolfe DW, Ball JI, Gugino BK, Moebius BN, Schindelbeck RR, Bilgili AV. 2008. Farmer-oriented assessment of soil quality using field, laboratory, and VNIR spectroscopy methods. Plant Soil. 307(1–2):243–253. doi: 10.1007/s11104-007-9521-0.
  • Jia P, Zhang J, He W, Yuan D, Hu Y, Zamanian K, Jia K, Zhao X. 2022. Inversion of different cultivated soil types’ salinity using hyperspectral data and machine learning. Remote Sens. 14(22):5639. doi: 10.3390/rs14225639.
  • Jia PP, Zhang JH, He W, Hu Y, Zeng R, Zamanian K, Jia KL, Zhao XN. 2022. Combination of hyperspectral and machine learning to invert soil electrical conductivity. Remote Sens. 14(11):2602. doi: 10.3390/rs14112602.
  • Levy JS, Johnson JTE. 2021. Remote soil moisture measurement from drone-borne reflectance spectroscopy: applications to hydroperiod measurement in desert playas. Remote Sens. 13(5):1035. doi: 10.3390/rs13051035.
  • Lin X, Su YC, Shang J, Sha J, Li X, Sun YY, Ji J, Jin B. 2019. Geographically weighted regression effects on soil zinc content hyperspectral modeling by applying the fractional-order differential. Remote Sens. 11(6):636. doi: 10.3390/rs11060636.
  • Liu K, Zhao D, Fang JY, Zhang X, Zhang QY, Li XK. 2017. Estimation of heavy metal contamination in soil using remote sensing spectroscopy and a statistical approach. J Indian Soc Remote Sens. 45(5):805–813. doi: 10.1007/s12524-016-0648-4.
  • Liu W, Li M, Zhang M, Long S, Guo Z, Wang H, Li W, Wang D, Hu Y, Wei Y, et al. 2020. Hyperspectral inversion of mercury in reed leaves under different levels of soil mercury contamination. Environ Sci Pollut Res Int. 27(18):22935–22945. doi: 10.1007/s11356-020-08807-z.
  • Mamattursun E, Anwar M, Ajigul M, Gulbanu H. 2018. A human health risk assessment of heavy metals in agricultural soils of Yanqi Basin, Silk Road Economic Belt, China. Hum Ecol Risk Assess. 24:1352–1366.
  • [MAPRC] Ministry of Agriculture of the People’s Republic of China. 2008. Beijing, China. GB/T 22105.2—2008. Soil quality—analysis of total mercury, arsenic, and lead contents—atomic fluorescence spectrometry—Part 2: analysis of total mercury contents in soils. Melbourne, Australia: MAPRC.
  • Michelle D, Onisimo M, Riyad I. 2011. Examining the utility of random forest and AISA Eagle hyperspectral image data to predict Pinus patula age in KwaZulu-Natal, South Africa. Geocarto Int. 26:275–289.
  • Nawar S, Mohamed ES, Essam ESS, Mohamed WS, Rebouh NY, Hammam AA. 2023. Estimation of key potentially toxic elements in arid agricultural soils using Vis-NIR spectroscopy with variable selection and PLSR algorithms. Front Environ Sci. 11:1222871. doi: 10.3389/fenvs.2023.1222871.
  • Nazupar S, Mamattursun E, Li XG, Wang YH. 2022. Spatial distribution, contamination levels, and health risks of trace elements in topsoil along an urbanization gradient in the city of Urumqi, China. Sustainability. 14(19):12646–12663. doi: 10.3390/su141912646.
  • Pirrone N, Cinnirella S, Feng X, Finkelman RB, Friedli HR, Leaner J, Mason R, Mukherjee AB, Stracher GB, Streets DG, et al. 2010. Global mercury emissions to the atmosphere from anthropogenic and natural sources. Atmos Chem Phys. 10(13):5951–5964. doi: 10.5194/acp-10-5951-2010.
  • Shekhar C, Negi HS, Srivastava S, Dwivedi M. 2018. Hyper-spectral data based investigations for snow wetness mapping. Geocarto Int. 34(6):664–687. doi: 10.1080/10106049.2018.1438528.
  • Shi TZ, Chen YY, Liu YL, Wu GF. 2014. Visible and near-infrared reflectance spectroscopy—An alternative for monitoring soil contamination by heavy metals. J Hazard Mater. 265:166–176. doi: 10.1016/j.jhazmat.2013.11.059.
  • Song L, Jian J, Tan DJ, Xie HB, Luo ZF, Gao B. 2015. Estimate of heavy metals in soil and streams using combined geochemistry and field spectroscopy in Wan-sheng mining area, Chongqing, China. Int J Appl Earth Obs Geoinf. 34:1–9. doi: 10.1016/j.jag.2014.06.013.
  • Tan K, Ma W, Chen L, Wang H, Du Q, Du P, Yan B, Liu R, Li H. 2021. Estimating the distribution trend of soil heavy metals in mining area from HyMap airborne hyperspectral imagery based on ensemble learning. J Hazard Mater. 401:123288. doi: 10.1016/j.jhazmat.2020.123288.
  • Tao C, Wang YJ, Zou B, Tu YL, Jiang XL. 2018. Assessment and analysis of heavy metal lead and zinc in soil with hyperspectral inversion model. Spectro Spectr Anal. 38:1850–1855.
  • Tian S, Wang S, Bai X, Zhou D, Lu Q, Wang M, Wang J. 2020. Hyperspectral estimation model of soil Pb content and its applicability in different soil types. Acta Geochim. 39(3):423–433. doi: 10.1007/s11631-019-00388-0.
  • Vohland M, Besold J, Hill J, Fründ HC. 2011. Comparing different multivariate calibration methods for the determination of soil organic carbon pools with visible to near infrared spectroscopy. Geoderma. 166(1):198–205. doi: 10.1016/j.geoderma.2011.08.001.
  • Wang F, Gao J, Zha Y. 2018. Hyperspectral sensing of heavy metals in soil and vegetation: feasibility and challenges. ISPRS J Photo Remote Sens. 136:73–84. doi: 10.1016/j.isprsjprs.2017.12.003.
  • Wang YY, Niu RQ, Guo L, Xiao YX, Ma HL, Zhao LR. 2023. Estimate of soil heavy metal in a mining region using PCC SVM RFECV AdaBoost combined with reflectance spectroscopy. Environ Geochem Health. 45(12):9103–9121. doi: 10.1007/s10653-023-01488-w.
  • Wei LF, Pu HC, Wang ZX, Yuan ZR, Yan XR, Cao LQ. 2020. Estimation of soil arsenic content with hyperspectral remote sensing. Sensors (Basel). 20(14):4056–4071. doi: 10.3390/s20144056.
  • Xu L, Chen Y, Feng A, Shi X, Feng Y, Yang Y, Wang Y, Wu Z, Zou Z, Ma W, et al. 2023. Study on detection method of microplastics in farmland soil based on hyperspectral imaging technology. Environ Res. 232:116389–116389. doi: 10.1016/j.envres.2023.116389.
  • Xue Y, Zou B, Wen Y, Tu Y, Xiong L. 2020. Hyperspectral inversion of chromium content in soil using support vector machine combined with lab and field spectra. Sustainability. 12(11):4441–4456. doi: 10.3390/su12114441.
  • Yang HF, Xu H, Zhong XN. 2022. Prediction of soil heavy metal concentrations in copper tailings area using hyperspectral reflectance. Environ Earth Sci. 81(6):183–193. doi: 10.1007/s12665-022-10307-x.
  • Yang PM, Hu J, Hu BF, Luo DF, Peng J. 2022. Estimating soil organic matter content in desert areas using in situ hyperspectral data and feature variable selection algorithms in southern Xinjiang, China. Remote Sens. 14(20):5221 doi: 10.3390/rs14205221.
  • Zhang SW, Shen Q, Nie CJ, , Huang YF, Wang JH, Hu QQ, Ding XJ, Zhou Y, Chen YP. 2019. Hyperspectral inversion of heavy metal content in reclaimed soil from a mining wasteland based on different spectral transformation and modeling methods. Spectrochim Acta A Mol Biomol Spectrosc. 211:393–400. doi: 10.1016/j.saa.2018.12.032.
  • Zhang TY, Fu Q, Tian RQ, Zhang Y, Sun ZH. 2023. A spectrum contextual self-attention deep learning network for hyperspectral inversion of soil metals. Eco Indic. 152:110351. doi: 10.1016/j.ecolind.2023.110351.
  • Zhao L, Hu YM, Zhou W, Liu ZH, Pan YC, Shi Z, Wang L, Wang GX. 2018. Estimation methods for soil mercury content using hyperspectral remote sensing. Sustainability. 10(7):2474–2487. doi: 10.3390/su10072474.
  • Zhao LY, Tan K, Wang X, Ding JW, Liu ZX, Ma HL, Han B. 2022. Hyperspectral feature selection for SOM prediction using deep reinforcement learning and multiple subset evaluation strategies. Remote Sens. 15(1):127. doi: 10.3390/rs15010127.
  • Zhou M, Zou B, Tu Y, Feng H, He C, Ma X, Ning J. 2022. Spectral response feature bands extracted from near standard soil samples for estimating soil Pb in a mining area. Geocarto Int. 37(26):13248–13267. doi: 10.1080/10106049.2022.2076921.
  • Zhou W, Yang H, Xie L, Li H, Huang L, Zhao Y, Yue T. 2021. Hyperspectral inversion of soil heavy metals in Three-River Source Region based on random forest model. Catena. 202:105222. doi: 10.1016/j.catena.2021.105222.