100
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Research on downscaling method of the enhanced TROPOMI solar-induced chlorophyll fluorescence data

, , , , , , & show all
Article: 2354417 | Received 26 Dec 2023, Accepted 07 May 2024, Published online: 15 May 2024

Abstract

Solar-induced chlorophyll fluorescence (SIF) is the release of plant energy during photosynthesis, which is significantly superior to the vegetation index in the characterization of vegetation growth. However, the existing satellite retrieved SIF data have the problems of low spatial resolution and spatial discontinuity. To solve these problems, this paper proposes multiple parameters downscaling method that considers the structural and physiological characteristics of SIF. Multiple linear regression (MLR), random forest (RF), and convolutional neural network (CNN) models were used to construct a downscaling model for the TROPOspheric Monitoring Instrument (TROPOMI) Enhanced SIF (eSIF) data. The theory of spatial scale invariance was applied to invert the 500 m spatial resolution SIF data products for Henan Province from 2012 to 2021 using Moderate-resolution Imaging Spectroradiometer (MODIS) data. The evaluation metrics for assessing downscaling accuracy include the determination coefficient (R2), mean absolute error (MAE), and root mean squared error (RMSE). The experimental results demonstrate that the RF model outperforms others, achieving R2, MAE, and RMSE values of 0.935, 0.041 mW/m2/nm2/sr, and 0.061 mW/m2/nm2/sr, respectively. These results successfully meet the downscaling requirements. The downscaling data products have better fitting effect with eSIF and new Global 'OCO-2′ SIF (GOSIF) data both in time and space. The correlation between downscaling SIF data and winter wheat yield is significantly better than that of GOSIF data products and shows strong correlation with Gross Primary Productivity (GPP). By considering the structural and physiological characteristics of SIF, the RF algorithm can effectively retrieve reliable 500 m spatial resolution SIF data, this provides methodological support for the application of SIF data at higher spatial scales.

1. Introduction

Solar-induced Chlorophyll Fluorescence (SIF) refers to the light signal produced by chlorophyll molecules in plants when they are excited within the 650–800 nm range after absorbing sunlight (Baker Citation2008; Liu and Liu 2013; Frankenberg et al. Citation2014; Verrelst et al. Citation2015). SIF signals, which account for only 1%–5% of the solar energy reflected by vegetation, are relatively weak (Grace et al. Citation2007; Meroni et al. Citation2009; Zarco-Tejada et al. Citation2013). Satellites may inverse SIF data based on atmosphere absorption dark lines as the spectral resolution of their sensors improves (Plascyk 1975; Plascyk and Gabriel Citation1975; Zhang et al. Citation2009; Wang et al. Citation2012). SIF captures the emission of the vegetation’s energy and effectively describes the photosynthetic activity of the plant. Compared to standard reflectance data and vegetation index data, SIF can provide more accurate representation of the plant’s actual condition (Sun et al. Citation2009, Citation2021b; Damm et al. Citation2010; Wood et al. Citation2017; Chen et al. Citation2019).

Currently, it is difficult to apply SIF data from satellite inversion to provincial spatial scale. One challenge is the low spatial resolution (Wood et al. Citation2017;) of the data, while another is the spatial discontinuity despite high spatial resolution (Yang et al. Citation2022). Conducting downscaling research on SIF data can help solve the aforementioned issues associated with SIF data (Ji et al. Citation2019; Wu et al. Citation2022; Gensheimer et al. Citation2022). SIF downscaling primarily includes low spatial resolution SIF image data, parameter selection, and spatial relationship establishment. It incorporates the theory of invariance of spatial scale relationships (Hu et al. Citation2020; Wen et al. Citation2021; Li et al. Citation2021; Sun et al. Citation2021a) and utilizes high spatial resolution parameter data to perform inverse calculations on SIF data (Gentine and Alemohammad Citation2018). Gentine and Alemohammad downscaled the 0.5° spatial resolution SIF data from the Global Ozone Monitoring Experiment 2 (GOME-2) inversion to higher resolution of 0.05°, this downscaled data set was used to generate a set of global RSIF data products, which were found to have strong correlation with Gross Primary Production (GPP) (Gentine and Alemohammad Citation2018). Ma et al. used the red, near-infrared, blue, and green band reflectance data corrected by the Bi-directional Reflectance Distribution Function (BRDF) of the Moderate-resolution Imaging Spectroradiometer (MODIS), the Normalized Difference Vegetation Index (NDVI), the cosine value of the sun’s altitude angle, and the air temperature as SIF parameters, and reconstructed the 2 km × 2 km strip SIF data to 0.05° spatial resolution by the random forest (RF) algorithm to solve for the spatial discontinuity of the data (Ma et al. Citation2020). Similarly, the GOME-2 SIF data was downscaled to a spatial resolution of 0.05° using the same approach (Ma et al. Citation2022). Li and Xiao selected Enhanced Vegetation Index (EVI), Photosynthetically Active Radiation (PAR), vapor pressure deficit (VPD), and air temperature as characteristic variables to implement spatial downscaling reconstructed the discontinuous Orbiting Carbon Observatory 2 (OCO-2) SIF data with spatial resolution of 1.3 km × 2.25 km into a new global 'OCO-2′ SIF (GOSIF) data product with spatial resolution of 0.05° (Li and Xiao Citation2019). Zhang et al. utilized the GOSIF data product from Henan Province and employed deep learning algorithm to invert 1 km spatial resolution SIF data. The study findings indicated that the downscaled SIF data exhibited superior performance compared to vegetation indices in monitoring agricultural drought (Zhang et al. Citation2020). Yu et al. utilized the downscaling approach to reconstruct spatially continuous 0.05° resolution OCO-2 SIF data using the seven-band reflectance data from MODIS. The study found that SIF data has significant advantages in monitoring vegetation growth conditions (Yu et al. Citation2019). Liu et al. research on winter wheat yield estimation using downscaled 0.05° spatial resolution SIF data from GOME-2. The study revealed that the 0.05° spatial resolution SIF data exhibited significant advantages in various yield estimation models (Liu et al. Citation2022b).

The reconstructed 0.05° spatial resolution SIF data product has shown promising results on large-scale (e.g. national level). However, it faces challenges in meeting the requirements of smaller scales (e.g. provincial level). Thus, this study focuses on Henan Province, which is the primary winter wheat production region in China, as the study area. It introduces a downscaling method that considers multiple parameters. This study utilizes the TROPOspheric Monitoring Instrument (TROPOMI) Enhanced SIF (eSIF) data as the primary data source. To downscale the SIF data, structural and physiological characteristics are selected from MODIS with spatial resolution of 500 m. The downscaling model is constructed using multiple linear regression (MLR), RF regression, and convolutional neural network (CNN) algorithm. By applying this method, the 500 m spatial resolution SIF data of Henan Province from 2012 to 2021 can be inverted, providing valuable support for the application of SIF data.

2. Materials and methods

2.1. Materials

2.1.1. Study area

Henan Province is situated between 31°23′N–36°22′N and 110°21′E–116°39′E, featuring a terrain that slopes from west to east. The study area primarily encompasses plains, basins, mountains, hills, and water bodies, as illustrated in . With a total area of 167,000 square kilometers, Henan Province holds significant importance as an agricultural hub in the country. The main crops cultivated in this region include winter wheat and summer corn.

Figure 1. Study area.

Figure 1. Study area.

2.1.2. Image data

In 2017, the European Space Agency launched the Sentinel-5p satellite, which is equipped with the TROPOMI. TROPOMI has a revisit cycle of 17-day and can capture daily images of the entire global surface with spatial resolution of 3.5 km × 7 km (Koehler et al. Citation2018). This enables the generation of data products with higher spatial and temporal resolution for the application research of SIF. Liu et al. analyzed the TROPOMI SIF data product in the 743–758 nm range, they reconstructed an 8-day composite 0.05° spatial resolution eSIF data product (Liu et al. Citation2022a) while ensuring the quality of the SIF data. The reconstructed data product retained the structural and physiological characteristics of the SIF data. The eSIF data can be downloaded from the scientific repository (https://zenodo.org/record/6115416).

SIF data primarily consists of Absorbed Photosynthetically Active Radiation (APAR), which is absorbed by vegetation, fluorescence quantum yield (SIFyield), and the canopy escape probability of SIF photons (fesc) (Yang et al. Citation2015; Li et al. Citation2018; Ma et al. Citation2020). APAR and fesc represent structural characteristics, while SIFyield represents physiological characteristics (Zhang et al. Citation2019). The relationship expression of SIF is shown in (1). (1) SIF=PAR×FPAR×SIFyield×fesc(1)

In EquationEquation (1), PAR represents the photosynthetically active radiation reaching the vegetation canopy, Fraction of Photosynthetically Active Radiation (FPAR) represents the fraction of PAR absorbed by vegetation, SIFyield represents the fluorescence quantum yield, fesc is a structural interference factor, which determines the probability of emission fluorescence of vegetation leaves escaping from the canopy, mainly affected by canopy structure and chlorophyll content.

Selecting vegetation index and meteorological data related to chlorophyll concentration can better characterize SIF (Frankenberg et al. Citation2011; Guanter et al. Citation2012). Therefore, the construction of vegetation index based on reflectance data has the potential to estimate vegetation information and spatial downscaling of SIF data (Duveiller and Cescatti Citation2016; Gentine and Alemohammad Citation2018).

In this study, reflectance data and vegetation index data can be used to characterize the structural characteristics of SIF data. we selected the red band (B1), near-infrared band (B2), blue band (B3), and green band (B4) from the MCD43A4 data, which have been corrected by the BRDF, to construct four different indices: Soil Adjusted Vegetation Index (SAVI), NDVI, EVI, and Near-infrared Reflectance of Vegetation (NIRv), These reflectance data and vegetation indices are widely used in SIF inversion (Zhang et al. Citation2018; Li and Xiao Citation2019; Yu et al. Citation2019; Ma et al. Citation2020; Gensheimer et al. Citation2022; Liu et al. Citation2022a). These indices are summarized in . In addition, MOD15A2 Leaf Area Index (LAI) and FPAR data were chosen to enhance the analysis of the structural characteristics. MOD11A2 Land Surface Temperature (LST) data and MOD16A2 Evapotranspiration (ET) data were chosen to represent the physiological characteristics of SIF data. The MODIS data can be downloaded from the National Aeronautics and Space Administration (NASA) official website (https://ladsweb.modaps.eosdis.nasa.gov/).

Table 1. Construct vegetation index.

GOSIF data has wide temporal coverage and spatial resolution of 0.05°, it offers 8-day composite, monthly composite, and annual composite data products. The GOSIF data products were chosen to validate the downscaling results at the county unit and analyze the correlation with winter wheat yield. The data can be downloaded from the official website (https://globalecology.unh.edu/). The MOD17A2 dataset’s GPP data, which has a spatial resolution of 500 m, was utilized to assess the spatial reliability of the downscaled SIF data. To extract agricultural land data, the MCD12Q1 land use type data was selected.

2.1.3. Yield data

To validate the correlation between downscaling SIF data and yield data, the winter wheat yield data from 2013 to 2021 was selected. The historical yield data for Henan Province can be downloaded from the official website of Henan Provincial Bureau of Statistics (https://tjj.henan.gov.cn/).

2.2. Methods

2.2.1. Data preprocessing

Before constructing the SIF downscaling model, we first need to resample the selected MCD43A4 B1, B2, B3, B4 and the constructed SAVI, NDVI, EVI, NIRv, MOD15A2 LAI and FPAR, MOD11A2 LST data, and MOD16A2 ET data to the same spatial resolution as the eSIF data, and synthesize the monthly mean. The experiment uses Arcgis10.8 software to realize the preprocessing of data such as reprojection, clipping, and resampling.

2.2.2. Downscaling models construction

The statistical model is a popular downscaling method at present. The method is that find the characteristic variable data with good correlation on the low spatial resolution scale and establish the spatial correspondence. Based on the assumption that the spatial scale relationship is unchanged, the data products are reconstructed by using the available high spatial resolution data. In the experiment, three methods of MLR, RF, and CNN were selected to construct the downscaling model.

2.2.2.1. Multiple linear regression

MLR is a statistical technique used to determine the relationship between multiple predictor variables and a dependent variable. It involves expressing this relationship through mathematical formulas (Wang et al. Citation2022). (2) Y=β0+β1x1+β2x2++βnxn+ω(i=1,2,,n)(2)

In EquationEquation (2), Y represents the dependent variable, β represents the regression coefficients, x represents the predictor variables, ω represents the random error, and n represents the sample size.

2.2.2.2. Random forest regression

RF utilizes the bagging technique to build multiple decision trees, and the final output is determined by averaging the prediction results from these trees (Breiman Citation2001). The RF model is capable of ranking characteristic variables by using out-of-bag data and providing insights into the importance of these variables (Fang et al. Citation2018). The RF model is illustrated in .

Figure 2. Random forest model.

Figure 2. Random forest model.
2.2.2.3. Convolutional neural network

CNN is a deep learning algorithm that consists of multiple convolutional layers, pooling layers, and fully connected layers (Shen and Yuan Citation2020). The convolutional layer is responsible for extracting data characteristics by using various sizes of convolutional kernels to perform local characteristic extraction on the input data. The extracted characteristics are then mapped using activation functions, the pooling layer is used to reduce the dimensionality of the characteristics. Finally, the fully connected layer integrates the characteristics and produces the prediction results.

2.2.3. Downscaling process

In the process of establishing the SIF downscaling model, the characteristic variable data are extracted based on the eSIF data product. To ensure the accuracy of the model, pixels with all characteristic variable data are selected as sample points. In the inversion of high spatial resolution SIF data, LST data needs to be resampled to 500 m spatial resolution to meet the downscaling target. The experiment selected eSIF data product for 24 months from October 2018 to September 2020 for model construction. The specific process of downscaling is as follows:

  1. The downscaling process is shown in . Firstly, the mechanism analysis is carried out based on SIF, and the characteristic variables are selected and resampled to the same spatial resolution as the eSIF data. The two types of data are synthesized by monthly mean, and the corresponding characteristic variable data are extracted from the SIF data point information to construct a downscaling sample data set on the 0.05° spatial scale.

  2. Model construction. Based on Python, three statistical regression models are constructed, which are the MLR, RF, and CNN. The three models are used to compare the accuracy of the SIF downscaling model.

  3. Comparison of model accuracy. Based on the same sample data, the three models are trained, and the accuracy of the test data set is evaluated by using the model accuracy evaluation index to determine the SIF downscaling model.

  4. Time-series SIF inversion. The characteristic variables of 500m spatial resolution monthly from 2012 to 2021 are input into the downscaling model, and the SIF data inversion of 500m spatial resolution is realized based on the invariant theory of spatial scale relationship.

  5. Time-series verification. The quality of downscaled SIF data products is evaluated by using eSIF data, GOSIF data, winter wheat yield data and GPP data.

Figure 3. Downscaling method flow.

Figure 3. Downscaling method flow.

2.2.4. Accuracy assessment indicators

To evaluate the accuracy of the downscaling results for SIF, we selected the determination coefficient (R2), mean absolute error (MAE), and root mean squared error (RMSE) as the accuracy assessment indicators. (3) R2=1i=1n(SIFtSIFp)2i=1n(SIFtSIF¯p)2(3) (4) MAE=1ni=1n|SIFtSIFp| (4) (5) RMSE=1ni=1n(SIFtSIFp)2 (5)

In EquationEquations (3)–(5), SIFt represents the true value of SIF, SIFp represents the predicted value of SIF, SIF¯ represents the mean value of SIF, and n represents the total number of samples.

3. Results

3.1. Downscaling model accuracy analysis

The experiment focused on eSIF data products from October 2018 to September 2020. These data were averaged to every month. To ensure consistency, the SIF structural and physiological characteristics were resampled to match the spatial resolution of eSIF. A total of 127,585 sample data points were selected for the experiment. Among these, 70% of the data points were used for model training (using the MLR, RF, and CNN models), while the remaining 30% were used for model validation. By optimizing the parameters of these models, the most suitable downscaling model was determined. The accuracy of each model, trained using the MLR, RF, and CNN downscaling models, is depicted in .

Figure 4. Precision comparison of three downscaling models.

Figure 4. Precision comparison of three downscaling models.

Based on , it is evident that all three downscaling models, using the same sample data, exhibit good accuracy. The MLR model demonstrates an accuracy of R2=0.897, MAE = 0.057 mW/m2/nm2/sr, and RMSE = 0.077 mW/m2/nm2/sr. Similarly, the RF and the CNN achieve accuracies of 0.935, 0.041 mW/m2/nm2/sr, 0.061 mW/m2/nm2/sr, and 0.913, 0.051 mW/m2/nm2/sr, 0.071 mW/m2/nm2/sr, respectively. Based on the observations from , it is evident that the MLR model’s prediction results are more scattered in comparison to the predictions of the RF and CNN models. The fitting line of the RF model closely aligns with the 1:1 line and exhibits the best performance when compared to the CNN model. As a result, the RF model is selected as the SIF downscaled model in this study. The results of RF characteristic variable ranking are shown in .

Table 2. Importance of random Forest characteristic variables.

shows that NIRv has the highest importance with a score of 0.64. Additionally, B1 and EVI also have relatively high scores. The selection of ET and LST data, which represent the physiological characteristics of SIF data, has higher scores compared to commonly used vegetation indices. This indicates a strong correlation between meteorological data and SIF data. Furthermore, other structural indices are also important in the SIF downscaling process, so this paper chooses to include these structural characteristics.

3.2. Downscaling results

Using the selected 500 m spatial resolution characteristic variables input downscaling model, the monthly scale 500 m spatial resolution SIF data product of Henan Province from 2012 to 2021 is inverted, which is denoted as SIF500. The comparison between the 2021 eSIF and downscaling SIF500 data products is shown in and .

Figure 5. Monthly eSIF in 2021.

Figure 5. Monthly eSIF in 2021.

Figure 6. Monthly SIF500 in 2021.

Figure 6. Monthly SIF500 in 2021.

From and , it can be seen that the downscaling SIF data has higher spatial resolution than the eSIF data. Among them, the main vegetation in spring in Henan Province is winter wheat, and the high value of SIF is mainly concentrated on agricultural land. It can be seen from that the downscaled SIF in March has a significant increase compared with the data in February, especially in the southeastern region of Henan Province. The main reason is that the temperature in the southern region of Henan Province is higher, and the winter wheat regreening period is earlier than that in the northern region. In the eSIF data, due to the rough resolution and the monthly synthesis of the data, the SIF changes less than that in February. After April, the SIF value of mountain vegetation in southwestern Henan Province increased significantly, mainly because the mountain vegetation began to turn green and the SIF value gradually increased. Through comparison, it is found that the downscaled 500 m spatial resolution SIF data can better reflect the small changes in the SIF value of mountain vegetation. The SIF value of agricultural land in Henan Province gradually weakened after reaching the peak in April, and reached the lowest in June. It is mainly in June when winter wheat is harvested and summer corn is sown. The surface of agricultural land is bare and the SIF value is low. After July, the overall SIF value in Henan Province was higher and decreased significantly in October, which was consistent with the growth cycle of summer corn. Among them, Henan Province was affected by heavy precipitation in July, and the main crop of summer corn in northern Henan suffered serious floods. It can be seen from the SIF data of agricultural land in August that there is no significant difference between the north and south regions of the eSIF data with 0.05° spatial resolution, but it can be seen from the downscaling data that the SIF value in the north of Henan Province is lower than that in the south region, indicating that the downscaling SIF can characterize the vegetation change information in detail. In winter, the downscaling SIF data can describe local change information more accurately than eSIF data. Through the time series comparison of SIF data products in 2021, it is found that based on RF, combined with structural and physiological characteristics, this paper can better invert SIF data with 500 m spatial resolution, and provide methodological support for the application research of SIF data in Henan Province.

3.3. Temporal and spatial validation at the same scale

To assess the accuracy of the downscaling results, we resampled the 2021 downscaling SIF data to match the spatial resolution of the eSIF data. The spatial correlation between these two datasets is illustrated in .

Figure 7. Validation at 0.05° spatial scale.

Figure 7. Validation at 0.05° spatial scale.

From , it can be seen that at the 0.05° spatial scale, the resampled SIF data has good fitting relationship with the eSIF data as a whole, R2 > 0.6. In July- October, the scatter plot of the verification results is relatively concentrated, but some discrete data points cause the fitting effect to be lower than in other months. It is mainly in these four months that the vegetation in Henan Province grows vigorously, and the value range of SIF is large. Therefore, there will be some errors in the process of resampling the downscaled 500 m spatial resolution SIF data to 0.05° spatial resolution. Second, there may be a mismatch between the eSIF data with spatial resolution of 0.05° and the downscaling SIF data resampled to spatial resolution of 0.05°, resulting in a certain deviation when extracting the pixel values of the two images, increasing the error between the data. Natural disasters also reduced the fitting effect of July- October to some extent. July- October is a critical period for the growth of summer corn, and the northern part of Henan Province suffered strong floods in July, which limited the growth of summer corn. The downscaling SIF data captures this change information well, while the eSIF data cannot capture the change information in detail due to the rough resolution, which is also an important factor leading to poor fitting from July to October.

3.4. Temporal validation of downscaling results

Li and Xiao selected EVI, PAR, VPD, and air temperature as characteristic variables to reconstruct spatially continuous OCO-2 SIF data. The dataset is recorded as GOSIF. The spatial resolution of GOSIF is 0.05°, which has long temporality and can meet the spatial verification of downscaling results. Therefore, based on the monthly scale GOSIF data products, the mean value of the inverted long-term SIF data from 2012 to 2021 is verified by taking the county as the unit. The fitting results of the mean SIF values for each month are presented in .

Figure 8. Verification of the mean value of county units.

Figure 8. Verification of the mean value of county units.

From , it can be seen that the downscaling 500 m spatial resolution SIF data and the 0.05° spatial resolution GOSIF data have good correlation in the county unit, and the R2 is greater than 0.7, and the downscaling SIF data in winter and the GOSIF data still show good correlation, indicating that the downscaling 500 m spatial resolution SIF data is reliable in time series, and the spatial resolution is significantly higher than the GOSIF data. Secondly, through , it is found that the fitting effect of the downscaling SIF data and GOSIF data on the county unit in July- October is slightly lower than that in other months, which may be due to the significant change of vegetation SIF in this period, and the downscaling SIF data increases the error in the resampling process.

3.5. Potential of downscaling results in winter wheat yield estimation

SIF data has shown great potential for agricultural applications. Wang et al. research on yield estimation of winter wheat using SIF data and found that SIF data with spatial resolution of 0.05° performed significantly better than EVI and meteorological data with spatial resolution of 500 m (Wang et al. Citation2022). Similarly, Zhang et al. conducted downscaling based on GOSIF data and discovered that the downscaling SIF data, which constructed the Temperature Fluorescence Dryness Index (TFDI), exhibited strong correlation with corn yield (Zhang et al. Citation2020). Hence, the correlation analysis between downscaling SIF and winter wheat yield is a method to evaluate SIF data quality.

Based on the GOSIF data and downscaling SIF data of the winter wheat growth period from 2013 to 2021, correlation verification was conducted on the county unit using the winter wheat yield data. The correlation between SIF and winter wheat yield was shown in and .

Figure 9. Correlation between GOSIF and winter wheat yield from 2013 to 2021.

Figure 9. Correlation between GOSIF and winter wheat yield from 2013 to 2021.

Figure 10. Correlation between downscaling SIF and winter wheat yield from 2013 to 2021.

Figure 10. Correlation between downscaling SIF and winter wheat yield from 2013 to 2021.

According to and , it is evident that the correlation between downscaling SIF data and winter wheat yield is notably stronger compared to the correlation between GOSIF data and winter wheat yield from 2013 to 2021. It is proved that the SIF data of 500 m spatial resolution after downscaling has an important potential for estimating winter wheat yield.

3.6. Evaluation of SIF downscaling results based on GPP

Based on the concept of the light use efficiency model, GPP can be represented as follows (Liu et al. Citation2022a). (6) GPP=PAR×FPAR×LUE (6)

In EquationEquation (6), PAR represents the photosynthetically active radiation reaching the vegetation canopy, FPAR is the fraction of photosynthetically active radiation absorbed by vegetation, and LUE is the light use efficiency.

From EquationEquations (1) and Equation(6), it can be seen that SIF and GPP share the same driving factors. Therefore, the relationship between SIF and GPP can be expressed as follows. (7) GPP=SIF×LUESIFyield×fesc (7)

EquationEquation (7) describes the coupling relationship between SIF and GPP, indicating a close connection between SIF data and vegetation photosynthesis.

To verify the correlation between the downscaling 500 m spatial resolution SIF data and GPP, the 500 m spatial resolution GPP product from the MOD17A2 dataset was chosen for validation using the year 2021 as an example. The validation results are shown in .

Figure 11. Fitting of SIF and GPP.

Figure 11. Fitting of SIF and GPP.

The downscaling SIF data in shows consistent results with GPP throughout each month. This finding aligns with the research conducted by Liu and others, who also found that SIF can effectively track GPP results (Liu et al. Citation2022a). The validation process further confirms the reliability of downscaled 500 m spatial resolution SIF data. From January to June, the relationship between SIF and GPP exhibits the strongest correlation, with all R2 values above 0.7. However, from July to September, the relationship between SIF and GPP becomes more scattered and the correlation weakens, mainly because the summer SIF signal is significantly influenced by various factors, including vegetation types, growth conditions, and climate change.

4. Discussion

4.1. Quality assessment of SIF downscaling model

At present, most of the researches on improving the spatial resolution of SIF products use multi-source data fusion machine learning algorithms (Li and Xiao Citation2019; Yu et al. Citation2019; Ma et al. Citation2020, Citation2022). In this study, MLR, RF, and CNN are used to carry out downscaling research, From the verification results, it was found that the RF was superior to MLR and CNN. The research shows that the MLR model cannot obtain the nonlinear relationship between the data. In the verification process, it is also found that the results predicted by the MLR model are more dispersed than the other two models. CNN can obtain more characteristic information, but in the process of model training, there are many parameters to be adjusted, the model training takes a long time, and the final model prediction result is not optimal (Lu et al. Citation2022), which may be caused by insufficient data samples. The RF model runs fast and has strong robustness. Compared with the other two models, it can obtain the nonlinear relationship between data in the training process, without complex parameter tuning, data preprocessing and other problems. Therefore, this paper chooses the RF to carry out the downscaling research.

To verify the accuracy of the RF model in the downscaling process and the invariant theory of spatial scale relationship. Based on the 0.05° spatial scale, the characteristic variable data is obtained and the RF model is used to invert the 2021 monthly SIF data product and perform spatial verification with eSIF. The verification results are shown in the .

Figure 12. The accuracy of the RF model in the SIF downscaling process and the theoretical verification of the invariant spatial scale relationship.

Figure 12. The accuracy of the RF model in the SIF downscaling process and the theoretical verification of the invariant spatial scale relationship.

It can be seen from that the SIF data predicted at the 0.05° spatial scale has good correlation with the eSIF data, indicating that the RF model has good robustness. Comparing , it is found that the validation results of the SIF data predicted at the 0.05° spatial scale are similar to those of the 500 m downscaling SIF data resampled to 0.05°, which verifies the feasibility of downscaling based on the invariant theory of spatial scale relationship.

4.2. Spatial scale difference analysis

In the process of model construction, the nearest pixel sampling method is used to resample the selected 500 m spatial resolution characteristic variable data to the same spatial resolution as eSIF. This step will smooth the range of characteristic variables. Since the data input in the experiment is monthly mean synthetic data, the data will be smoothed to a certain extent. This process can weaken the influence of abnormal data, but it will also weaken the accuracy of the model. In the experiment, the 500 m spatial resolution characteristic variable data products are selected, and the downscaling model is constructed based on the spatial scale relationship invariant theory. Compared with the spatial resolution of 0.05°, it has a more abundant change information. The downscaling model based on the invariant theory of spatial scale relationship may have some extreme values in the process of inverting 500 m spatial resolution SIF, but in this experiment, the inverted 500 m spatial resolution SIF data products have fewer outliers. Mainly in the experiment, two years of data samples are selected for training and the model has strong universality, which reduces the occurrence of outliers to a certain extent.

4.3. Impact of remote sensing image quality

During the experiment, MODIS data products are mainly selected to construct the downscaling model. Due to the lack of pixels caused by the data product itself and the preprocessing operation, the SIF inversion of the missing area cannot be realized. In the process of data quality verification, the mismatch between raster data pixels may also lead to errors in the verification process.

5. Conclusions

The existing SIF data products have insufficient spatial resolution. In this study, we used 0.05° spatial resolution eSIF data products and applied downscaling techniques, characteristic selection, and model construction to obtain 500 m spatial resolution SIF data products for Henan Province from 2012 to 2021. Based on our analysis, the following conclusions can be drawn:

  1. Based on the same sample data, MLR, RF, and CNN models have good model accuracy, and the RF downscaling model is the best. At the same time, it shows that the use of MODIS data products can characterize the structural and physiological characteristics of SIF and realize the downscaling of SIF.

  2. The downscaled SIF data has higher spatial resolution than the current SIF data products and has important potential in winter wheat yield estimation and GPP evaluation.

This study is based on the RF to achieve the spatial resolution of SIF data from 0.05° downscaling to 500 m. However, the downscaled SIF data in some regions has missing values due to the influence of characteristic variables. Acquiring high-quality characteristic data can help compensate for these missing values.

Data availability statement

The code used in this study is available by contacting the corresponding author.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This research was funded by the National Key Research and Development Plan, grant number 2016YFC0803103.

References

  • Baker NR. 2008. Chlorophyll fluorescence: a probe of photosynthesis in vivo. Annu Rev Plant Biol. 59(1):89–113. doi:10.1146/annurev.arplant.59.032607.092759.
  • Breiman L. 2001. Random forests. Mach Learn. 45(1):5–32. doi:10.1023/A:1010933404324.
  • Chen SY, Jing X, Dong YY, Liu LY. 2019. Detection of wheat stripe rust using solar-induced chlorophyll fluorescence and reflectance spectral indices. Remote Sens Technol Appl. 34(3):511–520.
  • Damm A, Elbers JA, Erler A, Giolis B, Rascher U. 2010. Remote sensing of sun-induced fluorescence to improve modeling of diurnal courses of gross primary production. Glob Change Biol. 16(1):171–186.
  • Duveiller G, Cescatti A. 2016. Spatially downscaling sun-induced chlorophyll fluorescence leads to an improved temporal correlation with gross primary productivity. Remote Sens Environ. 182:72–89.
  • Fang XR, Wen ZF, Chen JL, Wu SJ, Huang YY, Ma MH. 2018. Remote sensing estimation of suspended sediment concentration based on random forest regression model. J Remote Sens. 23:756–772.
  • Frankenberg C, Fisher JB, Worden J, Badgley G, Saatchi SS, Lee J-E, Toon GC, Butz A, Jung M, Kuze A, et al. 2011. New global observations of the terrestrial carbon cycle from GOSAT: patterns of plant fluorescence with gross primary productivity. Geophys Res Lett. 38(17):L17706. doi:10.1029/2011GL048738.
  • Frankenberg C, O’Dell C, Berry J, Guanter L, Joiner J, Köhler P, Pollock R, Taylor TE. 2014. Prospects for chlorophyll fluorescence remote sensing from the orbiting carbon observatory-2. Remote Sens Environ. 147:1–12.
  • Gensheimer J, Turner AJ, Köhler P, Frankenberg C, Chen J. 2022. A convolutional neural network for spatial downscaling of satellite-based solar-induced chlorophyll fluorescence (SIFnet). Biogeosciences. 19(6):1777–1793. doi:10.5194/bg-19-1777-2022.
  • Gentine P, Alemohammad SH. 2018. Reconstructed solar-induced fluorescence: A machine learning vegetation product based on MODIS surface reflectance to reproduce GOME-2 solar-induced fluorescence. Geophys Res Lett. 45(7):3136–3146.
  • Grace J, Nichol C, Disney M, Lewis P, Quaife T, Bowyer P. 2007. Can we measure terrestrial photosynthesis from space directly, using spectral reflectance and fluorescence? Global Change Biol. 13(7):1484–1497. doi:10.1111/j.1365-2486.2007.01352.x.
  • Guanter L, Frankenberg C, Dudhia A, Lewis PE, Gómez-Dans J, Kuze A, Suto H, Grainger RG. 2012. Retrieval and global assessment of terrestrial chlorophyll fluorescence from GOSAT space measurements. Remote Sens Environ. 121:236–251.
  • Hu FM, Wei ZS, Zhang W, Dorjee D, Meng LK. 2020. A spatial downscaling method for SMAP soil moisture through visible and shortwave-infrared remote sensing Data. J Hydrol. 590:125360.
  • Ji MH, Tang BH, Li ZL. 2019. Review of solar-induced chlorophyll fluorescence retrieval methods from satellite. Remote Sens Technol Appl. 34(3):455–466.
  • Koehler P, Frankenberg C, Magney T, Guanter L, Joiner J, Landgraf J. 2018. Global retrievals of solar-induced chlorophyll fluorescence with TROPOMI: First results and inter-sensor comparison to OCO-2. Geophys Res Lett. 45(19):10456–10463.
  • Li HK, Wu GH, Wang XL. 2021. Land surface temperature downscaling method in ion-type rare earth mining area oriented to mining disturbance. Geomatics Inf Sci Wuhan Univ. 46(1):133–142.
  • Li X, Xiao JF. 2019. A global, 0.05-degree product of solar-induced chlorophyll fluorescence derived from OCO-2, MODIS, and reanalysis data. Remote Sens. 11(5):517–530.
  • Li X, Xiao JF, He BB. 2018. Chlorophyll fluorescence observed by OCO-2 is strongly related to gross primary productivity estimated from flux towers in temperate forests. Remote Sens Environ. 204:659–671.
  • Liu X, Liu L, Bacour C, Guanter L, Chen J, Ma Y, Chen, R, Du S. 2022a. A simple approach to enhance the TROPOMI solar-induced chlorophyll fluorescence product by combining with canopy reflected radiation at near-infrared band. Remote Sens Environ. 284:113341.
  • Liu YY, Wang SQ, Wang XB, Chen B, Chen JH, Wang JB, Huang M, Wang ZS, Ma L, Wang PY, et al. 2022b. Exploring the superiority of solar-induced chlorophyll fluorescence data in predicting wheat yield using machine learning and deep learning methods. Comput Electron Agric. 192:106612.
  • Lu X, Zhou Y, Zhang X, Yu H, Cai G. 2022. Using time series vector features for annual cultivated land mapping: a trial in northern Henan, China. PLoS One. 17(8):e0272300.
  • Ma Y, Liu LY, Chen RN, Du SS, Liu XJ. 2020. Generation of a global spatially continuous TanSAT solar-induced chlorophyll fluorescence product by considering the impact of the solar radiation intensity. Remote Sens. 12(13):2167.
  • Ma Y, Liu LY, Liu XJ, Chen JD. 2022. An improved downscaled sun-induced chlorophyll fluorescence product of GOME-2 dataset. Eur J Remot Sens. 55(1):168–180.
  • Meroni M, Rossini M, Guanter L, Alonso L, Rascher U, Colombo R, Moreno J. 2009. Remote sensing of solar-induced chlorophyll fluorescence: Review of methods and applications. Remote Sens Environ. 113(10):2037–2051.
  • Plascyk JA, Gabriel FC. 1975. The fraunhofer line discriminator MKII-an airborne instrument for precise and standardized ecological luminescence measurement. IEEE Trans Instrum Meas. 24(4):306–313.
  • Shen ZX, Yuan SN. 2020. Regional load clustering integration forecasting based on convolutional neural network support vector regression machine. Power System Technol. 44(6):2237–2244.
  • Sun G, Liu LY, Zheng WG, Huang WJ, Yin M. 2009. Solar induced chlorophyll fluorescence instrument based on fraunhofer dark line principle. Trans Chin Soc Agric. 40:248–251.
  • Sun H, Zhou BC, Li H, Ruan L. 2021a. Study on microwave soil moisture downscaling by coupling MOD16 and SMAP. J Remote Sens. 25(3):776–790.
  • Sun ZQ, Gao XL, Du SS, Liu XJ. 2021b. Research progress and prospective of global satellite based solar-induced chlorophyll fluorescence products. Remote Sens Technol Appl. 36:1044–1056.
  • Verrelst J, Rivera JP, van der Tol C, Magnani F, Mohammed G, Moreno J, et al. 2015. Global sensitivity analysis of the SCOPE model: What drives simulated canopy-leaving sun-induced fluorescence. Remote Sens Environ. 166:8–21.
  • Wang LG, Zheng GQ, Guo Y, He J, Cheng YZ. 2022. Prediction of winter wheat yield based on fusing multi-source spatio-temporal data. Trans Chin Soc Agric. 53:198–204.
  • Wang R, Liu ZG, Yang PQ. 2012. Principle and progress in remote sensing of vegetation solar-induced chlorophyll fluorescence. Adv Earth Sci. 27(11):1221–1228.
  • Wang XX, Cai GS, Lu XP, Yang ZN, Zhang XJ, Zhang QG. 2022. Inversion of wheat leaf area index by multivariate red-edge spectral vegetation index. Sustainability. 14(23):15875. doi:10.3390/su142315875.
  • Wen FP, Zhao W, Hu L, Xu HX, Cui Q. 2021. SMAP passive microwave soil moisture spatial downscalingbased on optical remote sensing data: a case study in Shandian river basin. J Remote Sens. 25(4):962–973.
  • Wood JD, Griffis TJ, Baker JM, Frankenberg C, Verma M, Yune K. 2017. Multiscale analyses of solar‐induced florescence and gross primary production. Geophys Res Lett. 44(1):533–541.
  • Wu LS, Zhang YG, Zhang ZY, Zhang XK, Wu YF. 2022. Remote sensing of solar-induced chlorophyll fluorescence and its applications in terrestrial ecosystem monitoring. Chinese J Plant Ecol. 46(10):1167–1199. doi:10.17521/cjpe.2022.0233.
  • Yang FZ, Wang ZS, Zhang Q, Sun SL, Liu YB. 2022. Consistency analysis of five global sun-induced chlorophyll fluorescence products over China. Remote Sens Technol Appl. 37(1):125–136.
  • Yang X, Tang J, Mustard JF, Lee JE, Rossini M, Joiner J, Munger JW, Kornfeld A, Richardson AD. 2015. Solar-induced chlorophyll fluorescence that correlates with canopy photosynthesis on diurnal and seasonal scales in a temperate deciduous forest. Geophys Res Lett. 42(8):2977–2987.
  • Yu L, Wen J, Chang CY, Frankenberg C, Sun Y. 2019. High-resolution global contiguous SIF of OCO-2. Geophys Res Lett. 46(3):1449–1458.
  • Zarco-Tejada PJ, CA, Gonzalez MR, Martin P. 2013. Relationships between net photosynthesis and steady-state chlorophyll fluorescence retrieved from airborne hyperspectral imagery. Remote Sens Environ. 136:247–258.
  • Zhang Y, Joiner J, Alemohammad SH, Zhou S, Gentine P. 2018. A global spatially contiguous solar-induced fluorescence (CSIF) dataset using neural networks. Biogeosciences. 15(19):5779–5800. doi:10.5194/bg-15-5779-2018.
  • Zhang YJ, Liu LY, Hou MY, Liu LT, Li CD. 2009. Progress in remote sensing of vegetation chlorophyll fluorescence. J Remote Sens. 13(5):963–978.
  • Zhang Z, Wang S, Qiu B, Song L, Zhang Y. 2019. Retrieval of sun-induced chlorophyll fluorescence and advancements in carbon cycle application. J Remote Sens. 23(1):37–52.
  • Zhang ZX, Xu W, Qin QM, Long ZH. 2020. Downscaling solar-induced chlorophyll fluorescence based on convolutional neural network method to monitor agricultural drought. IEEE Trans Geosci Remote Sens. 59(2):1012–1028.