737
Views
0
CrossRef citations to date
0
Altmetric
Research Article

An improved deep learning network for AOD retrieving from remote sensing imagery focusing on sub-pixel cloud

ORCID Icon, ORCID Icon, , , , , , , , & show all
Article: 2262836 | Received 19 Apr 2023, Accepted 20 Sep 2023, Published online: 14 Oct 2023

ABSTRACT

Following the success of MODIS, several widely used algorithms have been developed for different satellite sensors to provide global aerosol optical depth (AOD) products. Despite the progress made in improving the accuracy of satellite-derived AOD products, the presence of sub-pixel clouds and the corresponding cloud shadows still significantly degrade AOD products. This is due to the difficulty in identifying sub-pixel clouds, as they are hardly identified, which inevitably leads to the overestimation of AOD. To overcome these conundrums, we proposed an improved deep learning network for retrieving AOD from remote sensing imagery focusing on sub-pixel clouds especially and we call it the Sub-Pixel AOD network (SPAODnet). Two specific improvements considering sub-pixel clouds have been made; a spatial adaptive bilateral filter is applied to top-of-atmosphere (TOA) reflectance images for removing the noise induced by sub-pixel clouds and the corresponding shadows at the first place and channel attention mechanism is added into the convolutional neural network to further emphasize the relationship between the uncontaminated pixels and the ground measured AOD from AERONET sites. In addition, a compositive loss function, Huber loss, is used to further improve the accuracy of retrieved AOD. The SPAODnet model is trained by using ten AERONET sites within Beijing-Tianjin-Hebei (BTH) region in China, along with their corresponding MODIS images from 2011 to 2020; Subsequently, the trained network is applied over the whole BTH region and the AOD images over the BTH region from 2011 ~ 2020 are retrieved. Based on a comprehensive validation with ground measurements, the MODIS products, and the AOD retrieved from the other neural network, the proposed network does significantly improve the overall accuracy, the spatial resolution, and the spatial coverage of the AOD, especially for cases with sub-pixel clouds and cloud shadows.

1. Introduction

AOD is a crucial parameter used in meteorological observation and represents aerosol extinction effects in the vertical atmospheric column. It is a fundamental optical property of aerosol derived from satellite imagery (Guo et al. Citation2020; Li, Carlson, and Lacis Citation2015; Lin et al. Citation2015; Martins et al. Citation2002; Su et al. Citation2017, Citation2020; Van Donkelaar, Martin, and Park Citation2006). Following the success of MODIS, several algorithms have been developed to retrieve AOD from satellite imagery. The most well-known algorithms as the Dark Target (DT) algorithm (Gupta et al. Citation2016; Jackson et al. Citation2013; Kaufman et al. Citation1997; Levy, Remer, and Dubovik Citation2007) and the Deep Blue (DB) algorithm (Hsu et al. Citation2004, Citation2006, Citation2013; Lyapustin et al. Citation2018). The DT algorithm estimates the reflectance of “dark targets” such as dense vegetation and dark soils to decouple the ground and atmosphere and derive AOD. However, it is generally overestimated and limited to “dark target” regions. The DB algorithm uses the minimum reflectivity method to build a surface reflectivity database to derive AOD, which is stable and can derive AOD over bright surfaces. However, it is not designed to deal with the influence of surface anisotropy, which increases errors in surface reflectivity determination particularly in urban areas with complex surfaces. Additionally, the DB algorithm employs variance statistics to further reduce the impact of cloudlets on retrievable pixels (Bian et al. Citation2018, Citation2022; Wang et al. Citation2020). Other algorithms, such as MAIAC, can retrieve aerosol properties over both dark and bright surfaces (Lyapustin et al. Citation2018). Multispectral lidar measurements are also utilized in aerosol studies (Pérez-Ramírez et al. Citation2019, Citation2020; Young et al. Citation2018).

Under the above mentioned algorithms, long-term AOD products with extensive coverage for polar-orbiting satellites have been released and used widely (Su et al. Citation2020). However, satellite-based AOD retrieval is reliant on clear skies and appropriate surface conditions (Kim et al. Citation2008, Citation2020). Cloud cover is the most significant factor that affects the retrieval of satellite-derived aerosols (Winker et al. Citation2009; Winker, Pelon, and Patrick McCormick Citation2003). It should not be overlooked that the cloud screening process may omit cirrus or the edge of a cloud to a certain extent, where thin clouds exist as sub-pixel clouds in low-resolution sensors (Lin et al. Citation2016). Despite improved detection of clouds and snow using quality screening in data pre-processing, residual clouds and sub-pixel clouds can still lead to an overestimation of AOD from the MAIAC algorithm (Li, Carlson, and Lacis Citation2015).

To better illustrate the impact of sub-pixel clouds on low-resolution images and the difficulties involved in their identification, we present . This figure shows an example of sub-pixel clouds in low-resolution imagery () and their corresponding high-resolution imagery () at the same location and time. The clouds are easily recognizable in high-resolution imagery, but low-resolution cloud products () cannot identify them accurately. As a result, pixels with unidentified clouds in low-resolution imagery are treated as “clear” pixels, leading to overestimation of AOD.

Figure 1. An example of an unidentified sub-pixel clouds scene in low-resolution imagery and its corresponding appearance in high resolution imagery. (a) and (b) display the high-resolution imagery under clear and cloudy conditions, respectively. Both images were captures in the same region at different times (2020-07-21 19:41:14. (UTC)) and (2020-07-23 19:28:53 (UTC)); (c) displays the low-resolution imagery captured during cloudy conditions on 2020-07-23 from 19:20:00–19:25:00(UTC). (d) displays the cloud mask for the same low-resolution imagery (2020-07-23 19:20:00–19:25:00 (UTC)). Regions with unidentified sub-pixel clouds are circled in red.

Figure 1. An example of an unidentified sub-pixel clouds scene in low-resolution imagery and its corresponding appearance in high resolution imagery. (a) and (b) display the high-resolution imagery under clear and cloudy conditions, respectively. Both images were captures in the same region at different times (2020-07-21 19:41:14. (UTC)) and (2020-07-23 19:28:53 (UTC)); (c) displays the low-resolution imagery captured during cloudy conditions on 2020-07-23 from 19:20:00–19:25:00(UTC). (d) displays the cloud mask for the same low-resolution imagery (2020-07-23 19:20:00–19:25:00 (UTC)). Regions with unidentified sub-pixel clouds are circled in red.

Sub-pixel clouds are especially difficult to identify and remove (Motohka et al. Citation2011). Noise due to irremovable sub-pixel clouds is one of the severest problems for AOD retrievals using satellite sensors with low-to-medium spatial resolution such as MODIS. Although cloud masks have been produced and demonstrated to be effective (Platnick et al. Citation2003), they are not always able to completely offset the contamination caused by clouds (Nagai et al. Citation2010). Residual sub-pixel clouds potentially contaminated about 40% of the MODIS data after cloud screening by the state flag of Surface reflectance product (Motohka et al. Citation2011). Despite the development of an efficient method suitable for widespread applications, which utilizes spatiotemporal optimization algorithms for gap filling in products (Zhang et al. Citation2022), the key challenge in low-resolution satellite image aerosol retrieval remains how to enhance aerosol retrieval accuracy and ensure spatiotemporal continuity, especially under sub-pixel cloud conditions. Until now, the strong influence of the sub-pixel clouds on AOD retrieval remains an unsolved problem. Data screening based on variances has improved the accuracy of AOD products through a strict threshold setting, but a large number of retrievable pixels have been excluded. The spatial coverage of retrieved AOD is greatly reduced. Consequently, how to increase the retrieved AOD while keeping the accuracy becomes a dilemma.

With the accumulation of satellite data and long-term ground observations, big data analytic methods (Chen et al. Citation2022) and machine learning algorithms (Kang et al. Citation2022) can be employed to construct a better fitting between satellite data and ground observations to better solve the above dilemma. We present an improved deep learning model incorporating a bilateral filter and channel attention mechanism to robustly retrieve AOD over the BTH region. The proposed model is specially designed to focus on sub-pixel clouds, and we call it the SPAODnet. Two specific improvements considering sub-pixel clouds have been made; a bilateral filter is applied to the TOA reflectance images for removing the noise induced by sub-pixel clouds and the corresponding shadows in the first place and a channel attention mechanism is added into the convolutional neural network to further emphasize the relationship between the uncontaminated pixels and the ground measured AOD from AERONET sites. In addition, a compositive loss function, the Huber loss function, is used to further improve the accuracy of the retrieved AOD. In the end, daily AOD within ten years (2011 to 2020) over the BTH region has been retrieved and is consequently used for model validation. The retrieved AODs are evaluated against AERONET AOD, Aerosol products, and those derived from other neural network approaches in terms of accuracy and spatial coverage.

2. Study area and datasets

2.1. Study area

This study area is the BTH region in China located within geographical coordinates (113.27°E-119.50°E, 36.05°N-42.4°N), which covers an area with 0.216 million kilometers (). As an important economic core area with heavy industries in northern China, heavy pollution weather with a long duration has appeared in recent decades, and the aerosol properties vary dramatically in different seasons. In addition, the well-maintained AERONET stations have lasted for many years, and more AOD samples can subsequently be collected over a long period for supporting the proper training of SPAODnet and the validation for retrieved AODs from the trained network. Therefore, the diversity of aerosols and the data availability make it an excellent study area for this study. show the detailed information on land cover data near the AERONET stations. The ranges represented in and (b) are located in the red boxes in . The blue dots in indicate the ten AERONET sites within the study area.

Figure 2. Spatial distribution of selected AERONET stations over the BTH region. The background image is the Terra and Aqua combined MODIS land cover type (MCD12Q1) Version 6. The colours in this figure legend represent different land cover types and the land cover types are evergreen needleleaf Forests(1), evergreen Broadleaf Forests(2), deciduous needleleaf Forests(3), deciduous Broadleaf Forests(4), mixed Forests(5), closed Shrublands(6), open Shrublands(7), Woody Savannas(8), Savannas(9), Grasslands(10), permanent wetlands (11), Croplands(12), urban and Built-up Lands(13), Cropland/Natural vegetation Mosaics(14), permanent snow and ice (15), Barren(16), and water Bodies(17) respectively.

Figure 2. Spatial distribution of selected AERONET stations over the BTH region. The background image is the Terra and Aqua combined MODIS land cover type (MCD12Q1) Version 6. The colours in this figure legend represent different land cover types and the land cover types are evergreen needleleaf Forests(1), evergreen Broadleaf Forests(2), deciduous needleleaf Forests(3), deciduous Broadleaf Forests(4), mixed Forests(5), closed Shrublands(6), open Shrublands(7), Woody Savannas(8), Savannas(9), Grasslands(10), permanent wetlands (11), Croplands(12), urban and Built-up Lands(13), Cropland/Natural vegetation Mosaics(14), permanent snow and ice (15), Barren(16), and water Bodies(17) respectively.

2.2. Data

In this study, TOA reflectance and surface reflectance from MODIS with the corresponding solar-viewing geometries including solar zenith, solar azimuth, viewing zenith and viewing azimuth angles, and the ground-measured AOD from AERONET sites are selected as the training datasets and the corresponding labels, respectively. Multiple steps for data pre-processing are applied to regulate training data that are fed into the SPAODnet model. The pre-processing steps include variable transformation, spatial correlation correspondence, and collocation at different temporal and spatial resolutions.

2.2.1. AERONET AOD

The label for training SPAODnet over the study area is from the ground-based observation data from AERONET. AERONET is a global network for observing aerosol parameters on the ground by using sun photometers, which have been commonly used for validating satellite derived AOD products (Bilal et al. Citation2013; Gao et al. Citation2021; van Donkelaar et al. Citation2013). AOD data are taken from ground-based observation stations of AERONET Version 3 Direct Sun Algorithm product (Giles et al. Citation2019), which provides observation data in three data quality levels: level 1.0 (L1, unscreened), level 1.5 (L1.5, cloud-screened and quality controlled), and level 2.0 (L2, quality-assured) (Wang et al. Citation2020). To demonstrate the data available for this study better, the available period of ten sites and their corresponding land cover types are listed in . The Beijing and Beijing-CAMS sites are distributed within the 5th Ring ROAD and have the same urban surface type (Wang et al. Citation2020; Wei et al. Citation2018). XiangHe is a typical suburban site located between the two megacities of Beijing and Tianjin. This site is surrounded by cropland and experiences both natural and anthropogenic aerosols from urban, rural, or mixed origins (Li et al. Citation2008).

Table 1. Detailed information about the used AERONET sites. Lat: latitude; Lon: longitude.

The level 1.5 AOD measurements from 2011 to 2020 are employed as the ground truth in this study. To accord with the satellite retrieved AOD, AOD ground measurements from the adjacent bands 500 nm and 675 nm were interpolated to 550 nm using Ångström exponent (Ångström Citation1929; Eck et al. Citation1999), which defined as EquationEq (1Equation3).

(1) τλ0=βλ0α(1)
(2) α=lnτλ1/τλ2lnλ1/λ2(2)
(3) β=πλ1λ1α(3)

Where τλ is the AOD at wavelength λ, α is the wavelength index, β is the turbidity coefficient of Ångström, λ0, λ1 and λ2 are wavelengths at 550 nm, 500 nm and 675 nm, respectively.

2.2.2 Satellite data

MODIS is a widely used sensor aboard the Terra and Aqua polar-orbiting satellites (Wang et al. Citation2020), which provide rich spectral information, wide viewing swath widths, and frequent global coverage. MODIS data were obtained from the Goddard Space Flight Center (GSFC) level 1 and Atmosphere Archive and Distribution System (LAADS). MODIS data have derived many data products that are convenient for scientific research (Chen et al. Citation2020; Levy et al. Citation2009; Liang, Sun, and Haoxin Citation2021; Wei et al. Citation2019); therefore, more similar research can be used for comparisons. In this research, TOA reflectance and surface reflectance as well as viewing angles are the main feature vectors in the SPAODnet model.

The aerosol products combined with M×D04 dark target algorithm and deep blue algorithm are used for comparative verification of SPAODnet. summarizes the characteristics of MODIS data used and the spectral measurement information of MODIS sensor.

Table 2. Detailed information of MODIS/Terra data used in this study.

During the study period, the B5 damage was observed on both MODIS sensors, and the B6 damage was only observed on MODIS/Terra (Chen et al. Citation2020); therefore, only bands 1, 2, 3, 4, and 7 of MOD02HKM and MOD09 are extracted as inputs. MOD03 provides the solar-viewing geometry including solar zenith, solar azimuth, viewing zenith, and viewing azimuth angles, which are also necessary input features for AOD retrieval. In addition, to retrieve the AODs under clear sky conditions, the cloud mask MOD35 is applied to screen out pixels in thick clouds and snow. To obtain a reasonable sub-region size of input data, a cloud-free threshold test is performed. First, the cloud mask MOD35 product is used to identify MODIS pixels that include ice, snow, ocean, and clouds, and these pixels are set to 0. Then, the cloud-free threshold is determined by calculating the ratio of cloud-free land pixels to the total number of pixels. If the ratio of is less than 90%, the entire sample is removed from the dataset. To determine the cloud-free threshold in SPAODnet, samples of pixels with 80%, 90%, 95%, and 100% cloud-free pixels were tested. There was no significant difference between the results obtained with 90%, 95%, and 100% cloud-free pixels. However, when the threshold was set below 90%, the validation results showed lower detection accuracy. Therefore, it is recommended to use a cloud-free threshold of at least 90% in SPAODnet to achieve optimal results. Finally, the features of cloud-free pixels are extracted to construct the nonlinear relationship between observation information and ground observation sites.

2.2.3 Training set constructing

Prior to performing the AOD retrieval, the MODIS data and ground-measured AOD data are required to construct a training dataset for the SPAODnet model through spatial-temporal correlation, which assures the spatial and temporal consistency. Assuming that aerosol is relatively homogeneous within a certain time-space boundary (Anderson et al. Citation2003), the geo-location (geographical latitude and longitude coordinates) of the AERONET sites are used as the center point to reproject and subset the corresponding TOA reflectance (MOD02), solar-viewing geometry (MOD03), surface reflectance (MOD09), and cloud mask (MOD35) images to align all the satellite data. The AERONET AOD data were averaged over ±30 min window based on the MODIS imaging time to reduce anomalous perturbations.

After being processed through the above procedure, the inputs including the sub-regions of TOA reflectance (MOD02), solar-viewing geometry (MOD03), surface reflectance (MOD09), and cloud mask (MOD35) images and the ground truth (AOD ground measurements) compose a cloud-free temporally and spatially consistent dataset (Ichoku et al. Citation2002; Remer et al. Citation2005). In this study, 4176 data records are collected as training datasets for SPAODnet from ten AERONET sites during a period of ten years. In deep learning, common methods for dividing a dataset include hold-out, cross-validation, and bootstrap sampling. We are using the hold-out method, which involves randomly dividing the pre-matched dataset into three sets: training, validation, and testing. The training set is used to train the model, the validation set is used to adjust the model’s hyperparameters during training and prevent overfitting, and the testing set is used to evaluate the final performance of the model. The size of each set can vary depending on the size of the dataset and the complexity of the model. We shuffle the dataset into training, validation, and testing with a ratio of 8:1:1, and the independent test set which is randomly selected in overall spatial-temporal ranges into SPAODnet for stringent validation.

2.3. Determining the sub-region size of the input satellite images

Since aerosols change slowly in space (Zhong et al. Citation2017), a single pixel may be affected by cloudlets and sub-pixel clouds. Therefore, sub-region images centered on AERONET sites are extracted to obtain spatially auto-correlated observations information as input. However, sub-regions smaller than 3 × 3 pixels, which is the size of convolution kernel, are not recommended. A sub-region block size of 12 × 12 pixels (km2) is selected for two reasons. First, this size is large enough to capture a variety of spatial variability scales. Second, reducing the number of model parameters following the assumption that the aerosol is uniformly distributed over a certain range. “Sub-region size” is a hyperparameter in the SPAODnet model that directly affects the model’s performance. It is assessed using the two metrics, coverage and accuracy, to determine the optimal value. The comparison results are listed in , and the best performance is obtained with a sub-region size of 12 × 12 km2 for the satellite images. The detailed results are shown in Section 4. Overall, the windows of 12 × 12 km2 and 8 × 8 km2 have higher AOD site coverage, and the ratio of Within EE is performing well when the sub-region size is set to 12.

Table 3. Performance of AOD retrievals under different sub-region sizes. In the collection 6 AOD validation, EE is defined as ±0.05 or ±0.15×AOD, and the AERONET site coverage represents effective coverage of AERONET AOD for the period 2011–2020.

3. Methodology

The flowchart of SPAODnet for retrieving AOD from MODIS data is shown in . Convolutional neural network (CNN), which has excellent performance on nonlinear fitting problems, are specifically used between TOA reflectance of satellite images and AOD for successful AOD retrieving (Chen et al. Citation2020). Additionally, the proposed SPAODnet model incorporates two core components into CNN including a spatial adaptive bilateral filtering and a channel attention mechanism. These components aim to reduce the impact of sub-pixel clouds and the corresponding cloud shadows.

Figure 3. The pipeline of the SPAODnet algorithm.

Figure 3. The pipeline of the SPAODnet algorithm.

3.1. The theoretic basis of AOD retrieval

The SPAODnet is theoretically based on the atmospheric radiative transfer equation (Kaufman et al. Citation1997). The radiation information received by satellites is a combination of scattering from the Earth’s atmosphere as well as reflection from the surface. Assuming that the land surface is a uniform Lambertian surface and that the atmosphere is uniformly attenuated in the vertical direction, the radiation intensity value L received by satellite can be depicted as EquationEq. (4):

(4) Lμv=L0μv+ρs1ρsμsF0Tθs,τTθv,τ(4)

where μSF0 is the radiation flux density at the top of the atmosphere in the direction perpendicular to the sunlight. L0μν represents path radiance in the direction of observation; ρand S denote surface reflectance and atmospheric single scattering albedo, respectively; μS=cosθs, μv=cosθv indicates the solar zenith angle and the observed zenith angle, respectively; TθS and Tθv represent the downward and upward total transmittance of the atmosphere at solar direction and satellite direction, respectively; τ is the Atmospheric AOD. With the above equation, we can solve for any of the parameters when the other factors are known. The atmospheric radiative transfer model solves for the radiation flux density by the above equation, normalized to give the top-of-atmosphere reflectance ρTOA.

(5) ρTOAμs,μv,=ρoμs,μv,+ρsμs,μv,1ρsμs,μv,STθs,τTθv,τ(5)

φ is the relative azimuth of the sensor. μs,μv are the cosine of the solar zenith angle and the satellite observation angle. ρ0 is the equivalent reflectance of atmospheric radiation along its atmospheric transport path. ρs is the surface reflectance of a Lambertian body with a uniform surface. This shows that ρTOAis not only a function of the optical thickness of aerosol, but also a function of the apparent reflectance. If ρsis known, AOD can be obtained for the corresponding image element, which is the principle of AOD retrieval, and it is also the basis for the construction of SPAODnet. It is worth mentioning that abnormal values, which surface reflectance ρs is greater than 0.3 in all wavelengths, are defined as cloudy or snowy weather. Above equation describes how solar radiation is transferred through the atmosphere and land surface into the satellite’s sensor; it is an integral differential equation and nonlinear, so it has no analytic solutions. Due to the excellent performance on nonlinear fitting problems (Su et al. Citation2020; Wang et al. Citation2019), CNN is selected for this research as one of the best tools.

3.2. Spatial adaptive bilateral filtering for input TOA imagery cleaning

Despite quality control and cloud screening of satellite imagery, residual sub-pixel clouds may contaminate approximately 40% of MODIS data (Motohka et al. Citation2011), which significantly reduces the accuracy of AOD retrieval using satellite imagery. The presence of sub-pixel clouds increases the TOA reflectance and leads to the derived AODs being overestimated. To eliminate the effect of sub-pixel clouds on TOA reflectance as possible while retaining more edge features, adaptive bilateral filtering is adopted in this study. Bilateral filtering is defined as EquationEq. (6Equation10), which takes both the spatial closeness and the similarity of grey-scale values into consideration while filtering the TOA image. Gaussian functions are used as the kernels by default for determining the closeness ωsi,j at the spatial domain and the similarity ωri,jat the grey-scale domain in EquationEq. (7) and (Equation8).

(6) fˆx,y=1ωpi,jΩωsi,jωri,jIi,j(6)
(7) ωsi,j=expix2+jy22σs2(7)
(8) ωri,j=expIi,jIx,y22σr2(8)
(9) ωρ=i,jΩωsi,jωri,j(9)
(10) Ii,j=fi,j+ni,j(10)

For a pixel (x, y) in a TOA image, is the denoised image, ωsi,jis the weight at spatial domain, ωri,j is the weight at grey-scale value domain, ωp is normalization parameter, Ii,j is the noisy image, Ωis the neighborhood range at pixel (x, y). As shown in EquationEq. (10), f denotes the noise-free image, assuming that n is the noise caused by the presence of sub-pixel clouds in this study, I is the noisy image, and Ii,j denotes the pixel value of image I at i,j.

From EquationEq. (6Equation8), it can be seen that the weight coefficients of the bilateral filter are determined by the two parameters spatial variance σs and the grey scale variance σr together. Thus, to eliminate the influence of discrete sub-image cloud points on TOA images as possible while being able to retain the features of image edge, adaptively adjusting the spatial variance σs and the grey scale variance σrbecome the key issues. In this study, a local adaptive mechanism is adopted to dynamically determine the spatial variance σsof a pixel by calculating the target scale (Rxy) of each pixel in the image, which depicts the extent of smoothness around that pixel through the highest similarity of the pixel with the neighbor pixels. The target scale Rxy characterize the maximum neighborhood radius of a connected circle in the same target region, as shown in EquationEq. (11).

(11) Rxy=argmaxrZ,r1UxyrTs(11)

For a pixel (x, y) in the image, Uxyr defines the similarity between the pixel and another neighbor pixel within the neighborhood boundary region, and Ts is the threshold function that defined as 0.75 in this study (Saha, Udupa, and Odhner Citation2000).

In the edge and texture regions, a smaller target scale is derived and a smaller value of spatial variance σs is taken to retain more details on the TOA image; on the contrary, a larger value of σsis derived for better smoothing and denoising. Therefore, the value of σsin the bilateral filter determines the degree of expansion of the Gaussian curve in the filter window, the larger the σs, the slower the Gaussian curve decreases and the more obvious the smoothing effect, σsis defined in Eq. (12)

(12) θs=0.5Rxy(12)

From Eq. (8), it can be seen that the σr sets large, the TOA image shows more blurred; the σrsets small, the edges and details show clearer, which are more sensitive to the σs. In Eq. (13), it can be seen that in order to obtain the value range variance σr, the noise variance σn of the image needs to be estimated at the first place, and σn can be quickly estimated according to the Laplace transform as shown in Eq. (14).

(13) σr=3σn(13)
(14) σn=π216W2H2imageIIx,yN(14)

W and H are the width and height of the image, represents the convolution operation, and Ix,yN represents the image and mask N for convolution operation, and N is the mask obtained by discrete Laplace transform as shown in Eq. (15).

(15) N=121242121(15)

In general, this method allows adaptive correction of TOA images and it is effective in regions with sub-pixel clouds, which preserves edge information while denoising TOA images with sub-pixel clouds and subsequently improves the input TOA images for the network greatly.

To better demonstrate the improvement of the bilateral filter, Sentinel-2(1 July 2020 03:16:49) and MODIS (1 July 2020 03:15:00) have been added to show the sub-pixel cloud as . (6 July 2020 03:16:46 UTC) shows clear weather condition compared to , in order to demonstrate that the red circle in is a sub-pixel cloud that is not observable in MODIS (). shows the results of bilateral filtering, Gaussian filtering, and thresholding, respectively. The results show that the Gaussian filtering based on distance removes a large number of edge features and the thresholding methods hardly eliminate the noise induced by sub-pixel clouds. In contrast, the adaptive bilateral filtering smooth the sub-pixel cloud region while retaining most of the edge information. The red boxes in show the sub-pixel clouds and the effects after the adaptive bilateral filtering, which provides more accurate TOA feature information to the following network for better prediction.

Figure 4. The visualization results of bilateral filtering in removing sub-pixel pixels. (a) and (b) show sentinel 2 satellite images under clear and cloudy conditions, respectively. (c) shows a low-resolution MODIS image under cloudy conditions. (d) demonstrates the results of Gaussian filtering, thresholding, and bilateral filtering in removing sub-pixel clouds from MODIS images.

Figure 4. The visualization results of bilateral filtering in removing sub-pixel pixels. (a) and (b) show sentinel 2 satellite images under clear and cloudy conditions, respectively. (c) shows a low-resolution MODIS image under cloudy conditions. (d) demonstrates the results of Gaussian filtering, thresholding, and bilateral filtering in removing sub-pixel clouds from MODIS images.

3.3. Convolutional neural network architecture combining channel attention

The input features of SPAODnet model can be divided into two modalities: the TOA and surface reflectance are images and the solar-viewing angles are numerals. To feed the matrix (remote sensed images) and numerical data (observed angles) into a non-linear regression prediction network, the branches of input 1 and input 2 are designed for multimodal feature extraction respectively, and the feature maps are concatenated to obtain the output of the network as AOD values using dense layers as shown in . To prevent overfitting of the network while training, dropout layer is employed to mitigate the occurrence of overfitting and achieve a degree of regularization.

Figure 5. A schematic representation of the channel-attention-based convolutional neural network architecture used to retrieve AOD. The same architecture was optimized multiple times with different loss functions to compare the sensitivity of the model. Convolution, channel attention, average pooling, up-sampling plus channel attention, global pooling, dense, and dropout layers are shown in purple, white, blue, green, blue, pink, and sky blue respectively. The concatenated denotes the concatenation operation between feature maps.

Figure 5. A schematic representation of the channel-attention-based convolutional neural network architecture used to retrieve AOD. The same architecture was optimized multiple times with different loss functions to compare the sensitivity of the model. Convolution, channel attention, average pooling, up-sampling plus channel attention, global pooling, dense, and dropout layers are shown in purple, white, blue, green, blue, pink, and sky blue respectively. The concatenated denotes the concatenation operation between feature maps.

The network architecture and the vector dimensions for each block are shown in , where the input dimension is (N, H, W, C). Here, N represents the number of samples, H*W represents the sub-region size of the input satellite images, and C represents the bands of TOA and surface reflectance. The output dimension is (N,1), indicating the output of the single variable, AOD. The network contains five basic blocks to extract input data features. Given MODIS TOA reflectance and surface images as feature vectors of input data, the shallow features are extracted through convolutional layers in the first place and the shallow features are subsequently passed into five consecutive blocks incorporating channel attention to extract effective feature information; the final feature map is passed into the concatenating layer by global pooling layer. Both average pooling and global average pooling (GAP) are employed within the network at different blocks; the average pooling layer located behind the convolutional layer is used to reduce the dimensionality and remove redundant information and the global average pooling replaces the flatten layer to regularize the whole network to prevent overfitting, which eliminates the black-box features in fully connection (FC) layer and subsequently enhances the interpretability of the network.

The core component of SPAODnet framework is the UpSample layer add Channel Attention module (Up-CA). This module is specially designed to further exclude the influence of noise, such as sub-pixel clouds and the corresponding shadows. Usually, the deep learning network employs the global average pooling or global max pooling layer to aggregate the global features, which are extremely susceptible to noise points, such as the presence of sub-pixel clouds in low resolution remote sensing imagery, and lead to that the weights are susceptible to outliers. Therefore, the Up-CA module is proposed in this study to replace the global pooling layer to aggregate spatial information with a dynamic pooling layer to lower the influence of sub-pixel clouds and the corresponding shadows. It assigns higher weights to the uncontaminated pixels by learning the agreement between the pixels and the label and several examples of the expected effect are presented at section 4.3. Our implementation is shown diagrammatically in and the lower part of the fame is the detailed structure of CA for AOD module.

Figure 6. Diagram of the Up-CA module. As illustrated, this module utilizes 1*1 convolutional layer instead of global pooling. The green and yellow blocks represent the depthwise separable convolution for conv 3*3 and conv1*1 respectively. ⊙ represents the sigmoid function. ⊙ is the matrix dot product. ⊗ denotes matrix multiplication. Convolution kernel size is 3*3 and the stride is 1 in convolution operation.

Figure 6. Diagram of the Up-CA module. As illustrated, this module utilizes 1*1 convolutional layer instead of global pooling. The green and yellow blocks represent the depthwise separable convolution for conv 3*3 and conv1*1 respectively. ⊙ represents the sigmoid function. ⊙ is the matrix dot product. ⊗ denotes matrix multiplication. Convolution kernel size is 3*3 and the stride is 1 in convolution operation.

The feature maps FRC×2H×2W as the input of the UpSampling layer and CA module. For the CA module, a parallel branch with sequential operations including a squeeze operation (1 × 1 convolutional layer), an excitation operation (ReLU), and a bottleneck structure (a full connection with a Sigmoid function) is added to combine with the general convolutional operation to retrieve the weights of different channel. The CA weights Td is element-wise multiplied by the feature map of input F to obtain the weighted feature map Td. Here, the depth-wise separable convolution is applied to reduce the number of convolution layers significantly. The CA mechanism is mathematically expressed as follows:

(16) Fd=sigmoidconvF(16)
(17) Td=sigmoid(conv(ReLU(conv(Td))))F(17)
(18) Td=FdF(18)

where Conv is the convolution operation with a filter size of 1 × 1. The CA reduces the impact of sub-pixel clouds on global features by learning the importance of spatial features without sub-pixel cloud and assigning different weights for different spatial information.

3.4. Model evaluation metrics and loss functions

Based on the assumption of uniform aerosol distribution within a certain space, the center pixel can represent the whole input image (12 × 12). Assuming a d-dimension input x, m-dimensional output y, weight matrix w, bias vector b, and the set of parameters, the process of AOD retrieving can be defined as the following formula:

(19) θw,bx:RdRm(19)

θw,b is obtained by minimizing the loss function between the ground truth (y) and the predictions (yˆ) over the training data. MSE and MAE are two of the most used loss functions (Chicco, Warrens, and Jurman Citation2021) and each of the two loss functions has its advantages. Instead of using one of them, the Huber loss function incorporating the advantages of both MSE and MAE is employed in this study, which is defined as Eq. (20). Compared with the MAE and MSE, the Huber loss is less sensitive to outliers while converging fast.

(20) Lδy,fx=12yyˆ2foryyˆδ,δyyˆ12δ2otherwise.(20)

Where Lδy,fx denote Huber loss, δ is the hyperparameter, y is the true value, yˆ is the predicted value of the model, the prediction bias is yyˆ. When δ\~0,the Huber loss tends to MAE; when δ\~ (a large number), the Huber loss tends to MSE. The δyyˆ0.5δ2 is adopted to ensure the MAE and MSE are consistent, which ensures that the Huber loss is continuously derivable. At value 0, it is also differentiable. The choice of δ is an adaptive selection process, which is not directly given. When the size of research window takes 12, δ gets 0.3 after training. As shown in , the results suggest that Huber Loss shows a better overall predictive ability for AOD retrieval.

Figure 7. Comparison of the performance of AOD retrieval on SPAODnet using different loss functions.

Figure 7. Comparison of the performance of AOD retrieval on SPAODnet using different loss functions.

The evaluation metrics used in this study include root-mean-square error (RMSE) and coefficients of determination (R2), which are shown in Eq. (21–22). The RMSE is widely accepted as an evaluation metric of the difference between satellite retrievals and ground-based measurements (Che et al. Citation2018).

(21) RMSE=1ni=1nyˆyi2(21)
(22) R2=1iyiyˆ2iyiyˉ2(22)

Where y and yi represent the AODs from satellites and ground-based sites, respectively. The network architecture of SPAODnet is flexible and can accommodate different types of variables to model complex associations in the general applications of spatial regression. Given the input x and the output y variables normalization (e.g. standardization) is required for both. Network training aims to optimize the following objective function:

(23) θω,bopt=argθω,bminLfθω,bx,y(23)

Where θw,bopt denotes an optimal solution for the network parameters, and the total loss function L is given in EquationEq. (20), depending on which interval the loss value belongs to.

We use Adam as the optimizer. Sensitivity analysis was conducted to find an optimal structure (the type of Network structure and the number of nodes for each layer). Grid search (Li et al. Citation2020) is conducted to obtain the optimal hyperparameters, including initial learning rate and minibatch size. The normalization is used to initialize the parameters. After the optimal model is obtained, it can be used to make predictions. All deep learning models are evaluated on a Linux server with Intel CPU i7, 16GB RAM, and Nvidia TITAN RTXTU102 GPUs. The full pipeline is implemented with tensorflow 2.6.0.

4. Results and discussion

To better present the results and validate our method, we compared the accuracy of AOD retrieval and the number of retrieval points. Finally, we used the Grad-CAM algorithm to visualize the effectiveness of channel attention in aerosol retrieval under sub-image cloud conditions, and we conducted the ablation experiments on each contribution measurement.

4.1. Overall performance of the SPAODnet model

4.1.1 The spatial-temporal variations of the retrieved AOD from SPAODnet model

The SPAODnet is an end-to-end network, which implies that the network can directly output an AOD image. Consequently, once the SPAODnet has been adequately trained, it can predict daily AOD over the entire study area, covering both dark and bright surfaces. Although a 12 × 12 km2 window (12 × 12 pixels) is used for training and retrieving AOD to reduce noise, the SPAODnet model retrieves AOD with a spatial resolution of 1 km by moving the window pixel by pixel over the entire image. As shown in , AOD with 1 km spatial resolution provides much more details than the MODIS products with a resolution of 3 or 10 km. To more accurately represent the spatial and temporal variability of the SPAODnet-retrieved AODs, plots of AODs across the entire study area on three different dates 2,019,072 (YYYYDDD), 2019267, and 2,020,158, are plotted in . True color composites of MODIS TOA reflectance at 0.5 km resolution are also shown in to visualize the spatial distribution of the AOD, along with corresponding AODs from the MODIS DT algorithm and DB algorithm in . The high-precision MAIAC algorithm was compared in . In addition, the retrieved AODs through the general CNN model, NNAero, are also presented in . From , the spatial and temporal variations of the SPAODnet retrieved AODs closely match those of the TOA reflectance. In the retrieval process, SPAODnet addresses the issue of removing isolated noise points in the AOD images retrieved under extreme conditions such as snowy weather, thick clouds, and shimmering water surfaces. To achieve this, it employs a Laplacian operator with a second derivative to compute image gradients and subsequently executes an eight-connected edge detection on the gradient image. Since the presence of clouds or water areas in the image can lead to biased high AOD values, calculating the gradient differences around pixels assists in identifying the noise points, which correspond to areas occupied by clouds or water. Following this, the eight-connected edge detection is utilized to pinpoint edge regions, effectively eliminating isolated noise points in the image. Compared to DB and DT, MAIAC has better spatial resolution and higher spatial coverage. However, when compared to (r) and (e), MAIAC exhibits higher AOD values in pixels near clouds and water in urban areas.

Figure 8. (a-c) the daily MODIS TOA reflectance true color maps over the BTH in China with 0.5 km resolution on 2,019,072; 2019267 and 2,020,158 respectively. (d-f) the daily maps of AOD retrieval as derived over the BTH in China using the SPAODnet model applied to MODIS-Aqua data at 0.550 μm with 1 km resolution. (g-I) the daily maps of AOD retrieval using the deep blue algorithm and deep target algorithm applied to MODIS-Aqua with a resolution of 10 km and 3 km, respectively. (m-r) the daily maps of AOD retrieval using the deep learning model NNero and MAIAC with 1 km resolution. The spatial distribution of AERONET site is represented by yellow triangles the white area in the maps represents no data.

Figure 8. (a-c) the daily MODIS TOA reflectance true color maps over the BTH in China with 0.5 km resolution on 2,019,072; 2019267 and 2,020,158 respectively. (d-f) the daily maps of AOD retrieval as derived over the BTH in China using the SPAODnet model applied to MODIS-Aqua data at 0.550 μm with 1 km resolution. (g-I) the daily maps of AOD retrieval using the deep blue algorithm and deep target algorithm applied to MODIS-Aqua with a resolution of 10 km and 3 km, respectively. (m-r) the daily maps of AOD retrieval using the deep learning model NNero and MAIAC with 1 km resolution. The spatial distribution of AERONET site is represented by yellow triangles the white area in the maps represents no data.

Figure 8. (Continued).

Figure 8. (Continued).

To demonstrate the difference quantitatively, AODs obtained through MAIAC, SPAODnet, MODIS DB product, and major AERONET sites were tabulate in . quantitatively compares the prediction accuracy of the SPAODnet model with DB(QA = 3) and MAIAC(QA = 3), which shows that the SPAODnet retrieved AODs are more in numerous and closer to the observed AODs from the AERONET sites. For all three dates, the SPAODnet retrieved AODs for all the sites, but the MODIS DB product missed five of them. For the missed AODs from the MODIS DB product, although the difference between the observed and the AOD from the SPAODnet was larger, the retrieved AODs from the SPAODnet were not biased very much. From the perspective of accuracy, the DB product only had one retrieval at Beijing-CAMS on 2,019,267 that was better than that from SPAODnet; therefore, SPAODnet outperformed. The discrepancy between SPAODnet and DB is prominent in the northeastern portion. SPAODnet showed more details and much higher aerosol loading, which were visually obvious. The SPAODnet model had a higher spatial coverage, which was even slightly higher than that of the DB algorithm with QA = 1; Furthermore, the prediction accuracy of the SPAODnet model achieved the highest values of 83.49%. Therefore, SPAODnet can retrieve AOD with better spatial and temporal coverage than the MODIS DB method, and it also has better accuracy. Most of the AOD predictions from MAIAC were similar to those from the SPAODnet model, but the AOD values for the Beijing-PKU site and Beijing-CAMS site were significantly higher, indicating that the MAIAC algorithm tends to overestimate AOD values in urban areas.

Table 4. Test performances for the AODs from the MAIAC, SPAODnet, MODIS DB products, and the major AERONET sites. QA = 1, 2, and 3 represent the quality assurance levels of DB products.

4.1.2 Accuracy evaluation and comparison to the MODIS AOD products

The SPAODnet model was trained and tested using the 4176 collocated data records over ten years from 2011 to 2020. The validation results indicate that over 91% of the AOD retrieved within the EE envelopes using all data, as shown in (Chen et al. Citation2020; Wei et al. Citation2019).To further demonstrate the accuracy of the SPAODnet model, Level 1.5 daily swath products with 3 km spatial resolution (MOD04_L2 for Terra) were used. The scatterplots of the MODIS AOD products from the MAIAC, DB, DT, and DBDT algorithm against the observed AOD from AERONET were plotted in (a)~(e). Based on major indicators including the ratio within EE, R2, and RMSE, the SPAODnet exhibited the highest accuracy. In particular, the ratio above EE was reduced to 5.51%. To match the number of validation points with the MODIS AOD products, all 4176 collocated data records, including both training and testing records, were used for this validation, which led to the SPAODnet’s accuracy appearing to be too high. To be more reasonable, another comparison was conducted using independent samples, which included only 1/10 of the 4176 collocated data records, and the corresponding AODs of the MODIS AOD products were picked out for comparison. The results are plotted in . At this time, over 83% of the AOD retrieved using the training data were within the EE envelopes, which is still much better than obtained using the MODIS products. However, the R2 and RMSE of the DT method improved significantly because the strict screening criterion for dark surfaces excluded many points with higher error, such as bright surfaces and higher aerosol loading. However, the DT method had a ratio above EE of 19.18%, indicating that it overestimated AODs in urban areas. (a) indicates that the R2 and RMSE metrics of the MAIAC algorithm are slightly higher than those of SPAODnet, whereas the Within EE and above EE metrics exhibit better performance for SPAODnet. This suggests that the MAIAC algorithm has relatively small prediction deviations, but there is a tendency for overestimation in urban areas. Conversely, the SPAODnet model demonstrates lower overestimation rates in independent testing and performs better in urban areas. Therefore, the SPAODnet model has competitive accuracy in AOD retrieval under sub-pixel cloud conditions compared to the widely used MODIS products.

Figure 9. Density scatterplots of total samples for (a) the MAIAC algorithm, (b) the deep blue algorithm, (c) the dark target algorithm, (d) the deep blue algorithm and the dark target algorithm (DBDT) and (e) SPAODnet algorithm. The colour bars represent the density of retrieved values that rely on the grid points (his2D). The black solid line is the 1:1 line, and the red dashed lines represent the within EE lines.

Figure 9. Density scatterplots of total samples for (a) the MAIAC algorithm, (b) the deep blue algorithm, (c) the dark target algorithm, (d) the deep blue algorithm and the dark target algorithm (DBDT) and (e) SPAODnet algorithm. The colour bars represent the density of retrieved values that rely on the grid points (his2D). The black solid line is the 1:1 line, and the red dashed lines represent the within EE lines.

Figure 10. Scatter plots with independent test for the retrieval of AOD. (a) the MAIAC algorithm, (b) the deep blue algorithm, (c) the dark target algorithm, (d) the deep blue algorithm and the dark target algorithm (DBDT) and (e) the SPAODnet algorithm.

Figure 10. Scatter plots with independent test for the retrieval of AOD. (a) the MAIAC algorithm, (b) the deep blue algorithm, (c) the dark target algorithm, (d) the deep blue algorithm and the dark target algorithm (DBDT) and (e) the SPAODnet algorithm.

4.1.3 Comparison to the state-of-the-art method

Compared to the widely used traditional methods such as DB and DT, MAIAC and deep learning methods have recently been developed and are considered state-of-the-art in this study. Among these methods, the Neural Network for AEROsol retrieval (NNAero) (Chen et al. Citation2020) is a self-designed network based on the CNN architecture that can simultaneously retrieve AOD and FMF with a spatial resolution of 1 km, providing much more detailed information than MODIS products with a resolution of 10 km. Moreover, the NNAero retrieved AOD accuracy is slightly higher than that of the MODIS DB and DT algorithms over northern and eastern China. The NNAero retrieval strategy inspired us in this research, and we compared the AOD retrievals from SPAODnet and NNAero. To ensure data consistency, we fed the SPAODnet training dataset into the self-constructed NNAero model, which was based on Chen’s work. The ratios within EE are 69.86% from us and 68% from the literature respectively, but the Chen’s model didn’t provide the R2 and RMSE. The slightly lower within EE ratio in self-constructed NNAero model may be attributable to not using data enhancement procedure. . shows the comparison between NNAero model and SPAODnet model by using the testing records. Our model () has a much lower above EE of 7.18% than the NNAero, which depicts the effectiveness of our proposed method for sub-pixel clouds.

Figure 11. Performance comparison between the (a) NNAero model and (b) SPAODnet model.

Figure 11. Performance comparison between the (a) NNAero model and (b) SPAODnet model.

Both the SPAODnet and the NNAero models are based on CNN architecture. The major modifications of the SPAODnet over the NNAero include 1) the preprocessing of TOA reflectance to minimize the impact of sub-pixel clouds and cloud shadows, 2) the use of a Huber loss function that combines the advantages of MSE and MAE, 3) the proper selection of backbone and window size for the SPAODnet model, and 4) a channel attention mechanism for AOD that strengthens the weight of important feature channels.

4.2. Retrievable number of aerosols

In addition to the accuracy of retrieved AOD, we also evaluate the performance of the SPAODnet model based on the number of retrievable aerosols over ten years. The more AODs that can be retrieved, the better the usability and capability of the SPAODnet performance. . displays the retrievable number of AOD over the ten AERONET sites from DB algorithm with QA = 1, 2, and 3 respectively, alongside the proposed SPAODnet. displays the corresponding AOD site coverage. For most sites, the SPAODnet’s retrievable number is higher than that from the DB algorithm, especially with QA = 1, and a ~ 10% increase in retrievable numbers can be achieved. The SPAODnet performs the best over the XiangHe and Beijiing-CAMS sites, and the retrievable number is significantly higher than that from the DB algorithm with QA = 2 and 3. Particularly, the SPAODnet’s retrievable number is 50% more than that from DB with QA = 3. In terms of the retrievable number of aerosols within ten years, our proposed method retrieves more available AODs while reducing the impact of sub-pixel clouds in low-resolution images, resulting in better usability and capability for applications.

Figure 12. The histogram of the retrievable number and the AOD site coverage from DB algorithm and SPAODnet in the BTH region during 2011–2020. (a) reveal the number of retrievable aerosols. (b) reveal the corresponding AOD site coverage.

Figure 12. The histogram of the retrievable number and the AOD site coverage from DB algorithm and SPAODnet in the BTH region during 2011–2020. (a) reveal the number of retrievable aerosols. (b) reveal the corresponding AOD site coverage.

Figure 12. (Continued).

Figure 12. (Continued).

4.3. The visualization of the attention mechanism in the SPAODnet network

A channel attention mechanism is introduced to reinforce the weights of the channels with important features and suppress the weights of the channels with minor features. The effectiveness of channel attention mechanism for AOD retrieval was visualized using Gard-CAM. displays the attention maps generated by the Grad-CAM algorithm for four scenes with different appearances of clouds or sub-pixel clouds. The feature map depicts the distribution of weights after training the model, where dark blue pixels represent lower weights, and the yellow regions represent the higher weights, indicating the greater importance of the corresponding regions for of AOD retrieval.

Figure 13. The daily of MODIS TOA reflectance true colour maps and the attention maps derived from Grad-CAM. A lighter colour indicates a higher weight and yellow indicates the highest weight.

Figure 13. The daily of MODIS TOA reflectance true colour maps and the attention maps derived from Grad-CAM. A lighter colour indicates a higher weight and yellow indicates the highest weight.

It can be seen that the channel attention Up-CA effectively extracts important information for these scenes. The regions with possible cloud contamination are identified as minor features and are given lower weights. Additionally, the scattered bright points that may represent sub-pixel clouds at the top of are also located as minor features in the SPAODnet network, which reflects the effectiveness of the incorporated channel attention mechanism.

Bilateral filtering improves the distribution of sub-pixel clouds in TOA images, and the Up-CA attention mechanism is used to extract cloud-free features in the model for corresponding with observation stations. The “above EE” evaluation metric is used to quantify “overestimated AOD,” where the retrieval of AOD values exceeding the confidence interval range is considered overestimation. The effect of bilateral filtering was evaluated through an ablation experiment, with the quantification of “overestimated AOD” shown in the “above EE” column of . The relative contribution by each measurement to improve the overall improvement are presented in .

Table 5. The ablation experiment of SPAODnet.“above EE” is the metric used to quantify sub-pixel clouds.

5. Conclusions

In this study, the SPAODnet model is proposed to retrieve AOD in the BTH region by integrating satellite observations and ground-measured AOD from AERONET sites. To improve AOD retrieving accuracy, the SPAODnet model addresses three essential issues while constructing a nonlinear regression prediction network. The first issue is addressed by using a bilateral filter to pre-process TOA reflectance in order to reduce the noise induced by sub-pixel clouds and cloud shadows. This significantly improves the inputs quality to the model. The second issue is addressed by employing the Huber loss function. The last issue is addressed by adding a channel attention mechanism to the backbone network, which slightly improves performance by reinforcing the contribution of important features and suppressing the minor features. Compared to other CNN-based models such as NNAero, the SPAODnet model achieves a 10 ~ 15% improvement in accuracy. Additionally, compared to the traditional methods such as DB and DT from MODIS AOD products, the SPAODnet model achieves over 80% of within EE ratio, with an accuracy that is 5–10% higher than those from MODIS products. Moreover, the predicted AOD values from the SPAODnet have a spatial resolution of 1 km by moving the window pixel by pixel over the entire image. In terms of removing sub-pixel clouds, employing bilateral filtering to process TOA reflectance reduced the above EE metric by 4%. Furthermore, using the attention mechanism Up-CA improved the above EE metric by 3%. Moreover, removing sub-pixel clouds also increased the spatial coverage of AOD subsequently. Therefore, through the simultaneous consideration of the nonlinearity and spatial correlation, the SPAODnet model achieved excellent performance in spatial resolution, spatial coverage and AOD retrieval accuracy.

Furthermore, the SPAODnet model is trained using over ten years of collocated data from both satellite and ground-measured AOD, making it a viable option for predicting AODs within a certain region and its spatial distribution without additional work. Based on these results, we believe that the proposed SPAODnet model can be considered an effective method for retrieving AODs from satellite observations. In future work, we intend to employ the SPAODnet model to produce global AOD with higher accuracy, better spatial coverage, and higher resolution. Additionally, we plan to incorporate an additional feature extraction branch at the time dimension to capture the spatio-temporal properties of aerosols simultaneously. The coupling of a neural network that focuses on sequential data, such as Transformer models, may improve the accuracy of the retrieved AOD.

Acknowledgments

The authors acknowledge MODIS and AERONET groups for the satellite and ground-based remote sensing data. The authors would like to acknowledge the SONET, AERSS for sharing the data and make them available for the community.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are available upon request by contact with the corresponding author, or accessed through https://ladsweb.modaps.eosdis.nasa.gov/and https://aeronet.gsfc.nasa.gov/cgi-bin/webtool_aod_v3.

Additional information

Funding

The work was supported by the  National Key Research and Development Program of China [2019YFE0197800]; National Key Research and Development Program of China under Grant [2020YFE0200700].

References

  • Anderson, T. L., R. J. Charlson, S. E. Schwartz, R. Knutti, O. Boucher, H. Rodhe, and J. Heintzenberg. 2003. “Climate Forcing by Aerosols–A Hazy Picture.” Science 300 (5622): 1103–25.
  • Ångström, A. 1929. “On the Atmospheric Transmission of Sun Radiation and on Dust in the Air.” Geografiska Annaler 11 (2): 156–166.
  • Bian Z, Cao B, Li H, Du Y, Lagouarde J, Xiao Q and Liu Q. 2018.“An analytical four-component directional brightness temperature model for crop and forest canopies.” Remote Sensing of Environment, 209:731–746. https://doi.org/10.1016/j.rse.2018.03.010
  • Bian Z, Wu S, Roujean J, Cao B, Li H, Yin G, Du Y, Xiao Q and Liu Q. 2022.“A TIR forest reflectance and transmittance (FRT) model for directional temperatures with structural and thermal stratification.” Remote Sensing of Environment, 268: 112749. https://doi.org/10.1016/j.rse.2021.112749
  • Bilal, M., J. E. Nichol, M. P. Bleiweiss, and D. Dubois. 2013. “A Simplified High Resolution MODIS Aerosol Retrieval Algorithm (SARA) for Use Over Mixed Surfaces.” Remote Sensing of Environment 136 (4): 135–145. https://doi.org/10.1016/j.rse.2013.04.014
  • Chen, X., G. de Leeuw, A. Arola, S. Liu, Y. Liu, L. Zhengqiang, and K. Zhang. 2020. “Joint Retrieval of the Aerosol Fine Mode Fraction and Optical Depth Using MODIS Spectral Reflectance Over Northern and Eastern China: Artificial Neural Network Method.” Remote Sensing of Environment 249:112006. https://doi.org/10.1016/j.rse.2020.112006
  • Chen, B., Y. Yang, C. Tong, J. Deng, K. Wang, and Y. Hong. 2022. “A Novel Big Data Mining Framework for Reconstructing Large-Scale Daily MAIAC AOD Data Across China from 2000 to 2020.” GIScience & Remote Sensing 59 (1): 670–685.
  • Che, Y., Y. Xue, J. Guang, L. She, and J. Guo. 2018. “Evaluation of the AVHRR DeepBlue Aerosol Optical Depth Dataset Over Mainland China.” Isprs Journal of Photogrammetry & Remote Sensing 146 (9): 74–90. https://doi.org/10.1016/j.isprsjprs.2018.09.004
  • Chicco, D., M. J. Warrens, and G. Jurman. 2021. “The Coefficient of Determination R-Squared is More Informative Than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation.” Peer Journal Computer Science 5 (7): e623. https://doi.org/10.7717/peerj-cs.623
  • Eck, T. F., B. N. Holben, J. S. Reid, O. Dubovik, A. Smirnov, N. T. O’neill, I. Slutsker, and S. Kinne. 1999. “Wavelength Dependence of the Optical Depth of Biomass Burning, Urban, and Desert Dust Aerosols.” Journal of Geophysical Research Atmospheres 104 (D24): 31333–31349.
  • Gao, L., L. Chen, L. Chengcai, L. Jun, H. Che, and Y. Zhang. 2021. “Evaluation and Possible Uncertainty Source Analysis of JAXA Himawari-8 Aerosol Optical Depth Product Over China.” Atmospheric Research 248 (9): 105248. https://doi.org/10.1016/j.atmosres.2020.105248
  • Giles, D. M., A. Sinyuk, M. G. Sorokin, J. S. Schafer, A. Smirnov, I. Slutsker, T. F. Eck, B. N. Holben, J. R. Lewis, and J. R. Campbell. 2019. “Advancements in the Aerosol Robotic Network (AERONET) Version 3 Database–Automated Near-Real-Time Quality Control Algorithm with Improved Cloud Screening for Sun Photometer Aerosol Optical Depth (AOD) Measurements.” Atmospheric Measurement Techniques 12 (1): 169–209.
  • Guo, J., X. Chen, S. Tianning, L. Liu, Y. Zheng, D. Chen, L. Jian, X. Hui, L. Yanmin, and H. Bingfang. 2020. “The Climatology of Lower Tropospheric Temperature Inversions in China from Radiosonde Measurements: Roles of Black Carbon, Local Meteorology, and Large-Scale Subsidence.” Journal of Climate 33 (21): 9327–9350.
  • Gupta, P., R. C. Levy, S. Mattoo, L. A. Remer, and L. A. Munchak. 2016. “A Surface Reflectance Scheme for Retrieving Aerosol Optical Depth Over Urban Surfaces in MODIS Dark Target Retrieval Algorithm.” Atmospheric Measurement Techniques 9 (7): 3293–3308.
  • Hsu, N. C., M. ‐. J. Jeong, C. Bettenhausen, A. M. Sayer, R. Hansell, C. S. Seftor, J. Huang, and S. ‐. C. Tsay. 2013. “Enhanced Deep Blue Aerosol Retrieval Algorithm: The Second Generation.” Journal of Geophysical Research Atmospheres 118 (16): 9296–9315.
  • Hsu, N. C., S.-C. Tsay, M. D. King, and J. R. Herman. 2004. “Aerosol Properties Over Bright-Reflecting Source Regions.” IEEE Transactions on Geoscience & Remote Sensing 42 (3): 557–569.
  • Hsu, N. C., S.-C. Tsay, M. D. King, and J. R. Herman. 2006. “Deep Blue Retrievals of Asian Aerosol Properties During ACE-Asia.” IEEE Transactions on Geoscience & Remote Sensing 44 (11): 3180–3195.
  • Ichoku, C., D. Allen Chu, S. Mattoo, Y. J. Kaufman, L. A. Remer, D. Tanré, I. Slutsker, and B. N. Holben. 2002. “A Spatio‐Temporal Approach for Global Validation and Analysis of MODIS Aerosol Products.” Geophysical Research Letters 29 (12): MOD1–MOD–4.
  • Jackson, J. M., H. Liu, I. Laszlo, S. Kondragunta, L. A. Remer, J. Huang, and H. Huang. 2013. “Suomi‐NPP VIIRS Aerosol Algorithms and Data Products.” Journal of Geophysical Research Atmospheres 118 (22): 12,673–12,89.
  • Kang, Y., M. Kim, E. Kang, D. Cho, and I. Jungho. 2022. “Improved Retrievals of Aerosol Optical Depth and Fine Mode Fraction from GOCI Geostationary Satellite Data Using Machine Learning Over East Asia.” Isprs Journal of Photogrammetry & Remote Sensing 183 (11): 253–268. https://doi.org/10.1016/j.isprsjprs.2021.11.016
  • Kaufman, Y. J., D. Tanré, L. Ao Remer, E. F. Vermote, A. Chu, and B. N. Holben. 1997. “Operational Remote Sensing of Tropospheric Aerosol Over Land from EOS Moderate Resolution Imaging Spectroradiometer.” Journal of Geophysical Research Atmospheres 102 (D14): 17051–17067.
  • Kim, J., U. Jeong, M. H. Ahn, J. H. Kim, R. J. Park, H. Lee, C. H. Song, Y. S. Choi, K. H. Lee, and J. M. Yoo. 2020. “New Era of Air Quality Monitoring from Space: Geostationary Environment Monitoring Spectrometer (GEMS).” Bulletin of the American Meteorological Society 101 (1): E1–E22. https://doi.org/10.1175/BAMS-D-18-0013.1
  • Kim, J., J. ‐. M. Yoon, M. H. Ahn, B. J. Sohn, and H. S. Lim. 2008. “Retrieving Aerosol Optical Depth Using Visible and Mid‐IR Channels from Geostationary Satellite MTSAT‐1R.” International Journal of Remote Sensing 29 (21): 6181–6192.
  • Levy, R. C., L. A. Remer, S. M. Didier Tanré́, and Y. J. Kaufman. 2009. “Algorithm for Remote Sensing of Tropospheric Aerosol Over Dark Targets from MODIS: Collections 005 and 051: Revision 2.” MODIS Algorithm Theoretical Basis Document (2). https://api.semanticscholar.org/CorpusID:17499618
  • Levy, R. C., L. A. Remer, and O. Dubovik. 2007. “Global Aerosol Optical Properties and Application to Moderate Resolution Imaging Spectroradiometer Aerosol Retrieval Over Land.” Journal of Geophysical Research Atmospheres 112 (D13): 210. https://doi.org/10.1029/2006JD007811
  • Liang, T., L. Sun, and L. Haoxin. 2021. “MODIS Aerosol Optical Depth Retrieval Based on Random Forest Approach.” Remote Sensing Letters 12 (2): 179–189.
  • Li, J., B. E. Carlson, and A. A. Lacis. 2015. “How Well Do Satellite AOD Observations Represent the Spatial and Temporal Variability of PM2. 5 Concentration for the United States?” Atmospheric Environment 102 (12): 260–273. https://doi.org/10.1016/j.atmosenv.2014.12.010
  • Li, L., Y. Fang, W. Jun, J. Wang, and G. Yong. 2020. “Encoder–Decoder Full Residual Deep Networks for Robust Regression and Spatiotemporal Estimation.” IEEE Transactions on Neural Networks and Learning Systems 32 (9): 4217–4230.
  • Lin, C. Q., C. C. Li, Z. Y. Alexis Kai Hon Lau, X. C. Lu, K. Tim Tse, J. Chi Hung Fung, L. Ying, T. Yao, and S. Lin. 2016. “Assessment of Satellite-Based Aerosol Optical Depth Using Continuous Lidar Observation.” Atmospheric Environment 140:273–282. https://doi.org/10.1016/j.atmosenv.2016.06.012
  • Lin, C., L. Ying, Z. Yuan, A. K. Lau, L. Chengcai, and J. C. Fung. 2015. “Using Satellite Remote Sensing Data to Estimate the High-Resolution Distribution of Ground-Level PM2. 5.” Remote Sensing of Environment 156 (9) :117–128. https://doi.org/10.1016/j.rse.2014.09.015
  • Li, Z. Q., X. G. Xia, M. R. Cribb, M. Wen, P. W. Brent Holben, H. B. Chen, T. E. Si-Chee Tsay, and F. S. Zhao. 2008. “Aerosol Optical Properties and Their Radiative Effects in Northern China.” Journal of Geophysical Research-Atmospheres 112 (D22S01): 22. https://doi.org/10.1029/2006JD007382
  • Lyapustin, A., Y. Wang, S. Korkin, and D. Huang. 2018. “MODIS Collection 6 MAIAC Algorithm.” Atmospheric Measurement Techniques 11 (10): 5741–5765.
  • Martins, J. V., D. Tanré, L. Remer, Y. Kaufman, S. Mattoo, and R. Levy. 2002. “MODIS Cloud Screening for Remote Sensing of Aerosols Over Oceans Using Spatial Variability.” Geophysical Research Letters 29 (12): MOD4-1-MOD4–.
  • Motohka, T., K. Nishida Nasahara, K. Murakami, and S. Nagai. 2011. “Evaluation of Sub-Pixel Cloud Noises on MODIS Daily Spectral Indices Based on in situ Measurements.” Remote Sensing 3 (8): 1644–1662.
  • Nagai, S., N. Saigusa, H. Muraoka, and K. Nishida Nasahara. 2010. “What Makes the Satellite‐Based EVI–GPP Relationship Unclear in a Deciduous Broad‐Leaved Forest?” Ecological Research 25 (2): 359–365.
  • Pérez-Ramírez, D., D. N. Whiteman, I. Veselovskii, P. Colarco, M. Korenski, and A. da Silva. 2019. “Retrievals of Aerosol Single Scattering Albedo by Multiwavelength Lidar Measurements: Evaluations with NASA Langley HSRL-2 During Discover-AQ Field Campaigns.” Remote Sensing of Environment 222 (12): 144–164. https://doi.org/10.1016/j.rse.2018.12.022
  • Pérez-Ramírez, D., D. N. Whiteman, I. Veselovskii, M. Korenski, P. R. Colarco, and A. M. da Silva. 2020. “Optimized Profile Retrievals of Aerosol Microphysical Properties from Simulated Spaceborne Multiwavelength Lidar.” Journal of Quantitative Spectroscopy & Radiative Transfer 246: 106932. https://doi.org/10.1016/j.jqsrt.2020.106932
  • Platnick, S., M. D. King, S. A. Ackerman, W. Paul Menzel, B. A. Baum, J. C. Riédi, and R. A. Frey. 2003. “The MODIS Cloud Products: Algorithms and Examples from Terra.” IEEE Transactions on Geoscience & Remote Sensing 41 (2): 459–473.
  • Remer, L. A., Y. J. Kaufman, D. Tanré, S. Mattoo, D. A. Chu, J. Vanderlei Martins, R.-R. Li, C. Ichoku, R. C. Levy, and R. G. Kleidman. 2005. “The MODIS Aerosol Algorithm, Products, and Validation.” Journal of the Atmospheric Sciences 62 (4): 947–973.
  • Saha, P. K., J. K. Udupa, and D. Odhner. 2000. “Scale-Based Fuzzy Connected Image Segmentation: Theory, Algorithms, and Validation.” Computer Vision and Image Understanding 77 (2): 145–174.
  • Su, T., I. Laszlo, L. Zhanqing, J. Wei, and S. Kalluri. 2020. “Refining Aerosol Optical Depth Retrievals Over Land by Constructing the Relationship of Spectral Surface Reflectances Through Deep Learning: Application to Himawari-8.” Remote Sensing of Environment 251 (112093) :112093 https://doi.org/10.1016/j.rse.2020.112093.
  • Su, T., J. Li, C. Li, A. Kai-Hon Lau, D. Yang, and C. Shen. 2017. “An Intercomparison of AOD-Converted PM2. 5 Concentrations Using Different Approaches for Estimating Aerosol Vertical Distribution.” Atmospheric Environment 166 (7): 531–542. https://doi.org/10.1016/j.atmosenv.2017.07.054
  • Van Donkelaar, A., R. V. Martin, and R. J. Park. 2006. “Estimating Ground‐Level PM2. 5 Using Aerosol Optical Depth Determined from Satellite Remote Sensing.” Journal of Geophysical Research Atmospheres 111 (D21): 201. https://doi.org/10.1029/2005JD006996
  • van Donkelaar, A., R. V. Martin, R. J. Spurr, E. Drury, L. A. Remer, R. C. Levy, and J. Wang. 2013. “Optimal Estimation for Global Ground‐Level Fine Particulate Matter Concentrations.” Journal of Geophysical Research Atmospheres 118 (11): 5621–5636.
  • Wang, L., Y. Chao, K. Cai, F. Zheng, and L. Shenshen. 2020. “Retrieval of Aerosol Optical Depth from the Himawari-8 Advanced Himawari Imager Data: Application Over Beijing in the Summer of 2016.” Atmospheric Environment 241 (10): 117788. https://doi.org/10.1016/j.atmosenv.2020.117788
  • Wang, Y., Q. Yuan, L. Tongwen, H. Shen, L. Zheng, and L. Zhang. 2019. “Large-Scale MODIS AOD Products Recovery: Spatial-Temporal Hybrid Fusion Considering Aerosol Variation Mitigation.” Isprs Journal of Photogrammetry & Remote Sensing 157 (8): 1–12. https://doi.org/10.1016/j.isprsjprs.2019.08.017
  • Wei, J., L. Sun, B. Huang, M. Bilal, Z. Zhang, and L. Wang. 2018. “Verification, Improvement and Application of Aerosol Optical Depths in China Part 1: Inter-Comparison of NPP-VIIRS and Aqua-MODIS.” Atmospheric Environment 175 (11): 221–233. https://doi.org/10.1016/j.atmosenv.2017.11.048
  • Wei, J., L. Zhanqing, Y. Peng, L. Sun, and X. Yan. 2019. “A Regionally Robust High-Spatial-Resolution Aerosol Retrieval Algorithm for MODIS Images Over Eastern China.” IEEE Transactions on Geoscience & Remote Sensing 57 (7): 4748–4757.
  • Winker, D. M., J. R. Pelon, and M. Patrick McCormick. 2003. CALIPSO Mission: Spaceborne Lidar for Observation of Aerosols and Clouds. Paper presented at the Lidar Remote Sensing for Industry and Environment Monitoring III.
  • Winker, D. M., M. A. Vaughan, A. Omar, H. Yongxiang, K. A. Powell, Z. Liu, W. H. Hunt, and S. A. Young. 2009. “Overview of the CALIPSO Mission and CALIOP Data Processing Algorithms.” Journal of Atmospheric and Oceanic Technology 26 (11): 2310–2323.
  • Young, S. A., M. A. Vaughan, A. Garnier, J. L. Tackett, J. D. Lambeth, and K. A. Powell. 2018. “Extinction and Optical Depth Retrievals for Calipso’s Version 4 Data Release.” Atmospheric Measurement Techniques 11 (10): 5701–5727.
  • Zhang, T., Y. Zhou, K. Zhao, Z. Zhu, G. R. Asrar, and X. Zhao. 2022. “Gap-Filling MODIS Daily Aerosol Optical Depth Products by Developing a Spatiotemporal Fitting Algorithm.” GIScience & Remote Sensing 59 (1): 762–781.
  • Zhong, B., W. Shanlong, A. Yang, and Q. Liu. 2017. “An Improved Aerosol Optical Depth Retrieval Algorithm for Moderate to High Spatial Resolution Optical Remotely Sensed Imagery.” Remote Sensing 9 (6): 555.