593
Views
1
CrossRef citations to date
0
Altmetric
Research Article

An innovative lightweight 1D-CNN model for efficient monitoring of large-scale forest composition: a case study of Heilongjiang Province, China

ORCID Icon, ORCID Icon, , & ORCID Icon
Article: 2271246 | Received 08 Jul 2023, Accepted 11 Oct 2023, Published online: 10 Nov 2023

ABSTRACT

Large-scale forest composition mapping and change monitoring are essential for regional and national forest resource management, monitoring, and carbon stock assessment. However, the existing large-scale mapping methods are not effective enough in terms of efficiency and accuracy. To address this limitation, this study proposes a lightweight one-dimensional convolutional neural network (LW-CNN) model for forest composition mapping. The LW-CNN model is developed using Landsat imagery covering 470,700 km2 obtained from Google Earth Engine (GEE) collected during two periods (2007 and 2018). The proposed LW-CNN is compared with a visual geometry group with 16 convolutional layers (VGG16), a residual network with 34 convolutional layers (Resnet34), and a residual network with 50 convolutional layers (Resnet50) in terms of model accuracy and efficiency. The factors influencing forest composition change are analyzed using the structural equation model (SEM). The results show that the proposed LW-CNN model can outperform the other three models in terms of model accuracy, achieving a mean overall accuracy (OA) of: 0.75 and efficiency of 7–22-fold. The changed forest composition from 2007 to 2018 accounts for 29.6% of the total forest area. The SEM results show that the climate factors have the most significant effect on the forest composition change. This study presents an innovative model for large-scale forest composition mapping, which is proven to be both efficient and accurate. This study also provides insights into the factors that affect the forest composition change, which could be valuable for forest resource management, monitoring, and carbon stock assessment.

1. Introduction

Large-scale forest composition mapping is a key indicator for various areas of scientific research and their applications (Ghosh and Joshi Citation2014; Pu Citation2021; Xi et al. Citation2021), such as forest carbon stock and biomass estimation (Schlund, Scipal, and Davidson Citation2017), biodiversity assessment (Wallis et al. Citation2017), forest fire monitoring (Calviño-Cancela et al. Citation2017), regional and national forest resources management, and decision-making. Heilongjiang Province covers an area of approximately 470,700 km2, with forests occupying 49% of the land area (data from the Department of Natural Resources of Heilongjiang Province (http://www.hljlr.gov.cn)). Heilongjiang Province is also an important forest district and timber production base in China. In the last 20 years, urbanization, agricultural modernization, and reforestation projects have been vigorously promoted, having a great impact on the forest composition in Heilongjiang Province. Therefore, it is valuable to map and monitor changes in the large-scale forest composition in Heilongjiang Province.

Satellite remote sensing is a key and relatively low-cost technology for forest composition mapping at a large scale due to its large area coverage and repetitive method of data collection. Medium-spatial-resolution remote sensing data obtained by Landsat have been widely used in large-scale forest parameter estimation and classification in a very cost-effective way (Grabska, Frantz, and Ostapowicz Citation2020; Konrad Turlej, Ozdogan, and Radeloff Citation2022). However, in large-scale classification tasks, it is time-consuming and laborious to acquire and process massive image data. In recent years, the Google Earth Engine (GEE) platform has been regarded as the good tool for large-scale classification because it has massive data access and download privileges and can achieve massive geospatial data processing using a high-performance cloud computing platform (Gorelick et al., Citation2017).

Large-scale mapping requires an effective and accurate predictive model. Machine learning (ML)-based models can perform supportive nonlinear classification of high-dimensional data, making them effective for large-scale mapping. There have been various widely used ML-based models, including random forest (RF) (Belgiu and Drăguţ Citation2016; Fedrigo et al. Citation2018), support vector machine (SVM) (Pham and Brabyn Citation2017), and extreme gradient boosting (XGBoost) (Georganos et al. Citation2018). Deep learning (DL)-based algorithms have stronger prediction power and better generalization performance than traditional ML-based algorithms, and thus can handle more complex tasks and be applied to machine translation, speech recognition, image recognition, and other fields (Ma et al. Citation2019). Recently, convolutional neural networks (CNNs), as a branch of DL-based models, have been used in data classification-related studies, achieving higher classification performance than the RF and SVM (Kattenborn et al. Citation2021; Mäyrä et al. Citation2021; Yoo et al. Citation2019). Different from the shallow networks (e.g. backpropagation (BP) neural network) developed over the past 20 years, the DL-based algorithms represented by the CNNs are characterized by a significant increase in the number of deep networks and parameters, which provides more abstract and deeper features and can reveal more complex hierarchical relationships between data through a convolutional layer (Kattenborn et al. Citation2021). In terms of the number of network layers, the development of CNNs has progressed to deep layers. For example, the visual geometry group (VGG) network has reached a maximum depth of 19 layers and a top-5 accuracy of 90.2% on the ImageNet dataset (Simonyan and Zisserman Citation2014). The Resnet model proposed in 2015 has an amazing 152 layers and a top-5 accuracy of 94.4% on the ImageNet dataset, which was achieved in the Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) (He et al. Citation2016). Although the aforementioned DL-based models can achieve good predictive performance, their model parameter numbers show a significant increase, and they have a high requirement for training data and a large computational burden, particularly in large-scale mapping. This poses a great challenge to computer capability and impedes the practical implementation of these models. In this context, developing a lightweight CNN model (i.e. a simple model structure with a small number of model parameters) for large-scale mapping could be advantageous for improving the model runtime and reducing the computer load.

For the CNN models, obtaining labeled datasets is a fundamental but labor-intensive task. Typically, two- and three-dimensional CNNs use multi-channel or hyper-channel RGB images with labels as input data. In the field of computer vision, there have already been many well-labeled benchmark datasets (e.g. visual object classes (VOC) (Everingham et al. Citation2010) and common objects in context (COCO) datasets (Lin et al. Citation2014)), which have been used as generic datasets in numerous studies to compare the classification performance of different models. For the land cover classification, there have also been a number of datasets, such as Indian Pines, Pavia, and Salinas datasets (Mäyrä et al. Citation2021). However, in terms of forest composition and tree species classification, it is challenging to delineate labeled image datasets with visual interpretation due to the very blurred differences and boundaries between the categories. This can lead to a large error between the visually interpreted datasets and the true labels (Xi et al. Citation2019). In addition, due to the limitation of spatial resolution, DL-based algorithms have rarely been applied to medium-resolution images in large-scale applications (Xi et al. Citation2021). Fortunately, a one-dimensional (1D) CNN model supports the input of a single spectral signal eliminating the need for prosperous and inaccurate sample production and providing an innovative approach for large-scale forest composition mapping.

Furthermore, it is crucial to examine the impact of various driving factors, including natural factors, climate conditions, atmospheric quality, and human activities, on the alteration of forest composition (Hansen et al. Citation2013). Currently, from a methodological perspective, there are three main types of methods for analyzing the driving factors (or driving force hereinafter), including correlation coefficient methods (Asuero, Sayago, and González Citation2006), geographical detectors (Song et al. Citation2020), and structural equation models (SEMs) (Bagozzi and Yi Citation2012). Among them, SEMs can estimate the relationship between the difficult-to-observe multiple latent (e.g. human activities) and the observed variables (e.g. temperature and precipitation), and can also estimate both the structure and the relationships of driving factors (Bagozzi and Yi Citation2012). Therefore, it is still worth exploring how to conduct the driving factors analysis and determine the interactions between driving factors of large-scale forest composition changes based on the SEM, which could help to make scientific decisions on forest management.

This study aims to develop an efficient CNN model to improve the accuracy and efficiency of large-scale forest composition mapping. The specific contributions of this study are as follows: (1) an innovative lightweight 1D-CNN model suitable for large-scale forest composition mapping is developed; (2) changes in the forest composition in Heilongjiang province are determined and used to analyze the factors affecting the forest composition change. This study provides a referable framework for large-scale mapping and change detection of forest composition using DL models.

2. Materials and methods

2.1. Study area

The study area was located in northeastern China, and it ranged from 121°11’ E to 135°05’ E and from 43°26’ N to 53°33’ N, as shown in . It covered an area of 470,700 km2 and included 12 cities and one region (the Daxing’anling region). In addition, it had a temperate and sub-frigid zone continental monsoon climate. According to the third land survey of China, the land cover types of Heilongjiang Province are mainly forests and farmland, which account for approximately 49% and 34% of the total land cover, respectively. The study area had typical temperate broadleaf forest and sub-frigid zone coniferous forest, consisting of Korean pine (Pinus koraiensis Sieb. et Zucc.), larch (Larix gmelinii (Rupr.) Kuzen), Manchurian ash (Fraxinus mandshurica Rupr.), Manchurian walnut (Juglans mandshurica Maxim.), white birch (Betula platyphylla Suk.), Mongolian oak (Quercus mongolica Fisch. ex Ledeb.), Miyabe maple (Acer miyabei Maxim.), and Amur cork tree (Phellodendron amurense Rupr.).

Figure 1. Location of the study area and the distribution of sample plots in 2018. The base map is a Landsat 8 OLI true color imagery in the WGS-84 geographic coordinate system. China map examination No. Is GS (2019) 1822.

Figure 1. Location of the study area and the distribution of sample plots in 2018. The base map is a Landsat 8 OLI true color imagery in the WGS-84 geographic coordinate system. China map examination No. Is GS (2019) 1822.

2.2. Data and Preprocessing

2.2.1. Field inventory data

In this study, two-period field inventory data (number of plots: 2906 vs. 2813) based on sample plots from the National Forest Inventory (NFI) collected in 2007 and 2018 were used. Each sample plot was 0.06 ha (24.5 m × 24.5 m) in size, and their exact locations were recorded using a handheld Global Positioning System (GPS) with an accuracy of ±5 m. In each sample plot, the tree species, tree height, and diameter at breast height (DBH) of at least 5 cm were recorded. According to the basal area of each tree species (Wensel, Levitan, and Barber Citation1980), only one species or one of several species with a basal area greater than 70% was considered as pure forest, while the others were considered mixed forest.

2.2.2. Optical imagery data and preprocessing

In this study, remote sensing data collected during two periods (2007: Landsat 5 TM and 2018: Landsat 8 OLI) were downloaded using the GEE cloud computing platform, and the selected period from June to October (i.e. growing season) had less than 5% cloudiness.

A total of 86 Landsat 5 TM imagery data were acquired in 2007, including six multispectral bands with a 30-m spatial resolution. Due to the unfavorable quality of some images, eight Landsat 5 TM imagery datasets from 2008 were used as supplements for 2007. Similarly, 97 Landsat 8 OLI imagery data were acquired in 2018, including seven multispectral bands with a 30-m spatial resolution. Nine Landsat 8 OLI imagery datasets collected in 2017 were used to supplement the data collected in 2018.

The preprocessing of Landsat 5 TM and Landsat 8 OLI imagery data included four steps: atmospheric correction, topography correction, geometric correction, and mosaic. These steps were conducted to obtain surface reflectance over the study area, which was implemented on the GEE cloud computing platform. There was a total of 51,676 × 37,691 pixels for a 30-m spatial resolution.

2.2.3. Auxiliary driving factor data

In this study, a total of 23 auxiliary driving factors were used, including seven climate factors, three topography factors, six human activity factors, and seven atmospheric quality factors. Since the forest composition change represents a long-term process, the climate factors and human activity factors for a total of 12 years (from 2007 to 2018) were obtained. In contrast, the topography factors remained almost constant over several decades, and thus, their data were obtained only for the two periods in 2007 and 2018.

The meteorological data were obtained from the China Meteorological Data Service Center (CMDC) (http://data.cma.cn/). A total of 14 stations in Heilongjiang Province were included in the data collection process, and seven annual average meteorological indicators were observed, including mean temperature (0.1°C), maximum temperature (0.1°C), minimum temperature (0.1°C), precipitation (0.1 mm), humidity (0.1% rh), barometric pressure (0.1 Pa), and maximum wind speed (0.1 m/s). As the Universal Kriging method has good performance in meteorological factor interpolation (Hofstra et al. Citation2008), it was selected to generate the meteorological images with a 30-m spatial resolution within the boundary of the study area.

The digital elevation model (DEM) with a 30-m spatial resolution was obtained from the Shuttle Radar Topography Mission (SRTM) (earthexplorer.usgs.gov/). The slope (°) and aspect (°) of the study area were generated using the DEM in the WGS-84 geographic coordinate system by ArcGIS 10.2 (ESRI, Redlands, CA, USA).

The human activity factors included the gross output value of agriculture, forestry, fishery and livestock, gross output value of forestry, gross output value of food crops, nighttime light index (NLI), distance to settlements, and population density. The DMSP-OLS NLI was obtained from synthetic products with a 1-km spatial resolution (Wu et al. Citation2022); the factor of distance to the settlements was determined by the shortest path analysis tool in ArcGIS 10.2 (ESRI, Redlands, CA, USA). The rest of the data were obtained from national statistical data (http://www.stats.gov.cn).

A total of seven atmospheric quality factors were used, including the atmospheric carbon dioxide concentration (CO2), atmospheric ammonium nitrogen wet deposition (NH4+_w), atmospheric nitrate nitrogen wet deposition (NO3_w), atmospheric particulate ammonium dry deposition (NH4+_d), atmospheric particulate nitrate dry deposition (NO3_d), gaseous nitrogen dioxide dry deposition (NO2_d), and gaseous ammonia dry deposition (NH3_d). These data were obtained from the National Ecological Science Data Center (http://www.cnern.org.cn/) and had a spatial resolution of 10 km × 10 km (Jia et al. Citation2019a, Citation2019b). All human activity factors and atmospheric quality factors were resampled to a 30-m raster image within the boundary of the study area.

2.3. Method

This study implemented a three-step process, as shown in . The three specific steps were as follows: (1) collect and process Landsat imagery data obtained by the GEE platform in two periods (i.e. 2007 and 2018); (2) construct the LW-CNN model and compare its accuracy and efficiency with those of the VGG16, Resnet34, and Resnet50 models; and (3) map the spatial distribution of forest composition for two periods and conduct the driving factor analysis based on the SEM.

Figure 2. Workflow of this study.

Figure 2. Workflow of this study.

2.3.1. Classification scheme and sample selections

The main forest composition was classified based on the sample plots of 2007 and 2018, consisting of 2,906 and 2,813 plots, respectively. In the classification scheme, the sample size (i.e. the number of sample plots) and the separability of forest composition categories were used. White birch, larch, and Mongolian oak forests were treated as a separate category with large distribution areas, whereas Korean pine, fir, spruce, and Mongolian pine forests were categorized as other coniferous forests with a small area. Hard broadleaf forests include Manchurian ash, Manchurian walnut, and Amur cork tree; these three types are of significant ecological and economic value in northeastern China. shows the classification scheme and sample sizes in 2007 and 2018.

Table 1. Classification scheme and sample size.

To verify the separability of the collected data samples, this study employed the Jefferies – Matusita (J-M) distance method (Yanchen and Jinfeng Citation2004), which can be used to count the differences between types, and the calculation range was [0, 2]. If the J-M value was greater than 1.9, then it was considered that the samples were selected well; when the J-M value was less than 1.7, then the separability of the samples was considered low.

2.3.2. Feature extraction and selection

To exploit remote sensing information for forest composition identification fully, 98 pixel-based feature variables of two periods in 2007 and 2018 were extracted from the Landsat 5 TM and Landsat 8 OLI imagery, respectively. The feature variables consisted of seven original bands, the gray-level co-occurrence matrix (GLCM) with 56 texture features, 10 band combination features, 10 image enhancement features, and 15 vegetation index features. The GLCM was extracted using a window size of 5 × 5 pixels and a stride of one (Jia, Zhou, and Li Citation2012), while the remaining features were derived based on pixels. The details of the 98 extracted features are shown in Appendix A. Since the spatial distribution of forest composition might be related to the climate and topography factors, seven meteorological features (i.e. temperature, maximum temperature, minimum temperature, precipitation, humidity, barometric pressure, and maximum wind speed) and three topographical features (i.e. the DEM, slope, and aspect) were used for forest composition classification.

Although DL-based algorithms can handle a large number of input features, some features can be redundant for classification and will increase the computational effort, especially for large-scale prediction. Therefore, the feature selection was performed to search for an optimal feature subset using the recursive feature elimination (RFE) method, which represents a greedy algorithm that removes the lowest importance features by repeatedly calculating the score of each feature subset (Ghosh and Joshi Citation2014). It also allows users to specify and find an optimal number of features to reduce the computational load during large-scale predictions. The feature selection procedure was performed using the Sklearn package in Python 3.7.

2.3.3. Classification algorithms

In the experiments, four 1D-CNN models were tested to compare their efficiency and accuracy in large-scale forest composition classification, including the VGG16 model, the Resnet34 model, the Resnet50 model, and the proposed lightweight 1D-CNN model, which is denoted by LW-CNN in the following text.

2.3.3.1. Classification models

  • VGG16

The VGG16 model performs well in image classification; it has a clean and succinct structure and consists of 13 convolutional layers with a kernel size of 3 × 3 and three fully connected layers (Simonyan and Zisserman Citation2014). Moreover, this model employs multiple convolutional layers with a small kernel size of 3 × 3 instead of one convolutional layer with a large kernel size, which allows for fewer parameters and better nonlinear mapping, which further increases the fitting capacity of the network. The structure of the VGG16 model is shown in Appendix B.

  • Resnet34 and Resnet50

The Resnet models are composed of many residual blocks, and they can handle hyper-deep convolutional networks (convolutional layer > 1,000) with residual blocks and solve the gradient disappearance or gradient explosion during model fitting (He et al. Citation2016). The Resnet34 model consists of one convolutional layer with a kernel size of 7 × 7, one Maxpooling layer with a size of 3 × 3, and 32 convolutional layers. The Resnet50 model is composed of one convolutional layer, one Maxpooling layer, and 48 convolutional layers. Detailed information on the two models is given in Appendix C.

  • LW-CNN

To reduce the number of parameters and increase the model efficiency, this study develops a lightweight 1D-CNN model named the LW-CNN model by exploiting the benefits of both the VGG16 model and the Resnet models. The structure of the LW-CNN model is shown in .

Figure 3. LW-CNN structure. Conv1D is a one-dimensional convolutional layer, K is kernel size, and out is the number of outputs.

Figure 3. LW-CNN structure. Conv1D is a one-dimensional convolutional layer, K is kernel size, and out is the number of outputs.

The LW-CNN uses seven one-dimensional convolution layers with a kernel size of 3 × 3 to extract abstract features and two maxpooling layers with a stride of two to perform downsampling and reduce redundant information. A flatten layer is used for array spreading, and dense maps the results of the flatten layer to the output space. To reduce the model parameters, the LW-CNN model uses fewer convolutional layers than the VGG16 and Resnet models. In addition, the residual structure (i.e. Block 1 and Block 2) is used in the LW-CNN model, which enhances the ability of the convolutional layer to extract features by adding features before convolution (He et al. Citation2016). The ReLU activation function is selected to improve the computational efficiency, and it is defined as follows:

(1) f(x)=x,x>00,x0(1)

The ReLU function can output values less than zero, making the network sparse and thus improving the computational speed. However, the values greater than zero remain unchanged, thus ensuring that there are no gradient saturation and disappearance.

To improve the edge information and make the input and output sizes consistent enough in deep network training, the padding function is used to keep the size consistent between the input and output by adding zero values. To accelerate training and prevent gradient disappearance and explosion, Batch Normalization (BN) (Ioffe and Szegedy Citation2015) is used to normalize the intermediate generated data as follows:

(2) μ=1mi=1mxiσ2=1mi=1m(xiμ)2xˆi=xiμσ2+εyi=λxˆi+βBNλ,β(xi)(2)

where xi denotes input data; m is the total number of input data samples; μ and σ2 are the mean value and variance of the input data, respectively; xˆi is the result of a transformation of the normal distribution, ε is a constant with a very small value used to prevent the denominator from being zero; λ and β are linear tuning parameters used for the output, and the CNN model automatically adjusts these parameters to follow the training process.

2.3.3.2. Hyperparameter tuning

In DL-based algorithms, hyperparameters (e.g. epochs, batch size, and learning rate) can significantly affect the model performance and computational efficiency, so it is critical to specify these parameters. The two-step hyperparameter tuning was used in this study. First, an approximate range of hyperparameters was manually set, and then the optimal values of hyperparameters were found using the grid search method (Genuer, Poggi, and Tuleau-Malot Citation2010), which listed different combinations of hyperparameters in an exhaustive way. The results of hyperparameter tuning are presented in .

Table 2. Hyperparameter tuning of the four models for data collected in 2007 and 2018.

Further, the number of parameters was calculated by EquationEquation (3), and the running time was obtained to compare the efficiency of the four 1D-CNN models.

(3) Para=K×K×In_c\b+Bias×Out_c(3)

In EquationEquation (3), Para represents the number of model parameters; K is the convolutional kernel size; In_c is the number of input channels; Bias is set to one by default; Out_c represents the number of output channels.

The four 1D-CNN models (i.e. VGG16, Resnet34, Resnet50, and LW-CNN models) were implemented in Python 3.7 and TensorFlow 1.14 on a computer with an AMD R7 3700X CPU, GTX 1080 8 G GPU, and 32-GB RAM.

2.3.4. Accuracy assessment

Due to the unbalanced distribution of sample sizes (refer to ), a stratified scheme was used to assess the classification accuracy for samples in each stratum (i.e. category). To ensure a robust assessment of model performances, the sample size of each stratum was partitioned into three subsets consisting of 60%, 20%, and 20% of the data, which were used for training, validation, and testing, respectively. The training dataset was used to train the model, the validation dataset was used for tuning the model hyperparameters, and the independent test dataset was used to evaluate the models after training (Ripley Citation2007). The user accuracy (UA), producer accuracy (PA), overall accuracy (OA), and F1-score were used to assess the classification accuracy of the models. The confusion matrix was also used to determine the sample sizes of the correct and incorrect classification for each category. The accuracy assessment metrics were calculated as follows:

(4) UAi=NiNi+(4)
(5) PAi=NiN+i(5)
(6) F1score=2×UAi×PAiUAi+PAi(6)
(7) OA=i=1kNiN,(7)

where N is the total number of samples; Ni is the number of correctly classified samples in a category i; Ni+ and N+i are the number of samples predicted to be in category i and the actual number of samples in category i, respectively.

2.3.5. Forest composition mapping and driving factor analysis

2.3.5.1. Forest composition mapping and changes analysis

In this study, two-step forest composition mapping was performed. First, the forests in 2007 and 2018 were extracted by classifying the land cover types. In accordance with the principles of representativeness and uniformity of distribution, the samples were selected through visual interpretation. The sample sizes for each category are shown in Appendix D. The land cover types were classified using the RF algorithm (Belgiu and Drăguţ Citation2016). The number of trees (n_estimator) was set to 1,000, and the random seed point (random_state) was set to 10. (2) Based on the forests extracted in the previous step, the forest compositions of Heilongjiang province in 2007 and 2018 were classified using the optimal CNN model. The change in the forest composition between 2007 and 2018 was determined based on the transfer matrix. The error matrix served as a straightforward cross-tabulation between the classification results of remote sensing data and the reference data. In addition, it provided the groundwork for estimating the area of changed classes (Wickham et al. Citation2023). In this study, the error matrix was used to evaluate the accuracy of forest composition changes in the transfer matrix, where the diagonal elements represented the proportion of consistent areas between unchanged sample types and unchanged classification results (Olofsson et al. Citation2014).

2.3.5.2. Establishment and analysis of driving factors based on SEM

In this study, the SEM was employed to quantify the effects of climate factors, human activity factors, topography factors, and atmospheric quality factors on forest composition changes (refer to Appendix E). The SEM is a multivariate statistical method that can be used for factor analysis, regression, path analysis, and simultaneous equation modeling and has great advantages in causal analysis. Moreover, it can handle both directly observed variables (i.e. observed variables) and variables that cannot be directly measured (i.e. latent variables) and is also suitable for discrete variables (Bagozzi and Yi Citation2012).

Next, 2, 813 sample plots were used and the values of changes in the observed variables were extracted for each sample point. The Theil – Sen median method was used to determine the changing trend of each factor. This method was selected because it is suitable for data trend analysis of long time series, is insensitive to outliers, and is robust in trend calculation (Sayemuzzaman and Jha Citation2014). This method is defined as follows:

(8) β=Median(xjxiji),j>i,i=1,2,3,,N,(8)

where xj and xi are factors related to the time series, arranged according to the time order; Median is the median function; β > 0 indicates an increasing trend; and β < 0 indicates a decreasing trend.

To verify the validity of samples for the SEM models, this study calculated the Kaiser – Meyer–Olkin (KMO) and p-values of Barlett’s sphericity test by exploratory factor analysis (EFA) (Yuan and Bentler Citation2006). The effect of model fitting was evaluated using the chi-square/df, adjusted goodness of fit index (AGFI), comparative fit index (CFI), and root-mean-square error of approximation (RMSEA) as evaluation metrics (Bagozzi and Yi Citation2012). When the chi-square/df was between 1 and 3, the AGFI and CFI values were larger than 0.9, and the RMSEA value was less than 0.05, which was suitable for constructing the SEM model.

3. Results

3.1. Sample separability

The J-M distance method was performed on data obtained in 2007 and 2018 to verify the separability of the samples, as shows. demonstrates that the J-M distance values between all samples were larger than 1.7, indicating that the samples from the two periods were well separable. From the perspective of forest composition, the J-M distances between larch forests and most broadleaf forests (i.e. white birch forests, Mongolian oak forests, and mixed coniferous-broadleaf forests) were larger than 1.9, indicating that these forest compositions have good separability. However, most of the J-M distances between 1.7 and 1.8 corresponded to the broadleaf forests, indicating that the separability of their samples was not as strong as that between coniferous and broadleaf forests.

Figure 4. Spectral separability analysis of the eight forest compositions based on the J-M distance. (a) separability of samples collected in 2007; (b) the separability of samples collected in 2018. Note: LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests.

Figure 4. Spectral separability analysis of the eight forest compositions based on the J-M distance. (a) separability of samples collected in 2007; (b) the separability of samples collected in 2018. Note: LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests.

The spectral variability of the eight categories is illustrated in . Generally, the reflectance of the eight categories showed significant differences, having consistent trends in each band. White birch forests (WB) and mixed coniferous-broadleaf forests (MB) had high reflectance, whereas larch forests (LA) had low reflectance. In some specific bands, such as blue and green bands, there were small spectral variations in soft broadleaf forests (SB) and hard broadleaf forests (HB). Further, there were only slight spectral variations between SB and Mongolian oak forests (MO) in the green and SWIR bands. White birch forests had much higher spectral reflectance than the other categories.

Figure 5. Boxplots of spectral variability of the eight categories. LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests.

Figure 5. Boxplots of spectral variability of the eight categories. LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests.

3.2. Feature selection

The RFE method was used to identify the optimal feature subsets for the four 1D-CNN models. shows how the accuracy of the four 1D-CNN models changed with the number of input features. The results indicated that the accuracy of the four 1D-CNN models initially increased with the number of input features. When the number of features reached 46 and 43 for data collected in 2007 and 2018, respectively, the accuracy tended to stabilize gradually. This indicated that although the CNN models could support the high-dimensional input data, the model accuracy could not be further increased when the number of features reached a certain value, which caused data redundancy and extra computational effort. The optimal feature subsets and feature importance rankings are shown in Appendix F. Based on the results, the spectral features of the original bands were ranked highly, and the GLCM was most frequently selected. Furthermore, climate factors, such as temperature and maximum wind speed, also played a significant role in forest composition classification.

Figure 6. Effect of the number of input features on the accuracy of the four 1D-CNN models: (a) data from 2007; (b) data from 2018.

Figure 6. Effect of the number of input features on the accuracy of the four 1D-CNN models: (a) data from 2007; (b) data from 2018.

3.3. Classification performance

The performance of the four 1D-CNN models (i.e. LW-CNN, VGG16, Resnet34, and Resnet50 models) was tested using the optimal feature subset and hyperparameters for data collected in 2007 and 2018, as shown in . For the data from 2007, the LW-CNN model had the highest classification accuracy among all models. For example, the LW-CNN, Resnet50, VGG16, and Resnet34 models achieved the OA values of 0.74, 0.73, 0.69, and 0.60, respectively, and this result was consistent with that for data from 2018, as shown in . Compared to the accuracy of the Resnet50, Resnet34, and VGG16 models, that of the LW-CNN model was improved by 1%, 5%, and 14% for the data from 2007 and by approximately 4%, 9%, and 24% for the data from 2018, respectively. This result indicated that the LW-CNN model could achieve favorable performance in classification when using selected features and one-dimensional convolution layers.

Figure 7. Confusion matrices of the four 1D-CNN models (i.e. VGG16, Resnet34, Resnet50, and LW-CNN). (a)–(d) data from 2007; (e)–(h) data from 2018; LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests. PA: producer’s accuracy; UA: user’s accuracy. Note: the numbers in italics are overall accuracy.

Figure 7. Confusion matrices of the four 1D-CNN models (i.e. VGG16, Resnet34, Resnet50, and LW-CNN). (a)–(d) data from 2007; (e)–(h) data from 2018; LA: larch forests; OC: other coniferous forests; WB: white birch forests; MO: Mongolian oak forests; SB: soft broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; CB: mixed coniferous-broadleaf forests. PA: producer’s accuracy; UA: user’s accuracy. Note: the numbers in italics are overall accuracy.

From the perspective of forest composition, the four 1D-CNN models had stable classification accuracy for larch forests and mixed broadleaf forests, as shown in . The PA of the four models for larch forests ranged from 0.59 to 0.76, while the UA ranged from 0.59 to 0.79. For the other coniferous forests (PA: 0.44–0.68; UA: 0.39–0.63) and mixed coniferous-broadleaf forests, the four models performed poorly (PA: 0.29–0.63; UA: 0.51–0.82). The low classification accuracy of the four models for these categories was mainly due to the low spectral variability, which increased the possibility of misclassifying samples, and because of the small sample size, which decreased the accuracy.

3.4. Forest composition mapping and change detection

The RF model was used to classify forest and non-forest data samples (i.e. grassland, water, wetland, building, farmland, and others, including unused land and small cloud areas). The classification performance of the land use types using RF model is presented in Appendix G, and the area of each category is shown in Appendix H. The classification of the land cover types had high accuracy (OA of 0.90 for data from 2007; OA of 0.92 for data from 2018), and the F1-score of forest class was also high (2007: 0.92 and 2018: 0.96).

The change in the forest composition from the northwest to the southeast in the study area is presented in . The coniferous forests (i.e. larch forests and other coniferous forests) were mainly distributed at high latitudes in the northwestern part of the study area, belonging to the sub-frigid zone. As the latitude decreased, the area of mixed broadleaf forests gradually increased. This could be because the latitude value influenced the temperature and precipitation, which led to the spatial variability in the distribution of forest composition. Based on the results in , there was a slight decline in the total forest area from 2007 to 2018. The areas of the mixed broadleaf forests and larch forests were in similar, constituting approximately 54% of the total forest area.

Figure 8. Spatial distribution of forest composition in Heilongjiang Province. (a) data from 2007; (b) data from 2018; (c) histogram of the estimated area. WB: white birch forests; MO: Mongolian oak forests; LA: larch forests; OC: other coniferous forests; CB: mixed coniferous-broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; SB: soft broadleaf forests.

Figure 8. Spatial distribution of forest composition in Heilongjiang Province. (a) data from 2007; (b) data from 2018; (c) histogram of the estimated area. WB: white birch forests; MO: Mongolian oak forests; LA: larch forests; OC: other coniferous forests; CB: mixed coniferous-broadleaf forests; HB: hard broadleaf forests; BM: mixed broadleaf forests; SB: soft broadleaf forests.

lists the transfer matrix of the forest composition change in the study area. There were 68,995.5 km2 (approximately 29.6% of the total forest area) in the changed forest, where the change in mixed broadleaf forests was largest compared with others forest composition. The size of the areas changed from mixed broadleaf forests to white birch forests and Mongolian oak forests was 8,392.5 km2 (3.6%) and 7,396.2 km2 (3.2%), respectively. The size of the areas changed from white birch forests and Mongolian oak forests to mixed broadleaf forests was 6,986.2 km2 (3.0%) and 9,080.5 km2 (3.9%), respectively. The changes from mixed broadleaf forests to white birch forests or Mongolian oak forests could be due to natural regeneration succession. Since there was a small possibility of natural succession between Mongolian oak forests and white birch forests, this change could be due to human activity factors or classification errors.

Table 3. Transfer matrix for the forest type change in the study area (unit: km2).

To assess the detection accuracy of forest composition area changes, an error matrix was constructed for change detection based on the sample labels and classification labels of the two periods. shows that the change detection accuracy was the highest for the unchanged forest category, having a UA value of 0.83 and a PA value of 0.87. The change detection accuracy was the lowest for the category of deciduous pine forest loss, having a UA value of 0.47 and a PA value of 0.53. The proportion of the unchanged category was the highest (38.3%), followed by that of the change in the mixed broadleaf forest gain (8.1%).

Table 4. Error matrix for the forest composition change detection (unit: area proportion (%)). Blank cells are zeros;

3.5. Driving factor analysis of forest composition change

Next, to quantify the effects of driving factors, this study used the SEM. The EFA results showed that the KMO value was 0.707 with a p-value of less than 0.05, which indicated good structural validity of the samples for the SEM. The SEM was well fitted, with the chi-square/df of 1.21, AGFI of 0.93, CFI of 0.90, and RMSEA of 0.02. shows that the climate factors and atmospheric quality factors had a significant positive effect on the forest composition change, whereas the human activity factors had a significant negative effect on the forest composition change. Furthermore, the topography factors (i.e. DEM, slope, and aspect) had no significant effect on the forest composition change. Further, indicates that climate factors had the greatest influence on the forest composition change, with a total effect coefficient of 0.52, followed by the atmospheric quality factors (0.35) and human activity factors (−0.23). Other variables showed positive significance at the 0.05 level except for the two observed variables: T-NH4+_w and aspect. This indicated that the change in forest composition had a decreasing trend with the increase in the T-NH4+_w value.

Figure 9. SEM for the affecting factor analysis of forest composition changes. CFST: change in forest composition; T-Min_tem: the changing trend of minimum temperature; T-Tem: the changing trend of temperatures; T-Preci: the changing trend of precipitation; T-Max_WS: the changing trend of maximum wind speed; T-GOV: the changing trend of the gross output value of farming, forestry, animal husbandry, and fishery; T-GOV_F: the changing trend of the gross output value of forestry; T-TSAG: the changing trend of the total sown areas of grain crops; T-NH3_d: the changing trend of gaseous ammonia gaseous dry deposition; T-NH4+_w: the changing trend of atmospheric ammonium nitrogen wet deposition; T-NO2_d: the changing trend of gaseous nitrogen dioxide dry deposition; T-NO3_d: the changing trend of atmospheric particulate nitrate dry deposition.

Figure 9. SEM for the affecting factor analysis of forest composition changes. CFST: change in forest composition; T-Min_tem: the changing trend of minimum temperature; T-Tem: the changing trend of temperatures; T-Preci: the changing trend of precipitation; T-Max_WS: the changing trend of maximum wind speed; T-GOV: the changing trend of the gross output value of farming, forestry, animal husbandry, and fishery; T-GOV_F: the changing trend of the gross output value of forestry; T-TSAG: the changing trend of the total sown areas of grain crops; T-NH3_d: the changing trend of gaseous ammonia gaseous dry deposition; T-NH4+_w: the changing trend of atmospheric ammonium nitrogen wet deposition; T-NO2_d: the changing trend of gaseous nitrogen dioxide dry deposition; T-NO3−_d: the changing trend of atmospheric particulate nitrate dry deposition.

Table 5. Effects of significant latent variables on the forest composition change.

4. Discussion

4.1. Application potential of LW-CNN in large-scale forest composition mapping

The parameters and training time of the four 1D-CNN models are presented in , where it can be seen that as the number of convolution layers increased, there was a notable increase in the number of parameters, which was mainly due to the increased number of channels following the convolution. However, the proposed LW-CNN model adopted an “inverted triangular” convolutional network, where the input convolution channels first increased and then decreased so that the increase in the number of model parameters was controlled. Compared to the parameter size of the VGG16 model, the parameter number (i.e. 129, 664) of the LW-CNN model was reduced by nearly 183 times, and the model running time (161 s) was reduced by 10 times. Although the OA values of the LW-CNN and Resnet50 models were similar, the number of parameters of the LW-CNN model was much smaller than that of the Resnet50 model. This could be attributed to the fewer convolutional layers and the residual structure designed based on the experience of the Resnet model, which made the proposed model run faster and be more effective in large-scale forest composition mapping than the VGG16 and Resnet models.

Table 6. Number of parameters and efficiency of the four models.

Moreover, the LW-CNN model had better classification performance than the VGG16, Resnet34, and Resnet50 models, which was mainly due to the unique residual structure used in the LW-CNN model and pre-extracted features. According to the previous results (He et al., Citation2016), deeper DL networks tend to have higher model accuracy, which is due to their ability to extract more abstract information that is beneficial to classification (Rohith and Kumar Citation2020; Wang et al. Citation2020). However, the proposed LW-CNN model with only seven convolution layers performed slightly better than the Resnet50 model with 50 convolution layers. In general, in CNNs, the first few convolution layers are used to extract shallow information, which contains more detailed information (e.g. color, texture, location). Moreover, the deeper convolutional layers (i.e. layers near the output layer) are used to extract more information by increasing the receptive field, which can provide more overall information but has less ability to capture detailed information. Therefore, pre-extracted features were used as input for the proposed LW-CNN. For example, the proposed model used fewer convolutional layers to achieve similar classification performance than did the models with multiple convolutional layers (i.e. Resnet50). This phenomenon could be attributed to the preprocessing of the input data in the LW-CNN model, which effectively served as shallow convolution. In addition, the unique residual structure allowed the preservation of the original information, which was important to distinguish forest compositions. The combination of these two factors made the proposed LW-CNN model have good classification performance. Another clear advantage of the 1D-CNN model over the 2D-CNN model was that its input samples were pure pixels, which avoided the problem of mixed pixels for samples obtained from images of medium spatial resolution in the 2D-CNN model (Li et al. Citation2020).

Furthermore, previous research has demonstrated that the DL models consistently outperform the ML models in terms of classification accuracy (Xi et al. Citation2019). To validate this finding, this study tested the performance of three ML models in classifying forest composition on the data from 2007. shows that the performance of the ML model was comparable to that of the VGG16 model but significantly inferior to those of the other three DL models used in this study. This underscored the supremacy of DL over ML in the field of forest component classification, serving as a reaffirmation of prior research findings (Xi et al. Citation2020).

Table 7. Classification performance of the ML models on the data from 2007.

Furthermore, to reduce the computational load caused by the excessive number of input features for large-scale mapping, feature selection was performed using the RFE. In the proposed model, the pre-extracted features were used as input to the DL network, so the number of input features determined the number of times the convolution could be performed, which affected the classification performance. Therefore, too few pre-extracted features from Landsat 8 OLI could not fully use the convolution layers, while too many features could greatly increase the computational load for large-scale mapping. In the proposed model, the selection of the number of features played a key role in fully using the convolutional layer. Therefore, it could be promising to extend 1D-CNN to other large-scale regions by inputting the selected features into the 1D-CNN model.

4.2. SEM-based driving factor analysis of forest composition change

This study investigated four types of factors (i.e. climate, topography, human activity, and atmospheric quality factors) to analyze their effect on forest composition changes. For climate factors, the maximum wind speed was the most influential and showed a positive correlation with the forest composition change. This could be due to the fact that strong wind could cause trees with a shallow root (e.g. larch) to fall, resulting in a forest composition change (Blennow and Olofsson Citation2008). It is worth noting that precipitation and temperature also had a significant positive correlation with the forest composition change, playing an important role in the evolution of forest regeneration. This was mainly due to the fact that temperature and precipitation affected the process of photosynthesis of vegetation, resulting in different growth rates of trees in the forest. Similarly, related studies have reported the important role of precipitation and temperature in stand growth (Shi et al. Citation2018). The results obtained in this study also showed that atmospheric nitrogen dry deposition in atmospheric quality factors had an important effect on forest composition changes. To the best of the authors’ knowledge, the existing studies have shown that nitrogen inputs through atmospheric deposition can have both beneficial and detrimental effects on forest ecosystems, such as the stimulation of carbon sequestration (De Vries, Du, and Butterbach-Bahl Citation2014) and loss of biodiversity (Bobbink et al. Citation2010). At the regional and global scales, atmospheric nitrogen deposition is one of the important factors affecting the change in forest ecosystem types (Du et al. Citation2019). The human activity factors, the NLI, population density, and distance to settlements, which are associated with human activity, were removed in this study after modifying the SEM. For example, since the implementation of natural forest protection projects in northeastern China, the impact of these human activities on the forest composition change has been insignificant. Further, the RF algorithm was applied to reveal the most important factors influencing the forest composition change, as shown in . The contribution of features based on the RF was consistent with the SEM, and climate factors were important factors influencing the forest composition change.

Figure 10. Main influencing factors of the forest composition change determined by the RF algorithm. MSE: mean square error; ** represents significance at the 0.05 level.

Figure 10. Main influencing factors of the forest composition change determined by the RF algorithm. MSE: mean square error; ** represents significance at the 0.05 level.

4.3. Limitations and future research

The use of Landsat 8 imagery pixels as samples in the proposed 1D-CNN represents a pixel-oriented classification approach in the field of remote sensing. Pixel-oriented classification is a conventional method for large-scale mapping, particularly suited for medium spatial resolution imagery. However, this method will inevitably introduce the “salt and pepper phenomenon.” Therefore, there exist numerous avenues for further improvements in the future.

First, the attention model (Vaswani et al. Citation2017) could be explored as an augmentation to the proposed LW-CNN model, enabling the model to capture pertinent information essential for classification. Furthermore, a semantic segmentation algorithm could be introduced to segment the forest stand to address the “salt and pepper phenomenon.” In addition, hyper-parameter tuning is an extremely time-consuming task, and several hyperparameters need to be considered jointly to identify the optimal one. This process can be enhanced using advanced algorithms, such as random search (Stuke, Rinke, and Todorović Citation2021), particle swarm optimization (Wang, Zhang, and Zhang Citation2019), and Bayesian-based optimization (Stuke, Rinke, and Todorović Citation2021).

Second, a refined forest composition could be classified using high-resolution imagery (e.g. Sentinel-2). The advantage of Sentinel-2 is that it has a richer spectral resolution with multiple red-edge and near-infrared bands that are useful for classification (Wang and Atkinson Citation2018), which offers the possibility to distinguish more forest compositions. Furthermore, multi-temporal data could be considered to improve the classification accuracy of forest composition. In the future, it is of practical application to integrate the feature extraction and selection processes with DL models into a framework, thus constructing the end-to-end 1D-CNN models.

Third, Although LW-CNN demonstrated promising efficiency and accuracy in large-scale mapping, uncertainty in change detection increases due to prediction mistakes in single-period distribution maps. This problem can be solved by reducing the number of classification categories, but this will lose some meaningful categories. Therefore, the accuracy of change detection was assessed by ground reference data. In the future, improving model accuracy and finding effective features for classification are fundamental ways to reduce errors in mapping and detecting change. In addition, due to the constraint posed by the sample size of various periods, it is only feasible to depict forest composition maps for two distinct time periods, which poses an obstacle to the discernment of the trends characterizing alterations in forest composition. In the future, composition maps with five-year intervals could be used and combined with the national forest survey data conducted every quinquennial.

The last limitation concerns the distribution of the sample plots, which were surveyed using systematic sampling instead of random sampling. Especially, there were few sample plots in the western part of the study area. Although most of the western region is covered by non-forest, there is still a few forests. Since DL models are strongly influenced by training data (Whang and Lee Citation2020), this means that the western region may have significant uncertainty due to the lack of training data. Future research could further explore the effects of the spatial distribution and number of sample plots on DL models.

5. Conclusion

The study proposes the LW-CNN model and compares it with the VGG16, Resnet34, and Resnet50 models in terms of model accuracy and efficiency in large-scale forest composition mapping. In addition, the driving factor analysis of the forest composition change in Heilongjiang province between 2007 and 2018 is conducted. Compared to the accuracy of the Resnet50, Resnet34, and VGG16 models, that of the proposed LW-CNN model is improved by 1%, 5%, and 14% for data from 2007 and by 4%, 9%, and 24% for data from 2018, while the efficiency is improved by approximately 7, 12, and 22 times, respectively. The analysis results indicate that the climate is the most important factor influencing forest composition changes in Heilongjiang province, followed by atmospheric quality and human activities-related factors. The proposed LW-CNN model can meet the efficiency and accuracy requirements of large-scale forest composition mapping and monitoring and can provide technical support for classification and change-detection tasks using medium spatial resolution images.

Author contributions

“Conceptualization, F.F. and Y.Z.; methodology, Y.M.; software, Y.M.; validation, Y.M.; writing – original draft preparation, Y.M. and Y.Z.; writing – review and editing, Y.Z. F.F. and Z.Z.; visualization, Y.M.; supervision, F.L. and Y.Z.; project administration, F.L. F.F. and Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.”

Supplemental material

Supplemental Material

Download Zip (351.1 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data that support the findings of this study are available on request from the corresponding author, (Y.Z). The data are not publicly available because it contains information that could compromise the privacy of research participants.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15481603.2023.2271246

Additional information

Funding

This research was funded by the Science and Technology Basic Resources Investigation Program of China (No. 2019FY100500); Science and Technology Basic Resources Investigation Program of China (No. 2019FY101602-1); and National Forestry and Grassland Data Center-Heilongjiang platform (2005DKA32200-OH).

References

  • Amato, U., R. M. Cavalli, A. Palombo, S. Pignatti, and F. Santini. 2008. “Experimental Approach to the Selection of the Components in the Minimum Noise Fraction.” IEEE Transactions on Geoscience and Remote Sensing 47 (1): 153–27. https://doi.org/10.1109/TGRS.2008.2002953.
  • Asuero, A. G., A. Sayago, and A. G. González. 2006. “The Correlation Coefficient: An Overview.” Critical Reviews in Analytical Chemistry 36 (1): 41–59. https://doi.org/10.1080/10408340500526766.
  • Bagozzi, R. P., and Y. Yi. 2012. “Specification, Evaluation, and Interpretation of Structural Equation Models.” Journal of the Academy of Marketing Science 40 (1): 8–34. https://doi.org/10.1007/s11747-011-0278-x.
  • Belgiu, M., and L. Drăguţ. 2016. “Random Forest in Remote Sensing: A Review of Applications and Future Directions.” ISPRS Journal of Photogrammetry and Remote Sensing 114:24–31. https://doi.org/10.1016/j.isprsjprs.2016.01.011.
  • Blennow, K., and E. Olofsson. 2008. “The Probability of Wind Damage in Forestry Under a Changed Wind Climate.” Climatic Change 87 (3–4): 347–360. https://doi.org/10.1007/s10584-007-9290-z.
  • Bobbink, R., K. Hicks, J. Galloway, T. Spranger, R. Alkemade, M. Ashmore, M. Bustamante, et al. 2010. “Global Assessment of Nitrogen Deposition Effects on Terrestrial Plant Diversity: A Synthesis.” Ecological Applications 20 (1): 30–59. https://doi.org/10.1890/08-1140.1.
  • Calviño-Cancela, M., M. L. Chas-Amil, E. D. García-Martínez, and J. Touza. 2017. “Interacting Effects of Topography, Vegetation, Human Activities and Wildland-Urban Interfaces on Wildfire Ignition Risk.” Forest Ecology and Management 397:10–17. https://doi.org/10.1016/j.foreco.2017.04.033.
  • De Vries, W., E. Du, and K. Butterbach-Bahl. 2014. “Short and Long-Term Impacts of Nitrogen Deposition on Carbon Sequestration by Forest Ecosystems.” Current Opinion in Environmental Sustainability 9-10 (nov): 90–104. https://doi.org/10.1016/j.cosust.2014.09.001.
  • Du, C., W. Fan, Y. Ma, H. Jin, and Z. Zhen. 2021. “The Effect of Synergistic Approaches of Features and Ensemble Learning Algorithms on Aboveground Biomass Estimation of Natural Secondary Forests Based on ALS and Landsat 8.” Sensors 21 (17): 5974. https://doi.org/10.3390/s21175974.
  • Du, E., M. E. Fenn, W. De Vries, and Y. S. Ok. 2019. “Atmospheric Nitrogen Deposition to Global Forests: Status, Impacts and Management Options.” Environmental Pollution 250:1044–1048. https://doi.org/10.1016/j.envpol.2019.04.014.
  • Estornell, J., J. M. Martí-Gavilá, M. T. Sebastiá, and J. Mengual. 2013. “Principal Component Analysis Applied to Remote Sensing.” Modelling in Science Education and Learning 6:83–89. https://doi.org/10.4995/msel.2013.1905.
  • Everingham, M., L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. 2010. “The Pascal Visual Object Classes (Voc) Challenge.” International Journal of Computer Vision 88:303–338. https://doi.org/10.1007/s11263-009-0275-4.
  • Fedrigo, M., G. J. Newnham, N. C. Coops, D. S. Culvenor, D. K. Bolton, and C. R. Nitschke. 2018. “Predicting Temperate Forest Stand Types Using Only Structural Profiles from Discrete Return Airborne Lidar.” ISPRS Journal of Photogrammetry and Remote Sensing 136:106–119. https://doi.org/10.1016/j.isprsjprs.2017.11.018.
  • Genuer, R., J. Poggi, and C. Tuleau-Malot. 2010. “Variable Selection Using Random Forests.” Pattern Recognition Letters 31 (14): 2225–2236. https://doi.org/10.1016/j.patrec.2010.03.014.
  • Georganos, S., T. Grippa, S. Vanhuysse, M. Lennert, E. Wolff, and E. Wolff. 2018. “Very High Resolution Object-Based Land Use–Land Cover Urban Classification Using Extreme Gradient Boosting.” IEEE Geoscience and Remote Sensing Letters 15 (4): 607–611. https://doi.org/10.1109/LGRS.2018.2803259.
  • Ghosh, A., and P. K. Joshi. 2014. “A Comparison of Selected Classification Algorithms for Mapping Bamboo Patches in Lower Gangetic Plains Using Very High Resolution WorldView 2 Imagery.” International Journal of Applied Earth Observation and Geoinformation 26:298–311. https://doi.org/10.1016/j.jag.2013.08.011.
  • Gorelick, N., Hancher, M., and Dixon, M. 2017. “Google Earth Engine: Planetary-scale geospatial analysis for everyone[J].” Remote Sensing of Environment 202:18–27. https://doi.org/10.1016/j.rse.2017.06.031.
  • Grabska, E., D. Frantz, and K. Ostapowicz. 2020. “Evaluation of Machine Learning Algorithms for Forest Stand Species Mapping Using Sentinel-2 Imagery and Environmental Data in the Polish Carpathians.” Remote Sensing of Environment 251:112103. https://doi.org/10.1016/j.rse.2020.112103.
  • Hansen, M. C., Potapov, P. V. Potapov, R. Moore, M. Hancher, S. A. Turubanova, A. Tyukavina, et al. 2013. “High-Resolution Global Maps of 21st-Century Forest Cover Change.” Science 342 (6160): 850–853. https://doi.org/10.1126/science.1244693.
  • He, K., X. Zhang, S. Ren, and J. Sun. 2016. “Deep Residual Learning for Image Recognition.” IEEE Conference on Computer Vision and Pattern Recognition (CVPR),770–778. https://doi.org/10.1109/CVPR.2016.90.
  • Hofstra, N., M. Haylock, M. New, P. Jones, and C. Frei. 2008. “Comparison of Six Methods for the Interpolation of Daily, European Climate Data.” Journal of Geophysical Research Atmospheres 113 (D21). https://doi.org/10.1029/2008JD010100.
  • Ioffe, S., and C. Szegedy. 2015. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:448–456.
  • Jia, Y., Q. WANG, J. ZHU, Chen, Z., He, N.P., and Yu, G.R. 2019a. “Spatial and Temporal Patterns of Atmospheric Inorganic Nitrogen Dry Deposition in China.” 2006–2015. [DB/OL. https://doi.org/10.11922/sciencedb.921.
  • Jia, Y., Q. WANG, J. ZHU, Chen, Z., He, N. P., and Yu, G.R. 2019b. Spatial Pattern of Atmospheric Inorganic Nitrogen Wet Deposition in China from 1996 to 2015. https://doi.org/10.11922/csdata.2018.0031.zh.
  • Jia, L., Z. Zhou, and B. Li. 2012. “Study of SAR Image Texture Feature Extraction Based on GLCM in Guizhou Karst Mountainous Region”. In 2012 2nd International Conference on Remote Sensing, Environment and Transportation Engineering, Nanjing, China, 1–4.
  • Kattenborn, T., J. Leitloff, F. Schiefer, and S. Hinz. 2021. “Review on Convolutional Neural Networks (CNN) in Vegetation Remote Sensing.” Isprs Journal of Photogrammetry & Remote Sensing 173:24–49. https://doi.org/10.1016/j.isprsjprs.2020.12.010.
  • Konrad Turlej, C., M. Ozdogan, and V. C. Radeloff. 2022. “Mapping Forest Types Over Large Areas with Landsat Imagery Partially Affected by Clouds and SLC Gaps.” International Journal of Applied Earth Observation and Geoinformation 107:102689. https://doi.org/10.1016/j.jag.2022.102689.
  • Li, W., R. Dong, H. Fu, J. Wang, L. Yu, and P. Gong. 2020. “Integrating Google Earth Imagery with Landsat Data to Improve 30-M Resolution Land Cover Mapping.” Remote Sensing of Environment 237:111563. https://doi.org/10.1016/j.rse.2019.111563.
  • Lin, T., M. Maire, S. Belongie, J. H. C, and L. Zitnick. 2014. “Microsoft COCO: Common Objects in Context.” European Conference on Computer Vision8693:740–755.
  • Liu, Q., G. Liu, C. Huang, S. Liu, and J. Zhao. 2014. “A Tasseled Cap Transformation for Landsat 8 OLI TOA Reflectance Images.” IEEE Geoscience and Remote Sensing Symposium 541–544. https://doi.org/10.1109/IGARSS.2014.6946479.
  • Ma, L., Y. Liu, X. Zhang, Y. Ye, G. Yin, and B. A. Johnson. 2019. “Deep Learning in Remote Sensing Applications: A Meta-Analysis and Review.” ISPRS Journal of Photogrammetry and Remote Sensing 152:166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015.
  • Mäyrä, J., S. Keski-Saari, S. Kivinen, T. Tanhuanpää, P. Hurskainen, P. Kullberg, L. Poikolainen, et al. 2021. “Tree Species Classification from Airborne Hyperspectral and LiDar Data Using 3D Convolutional Neural Networks.” Remote Sensing of Environment 256:112322. https://doi.org/10.1016/j.rse.2021.112322.
  • Mohammadpour, P., D. X. Viegas, and C. Viegas. 2022. “Vegetation Mapping with Random Forest Using Sentinel 2 and GLCM Texture Feature—A Case Study for Lousã Region, Portugal.” Remote Sensing 14 (18): 4585. https://doi.org/10.3390/rs14184585.
  • Olofsson, P., G. M. Foody, M. Herold, S. V. Stehman, C. E. Woodcock, and M. A. Wulder. 2014. “Good Practices for Estimating Area and Assessing Accuracy of Land Change.” Remote Sensing of Environment 148:42–57. https://doi.org/10.1016/j.rse.2014.02.015.
  • Pham, L. T. H., and L. Brabyn. 2017. “Monitoring Mangrove Biomass Change in Vietnam Using SPOT Images and an Object-Based Approach Combined with Machine Learning Algorithms.” ISPRS Journal of Photogrammetry & Remote Sensing 128:86–97. https://doi.org/10.1016/j.isprsjprs.2017.03.013.
  • Pu, R. 2021. “Mapping Tree Species Using Advanced Remote Sensing Technologies: A State-Of-The-Art Review and Perspective.” Journal of Remote Sensing 2021:9812624. https://doi.org/10.34133/2021/9812624.
  • Ripley, B. D. 2007. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.
  • Rohith, G., and L. S. Kumar. 2020. Remote Sensing Signature Classification of Agriculture Detection Using Deep Convolution Network Models Machine Learning, Image Processing, Network Security and Data Sciences. Singapore: Springer.
  • Sayemuzzaman, M., and M. K. Jha. 2014. “Seasonal and Annual Precipitation Time Series Trend Analysis in North Carolina, United States.” Atmospheric Research 137:183–194. https://doi.org/10.1016/j.atmosres.2013.10.012.
  • Schlund, M., K. Scipal, and M. W. J. Davidson. 2017. “Forest Classification and Impact of BIOMASS Resolution on Forest Area and Aboveground Biomass Estimation.” International Journal of Applied Earth Observation and Geoinformation 56:65–76. https://doi.org/10.1016/j.jag.2016.12.001.
  • Shi, Y., L. Xu, Y. Zhou, B. Ji, G. Zhou, H. Fang, J. Yin, et al. 2018. “Quantifying Driving Factors of Vegetation Carbon Stocks of Moso Bamboo Forests Using Machine Learning Algorithm Combined with Structural Equation Model.” Forest Ecology and Management 429:406–413. https://doi.org/10.1016/j.foreco.2018.07.035.
  • Simonyan, K., and A. Zisserman. 2014. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” arXiv Preprint arXiv1409:1556. https://doi.org/10.48550/arXiv.1409.1556.
  • Song, Y., J. Wang, Y. Ge, and C. Xu. 2020. “An Optimal Parameters-Based Geographical Detector Model Enhances Geographic Characteristics of Explanatory Variables for Spatial Heterogeneity Analysis: Cases with Different Types of Spatial Data.” GIScience and Remote Sensing 57 (5): 593–610. https://doi.org/10.1080/15481603.2020.1760434.
  • Stuke, A., P. Rinke, and M. Todorović. 2021. “Efficient Hyperparameter Tuning for Kernel Ridge Regression with Bayesian Optimization.” Machine Learning: Science and Technology 2 (3): 035022. https://doi.org/10.1088/2632-2153/abee59.
  • Vaswani, A., N. Shazeer., N. Parmar., Uszkoreit., J., Jones., L, Gomez., A. N., Łukasz Kaiser, G., and Polosukhin, I. 2017. “Attention is All You Need.” arXiv. https://doi.org/10.48550/arXiv.1706.03762.
  • Wallis, C. I. B., G. Brehm, D. A. Donoso, K. Fiedler, J. Homeier, D. Paulsch, D. Süßenbach, et al. 2017. “Remote Sensing Improves Prediction of Tropical Montane Species Diversity but Performance Differs Among Taxa.” Ecological Indicators 83:538–549. https://doi.org/10.1016/j.ecolind.2017.01.022.
  • Wang, Q., and P. M. Atkinson. 2018. “Spatio-Temporal Fusion for Daily Sentinel-2 Images.” Remote Sensing of Environment 204:31–42. https://doi.org/10.1016/j.rse.2017.10.046.
  • Wang, G., M. Wu, X. Wei, and H. Song. 2020. “Water Identification from High-Resolution Remote Sensing Images Based on Multidimensional Densely Connected Convolutional Neural Networks.” Remote Sensing 12 (5): 795. https://doi.org/10.3390/rs12050795.
  • Wang, Y., H. Zhang, and G. Zhang. 2019. “CPSO-CNN: An Efficient PSO-Based Algorithm for Fine-Tuning Hyper-Parameters of Convolutional Neural Networks.” Swarm and Evolutionary Computation 49:114–123. https://doi.org/10.1016/j.swevo.2019.06.002.
  • Wensel, L. C., J. Levitan, and K. Barber. 1980. “Selection of Basal Area Factor in Point Sampling.” Journal of Forestry78(2): 83–84. https://doi.org/10.1093/jof/78.2.83.
  • Whang, S. E., and J. G. Lee. 2020. “Data Collection and Quality Challenges for Deep Learning.” Proceedings of the VLDB Endowment 13 (12): 3429–3432. https://doi.org/10.14778/3415478.3415562.
  • Wickham, J., S. V. Stehman, D. G. Sorenson, L. Gass, and J. A. Dewitz. 2023. “Thematic Accuracy Assessment of the NLCD 2019 Land Cover for the Conterminous United States.” GIScience & Remote Sensing 60 (1): 2181143. https://doi.org/10.1080/15481603.2023.2181143.
  • Wu, Y., K. Shi, Z. Chen, S. Liu, and Z. Chang. 2022. “Developing Improved Time-Series DMSP-OLS-Like Data (1992–2019) in China by Integrating DMSP-OLS and SNPP-VIIRS.” IEEE Transactions on Geoscience & Remote Sensing60:1–14. https://doi.org/10.1109/TGRS.2021.3135333.
  • Xi, Z., C. Hopkinson, S. B. Rood, and D. R. Peddle. 2020. “See the Forest and the Trees: Effective Machine and Deep Learning Algorithms for Wood Filtering and Tree Species Classification from Terrestrial Laser Scanning.” ISPRS Journal of Photogrammetry and Remote Sensing 168:1–16. https://doi.org/10.1016/j.isprsjprs.2020.08.001.
  • Xi, Y., C. Ren, Q. Tian, Y. Ren, X. Dong, and Z. Zhang. 2021. “Exploitation of Time Series Sentinel-2 Data and Different Machine Learning Algorithms for Detailed Tree Species Classification.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14:7589–7603. https://doi.org/10.1109/JSTARS.2021.3098817.
  • Xi, Y., C. Ren, Z. Wang, S. Wei, J. Bai, B. Zhang, H. Xiang, et al. 2019. “Mapping Tree Species Composition Using OHS-1 Hyperspectral Data and Deep Learning Algorithms in Changbai Mountains, Northeast China.” Forests 10 (9): 818. https://doi.org/10.3390/f10090818.
  • Xue, J., and B. Su. 2017. “Significant Remote Sensing Vegetation Indices: A Review of Developments and Applications.” Journal of Sensors 2017:1–17. https://doi.org/10.1155/2017/1353691.
  • Yanchen, B. O., and W. Jinfeng. 2004. “Exploring the Scale Effect in Thematic Classification of RemotelySensed Data: The Statistical Separability-Based Method.” Remote Sensing Technology and Application 19 (6): 443–449. https://doi.org/10.11873/j.issn.1004-0323.2004.6.443.
  • Yoo, C., D. Han, J. Im, and B. Bechtel. 2019. “Comparison Between Convolutional Neural Networks and Random Forest for Local Climate Zone Classification in Mega Urban Areas Using Landsat Images.” ISPRS Journal of Photogrammetry and Remote Sensing 157:155–170. https://doi.org/10.1016/j.isprsjprs.2019.09.009.
  • Yuan, K., and P. M. Bentler. 2006. “Structural Equation Modeling.” Handbook of Statistics26:297–358. https://doi.org/10.1002/9781118133880.hop202023.

Appendix Appendix A.

The 98 features extracted from Landsat 5 TM and Landsat 8 OLI imagery

K is convolution kernel size; out is the number of outputs; S is stride; BN is batch normalization; FC is fully connected layer.

K is convolution kernel size; out is the number of outputs; S is stride; BN is batch normalization; FC is fully connected layer.

Appendix B.

The structure of VGG34 Note: Conv34 is one-dimensional convolutional layer

Appendix C.

Model architecture for Resnet 2007 and Resnet 2007

Appendix D.

Sample size for land use type classification

Appendix E.

Description of the variables for SEM

Appendix F.

Ranking of the selected features using RFE

Appendix G.

Classification accuracy of land use type based on random forest intempandtemp

Appendix H.

The classified area of land use type in Heilongjiang Province intempandtemp