Search in:

GIScience & Remote Sensing Volume 60, 2023 - Issue 1

Submit an article Journal homepage

Open access

1,250

Views

CrossRef citations to date

Altmetric

Listen

Research Article

Regionalized classification of stand tree species in mountainous forests by fusing advanced classifiers and ecological niche model

Panfei Fanga Faculty of Forestry, Southwest Forestry University, Kunming, ChinaView further author information

Guanglong Oub Key Laboratory of State Forestry Administration on Biodiversity Conservation in Southwest China Southwest Forestry University, Kunming, ChinaView further author information

Ruonan Lic Faculty of Geography, Yunnan Normal University, Kunming, ChinaView further author information

Leiguang Wangd Institute of Big Data and Artificial Intelligence, Southwest Forestry University, Kunming, ChinaCorrespondence[email protected]
View further author information

Weiheng Xud Institute of Big Data and Artificial Intelligence, Southwest Forestry University, Kunming, ChinaView further author information

Qinling Daie Art and Design College, Southwest Forestry University, Kunming, Yunnan, ChinaView further author information

Xin Huangf School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, ChinaView further author information

show all

Article: 2211881 | Received 20 Dec 2022, Accepted 04 May 2023, Published online: 12 May 2023

Cite this article
https://doi.org/10.1080/15481603.2023.2211881
CrossMark

In this article

ABSTRACT
1. Introduction
2. Materials
3. Methodology
4. Results
5. Discussion
6. Conclusions
Supplemental material
Acknowledgements
Disclosure statement
Additional information
References

Full Article
Figures & data
References
Supplemental
Citations
Metrics
Licensing
Reprints & Permissions
View PDF PDF View EPUB EPUB

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Though many new remote sensing technologies have been introduced to analyze forests, regional-scale species-level mapping products are still rare, especially in large mountainous areas. Tree species abundance, low spectral separability among species and huge computing demand are hindrances for obtaining an accurate stand tree species map. This study addressed these problems by synergizing regionalization, multiple feature fusion, and model fusion and proposed a new machine learning workflow. The whole area, i.e. Yunnan province in China (approximately 390,000 km²), was firstly divided into 8 distinct floristic regions according to the distributions and phylogenetic relationships of native tree species. Thereafter, with Google Earth Engine (GEE) platform, multiple data sets, including Sentinel-2 imagery, SRTM DEM, and WorldClim bioclimatic, were collected to construct a high-dimensional feature pool for each region. Thirdly, the maximum entropy model （MaxEnt）, generally used for predicting ecological niche, and three classifiers, i.e. Random Forest (RF), Support Vector Machines (SVM), and Extreme Gradient Boosting (XGBoost), were used to pre-classify environmental and remote sensing data, respectively. After that, two types of decision fusion strategies, parallel and serial ensembles, were applied to fuse pre-classification probability maps for each sub-region. Finally, the spatial distribution of 19 forest stand species over the whole Yunnan Province was obtained by mosaicking the best classification results from 8 sub-regions. Our method achieves an overall accuracy of 72.18% on the entire validation dataset. The decision fusion models significantly improve the classification accuracy, with the eight partitioned best fusion models improving the accuracy by 7.33%–25.39% on average compared to base classifiers. This study demonstrates that the spatial partitioning strategy and the decision fusion integrating a proper machine learning algorithm and ecological niche model can significantly improve the classification accuracy of forest stand species in montane forests.

KEYWORDS:

Tree species classification
Vegetation regionalization
Maximum entropy model
Machine learning
Decision fusion

1. Introduction

Information on the distribution of forest stand species is essential for the development of forest management policies and the sustainable use of forest resources (Fassnacht et al. Citation2014; Guo et al. Citation2020) as well as the assessment of biodiversity, habitat quality, carbon cycling, and biomass (Kollert et al. Citation2021; Ørka et al. Citation2013; Waser et al. Citation2011).

The spatial distribution of tree species is traditionally obtained through labor-intensive field surveys. Thus, inventory data are expensive and time-consuming to create, especially in a large mountainous region. Remote sensing brings opportunities for cost-efficiently and time-efficiently obtaining vegetation-related information (Grabska, Frantz, and Ostapowicz Citation2020). However, unlike other land cover types, the extremely low separability among tree species in spectral signature makes mapping tree species is still a big challenge. In this context, various data and advanced processing techniques are often needed, to meet the corresponding requirements with the desired accuracy and details of forest stand tree species mapping.

Multimodal remote sensing sensors provide key information for describing tree species. High spatial resolution multispectral images have rich spectral and textural information, and hyperspectral images with nano-scale spectral resolution are rich in spectral information (Wan et al. Citation2021); LiDAR data provide geometric information related to tree vertical structure, such as tree height, canopy diameter, and leaf area (Feng et al. Citation2020; Ke, Quackenbush, and Im Citation2010; Michaowska and Rapiński Citation2021). Current tree species classification research focuses on algorithm-driven or data-driven methods (Fassnacht et al. Citation2016). Machine learning algorithms have potent capabilities to map non-normally distributed feature into classes (Bhatt et al. Citation2022). Typical models including Random Forest (RF), Support Vector Machines (SVM), and Extreme Gradient Boosting (XGBoost) have been widely applied to classification. To date, with the increasing availability of earth observation data, Deep Learning (DL) models have drawn more attention. DL models can learn discriminant features from data through multi-layer neural networks, accurately capture nonlinear relationship between data and classes, and thus achieving end-to-end image classification (LeCun, Bengio, and Hinton Citation2015).

These data and classification models have been widely used to map tree species with high accuracies. For instance, very-high-resolution (VHR) Worldview-2 images and RF were used to identify the three most abundant Pine species in Galicia, with a classification accuracy of 91% (Alonso, Picos, and Armesto Citation2021). Feng et al. collaborated aerial hyperspectral with LiDAR data and used RF to identify 10 tree species in a forest farm with an OA of 96.10% (Feng et al. Citation2020). Zhang et al. applied 3DCNN to tree species classification, achieving a high accuracy of 93.14% while improving computational efficiency (Zhang, Zhao, and Zhang Citation2020). Unfortunately, these studies cannot be extended to large regions or globally due to high data collection costs.

Free-accessible images acquired by low- and medium-resolution satellites, like MODIS, Landsat, and Sentinel serials, are often used for mapping forest ecosystem over a large area. Many forest-related products, such as GLAD Forest (Hansen et al. Citation2013), FROM_GLC (Gong et al. Citation2013), and iMap World 1.0 (Liu et al. Citation2021), have been produced. However, these products treat forests as a single class or divide forests into several forest types. These researches focus on analyzing forest distribution and dynamic changes but cannot provide the spatial distribution of tree species. How to obtain spatial distribution information of large-scale tree species economically, efficiently, and accurately is still an important and urgent problem. Currently, Sentinel-2 (S2) satellite constellation may be the best choice for tree species classification in large regions. Several studies have demonstrated the high potential of unique red-edge band and high-density time series of S2 images for vegetation mapping in large regions (Grabska, Frantz, and Ostapowicz Citation2020; Welle et al. Citation2022). However, new challenges arise when classifying a geographically heterogeneous region such as the Yunnan Province.

The Yunnan province is in southwest China (), which covers an area about the size of Germany. Mapping such a large area firstly implies processing massive amounts of data, and with its cloudy and rainy climate, more images need to be processed to synthesize high-quality images. Secondly, the horizontally and vertically natural conditions are highly variable, and the increased geospatial heterogeneity of tree species’ spectra and phenology in a large area challenge the mapping process (Shirazinejad, Zoej, and Latifi Citation2022). Thirdly, Yunnan is the central part of the Indo-Burma biodiversity hotspot (Myers et al. Citation2000), the composition of tree species in this area is highly complex, and the local distribution of species is also highly variable. These factors mentioned above increase the difficulty of tree species mapping.

Figure 1. (a) Yunnan, a global biodiversity hotspot region, located in southwestern China. (b) The elevation map of the Yunnan. (c) Eight distinct floristic sub-regions of Yunnan as basic unit for classification (Li et al. Citation2015).

When studies focus on developing and optimizing the classification process of tree species in large areas, one aspect that deserves significant attention is the influence of environment on the distribution of tree species. Environmental factors, such as rainfall, temperature, geomorphology, and soil, have essential effects on the distribution of tree species (Yu et al. Citation2020). However, the spatial resolution of current environmental data is mostly lower than 1 km. For example, the TerraClimate dataset provides 4 km of temperature data (Abatzoglou et al. Citation2018), and the WorldClim dataset provides 1 km of bioclimatic variables (Hijmans et al. Citation2005). Owing to the resolution gap with remotely sensed images, environmental data are rarely incorporated in classification processes based on remotely sensed data and machine learning. Environmental data are more often used for species distribution modeling, using sample points of species associated with environmental variables to map the potential distribution areas of target species. For example, Zhou et al. used the MaxEnt model to predict the potential distribution of Cunninghamia lanceolata in China (Zhou et al. Citation2021). Although machine learning classification and ecological niche model prediction have been extensively studied, the combination of the two has rarely been explored. Researchers are therefore missing the opportunity to synergize the two domains to obtain more accurate tree species distributions (Mouta et al. Citation2021). Therefore, a rising question is how to effectively integrate ecological niche models and machine learning to do classification over such a large area.

The development of remote sensing cloud computing technology and the emergence of platforms have brought new solutions to these problems. Google Earth Engine (GEE) is a cloud computing platform, whose powerful computing capabilities have changed the traditional paradigm of remote sensing data processing and analysis (Tamiminia et al. Citation2020). Operations such as cloud masking, image composition within time windows and time series analysis are easily implemented on GEE. Meanwhile, the GEE platform also is convenient for users to employ multiple machine learning classifiers and ecological niche models to form a processing chain. In addition, spatial partitioning has been proved to improve classification accuracy for large areas with high environmental heterogeneity (Costa et al. Citation2022). It is more precise and reasonable to introduce eco-geographical regionalization strategy into classification process (Moraes et al. Citation2021).

The blooming development of multimodal sensors and technical revolution of processing platforms and methods open the door for classifying tree species over a large area. However, analyzing the data at the desired accuracy raises new challenges. In this context, we aim to build a classification chain for accurately mapping forest stand species in the whole Yunnan province by exploring floristic regionalization, possible feature-level and decision-level fusion strategies, and combinations. The objectives of this study were (ⅰ) to evaluate the effectiveness of using the floristic regionalization for forest stand tree species classification, (ⅱ) to explore the significant features for separating tree species, and (ⅲ) to develop an effective method for forest stand species classification by integrating ecological niche model and machine learning.

2. Materials

2.1. Study area

Yunnan is the most southwestern province of China () and situated in a mountainous area. It has significant environmental heterogeneity, with diverse topography, landforms, and climate. As shown in , the topography of Yunnan Province exhibits a high northwest and low southeast, with a stepped slope in the terrain, ranging in elevation from the highest mountain top 6740 m in the northwest to the lowest valley bottom 76.4 m in the southeast (Yang et al. Citation2004). The territory contains a variety of landform types including high mountains, hills, intermountain basins, river valleys, and karsts. Yunnan has the characteristics of low latitude, monsoon, and mountain plain climates and crosses seven climate zones: northern tropical, southern subtropical, middle subtropical, northern subtropical, southern temperate, middle temperate, and highland climate.

As shown in , Yunnan is bordered by the southeastern edge of the Tibetan Plateau in the northwestern part of the country and by the Southeast Asian countries in the western and southern parts, a huge meeting place of floras and formations (Li and D Citation1986). Its northwestern part is introduced by the Himalayan vegetation zone and the ancient southern continental flora, while the eastern part has numerous central and southern Chinese flora, and the northern part has the Emei, Qinling, and northwestern flora (Li and Pei Citation1991). The flora of southern Yunnan part is more closely related to the Indo-Malaysian flora, while the flora of southeastern Yunnan is more closely associated with the East Asian flora (Hua and Heil Citation2017). Geological activity and diverse topography and climate have not only forged Yunnan’s unique phytogeographic divergence and vegetation geography but also made it a well-known global biodiversity hotspot (Liu et al. Citation2021).

2.2. Multi-source data

This study used four types of data, Sentinel-2 images, WorldClim data, Digital Elevation Model (DEM) data and Forest Management Inventory data (FMI), to perform tree species classification.

2.2.1. Sentinel-2 data

High-quality images with low cloud ratio in Yunnan are often difficult to obtain due to persistent cloud cover. We counted the number of cloud-free images in three periods. As shown in , there were no valid observations for some pixels throughout 2016. In contrast, the qualified observations exceed 10 over the most tiles from 7/2015 to 12/2017 and exceed 50 from 2016 to 2020. Therefore, to obtain high-quality images and extract dense time-series related features, we used 31,440 top-of-atmosphere (TOA) images from 72 MGRS tiles between 7/2015 and 12/2020 as input data. The images from 7/2015 to 12/2017 were used to extract spectral and texture features, and the images from 2016 to 2020 were used to construct the time series data. The temporal mismatch between the extracted features and FMI data year may affect the reliability of the classification. For this problem, the details were discussed in section 5.

Figure 2. Valid observations of Sentinel-2 images in different periods. (a) 2016.1 to 2016.12; (b) 2015.7 to 2017.12; (c) 2016.1 to 2020.12.

2.2.2. Bioclimatic and topographic data

The distribution of tree species on the earth’s surface is neither random nor uniform, but geographically specific and characterized by specific environmental and climatic factors (Zeb et al. Citation2021). Topographic and climatic heterogeneity has an important impact on tree species distribution. Therefore, bioclimatic and topographic-related factors are considered to be introduced for classification.

Bioclimatic data with a spatial resolution of 1 km were obtained from the WorldClim website (https://www.worldclim.org/data/worldclim21). These data consist of 19-dimensional bioclimatic variables with biological significance, such as mean annual temperature, annual precipitation, annual temperature difference, and precipitation during the rainy and dry (Hijmans et al. Citation2005). Moreover, topographic data were obtained from the Platform Space Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM).

2.2.3. Forest Management Inventory data

The Forest Management Inventory (FMI) data 2016 in China was employed as the reference data for our study. FMI aims to determine the attributes including location, natural condition, tree species composition, and distribution of each forest stand. FMI data is officially produced by local forestry and grassland administrations, under the conduction of the national administration of China. As the data has been calibrated with forest sample plots to control the error components of inventory, the data can provide a reliable reference for the different tree species.

Yunnan has about 18,000 plant species (Liu and Peng Citation2016), and it is impossible to identify all of them by using remote sensing imagery. We counted each forest stand’s distribution area in each sub-region based on FMI and selected the top 9 species in each sub-region, which resulted in 19 species in total ().

Table 1. Observed class and its acronym.

Download CSV Display Table

3. Methodology

This paper proposes a classification framework combining machine learning and ecological niche models to determine the spatial distribution of the major forest stand species in Yunnan Province. illustrates the whole workflow, including seven steps: (1) Spatial division; (2) Data pre-processing; (3) Feature extraction; (4) Feature reduction; (5) Classification with component classifiers; (6) Classifier fusion ensemble; and (7) Accuracy evaluation and mapping.

Figure 3. Technique workflow of research.

3.1. Floristic regionalization

We divided Yunnan Province into eight sub-regions based on the Yunnan flora system (Li et al. Citation2015). This system is produced by combining the distribution and phylogenetic relationships of seed plants of the genus 1983 in Yunnan. As shown in , the eight sub-regions were used as the basic units for independent species mapping. Feature screening, model training, and classification were performed independently for each sub-region. The goal is to make the classification and post-classification analysis applicable to local conditions and tree species, avoiding the usage of consistent feature sets for classification across the study area.

3.2. Data pre-processing

3.2.1. Sentinel-2 and topographic variables

We conducted the following steps on the acquired Sentinel-2 images to obtain cloud-free, seamless image composite. Firstly, pixels obscured and covered by clouds were removed based on the adjusted cloud score algorithm (Oreopoulos, Wilson, and Várnai Citation2011). Secondly, post-cloud masking image data from 7/2015 to 12/2017 were composed by median reducer in GEE. The composite image was used to extract spectral, vegetation index, and texture features. Thirdly, we constructed the NDVI and Red-Edge Position Index (REP) time series at 5-day period, using Sentinel-2 images from 1/2016 to 12/2020. We adopted the median of all high-quality observations available within the 5-day time window to represent the observation value. When no good quality observations were available within the 5-day period, we used linear interpolation to fill the data gaps. Although these methods address the issue of missing pixel values, there is still a significant amount of noise in the data product, which hinders further analysis and utilization of the data. To mitigate noise, we applied a Savitzky-Golay (SG) filter to smooth the NDVI and REP time series, using a moving window of size 9 and a polynomial filter order of 2.

3.2.2. Reference data

The quality and quantity of the training samples are critical for training models. FMI data were filtered to obtain accurate and representative samples. To obtain samples for forest and non-forest classification, we conducted the following three steps. Firstly, the plots of forest (pure forest, shrub economic forest, mixed forest, other shrub woodlands, other special shrubs, economic forest, open woodland, bamboo forest) and non-forest (construction land, watershed, cropland grassland, and other non-forest lands) were selected from FMI data. Secondly, the selected candidates were further purified according to the NDVI standard deviation calculated from pixels in each plot candidate. Candidates with high standard deviation value were removed, and centroids of the rest plot were selected as sample points. Finally, the rest samples were further purified through visually inspecting the overlaying of sample points and Sentinel-2 median composite image.

Similar steps were also employed for producing tree species samples. Firstly, plots, where the first dominant species accounts for more than 65% of the total stock volume, were seemed as pure and preserved. Secondly, the top 9 tree species in terms of coverage in each sub-region were counted from FMI, respectively, and corresponding plots were selected from the pure plots of each sub-region. Thirdly, with each plot as a basic unit, the standard deviations of Blue, NIR, SWIR, NDVI, and Greenness bands and the area were calculated, and plots with low standard deviations and large areas were preserved. Fourthly, centroids of the preserved sub-compartments were selected as candidate sample points and finally purified by visually inspecting the overlaying of sample points and image composite.

The samples of each category were divided into a training sample set (75%) and a validation sample set (25%), and shows the numbers of training and validation data.

Figure 4. The number of samples for each observed class.

3.3. Feature pools and feature reduction

3.3.1. Mining features from multiple data

Tree species differ in canopy shape, phenology, and habitat. To fully understand and accurately distinguish forest from non-forest and identify species, based on comprehensively reviewing the literature, this paper extracts various features, including spectral bands, texture features, vegetation indices, and other environmental features.

Ten bands (Blue, Green, Red, RedEdge1, RedEdge2, RedEdge3, NIR, Narrow NIR, SWIR1, SWIR2) were selected from the S2 image compositing data. Based on the 10 bands, the Greenness and Wetness components of Tasseled Cap Transform were also calculated as well as 16 vegetation indices, including Triangular Vegetation Index (TVI), MERIS Terrestrial Chlorophyll Index (MTCI), Normalized Burn Ratio (NBR), Normalized Difference Built-up Index (NDBI), Inverted Red-Edge Chlorophyll Index (IRECI), Modified Chlorophyll Absorption Ratio Index (MCARI), Land Surface Water Index (LSWI), Normalized Difference 750/705 Chl NDI (Chl NDI), Normalized Difference Red-Edge Index (NDRE), Red-Edge Position Index (REP), Normalized Difference Vegetation Index (NDVI), Normalized Difference Senescent Vegetation Index (NDSVI), Normalized Difference Tillage Index (NDTI), Normalized Difference Water Index (NDWI), and Red Edge Chlorophyll Index (CIrededge). The formulas for vegetation indices are shown in .

Table 2. The vegetation indices and corresponding formulas used in this study.

Display Table

In addition, we computed six texture features (Sum Average, Correlation, Variance, Dissimilarity, Contrast, and Cluster shade) for each of the 10 spectral bands and the NDVI, so there are 66 texture features in total. Five-day interval time series data indices were computed from both NDVI and REP indices, resulting in a feature set with 146 bands (73 × 2). We also obtained 19 dimensions of bioclimatic variables from the WorldClim data and 3 topographic factors, namely Elevation, Slope, and Aspect, from the SRTM data using the GEE built-in functions.

Based on the basic features mentioned above, two feature pools were constructed. As shown in , the first feature pool consists of 37-dimensional features, for forest and non-forest classification. The second feature pool consists of 257-dimensional features () for forest stand species classification. Among them, 22-dimensional environmental features, including 19-dimensional bioclimatic features and 3-dimensional topographic features, were used as input features for MaxEnt; 237-dimensional feature sets, named the ALL Feature Set, were selected as inputs to three machine learning models, SVM, RF, and XGBoost.

Table 3. The feature candidates for the forest mask generation.

Download CSV Display Table

Table 4. The feature candidates for forest stand species classification.

Download CSV Display Table

3.3.2. Feature selection

The number and quality of features largely determine the algorithm’s efficiency. Selecting features with optimal separability is crucial for computational efficiency and classification performance. Thus, we optimized features before classification.

For the MaxEnt model, the factors were evaluated and selected by the following three steps. (1) Spearman correlation analysis was first performed on the variables. (2) Pre-training was performed using all environmental variables, and the importance of each dimensional variable was evaluated using Jackknife test. (3) The variable most significant in the Jackknife test was selected among the highly correlated variables (absolute value of correlation ≥0.8) (i.e. the model generated from all variables except this variable and the model generated using only this variable had the smallest difference in regularized training gain). Highly correlated and redundant variables were removed, and each environmental factor that dominates the prediction was finally identified. The selected features consist of an optimal feature set for the MaxEnt predicting model and called Feature Set 1 in the following section.

For SVM, RF, and XGBoost classifier, the relevance hierarchical clustering method (You et al. Citation2021) was used to select the best input features. This feature selection process is performed in four steps. The normalization method is firstly used to map all features to a distribution with mean value of 0 and standard deviation value of 1. Pre-training is secondly performed to determine the number of feature dimensions when the accuracy reaches saturation point. The feature importance of the features is evaluated by the Gini importance or mean reduction impurity calculated by the RF classifier and ranked by feature importance score. Spearman correlation analysis is performed on the most important features. The most important features are clustered into several clusters by the maximum depth threshold, and finally the features with the highest feature importance score in each cluster are retained. In this study, the maximum depth threshold is set to 0.5 to preserve the important features with high separability and low correlation among the selected features. The optimal features are selected for stand tree species classification, referred to as Feature set 2 in the following sections, and the set of best features for classification and non-forest classification, hereinafter referred to as Feature set 3.

It is worth noting that, as the above-mentioned above selection process independently operated in each floristic region, Feature Set 1, Feature Set 2, and the Feature Set 3 may be different from region to region.

3.4. Classification with component classifiers

MaxEnt derives the constraints on species distribution, then seeks the possible distribution with maximum entropy. When the entropy achieves maximum, the species occurrence probability distribution is closest to the true distribution of species (Phillips, Anderson, and Schapire Citation2006). The prediction results of MaxEnt are reflected by the probability of species occurrence. And the larger the value, the more suitable environmental conditions for species survival. Due to their superior and stable classification performance, three machine learning classifiers, including RF, SVM, and XGBoost, are the most popular and widely used in various applications (Abdi Citation2020; Grabska, Frantz, and Ostapowicz Citation2020).

The four component models are trained and validated using the same samples, and empirically adjustment method was used to obtain the hyperparameters for each classifier. The MaxEnt model classification was implemented on local computer using the maxent software (version 3.4.1), while the three machine learning classifiers were implemented on the GEE platform.

3.5. Classification by multi-classifier fusion

After the classification results of the four component classifiers are obtained, we construct two paradigms to fuse MaxEnt and the three machine learning classifiers parallelly () and serially (), respectively. The integration schematic of the two paradigms is shown in .

Figure 5. Schematic diagram of two decision fusion strategies.

The first paradigm is called the parallel ensemble model. Classification is performed by four component classifiers parallelly and followed by decision fusion based on a weighted voting strategy. The final fusion objective function is defined as (1).

(1)

Y^{*} = \underset{Y}{a r g m a x} (\sum_{i \in S} \sum_{j = 1}^{4} log (w_{j} l_{i, y_{i}}^{M d_{j}}))

(1)

where S is the set of pixel locations, $l_{i, y_{i}}^{M d_{j}}$ denotes a probability predicted by the model $j$ , that a pixel i belongs to a given class $y_{i}$ , and $w_{j}$ is a weight measuring confidence of the model $j$ . Intuitively, the setting of $w_{j}$ is related to the training performance of models. Therefore, weights for each component classifier are calculated based on the training accuracies with $w_{i} = \frac{A c c (C_{i})}{\sum_{1}^{4} A c c (C_{j})}$ and $\sum_{j} w_{j} = 1$ .

The second integration paradigm is a serial mode, where the output of the MaxEnt classifier, in the form of class-specific possibilities, is stacked together with Feature set 2 and fed into the three machine learning classifiers, respectively. In this process, the possibility of each class is seemed as a feature.

4. Results

We implemented the proposed classification scheme in Yunnan province. First, we divided Yunnan Province into eight sub-regions. Then, the best classification features were selected for each subzone from the candidate features. Finally, two levels of classification, forest cover and tree species, were executed in each sub-region. In this section, experimental setting, the mapping results of forest cover and species are described.

4.1. Experimental setting and accuracy assessment

Many studies proved that forest-type classification could often be achieved with high accuracy (Grabska et al. Citation2019; Li et al. Citation2022). Therefore, the experiments focus on analyzing the performance of species-level classifications. As shown in , totally 11 feature-classifier combinations are compared to analyze the importance of features and the effectiveness of the proposed fusing strategies. Moreover, visual inspection and quantitative metrics, including confusion matrix and derived Overall Accuracy （OA） and Kappa, are employed for assessing classification performance.

Table 5. Feature-classifier combinations explored in this study.

Download CSV Display Table

4.2. Forests cover mapping

shows the number of features selected and the classification accuracy for each sub-region for both levels of forest cover and tree species classification. Depending on the spatial location, the number of features used for forest classification ranged from 8 to 11 dimensions, and the number of features for tree species classification ranged from 26 to 41 dimensions. The OA ranged from 93.91% to 98.06% at the forest level, and from 70.12% to 75.24% at the tree species level.

Figure 6. The numbers of optimal features and overall accuracies in the 8 sub-regions. Darker red tones mean higher accuracy for species classification.

The forest and non-forest classification was firstly performed. Since the classification error in this level will transfer to the species-level classification, the classification accuracy is expected as high as possible. In this study, the OA is 96.88%, and the Kappa coefficient is 0.96, according to 27,296 reference samples. The classification results are shown in . The classification results are highly consistent with the forest distribution in the field. Therefore, this result can be used for producing forest mask.

Figure 7. Forest/non-forest classification result of Yunnan province.

4.3. Result at species level

In each sub-region, the accuracy of the 11 classification schemes was evaluated based on the validation samples. The OAs of each partition are shown in and .

Table 6. Accuracy assessment for different classification scenarios in each sub-region. The highest accuracies over all sub-regions are in bold.

Download CSV Display Table

The accuracies of the seven component classifiers do not exceed 67.40% in all eight partitions, and these accuracies may be insufficient to support further analysis and decision-making. In contrast, the integrated model significantly improves the classification accuracy. Among the integration models, the average accuracy of MXSR over eight regions is 69.51%, and the average accuracy of the three serial integration models, MS, MX, and MR, is higher than 71.2%. Overall, the classification performance of the integrated classifier is much higher than that of the component classifier. Among the ensemble models, three serial integrated ensembles are superior to the parallel integrated model.

By mosaicking the MR classification results of the eight sub-regions, we obtained the spatial distribution map of 19 target forest species in Yunnan Province at 10 m resolution (). The inset details the spatial distribution of tree species in the selected sites. The accuracy was evaluated by 53,934 validation sample points, and the overall accuracy was 72.18% with a kappa coefficient of 0.69. The heat map of confusion matrix is given in , from which the four classes of Larix gmelinii, Hevea brasiliensis, Abies fabri, and Juglans regia achieved high accuracies, while Alnus cremastogyne Burk., Quercus L., and Sassafras tsumu were classified with low accuracies.

Figure 8. The spatial details of the forest stand species map in Yunnan Province in 2016. Site a (103.279°E, 26.851°N); Site b (103.626°E, 25.910°N); Site c (103.031°E, 25.434°N); Site d (104.568°E, 23.026°N); Site e (101.347°E, 21.946°N); Site f (99.173°E, 25.318°N); Site g (99.688°E, 27.743°N).

Figure 9. Confusion matrix heatmap of tree species classification.

5. Discussion

We have divided the Yunnan region into eight sub-regions, with features extracted from S2, SRTM, and WorldClim data, and used four component classifiers to map the spatial distribution of forest stand species. This section discusses the impacts of floristic regionalization, multiple data and the component classifiers and integrated models on the classification performance.

5.1. The necessity of floristic regionalization

Yunnan Province contains a variety of landform types including high mountains, hills, intermountain basins, river valleys, and karsts, as well as seven climate types and seven vegetation types. The corresponding physical differences in climate and biota between the North and South are equivalent to the differences from Hainan Island, China to Changchun, Northeast China (Yang et al. Citation2004). Spatial zoning is a practical solution for large areas with different growth conditions and complex tree species composition.

In summary, the floristic regionalization can bring four obvious strengths. First, zoning according to vegetation zones creates zonation with uniform ecological and spectral characteristics. Compared to the whole area, a sub-region has a higher consistency of geological history, geomorphology, climate, and vegetation composition, reducing spatial and temporal variability in the tree species’ spectral and phenological response (Cano et al. Citation2017). Second, each partition has a different species composition, and feature selection was independently performed for each partition to avoid classification with a set of feature combinations over a very large area. Third, the partitioning increases the proportion of the number of samples and alleviates the spatial and quantitative imbalance of samples to some extent. Fourth, by classifying the eight regions sequentially over the GEE platform, the computational burden and demand on storage are also reduced.

5.2. Variable importance assessment

5.2.1. The importance of Sentinel-2 data in stand species classification

The mean spectral reflectance of the 19 tree species extracted from July 2015 to December 2017 median composite images are shown below in . Although S2 can provide more detailed spectral and spatial information than Landsat images, it is still insufficient to distinguish between tree species. This conclusion is also confirmed by the OA shown in , where the classification accuracy using the features from S2 failed to reach 70%.

Figure 10. The mean spectral reflectance of 19 tree species, calculated from the Sentinel-2 bands.

The reflectance differences among the species in the Red-Edge to SWIR1 bands are higher than in other spectral bands, and in this spectral band, the reflectance of broadleaf trees was higher than that of conifers, with higher values for Betula L. and Hevea brasiliensis, while Picea asperata Mast., Abies fabri, and Larix gmelinii had lower reflectance, and in this spectral range Sassafras tsumu and Larix gmelinii in reflectance differed significantly from the other species, indicating the usefulness of the three bands in distinguishing tree species. However, as expected, the reflectance differences between coniferous and broadleaf trees are much smaller in other bands, and there is a large overlap between the different tree species.

To evaluate the contribution of texture and time-series features to classification, based on three feature combinations, we assessed the separability between tree species. Three feature combinations include (a) raw bands and vegetation indices, (b) spectral features and texture features, and (c) spectral features and time series features.

The Jeffries-Matusita (JM) (Ma et al. Citation2021) distance between tree species is calculated and shown in . Higher JM values indicate greater separability of the two classes. In this study, the JM distances are classified into four classes: strongly separable (1.9–2.0), better separable (1.8–1.9), weakly separable (1.7–1.8), and poorly separable (<1.7). The JM distance between two different categories is expected to be greater than 1.8 to obtain a satisfactory classification. Several species do not co-exist in the same sub-region, so the JM distances were not calculated.

Figure 11. JM distances between species with (a) spectral bands and vegetation indices shown in , (b) spectral features and texture features, and (c) spectral features and time series features.

Figure 11. JM distances between species with (a) spectral bands and vegetation indices shown in Table 4, (b) spectral features and texture features, and (c) spectral features and time series features.

In Scheme a, the separability of most tree species pairs is poor due to the high similarity in spectral reflectance of the tree species. In Scheme b, separability between species improved significantly with the inclusion of textural features, but sassafras was still difficult to discriminate from Sassafras tsumu, Pinus yunnanensis, and Alnus cremastogyne Burk.. In Scheme c, the JM distances between species all reached over 1.9, suggesting that the time series feature can increase the separability of species.

5.2.2. The importance of environmental data in stand species classification

Regional arboreal species composition can be largely explained in terms of a long history of biogeographic and evolutionary events. Remote sensing-based tree species classification usually concentrates on data and methods, often ignoring the rich context that environmental data can provide.

In our study, the most important variable for stand species classification is Elevation, ranked highest in evaluation quantitative importance scores in all eight sub-regions. By overlaying the classification results with the elevation maps, the elevation percentage map of each tree species was obtained (), which also reflected the differences in elevation distribution of each tree species. For example, Larix gmelinii and Abies fabri are distributed above 3200 m, while Hevea brasiliensis and Sassafras tsumu occur at lower elevations below 1500 m. Numerous studies confirm that topographic features facilitate tree species classification, especially for species that follow a natural height gradient (Grabska, Frantz, and Ostapowicz Citation2020).

Figure 12. Distribution map of forest stand species in the altitude range.

Each species has its own preferred temperature range, acceptable temperature extremes, and rainfall. A large amount of literature indicates that climate conditions are the key factor determining the large-scale distribution patterns of organisms (Datta, Schweiger, and Kühn Citation2020). Our study uses environmental factors and the MaxEnt model to assign the probability of tree species occurrence for each geographical location. As shown in , adding these probability layers in two forms to the process based on remote sensing image classification can improve classification accuracy. The constraint of environmental factors on species distribution increases classification performance. shows that the classification accuracy of the categories with higher environmental constraints is also higher, such as Hevea brasiliensis, Abies fabri, and Larix gmelinii. Other species may be excluded due to lower environmental tolerance in the growing areas of these species, making these species have higher homogeneity. However, in our study, the range of some tree species was over-predicted. The climate variables used indicate that these tree species should have a wider geographical range than in reality, which may be because the climate factors we used are not the main limiting factors for the geographical range of tree species (Wiens and Graham Citation2005). In addition, the characteristics of the environmental factors used have a resolution of 1 km, and the low resolution ignores the micro-scale processes of local effects of tree species. Many tree species require specific small-scale habitat attributes (Sinclair, White, and Newell Citation2010). We believe that future research needs to consider more ecological and geographical factors, such as the CHELSA dataset, which contains explicit indices for many ecological and physiological processes. Our test results in local areas show that this data can provide more accurate distribution ranges. Of course, to generate high-precision spatiotemporal microclimate, it may be necessary to combine global climate datasets with in situ microclimate measurements, long-term meteorological station data, and high-resolution remote sensing data (Bobrowski, Weidinger, and Schickhoff Citation2021).

Table 7. Comparison of classification accuracies of different methods in each sub-region.

Download CSV Display Table

5.3. Species mapping performance by fusion multiple models

To evaluate the mapping performance of the MR serial model, we compare the classification performance of the M, ROPT, and MR models in this section. The MR method achieves the best accuracy in the vast majority of sub-regions (7 out of 8). visually demonstrates the advantages of the MR integrated model. The accuracy difference between MR and M for the eight sub-regions is from 21.00% to 31.06% with an average difference of 25.39%. The difference in accuracy with ROPT is 11.63% at the maximum and 5.47% at the minimum. Although the M classifier obtained the lowest OA in each partition, adding the classification results of the M classifier as features to ROPT resulted in a better classifier than the component classifier MR. The great advantage in classification accuracy proves the effectiveness and potential of the MR-integrated model for mapping tree species in forest stands. In addition, the classification accuracy of the MR serial integration method was higher than 70% in 8 partitions with different ecological conditions, which proved that the method can be used for tree stand species mapping under diverse ecological conditions with good stability.

shows the S2 false color images (the combination of bands 3, 4, and 8), the FMI data, and the classification results of the three methods, from four selected regions for visual evaluation. We can find that the classification results of different classifiers differed greatly. The M classifier with low resolution environmental data for classification predicted poorly that the geographically expressed area of each tree species was larger than its actual distribution area, which is consistent with the conclusions reached in previous studies (Chandra et al. Citation2021; Pshegusov et al. Citation2022). The RF model, which uses higher resolution remote sensing data for classification, provides finer details than the M model. However, there are still a lot of misclassifications and noise due to the low separability of tree species. The MR integrated by both can alleviate the problems of both, the integrated classifier MR achieves the best classification result, and its classification results are more consistent with the FMI data. MR by taking advantage of different data and classifiers. Environmental data provide a geographical representation of the ecological niche of tree species, and remote sensing images provide macroscopic and fine scale spectral, textural, and phenological differences of tree species. Orchestrating the interplay of various data and theories with modeling has been identified as a promising approach to obtaining species information (Maréchaux et al. Citation2021). Studies on species distribution and invasions (Engler et al. Citation2013; Kattenborn et al. Citation2019) also confirm that using integrated datasets and combining ecological niche models and machine learning algorithms can benefit improve classification accuracy.

Figure 13. The spatial details of classification results in forest stand species. (a) Original image, (b) the Forest Management Inventory data, (c) M classification, (d) ROPT classification, and (e) MR classification.

In addition, we collected typical tree species classification cases in recent years and compared the results of the cases with ours. A brief overview is shown in , and more detailed information can be found in the literature to read the original content (Bjerreskov, Nord-Larsen, and Fensholt Citation2021; Boschetti et al. Citation2007; Engler et al. Citation2013; Fang et al. Citation2020; Grabska, Frantz, and Ostapowicz Citation2020; Grabska et al. Citation2019; Hościło and Lewandowska Citation2019; Immitzer, Atzberger, and Koukal Citation2012; Ke, Quackenbush, and Im Citation2010; Li, Hu, and Noland Citation2013; Ma et al. Citation2021; Plakman et al. Citation2020; Shirazinejad, Zoej, and Latifi Citation2022; Sun et al. Citation2019; Wang and Ren Citation2021; Wessel, Brandmeier, and Tiede Citation2018).

Figure 14. Brief review on research cases in tree species classification.

Because these cases tested different area sizes, classification methods and tree species, it is not reasonable to directly compare precision. However, we can still compare individual study cases in terms of the following aspects. The species number, reported in listed study cases ranged from 3 to 19 species, the area tested ranged from 5.23 km² to 43,000 km², the resolution of the data used ranged from a maximum of 0.1 m to a minimum of 16 m, and the overall accuracies ranged from 61.3% to 97%. Compared with the listed cases, the number of tree species (19) and the size of the study area (394100 km²) in our study have a definite advantage. The overall accuracy of our study was 72.18%, which is low compared to some study cases. However, we did not use ultra-high resolution or additional input data (e.g. LiDAR). This accuracy is acceptable given the scope and class numbers of the study. This precision is significantly better than the results reported for the 19 tree species study (61.3%) (Fang et al. Citation2020) and comparable to the reported for the 18 tree species cases (73.25%) (Sun et al. Citation2019), but the area of the two cases is 8.6 km² and 177 km², respectively, which is much smaller than our study. In contrast, the proposed method can achieve accurate and stable mapping of tree species distribution in large areas under different ecological conditions.

5.4. Limitations and future work

The variety of multiple data in terms of spatial resolution, acquisition date and times may affect the classification performance and the spatial distribution patterns of the categories.

In the GEE platform, based on the geographical reference, multiple features extracted from datasets, including S2, SRTM DEM, and WorldClim bioclimatic, can be easily resampled to the same pixel size and stacked together. However, the up-sampling process of low-resolution data sets cannot provide extra information than the original data set. The resolution is usually coarse, and they often do not really describe the spatial distribution of tree species but rather to predict habitat suitability conditions at a given site (Engler et al. Citation2013). This is also the main reason that the probabilistic output of MaxEnt model was seen as an extra feature set, rather than directly assigned as a component classifier like RF or SVM. As we can see through , when the low spatial resolution probability map output from the MaxEnt model is integrated with the machine learning model, it significantly changes the spatial pattern of the output map, and species with fewer samples may be smoothed out by the dominant category. While this improves classification accuracy overall, we believe that the effect of this integration is not always positive, and that the introduction of the low spatial resolution probability maps output by the MaxEnt model may reduce the classification accuracy of certain spatial locations and specific categories, and reduce the distribution area of small sample categories.

Expected for spatial mismatch, the temporal mismatch among multiple data is another factor that may increase uncertainties of classification. To avoid missing some key tree species phenology, we decided to construct 5-day intensive time series data. However, in most areas of the study area, data from both S2A/B satellites still do not allow for 5-day repeat observations, and until March 2017, a single satellite required 10 days, which combined with the cloudy and rainy climate, resulting in fewer pixels available for image acquisition. To construct High-Spatiotemporal-Resolution time series data, we used S2 data for 5 years from 2016 to 2020, which were synthesized according to the time of image acquisition in the year (DOY, Day of Year) to construct a 5-day dense NDVI and REP time series. In this study, we used sample data from the 2016 FMI. This operation led to the mismatch in acquisition time between the sample data and the imagery used. This mismatch may introduce uncertainties.

In this study, we predicted the distribution of 19 tree species in Yunnan by integrating machine learning classifiers and ecological niche models and further experiments and validations can be conducted in other types of forests in the future. We used only four models in this experiment, and more models can be introduced in the future to increase the diversity and complementarity of models. In addition, with growing data availability, more eco-physiological and remote sensing data can be integrated into the models. This process will benefit species classification by a better understanding of the species physiological limits and selecting features more sensitive to species range limits.

6. Conclusions

Detailed and accurate mapping of forest stand species has an important practical need. To achieve effective mapping of the mountain vegetation, it is necessary to fully consider the ecological characteristics of tree species and the displayed remote sensing characteristics and to use multiple data and classification methods for integrated discrimination. In this study, the spatial distribution of 19 stand tree species in Yunnan Province was mapped based on the GEE platform, in collaboration with remote sensing image machine learning algorithm classification and ecological niche modeling, with an overall accuracy of 72.18% assessed through 53,934 validation sample points. Higher classification accuracy was obtained by integrating decision fusion models from both domains compared to supervised classification or ecological niche modeling of environmental data. The study shows that integrating machine learning algorithms and ecological niche models is effective in regions with high environmental heterogeneity. Spectral, spatial, and temporal extracted from remote sensing data and various environmental variables contribute to tree species classification.

Supplemental material

Supplemental Material

Download MS Word (8.9 MB)

Acknowledgments

We thank anonymous reviewers and editors for their valubale comments and suggestions, which greatly improved the quality of our maniucrpt.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15481603.2023.2211881.

Additional information

Funding

The work was supported by the National Natural Science Foundation of China [32160369]; Major Scientific and Technological Projects of Yunnan Province [202202AD080010]; National Natural Science Foundation of China [31860182]; National Natural Science Foundation of China [32060320]; National Natural Science Foundation of China [41961053].

References

Abatzoglou, J. T., S. Z. Dobrowski, S. A. Parks, and K. C. Hegewisch. 2018. “TerraClimate, a High-Resolution Global Dataset of Monthly Climate and Climatic Water Balance from 1958–2015.” Scientific Data 5 (1): 1–24. doi:10.1038/sdata.2017.191.
PubMedGoogle Scholar
Abdi, A. M. 2020. “Land Cover and Land Use Classification Performance of Machine Learning Algorithms in a Boreal Landscape Using Sentinel-2 Data.” GIScience & Remote Sensing 57 (1): 1–20. doi:10.1080/15481603.2019.1650447.
Web of Science ®Google Scholar
Ahamed, T., L. Tian, Y. Zhang, and K. C. Ting. 2011. “A Review of Remote Sensing Methods for Biomass Feedstock Production.” Biomass & bioenergy 35 (7): 2455–2469. doi:10.1016/j.biombioe.2011.02.028.
Web of Science ®Google Scholar
Alonso, L., J. Picos, and J. Armesto. 2021. “Forest Cover Mapping and Pinus Species Classification Using Very High-Resolution Satellite Images and Random Forest.” ISPRS Annals of the Photogrammetry, Remote Sensing & Spatial Information Sciences 3: 203–210. doi:10.5194/isprs-annals-V-3-2021-203-2021.
Google Scholar
Bhatt, P., A. Maclean, Y. Dickinson, and C. Kumar. 2022. “Fine-Scale Mapping of Natural Ecological Communities Using Machine Learning Approaches.” Remote Sensing 14 (3): 563. doi:10.3390/rs14030563.
Web of Science ®Google Scholar
Bjerreskov, K. S., T. Nord-Larsen, and R. Fensholt. 2021. “Classification of Nemoral Forests with Fusion of Multi-Temporal Sentinel-1 and 2 Data.” Remote Sensing 13 (5): 950. doi:10.3390/rs13050950.
Web of Science ®Google Scholar
Bobrowski, M., J. Weidinger, and U. Schickhoff. 2021. “Is New Always Better? Frontiers in Global Climate Datasets for Modeling Treeline Species in the Himalayas.” Atmosphere 12: 543. doi:10.3390/atmos12050543.
Web of Science ®Google Scholar
Boschetti, M., L. Boschetti, S. Oliveri, L. Casati, and I. Canova. 2007. “Tree Species Mapping with Airborne Hyper‐spectral MIVIS Data: The Ticino Park Study Case.” International Journal of Remote Sensing 28 (6): 1251–1261. doi:10.1080/01431160600928542.
Web of Science ®Google Scholar
Cano, E., J. -P. Denux, M. Bisquert, L. Hubert-Moy, and V. Chéret. 2017. “Improved Forest-Cover Mapping Based on MODIS Time Series and Landscape Stratification.” International Journal of Remote Sensing 38 (7): 1865–1888. doi:10.1080/01431161.2017.1280635.
Web of Science ®Google Scholar
Carlson, T. N., and D. A. Ripley. 1997. “On the Relation Between NDVI, Fractional Vegetation Cover, and Leaf Area Index.” Remote Sensing of Environment 62 (3): 241–252. doi:10.1016/S0034-4257(97)00104-1.
Web of Science ®Google Scholar
Chandrasekar, K., M. Sesha Sai, P. Roy, and R. S. Dwevedi. 2010. “Land Surface Water Index (LSWI) Response to Rainfall and NDVI Using the MODIS Vegetation Index Product.” International Journal of Remote Sensing 31 (15): 3987–4005. doi:10.1080/01431160802575653.
Web of Science ®Google Scholar
Chandra, N., G. Singh, S. Lingwal, J. S. Jalal, M. S. Bisht, V. Pal, M. P. S. Bisht, B. Rawat, and L. M. Tiwari. 2021. “Ecological Niche Modeling and Status of Threatened Alpine Medicinal Plant Dactylorhiza Hatagirea D. Don in Western Himalaya.” Journal of Sustainable Forestry 41 (10): 1–17. doi:10.1080/10549811.2021.1923530.
Web of Science ®Google Scholar
Costa, H., P. Benevides, F. D. Moreira, D. Moraes, and M. Caetano. 2022. “Spatially Stratified and Multi-Stage Approach for National Land Cover Mapping Based on Sentinel-2 Data and Expert Knowledge.” Remote Sensing 14 (8): 1865. doi:10.3390/rs14081865.
Web of Science ®Google Scholar
Dash, J., and P. Curran. 2004. “The MERIS Terrestrial Chlorophyll Index.” International Journal of Remote Sensing 25 (23): 5403–5413. doi:10.1080/0143116042000274015.
Web of Science ®Google Scholar
Datta, A., O. Schweiger, and I. Kühn. 2020. “Origin of Climatic Data Can Determine the Transferability of Species Distribution Models.” NeoBiota 59: 61–76. doi:10.3897/neobiota.59.36299.
Google Scholar
Duan, B., S. Fang, Y. Gong, Y. Peng, X. Wu, and R. Zhu. 2021. “Remote Estimation of Grain Yield Based on UAV Data in Different Rice Cultivars Under Contrasting Climatic Zone.” Field Crops Research 267: 108148. doi:10.1016/j.fcr.2021.108148.
Web of Science ®Google Scholar
Engler, R., L. T. Waser, N. E. Zimmermann, M. Schaub, S. Berdos, C. Ginzler, and A. Psomas. 2013. “Combining Ensemble Modeling and Remote Sensing for Mapping Individual Tree Species at High Spatial Resolution.” Forest Ecology and Management 310: 64–73. doi:10.1016/j.foreco.2013.07.059.
Web of Science ®Google Scholar
Fang, F., B. E. McNeil, T. A. Warner, A. E. Maxwell, G. A. Dahle, E. Eutsler, and J. Li. 2020. “Discriminating Tree Species at Different Taxonomic Levels Using Multi-Temporal WorldView-3 Imagery in Washington DC, USA.” Remote Sensing of Environment 246: 111811. doi:10.1016/j.rse.2020.111811.
Web of Science ®Google Scholar
Fang, P., X. Zhang, P. Wei, Y. Wang, H. Zhang, F. Liu, and J. Zhao. 2020. “The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 M Resolution Imagery.” Applied Sciences 10 (15): 5075. doi:10.3390/app10155075.
Google Scholar
Fassnacht, F. E., H. Latifi, K. Stereńczak, A. Modzelewska, M. Lefsky, L. T. Waser, C. Straub, and A. Ghosh. 2016. “Review of Studies on Tree Species Classification from Remotely Sensed Data.” Remote Sensing of Environment 186: 64–87. doi:10.1016/j.rse.2016.08.013.
Web of Science ®Google Scholar
Fassnacht, F. E., C. Neumann, M. Forster, H. Buddenbaum, A. Ghosh, A. Clasen, P. K. Joshi, and B. Koch. 2014. “Comparison of Feature Reduction Algorithms for Classifying Tree Species with Hyperspectral Data on Three Central European Test Sites.” IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7 (6): 2547–2561. doi:10.1109/jstars.2014.2329390.
Web of Science ®Google Scholar
Feng, B., C. Zheng, W. Zhang, L. Wang, and C. Yue. 2020. “Analyzing the Role of Spatial Features When Cooperating Hyperspectral and LiDar Data for the Tree Species Classification in a Subtropical Plantation Forest Area.” Journal of Applied Remote Sensing 14 (02): 022213. doi:10.1117/1.JRS.14.022213.
Google Scholar
Gao, B. -C. 1996. “NDWI—A Normalized Difference Water Index for Remote Sensing of Vegetation Liquid Water from Space.” Remote Sensing of Environment 58 (3): 257–266. doi:10.1016/S0034-4257(96)00067-3.
Web of Science ®Google Scholar
Gong, P., J. Wang, L. Yu, Y. Zhao, Y. Zhao, L. Liang, Z. Niu, et al. 2013. “Finer Resolution Observation and Monitoring of Global Land Cover: First Mapping Results with Landsat TM and ETM+ Data.” International Journal of Remote Sensing 34 (7): 2607–2654. doi:10.1080/01431161.2012.748992.
Web of Science ®Google Scholar
Grabska, E., D. Frantz, and K. Ostapowicz. 2020. “Evaluation of Machine Learning Algorithms for Forest Stand Species Mapping Using Sentinel-2 Imagery and Environmental Data in the Polish Carpathians.” Remote Sensing of Environment 251: 112103. doi:10.1016/j.rse.2020.112103.
Web of Science ®Google Scholar
Grabska, E., P. Hostert, D. Pflugmacher, and K. Ostapowicz. 2019. “Forest Stand Species Mapping Using the Sentinel-2 Time Series.” Remote Sensing 11 (10): 1197. doi:10.3390/rs11101197.
Web of Science ®Google Scholar
Guo, Y., Z. Li, E. Chen, X. Zhang, L. Zhao, E. Xu, Y. Hou, and R. Sun. 2020. “An End-To-End Deep Fusion Model for Mapping Forests at Tree Species Levels with High Spatial Resolution Satellite Imagery.” Remote Sensing 12 (20): 3324. doi:10.3390/rs12203324.
Web of Science ®Google Scholar
Hansen, M. C., P. V. Potapov, R. Moore, M. Hancher, S. A. Turubanova, A. Tyukavina, D. Thau, et al. 2013. “High-Resolution Global Maps of 21st-Century Forest Cover Change.” Science 342 (6160): 850–853. doi:10.1126/science.1244693.
PubMed Web of Science ®Google Scholar
Hijmans, R. J., S. E. Cameron, J. L. Parra, P. G. Jones, and A. Jarvis. 2005. “Very High Resolution Interpolated Climate Surfaces for Global Land Areas.” International Journal of Climatology: A Journal of the Royal Meteorological Society 25 (15): 1965–1978. doi:10.1002/joc.1276.
Web of Science ®Google Scholar
Hościło, A., and A. Lewandowska. 2019. “Mapping Forest Type and Tree Species on a Regional Scale Using Multi-Temporal Sentinel-2 Data.” Remote Sensing 11 (8): 929. doi:10.3390/rs11080929.
Web of Science ®Google Scholar
Hua, Z., and M. Heil. 2017. “Biogeographical Divergence of the Flora of Yunnan, Southwestern China Initiated by the Uplift of Himalaya and Extrusion of Indochina Block.” PLos One 7 (9): 7. doi:10.1371/journal.pone.0045601.
Google Scholar
Immitzer, M., C. Atzberger, and T. Koukal. 2012. “Tree Species Classification with Random Forest Using Very High Spatial Resolution 8-Band WorldView-2 Satellite Data.” Remote Sensing 4 (9): 2661–2693. doi:10.3390/rs4092661.
Web of Science ®Google Scholar
Kattenborn, T., J. Lopatin, M. Förster, A. C. Braun, and F. E. Fassnacht. 2019. “UAV Data as Alternative to Field Sampling to Map Woody Invasive Species Based on Combined Sentinel-1 and Sentinel-2 Data.” Remote Sensing of Environment 227: 61–73. doi:10.1016/j.rse.2019.03.025.
Web of Science ®Google Scholar
Ke, Y., L. J. Quackenbush, and J. Im. 2010. “Synergistic Use of QuickBird Multispectral Imagery and LIDAR Data for Object-Based Forest Species Classification.” Remote Sensing of Environment 114 (6): 1141–1154. doi:10.1016/j.rse.2010.01.002.
Web of Science ®Google Scholar
Kollert, A., M. Bremer, M. Löw, and M. Rutzinger. 2021. “Exploring the Potential of Land Surface Phenology and Seasonal Cloud Free Composites of One Year of Sentinel-2 Imagery for Tree Species Mapping in a Mountainous Region.” International Journal of Applied Earth Observation and Geoinformation 94: 102208. doi:10.1016/j.jag.2020.102208.
Web of Science ®Google Scholar
LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep Learning.” Nature 521 (7553): 436–444. doi:10.1038/nature14539.
PubMed Web of Science ®Google Scholar
Li, X., and W. D. 1986. “The Plant Geography of Yunnan Province, Southwest China.” I Historical Biogeography: Journal of Biogeography 13 (5): 367–397. doi:10.2307/2844964.
Google Scholar
Li, R., P. Fang, W. Xu, L. Wang, G. Ou, W. Zhang, and X. Huang. 2022. “Classifying Forest Types Over a Mountainous Area in Southwest China with Landsat Data Composites and Multiple Environmental Factors.” Forests 13 (1): 135. doi:10.3390/f13010135.
Web of Science ®Google Scholar
Li, J., B. Hu, and T. L. Noland. 2013. “Classification of Tree Species Based on Structural Features Derived from High Density LiDar Data.” Agricultural and Forest Meteorology 171-172: 104–114. doi:10.1016/j.agrformet.2012.11.012.
Web of Science ®Google Scholar
Li, R., N. J. Kraft, J. Yang, and Y. Wang. 2015. “A Phylogenetically Informed Delineation of Floristic Regions Within a Biodiversity Hotspot in Yunnan, China.” Scientific Reports 5 (1): 1–7. doi:10.1016/S0034-4257(96)00178-2.
Google Scholar
Li, W., and S. Pei. 1991. “A Study of Flora of Yunnan.” Guihaia 11: 293–303.
Google Scholar
Liu, H., P. Gong, J. Wang, X. Wang, G. Ning, and B. Xu. 2021. “Production of Global Daily Seamless Data Cubes and Quantification of Global Land Cover Change from 1985 to 2020-iMap World 1.0.” Remote Sensing of Environment 258: 112364. doi:10.1016/j.rse.2021.112364.
Web of Science ®Google Scholar
Liu, F., J. Hu, F. Yang, and X. Li. 2021. “Heterogeneity-Diversity Relationships in Natural Areas of Yunnan, China.” Chinese Geographical Science 31 (3): 506–521. doi:10.1007/s11769-021-1207-7.
Web of Science ®Google Scholar
Liu, Z., and H. Peng. 2016. “Notes on the Key Role of Stenochoric Endemic Plants in the Floristic Regionalization of Yunnan.” Plant Diversity 38 (6): 289–294. doi:10.1016/S0034-4257(96)00067-3.
PubMedGoogle Scholar
Long, T., Z. Zhang, G. He, W. Jiao, C. Tang, B. Wu, X. Zhang, G. Wang, and R. Yin. 2019. “30 M Resolution Global Annual Burned Area Mapping Based on Landsat Images and Google Earth Engine.” Remote Sensing 11 (5): 489. doi:10.3390/rs11050489.
Web of Science ®Google Scholar
Main, R., M. A. Cho, R. Mathieu, M. M. O’Kennedy, A. Ramoelo, and S. Koch. 2011. “An Investigation into Robust Spectral Indices for Leaf Chlorophyll Estimation.” Isprs Journal of Photogrammetry & Remote Sensing 66 (6): 751–761. doi:10.1016/j.isprsjprs.2011.08.001.
Web of Science ®Google Scholar
Ma, M., J. Liu, M. Liu, J. Zeng, and Y. Li. 2021. “Tree Species Classification Based on Sentinel-2 Imagery and Random Forest Classifier in the Eastern Regions of the Qilian Mountains.” Forests 12 (12): 1736. doi:10.3390/f12121736.
Web of Science ®Google Scholar
Maréchaux, I., F. Langerwisch, A. Huth, H. Bugmann, X. Morin, C. P. O. Reyer, R. Seidl, et al. 2021. “Tackling Unresolved Questions in Forest Ecology: The Past and Future Role of Simulation Models.” Ecology & Evolution 11 (9): 3746–3770. doi:10.1002/ece3.7391.
PubMed Web of Science ®Google Scholar
Michaowska, M., and J. Rapiński. 2021. “A Review of Tree Species Classification Based on Airborne LiDar Data and Applied Classifiers.” Remote Sensing 13 (3): 353. doi:10.3390/rs13030353.
Web of Science ®Google Scholar
Moraes, D., P. Benevides, H. Costa, F. D. Moreira, and M Caetano. 2021. “Assessment of the Introduction of Spatial Stratification and Manual Training in Automatic Supervised Image Classification.” In Earth Resources and Environmental Remote Sensing/GIS Applications XII, LNEE, 291–298. Singapore: SPIEdoi:10.1007/978-981-16-0289-4_39.
Google Scholar
Mouta, N., R. Silva, S. Pais, J. M. Alonso, J. F. Gonçalves, J. Honrado, and J. R. Vicente. 2021. “‘The Best of Two Worlds’—Combining Classifier Fusion and Ecological Models to Map and Explain Landscape Invasion by an Alien Shrub.” Remote Sensing 13 (16): 3287. doi:10.3390/rs13163287.
Web of Science ®Google Scholar
Myers, N., R. A. Mittermeier, C. G. Mittermeier, G. A. B. da Fonseca, and J. Kent. 2000. “Biodiversity Hotspots for Conservation Priorities.” Nature 403 (6772): 853–858. doi:10.1038/35002501.
PubMed Web of Science ®Google Scholar
Oreopoulos, L., M. J. Wilson, and T. Várnai. 2011. “Implementation on Landsat Data of a Simple Cloud-Mask Algorithm Developed for MODIS Land Bands.” IEEE Geoscience & Remote Sensing Letters 8 (4): 597–601. doi:10.1109/LGRS.2010.2095409.
Web of Science ®Google Scholar
Ørka, H. O., M. Dalponte, T. Gobakken, E. Næsset, and L. T. Ene. 2013. “Characterizing Forest Species Composition Using Multiple Remote Sensing Data Sources and Inventory Approaches.” Scandinavian Journal of Forest Research 28 (7): 677–688. doi:10.1080/02827581.2013.793386.
Web of Science ®Google Scholar
Phillips, S. J., R. P. Anderson, and R. E. Schapire. 2006. “Maximum Entropy Modeling of Species Geographic Distributions.” Ecological Modelling 190 (3–4): 231–259. doi:10.1016/j.ecolmodel.2005.03.026.
Web of Science ®Google Scholar
Plakman, V., T. Janssen, N. Brouwer, and S. Veraverbeke. 2020. “Mapping Species at an Individual-Tree Scale in a Temperate Forest, Using Sentinel-2 Images, Airborne Laser Scanning Data, and Random Forest Classification.” Remote Sensing 12 (22): 3710. doi:10.3390/rs12223710.
Web of Science ®Google Scholar
Pshegusov, R., F. Tembotova, V. Chadaeva, Y. Sablirova, M. Mollaeva, and A. Akhomgotov. 2022. “Ecological Niche Modeling of the Main Forest-Forming Species in the Caucasus.” Forest Ecosystems 9: 100019. doi:10.1016/j.fecs.2022.100019.
Web of Science ®Google Scholar
Richardson, A. D., S. P. Duigan, and G. P. Berlyn. 2002. “An Evaluation of Noninvasive Methods to Estimate Foliar Chlorophyll Content.” The New Phytologist 153 (1): 185–194. doi:10.3390/rs12203324.
Web of Science ®Google Scholar
Rozenstein, O., N. Haymann, G. Kaplan, and J. Tanny. 2019. “Validation of the Cotton Crop Coefficient Estimation Model Based on Sentinel-2 Imagery and Eddy Covariance Measurements.” Agricultural Water Management 223: 105715. doi:10.1016/j.agwat.2019.105715.
Web of Science ®Google Scholar
Schlerf, M., C. Atzberger, and J. Hill. 2005. “Remote Sensing of Forest Biophysical Variables Using HyMap Imaging Spectrometer Data.” Remote Sensing of Environment 95 (2): 177–194. doi:10.1016/j.rse.2004.12.016.
Web of Science ®Google Scholar
Shirazinejad, G., M. Zoej, and H. Latifi. 2022. “Applying Multidate Sentinel-2 Data for Forest-Type Classification in Complex Broadleaf Forest Stands.” Forestry: An International Journal of Forest Research 95 (3): 363–379. doi:10.1093/forestry/cpac001.
Google Scholar
Sinclair, S. J., M. D. White, and G. R. Newell. 2010. “How Useful are Species Distribution Models for Managing Biodiversity Under Future Climates?” Ecology and Society 15. doi:10.5751/ES-03089-150108.
Web of Science ®Google Scholar
Sun, Y., J. Huang, Z. Ao, D. Lao, and Q. Xin. 2019. “Deep Learning Approaches for the Mapping of Tree Species Diversity in a Tropical Wetland Using Airborne LiDar and High-Spatial-Resolution Remote Sensing Images.” Forests 10 (11): 1047. doi:10.3390/f10111047.
Web of Science ®Google Scholar
Tamiminia, H., B. Salehi, M. Mahdianpari, L. Quackenbush, S. Adeli, and B. Brisco. 2020. “Google Earth Engine for Geo-Big Data Applications: A Meta-Analysis and Systematic Review.” Isprs Journal of Photogrammetry & Remote Sensing 164: 152–170. doi:10.1016/j.isprsjprs.2020.04.001.
Web of Science ®Google Scholar
Wang, X., and H. Ren. 2021. “DBMF: A Novel Method for Tree Species Fusion Classification Based on Multi-Source Images.” Forests 13 (1): 33. doi:10.3390/f13010033.
Web of Science ®Google Scholar
Wan, H., Y. Tang, L. Jing, H. Li, F. Qiu, and W. Wu. 2021. “Tree Species Classification of Forest Stands Using Multisource Remote Sensing Data.” Remote Sensing 13 (1): 144. doi:10.3390/rs13010144.
Web of Science ®Google Scholar
Waser, L. T., C. Ginzler, M. Kuechler, E. Baltsavias, and L. Hurni. 2011. “Semi-Automatic Classification of Tree Species in Different Forest Ecosystems by Spectral and Geometric Variables Derived from Airborne Digital Sensor (ADS40) and RC30 Data.” Remote Sensing of Environment 115 (1): 76–85. doi:10.1016/j.rse.2010.08.006.
Web of Science ®Google Scholar
Welle, T., L. Aschenbrenner, K. Kuonath, S. Kirmaier, and J. Franke. 2022. “Mapping Dominant Tree Species of German Forests.” Remote Sensing 14 (14): 3330. doi:10.3390/rs14143330.
Web of Science ®Google Scholar
Wessel, M., M. Brandmeier, and D. Tiede. 2018. “Evaluation of Different Machine Learning Algorithms for Scalable Classification of Tree Types and Tree Species Based on Sentinel-2 Data.” Remote Sensing 10 (9): 1419. doi:10.3390/rs10091419.
Web of Science ®Google Scholar
Wiens, J. J., and C. H. Graham. 2005. “Niche Conservatism: Integrating Evolution, Ecology, and Conservation Biology.” Annual Review of Ecology, Evolution, and Systematics 36 (1): 519–539. doi:10.1146/annurev.ecolsys.36.102803.095431.
Google Scholar
Wu, C., Z. Niu, Q. Tang, and W. Huang. 2008. “Estimating Chlorophyll Content from Hyperspectral Vegetation Indices: Modeling and Validation.” Agricultural and Forest Meteorology 148 (8–9): 1230–1241. doi:10.1016/j.agrformet.2008.03.005.
Web of Science ®Google Scholar
Yang, Y., K. Tian, J. Hao, S. Pei, and Y. Yang. 2004. “Biodiversity and Biodiversity Conservation in Yunnan, China.” Biodiversity & Conservation 13 (4): 813–826. doi:10.1023/B:BIOC.0000011728.46362.3c.
Web of Science ®Google Scholar
You, N., J. Dong, J. Huang, G. Du, G. Zhang, Y. He, T. Yang, Y. Di, and X. Xiao. 2021. “The 10-M Crop Type Maps in Northeast China During 2017–2019.” Scientific Data 8 (1): 1–11. doi:10.1038/s41597-021-00827-9.
PubMed Web of Science ®Google Scholar
Yu, X., D. Lu, X. Jiang, G. Li, Y. Chen, D. Li, and E. Chen. 2020. “Examining the Roles of Spectral, Spatial, and Topographic Features in Improving Land-Cover and Forest Classifications in a Subtropical Region.” Remote Sensing 12 (18): 2907. doi:10.3390/rs12182907.
Web of Science ®Google Scholar
Zeb, S. A., S. M. Khan, and Z. Ahmad. 2021. “Phytogeographic Elements and Vegetation Along the River Panjkora - Classification and Ordination Studies from the Hindu Kush Mountains Range.” The Botanical Review 87 (4): 1–25. doi:10.1007/s12229-021-09247-1.
Web of Science ®Google Scholar
Zhang, Y., I. O. Odeh, and C. Han. 2009. “Bi-Temporal Characterization of Land Surface Temperature in Relation to Impervious Surface Area, NDVI and NDBI, Using a Sub-Pixel Image Analysis.” International Journal of Applied Earth Observation and Geoinformation 11 (4): 256–264. doi:10.1016/j.jag.2009.03.001.
Web of Science ®Google Scholar
Zhang, B., L. Zhao, and X. Zhang. 2020. “Three-Dimensional Convolutional Neural Network Model for Tree Species Classification Using Airborne Hyperspectral Images.” Remote Sensing of Environment 247: 111938. doi:10.1016/j.rse.2020.111938.
Web of Science ®Google Scholar
Zheng, G., A. Bao, X. Li, L. Jiang, C. Chang, T. Chen, and Z. Gao. 2019. “The Potential of Multispectral Vegetation Indices Feature Space for Quantitatively Estimating the Photosynthetic, Non-Photosynthetic Vegetation and Bare Soil Fractions in Northern China.” Photogrammetric Engineering & Remote Sensing 85 (1): 65–76. doi:10.14358/PERS.85.1.65.
Web of Science ®Google Scholar
Zhong, L., L. Hu, and H. Zhou. 2019. “Deep Learning Based Multi-Temporal Crop Classification.” Remote Sensing of Environment 221: 430–443. doi:10.1016/j.rse.2018.11.032.
Web of Science ®Google Scholar
Zhou, Y., Z. Zhang, B. Zhu, X. Cheng, L. Yang, M. Gao, and R. Kong. 2021. “MaxEnt Modeling Based on CMIP6 Models to Project Potential Suitable Zones for Cunninghamia Lanceolata in China.” Forests 12 (6): 752. doi:10.3390/f12060752.
Web of Science ®Google Scholar

Download PDF

Share icon
Back to Top

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.

People also read
Recommended articles
Cited by

To cite this article:

Reference style: APA Chicago Harvard

Citation copied to clipboard

Reference styles above use APA (6th edition), Chicago (16th edition) & Harvard (10th edition)

Download citation

Download a citation file in RIS format that can be imported by citation management software including EndNote, ProCite, RefWorks and Reference Manager.

Choose format: RIS BibTex RefWorks Direct Export

Choose options: Citation Citation & abstract Citation & references

Your download is now in progress and you may close this window

Did you know that with a free Taylor & Francis Online account you can gain access to the following benefits?

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Have an account?
Login now Don't have an account?
Register for free

Login or register to access this feature

Have an account?
Login now Don't have an account?
Register for free

Choose new content alerts to be informed about new research of interest to you
Easy remote access to your institution's subscriptions on any device, from any location
Save your searches and schedule alerts to send you new results
Export your search results into a .csv file to support your research

Regionalized classification of stand tree species in mountainous forests by fusing advanced classifiers and ecological niche model

ABSTRACT

1. Introduction

2. Materials

2.1. Study area

2.2. Multi-source data

2.2.1. Sentinel-2 data

2.2.2. Bioclimatic and topographic data

2.2.3. Forest Management Inventory data

Table 1. Observed class and its acronym.

3. Methodology

3.1. Floristic regionalization

3.2. Data pre-processing

3.2.1. Sentinel-2 and topographic variables

3.2.2. Reference data

3.3. Feature pools and feature reduction

3.3.1. Mining features from multiple data

Table 2. The vegetation indices and corresponding formulas used in this study.

Table 3. The feature candidates for the forest mask generation.

Table 4. The feature candidates for forest stand species classification.

3.3.2. Feature selection

3.4. Classification with component classifiers

3.5. Classification by multi-classifier fusion

4. Results

4.1. Experimental setting and accuracy assessment

Table 5. Feature-classifier combinations explored in this study.

4.2. Forests cover mapping

4.3. Result at species level

Table 6. Accuracy assessment for different classification scenarios in each sub-region. The highest accuracies over all sub-regions are in bold.

5. Discussion

5.1. The necessity of floristic regionalization

5.2. Variable importance assessment

5.2.1. The importance of Sentinel-2 data in stand species classification

5.2.2. The importance of environmental data in stand species classification

Table 7. Comparison of classification accuracies of different methods in each sub-region.

5.3. Species mapping performance by fusion multiple models

5.4. Limitations and future work

6. Conclusions

Supplemental Material

Acknowledgments

Disclosure statement

Supplementary material

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date