526
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A patch filling method for thematic map refinement: a case study on forest cover mapping in the Greater Mekong Subregion and Malaysia

ORCID Icon, ORCID Icon, &
Article: 2252225 | Received 30 Mar 2023, Accepted 22 Aug 2023, Published online: 05 Sep 2023

ABSTRACT

Accurate forest cover mapping is essential for monitoring the status of forest extent in Southeast Asia. However, tropical areas frequently experience cloud cover, resulting in invalid or missing data in thematic maps. The initial 2005 and 2010 forest cover maps produced by the collaboration of the Greater Mekong Subregion and Malaysia (GMS+) economies contain unclassified pixels in the areas affected by cloud or cloud shadow. To enhance the usability and effectiveness of the 2005 and 2010 GMS+ forest cover maps for further analysis and applications, we present a novel method for accurately mapping forest cover in the presence of cloud cover. We employed a pixel-based algorithm to create clear view composites and automatically generated land cover training labels from the existing forest cover maps. We then reclassified the invalid areas and produced updated maps. The land cover types for all previously missing pixels have been successfully reclassified. The accuracy of this method was assessed at both the pixel and region level, with an overall accuracy of 94.2% at the forest/non-forest level and 86.6% at the finer classification level by pixel level assessment across all reclassified patches, and 93.2% at the forest/non-forest level and 89.9% at the finer level by region level for the selected site. There are 2.6% of forest and 0.7% of non-forest areas in the 2005 map, as well as 2.7% of forest and 0.6% of non-forest in the 2010 map have been reclassified from invalid pixels. This approach provides a framework for filling invalid areas in the existing thematic map toward improving its spatial continuity. The updated outputs provide more accurate and reliable information than the initial maps on the status of forest extent in the GMS+, which is critical for effective forest management and sustainable use in the region.

Introduction

Forests represent an invaluable natural resource due to their crucial role in mitigating climate change by absorbing carbon dioxide and releasing oxygen into the atmosphere (Bonan Citation2008; Prentice et al. Citation2001). Southeast Asia, renowned for its abundant and diverse forest resources, provides indispensable ecosystem services such as watershed protection, soil conservation, carbon sequestration, and local and regional climate regulation while generating significant economic benefits (Bond Citation2009; Michon et al. Citation2007; Sodhi et al. Citation2010). However, due to deforestation, land conversion, illegal logging, and climate change, Southeast Asia’s forest resources and livelihoods are facing significant risks (Hughes Citation2017; Jones et al. Citation2020; Wells et al. Citation2007; Wilcove et al. Citation2013). In this context, monitoring forest resources involves observing and assessing the status of forests to evaluate the impacts of deforestation and degradation on local livelihoods, regional economies, and global sustainability. Precise and comprehensive monitoring of forest resources is critical for developing effective forest management and implementing policies and regulations aimed at promoting sustainable use in Southeast Asia (Estoque et al. Citation2019; MacDicken et al. Citation2015; Senga Citation2004). In addition, the implementation of international agreements such as the Convention on Biological Diversity (CBD), the United Nations Framework Convention on Climate Change (UNFCCC), and the United Nations Forum on Forests (UNFF) requires accurate and reliable information on forests to support decision-making and to track progress toward its goals (Stibig et al. Citation2007).

The Greater Mekong Subregion (GMS) is a vast area drained by the Lancang-Mekong River, the largest river in Southeast Asia, which flows through six countries: Cambodia, the People’s Republic of China (specifically Yunnan Province, and Guangxi Zhuang Autonomous Region), the Lao People’s Democratic Republic, Myanmar, Thailand, and Vietnam (Senga Citation2004). The GMS is characterized by its rich natural resources, including diverse landscapes, high levels of biodiversity, and significant economic potential. However, the region faces increasing pressures from land use and land cover changes, urbanization, and infrastructure development, resulting in rapid social and economic development as well as population growth (Costenbader et al. Citation2015; Han and Song Citation2022; Kayiranga et al. Citation2021; Rowcroft Citation2008; Smith et al. Citation2013). Despite these challenges, the GMS remains one of the most biologically diverse areas on the planet and is home to numerous threatened species and ecosystems.

Remote sensing technology has emerged as an ideal solution for mapping and monitoring forest cover and change in the GMS. In recent decades, advancements in earth observation technologies have facilitated the availability of a large amount of medium to high spatial resolution remote sensing data at free or reasonably low costs. This has enabled researchers to gain valuable insights into the region’s forest cover and changes over time, as reported in several studies (Drusch et al. Citation2012; Woodcock et al. Citation2008; Wulder et al. Citation2012; Zhu et al. Citation2019). Continuous coverage and regular updates are necessary to achieve effective forest mapping and to understand forest resources. Various studies have utilized Landsat and Sentinel satellite imagery to capture the spatial patterns and distribution of forest cover in the GMS (Leinenkugel et al. Citation2015; Miettinen, Stibig, and Achard Citation2014; Poortinga et al. Citation2019). The use of multi-temporal data from diverse satellites has allowed researchers to identify changes in forest cover over time and track the extent and causes of deforestation and degradation (Hansen et al. Citation2008; He et al. Citation2022; Stibig et al. Citation2014). The integration of optical and radar data from multiple sensors has also led to the development of new methods for monitoring forest cover and changes (Dong et al. Citation2013; Maskell et al. Citation2021; Sarzynski et al. Citation2020).

The acquisition of clear view remote sensing data in tropical areas is a long-standing challenge (Asner Citation2001; Huete and Saleska Citation2010; Li, Feng, and Xiao Citation2018; Potapov et al. Citation2012). In response to this challenge, multiple image compositing algorithms have been designed to generate clear view remote sensing images. Pixel-based image compositing methods can utilize heavily cloud-covered images to identify clear-view observations. Various criteria have been proposed to discern high-quality pixels from the collection of available observations. Previous studies have commonly employed single criteria, such as selecting the observation with the maximum or minimum spectral band/index value, or considering the mean or median value among available observations as the compositing rule (Ju et al. Citation2010; Roy et al. Citation2010). In recent years, multi-criteria compositing methods have emerged, where multiple scores were calculated to assess the quality of pixels. These pixel scores encompass factors such as the acquisition year, sensor score, day of the year, distance of a given pixel to the nearest cloudy pixel, atmospheric opacity score, and other relevant criteria (Griffiths, Nendel, and Hostert Citation2019; White et al. Citation2014; Zhu et al. Citation2015). By evaluating these scores, the best observations were selected from the available candidates and composed of clear view images. Despite these efforts, cloud cover remains a significant impediment to accurate remote sensing mapping and monitoring in tropical areas. To address this issue, the “Forest Cover and Carbon Mapping in the Greater Mekong Subregion and Malaysia” project (https://www.apfnet.cn/plus/view.php?aid=3967) was initiated and supported by the Asia-Pacific Network for Sustainable Forest Management and Rehabilitation (APFNet) in collaboration with scientific researchers from various institutions in the Greater Mekong Subregion and Malaysia (GMS+). The project, which began in 2010, aims to assess the impacts of forest cover, forest change, and carbon stocks on the environment and communities in this region (Li, Pang, and Huang Citation2012; Pang, Huang, and Li Citation2011; Pungkul, Suraswasdi, and Phonekeo Citation2014). To map the extent and distribution of forest cover, project members collected plentiful ground truth data and Landsat TM/ETM+ images. Between 2011 to 2013, the project generated forest cover maps at 30 m for 2005 and 2010 covering the entire GMS+ through cooperation among nations. However, the high frequency of cloud and cloud shadow cover in GMS+ resulted in invalid areas in the 2005 and 2010 forest cover maps, which were covered by cloud/cloud shadow and unclassified pixels, particularly in low-latitude areas such as Malaysia.

This study aims to develop an optimization method that fills invalid areas in the 2005 and 2010 GMS+ forest cover thematic maps to improve their usability and effectiveness for analysis and applications. Firstly, we used a pixel-based compositing approach to generate clear view composited image for the invalid areas. Then, we automatically derived training land cover labels using the existing forest cover maps and incorporated the reclassification outcomes into the original maps to generate updated maps. We evaluated the effectiveness of this approach using pixel level and region level assessments. This method provides a conceptual framework for filling and updating invalid areas in existing forest cover maps.

Study area and materials

Study area

The GMS+ covers the Indochina Peninsula, Malaysia, and Yunnan and Guangxi of China (92.2° E-119.3° E, 0.8° N-29.2° N) (), with a population exceeding 370 million, ranking among the most densely-populated regions in the world. The GMS+ is among the world’s most irreplaceable natural areas and significant biodiversity hotspots (De Bruyn et al. Citation2014). This region comprises a variety of landscapes, including high mountain ranges, productive lowlands, and vast floodplains, and is renowned for its remarkable biodiversity, cultural heritage, and economic significance. The region has a subtropical and tropical monsoon climate, with an annual average temperature of 20–27°C and characterized by distinct dry (from November to April) and wet seasons (from May to October). Abundant water and heat resources create optimal conditions for vegetation growth. Subtropical and tropical forests, comprising various types such as evergreen, semi-evergreen, deciduous, mangroves, and swamp, cover the GMS+.

Figure 1. Geographical location of the Greater Mekong Subregion and Malaysia (GMS+).

Figure 1. Geographical location of the Greater Mekong Subregion and Malaysia (GMS+).

Forest cover map

The “Forest Cover and Carbon Mapping in the Greater Mekong Subregion and Malaysia” project generated forest cover thematic maps for the GMS+ region in 2005 and 2010. The maps consisted of 3 primary level classes (forest, nonforest, and no data), along with 16 second-level classes (comprising six classes for forest type, eight classes for nonforest type, and two classes for no data). Comprehensive information on the adopted classes in the GMS+ forest map is provided in .

Table 1. Land cover types used in the 2005 and 2010 forest cover maps.

Forest cover mapping for the GMS+ in 2005 and 2010 was conducted by acquiring Landsat TM/ETM+ data and field survey data, which involved essential processing steps: image pre-processing, segmentation, classification, and validation. In the pre-processing stage, the input images underwent geometric correction, radiometric corrections, and cloud removal procedures. For cloud removal, a linear correlation model based on spectral feature analysis was established using two temporal Landsat TM images, specifically targeting pairs of clear view pixels, which allowed for the extraction of cloud and cloud shadow regions by comparing changes in spectral characteristics. The project utilized object-based classification methods that considered both the spectral attributes of objects and their spatial relationships. The segmentation process was performed using the eCognition software, and a set of rules based on the spectral, spatial, and contextual features of the image was defined to identify the objects. The support vector machine (SVM) classification algorithm was applied to classify the segmented images. Subsequently, the quality of the classification maps was enhanced using manual methods with reference data such as field collection points, national forest inventory plots, forest management data, and existing land cover maps. Finally, field survey data and visually selected samples from IKONOS, RapidEye, and SPOT satellite images were utilized in the post-classification process, and various misclassified areas were rectified. The accuracy evaluation of the forest cover map for level II categories was conducted through a comparative analysis between the forest cover map and ground truth data. The results of this assessment have been listed in .

Table 2. The accuracy of forest cover map and collaboration team for each economy.

Due to constraints in data acquisition in 2010 and challenges in obtaining reliable observations in the GMS+, some pixels were classified as “cloud/cloud shadow” and “unclassified areas,” indicating invalid regions in the maps. shows the original GMS+ forest cover maps for 2005 and 2010, with four enlarged local areas () that emphasize the examples covered by the invalid regions.

Figure 2. The original GMS+ forest cover maps of (a) 2005 and (b) 2010, a to D depict enlarged views of specific regions exhibiting invalided patches, which were covered by cloud/cloud shadow and unclassified areas.

Figure 2. The original GMS+ forest cover maps of (a) 2005 and (b) 2010, a to D depict enlarged views of specific regions exhibiting invalided patches, which were covered by cloud/cloud shadow and unclassified areas.

In order to assess the extent of invalid areas in the forest cover maps of the GMS+, pixels labeled as “cloud/cloud shadow” or “unclassified” were counted in the 2005 and 2010 maps, and the number of invalid pixels was calculated for each economy. illustrates the percentage of invalid pixels in each economy. The results reveal that Malaysia exhibits the highest proportion of invalid pixels within the GMS+ for both time periods, accounting for 79.4% of the total invalid areas in 2005 and 62.1% in 2010. This was followed by Yunnan, Myanmar, and Laos, with invalid pixels ranging from 11.6%, 6.2%, and 2.8% for 2005, and 31.3%, 1.9%, and 4.6% for 2010, respectively. Conversely, the remaining economies, namely Cambodia, Guangxi, Thailand, and Vietnam exhibit minimal or negligible instances of invalid pixels within their respective maps.

Figure 3. Distribution of the cloud/cloud shadows in the forest cover map of 2005 and 2010 in GMS+ (the upper part), and the proportion of invalid pixels for each economy in GMS+ (the bottom part).

Figure 3. Distribution of the cloud/cloud shadows in the forest cover map of 2005 and 2010 in GMS+ (the upper part), and the proportion of invalid pixels for each economy in GMS+ (the bottom part).

Remote sensing image

Recent studies have provided evidence indicating a higher likelihood of acquiring high-quality Landsat imagery during the dry season as opposed to the wet season in the GMS+ (Li, Feng, and Xiao Citation2018). Furthermore, the dry season demonstrates improved discriminatory capabilities for distinguishing non-forest types from forests within densely vegetated areas in the tropical region (Mayes, Mustard, and Melillo Citation2015). Consequently, this study utilized all available Landsat TM/ETM images with cloud coverage below 70% captured during the dry season between January and April of 2005 and 2010. For a few regions with extreme cloud cover where effective observations were not obtainable in the target year, images from before and after one year were included to create no gap composites. Additional candidate images were included as needed to ensure complete no-gap composites, for example, images from November to December of 2004 and/or 2005 would be used to create a clear view composite for 2005.

Methods

To rectify the issue of invalid areas in the GMS+ maps, we developed an approach that utilized the Random Forest (RF) method to reclassify land cover types in the missing regions. The process of generating the updated GMS+ maps was carried out in three steps. Firstly, clear view image compositing was undertaken, followed by the reclassification of invalid areas. Finally, the forest cover map was updated, as shown in . To evaluate the effectiveness of this approach, we assessed the accuracy of the filled patches using reference samples from visual interpretation in pixel level and region level validation.

Figure 4. Workflow of the proposed method used to generate the updated 2005 and 2010 GMS+ forest cover maps.

Figure 4. Workflow of the proposed method used to generate the updated 2005 and 2010 GMS+ forest cover maps.

Generate the clear view image composites

For the generation of clear view composites, three indicators were used to evaluate the quality of observed pixels: the target date, distance to cloud and cloud shadow, and haze impact. The day of year (DOY) score was computed by the acquisition time of available images, with a higher score assigned to images closer in date to the target time. The cloud and cloud shadow score quantified the degree of contamination by clouds and cloud shadows in each pixel. The haze optimized transformation (HOT) score indicated the extent to which aerosols, haze, and thin clouds affected a pixel. These scores were calculated following the methodology described by Griffiths et al. (Citation2019). The candidate observation with the highest total score indicates a better quality and used in the construction of the clear view composites. However, it was observed that selecting the pixel with the highest score from difference temporal observations could lead to artificially discontinuities along the edges of the composited area. To address this issue, a multi-factor weighting (MFW) was developed, which integrated all available observations by incorporating their pixel scores to determine the composited value (Meng et al. Citation2023). The MFW method was applied on the Google Earth Engine (GEE) platform to generate clear view composites that exhibit a high level of spatial continuity and radiometric consistency across the entire GMS+. The composited image consists of six spectral bands, namely the blue, green, red, NIR, SWIR1, and SWIR2 bands. In addition, several spectral-derived indices were calculated, including the NDVI (Normalized Difference Vegetation Index), EVI (Enhanced Vegetation Index), SAVI (Soil-Adjusted Vegetation Index), NDWI (Normalized Difference Water Index), and NDSI (Normalized Difference Snow Index).

Invalid patches reclassification

In this study, the training labels were obtained by creating an intersection map based on the forest cover products from 2005 and 2010. Given the high accuracy of the initial forest cover maps, the intersection areas between these two maps can be considered highly reliable. The intersection map indicated the areas where the land cover remained unchanged between 2005 and 2010. The analysis showed that 74.8% of the GMS+ region experienced no changes in land cover during this period. These unchanged areas were primarily characterized by broadleaf forest (30.7%), cropland (26.9%), shrubland (7.0%), and needleleaf forest (5.0%).

To train local RF models for each invalid patch, we systematically sampled a defined number of points from the intersection map within buffer zones around the invalid areas. The range of the buffer zone and the number of training samples were determined by the size of the invalid area. A larger invalid area had wider buffer zones and more samples to train the classification model. The length of the buffer zone was set as half the length of the minimum bounding rectangle of the invalid area, as shown in . We chose the following number of training samples per class based on the count of invalid pixels in a patch: 10 samples if less than 100 invalid pixels, 100 samples if between 100 and 1000 invalid pixels, 1000 samples if above 1000 but less than 10,000 invalid pixels, and 2000 samples if over 10,000 invalid pixels. Each local RF model used 70% samples for training and the remaining 30% for model evaluation. We built five RF models for each invalid area and selected the model with the best results to reclassify the land cover types for invalid areas. Finally, we generated updated GMS+ maps by filling the invalid areas with the reclassified land cover results.

Figure 5. Flowchart of the rules used to select the number of training samples in the buffer area.

Figure 5. Flowchart of the rules used to select the number of training samples in the buffer area.

Evaluation for the filled areas

We assessed the performance of the proposed method using visual assessment and quantitative validation at the pixel level and region level. For the pixel level assessment, we randomly selected 5,000 samples per class from the filled areas across the entire GMS+ region, and compared them with points derived from visual interpretation. The resulting confusion matrix was used to calculate the producer’s accuracy (PA), user’s accuracy (UA), and overall accuracy (OA) for both level I forest and nonforest categories and all finer types at level II.

To evaluate the proposed method at a regional level, we selected a testing site with an image size of 4000 × 4000. This site was chosen as it featured a majority of land cover and exhibited full or near-full forest cover according to the existing map. The processes of generating a regional classification map and performing the validation were illustrated in . Firstly, a clear view composited image was generated to cover the selected site. Subsequently, the training datasets were automatically derived from the buffer zone surrounding the testing site. The proposed method was then applied to reclassify the clear view image. Finally, the reclassified map was compared with the existing forest cover map to evaluate the effectiveness of the proposed method.

Figure 6. Flowchart of the region-based approach for validation: a 4000×4000 image was used as the validation reference (A1), training datasets were derived from the buffer zone surrounding the testing site (B1 and B2), the reclassified map generated by the proposed method (A2’) were compared with the validation map at the region level. The yellow squares indicate the locations of two enlarged sites where a detailed comparison of the validation map with the reclassified map is presented in .

Figure 6. Flowchart of the region-based approach for validation: a 4000×4000 image was used as the validation reference (A1), training datasets were derived from the buffer zone surrounding the testing site (B1 and B2), the reclassified map generated by the proposed method (A2’) were compared with the validation map at the region level. The yellow squares indicate the locations of two enlarged sites where a detailed comparison of the validation map with the reclassified map is presented in Figure 11.

Results

Clear view image compositing for GMS+

As one of the cloudiest regions in the GMS+ area, Malaysia faces significant challenges when it comes to selecting appropriate optical satellite data for land cover monitoring. In , a comparison is presented between the original Landsat TM image used for forest cover mapping and the clear view image compositing utilized for reclassifying invalid pixels. Despite researchers’ efforts to choose data with the lowest cloud percentage for each path and row, the original Landsat TM image () still contains a considerable amount of cloud and haze in the satellite images for both Peninsular Malaysia and East Malaysia. In contrast, the clear view composited image () exhibits high radiometric consistency throughout Malaysia, displaying a visually smooth and natural appearance. The composited image is free of clouds, providing a clear view without any gaps, thereby enabling the reclassification of cloud-covered areas in the original 2005 and 2010 maps.

Figure 7. Remote sensing images used for mapping forest cover of the Malaysia in 2005: (a) the original Landsat TM/ETM+ images, (b) the clear view composited images of Peninsular Malaysia (left) and East Malaysia (right).

Figure 7. Remote sensing images used for mapping forest cover of the Malaysia in 2005: (a) the original Landsat TM/ETM+ images, (b) the clear view composited images of Peninsular Malaysia (left) and East Malaysia (right).

Fill the invalid patches and update the forest cover map

To demonstrate the effectiveness of our proposed method, we selected five sites with various land cover types to illustrate the disparities between the original and updated maps. As shown in , by utilizing clear view Landsat composites to reclassify the invalid areas, our proposed method has effectively addressed the information gaps and accurately reclassified the invalid areas, leading to a more accurate and rational final outcome. Specifically, for sites with a small number of invalid pixels (), we have successfully identified and generated missing land cover categories, including wetland, shrub land, water, and forest. In the case of sites encompassing a broader range of invalid pixel coverage (), our method has demonstrated its robustness and stability in sample selection and construction of the RF model, enabling the successful identification of diverse land cover types and accurate characterization of the corresponding missing areas (e.g. cropland and impervious in , snow in , water and forest in ).

Figure 8. Comparison of the original maps and the updated maps over five selected sites in the GMS+. White pixels in the original maps refer to the region affected by cloud or cloud shadows, indicating the absence of available data.

Figure 8. Comparison of the original maps and the updated maps over five selected sites in the GMS+. White pixels in the original maps refer to the region affected by cloud or cloud shadows, indicating the absence of available data.

presents the updated forest cover map for the entire GMS+ in 2005 and 2010, illustrating the successful filling of all invalid patches with reclassified land covers. These maps are more suitable for subsequent analysis and application. display the optimized maps corresponding to the four illustrations in , sharing the same geographic location for direct comparison.

Figure 9. The updated GMS+ forest cover maps of (a) 2005 and (b) 2010, a to D present the optimized maps with the same locations as the four drawings in .

Figure 9. The updated GMS+ forest cover maps of (a) 2005 and (b) 2010, a to D present the optimized maps with the same locations as the four drawings in Figure 2(a–d).

Accuracy evaluation for the filled areas

Using the invalid pixels filling strategy described in the method section, we generated the revised 2005 and 2010 forest cover maps for the entire GMS+. The accuracy evaluation was conducted at both the pixel and region levels, revealing that our method effectively identified missing information and accurately reclassified invalid areas.

Pixel level assessment

To evaluate the effectiveness of the proposed method at the pixel level, a total of 60,000 samples were randomly selected from all filled areas, with 5,000 samples per class. These samples were compared with referenced points obtained from visual interpretation. The location of the testing samples is shown in . The validation results indicated that the proposed method achieved a high level of accuracy, with an overall accuracy of 94.2% for the classification of forest and nonforest at level I, and 86.6% for finer-level classification at level II. Furthermore, the forest type showed the highest PA of 97.0%, while the nonforest type had the highest UA of 97.7%, as presented in . Examining the individual land cover types (), water demonstrated the highest PA of 92.5%, followed by needleleaf with 91.9%, urban with 91.2%, and wetland with 89.8%. In terms of UA, snow achieved the highest accuracy of 98.5%, followed by water with 96.7%, bare land with 96.0%, and urban with 92.6%. Conversely, shrub land had the lowest PA of 79.6%, while broadleaf had the lowest UA of 72.1%.

Figure 10. Location of the random points of each class for pixel level validation.

Figure 10. Location of the random points of each class for pixel level validation.

Table 3. Pixel level agreements of level I between the filled areas and the referenced data.

Table 4. Pixel level agreements of level II between the filled areas and the referenced data*.

Region level assessment

As illustrated in , our results demonstrate that the proposed method effectively utilized training samples in the buffer area to reclassify the land cover features of the target testing site. The resulting reclassified map provides a more realistic representation of land cover details, such as water bodies, urban areas, and broadleaf forests, compared to the original forest cover data. Region-based confusion matrices were used to evaluate the classification performance at level I () and level II (). The overall accuracy at level I was 93.2%, while it was slightly lower at 89.9% at level II. Most of the finer land cover types exhibited higher PA than UA. Specifically, the PAs for all types were consistently above 80%, while the UAs were relatively low. Notably, the wetland type had the lowest UA of 9.9%. This can be attributed to a large amount of wetland training samples in the buffer zone and the classification of additional wetland pixels in the study sites, primarily located along the river and previously misidentified as shrub land in the original map. In terms of individual land cover types, the mixed forest had the highest PA of 96.8%, followed by broadleaf with 94.7%, needleleaf with 93.7%, and urban with 93.0%. Among the UAs, cropland had the highest accuracy of 94.6%, followed by shrubland with 93.5%, broadleaf with 90.8%, and water with 87.5%. Overall, the region-level evaluation revealed high accuracy for forest and nonforest classification at both level I and level II, it also showed the variations in the accuracy levels across different land cover types, with some types showing lower accuracy than others.

Figure 11. The comparison between the validation maps and reclassified maps. The approximate locations of these two sites are shown in .

Figure 11. The comparison between the validation maps and reclassified maps. The approximate locations of these two sites are shown in Figure 6.

Table 5. Region level agreements of level I between the updated map and the referenced map on the selected site.

Table 6. Region level agreements of level II between the updated map and the referenced map on the selected site.

Difference between the original map and the updated the forest cover map

After filling in the invalid pixels in the original maps, we found that these previously “no data” pixels were now entirely occupied by newly assigned land cover types. A detailed summary of the changes between the original and updated maps is presented in . The statistical results indicated that the proposed method effectively reclassified the invalid areas, which account for 3.3% of the GMS+ areas, as 2.6% forest and 0.7% non-forest areas in the 2005 map. The majority of these reclassified areas were broadleaf forest (2.0%), mixed forest (0.4%), and cropland (0.3%). For the 2010 map, the invalid areas were reclassified to 2.8% forest and 0.6% non-forest, mainly consisting of broadleaf forest (2.1%), mixed forest (0.4%), and bare land (0.2%).

Table 7. Difference of land cover between the original map and the updated map in the entire GMS+ area.

Discussion

The significance of updating the historical land cover maps

Land cover plays a crucial role in understanding global change processes, and the availability of accurate land cover maps is essential for various applications (Andrew, Wulder, and Nelson Citation2014; Townshend et al. Citation2012; Wulder et al. Citation2018). Classification algorithms that utilize high-quality, clear view inputs have proven effective in identifying different land cover types based on their unique characteristics, resulting in reliable land cover products. Numerous studies have been conducted to develop large-scale regional and global land cover products, aiming to provide comprehensive and up-to-date information on land cover dynamics. These products rely on extensive ground truth data and employ advanced classification techniques to achieve high accuracy (Ganguly et al. Citation2010, Chen et al. Citation2015; Gómez, White, and Wulder Citation2016; Gong et al. Citation2019; Potapov et al. Citation2022; Zhang et al. Citation2021). However, obtaining reliable land cover products can be challenging, particularly in remote or developing regions where ground truth data may be limited (Anderson and Johnson Citation2016; Giri et al. Citation2011). In such cases, historical land cover products can serve as a valuable reference for further analysis and understanding of land cover dynamics. Historical land cover products provide a baseline of land cover information, capturing past land use and changes over time. While they may have limitations, such as outdated or incomplete information, they still offer valuable insights into long-term land cover patterns and trends. By utilizing historical land cover products, researchers and decision-makers can gain valuable insights into the historical context of land cover dynamics, identify long-term changes, and assess the impacts of various factors on land cover patterns. This information can support land management strategies, environmental assessments, and policy development, particularly in regions where current and detailed land cover data may be lacking.

It is evident that historical thematic maps often suffer from missing data due to cloud and cloud shadow cover or incomplete coverage especially in tropical areas. These missing data can result in a loss of details and information, introducing uncertainty in forest change analysis across different years. Therefore, it becomes crucial to update and fill these thematic maps to ensure the reliability and validity of subsequent analyses. The proposed method in this study addresses this issue by updating the historical 2005 and 2010 GMS+ forest cover maps. The underlying assumption is that the accuracy of the historical land cover maps is acceptable after undergoing comprehensive evaluation. This method leverages the existing information and knowledge captured in the historical maps and enhances their utility by addressing the limitations posed by cloud cover. It is important to note that this approach does not negate the need for periodic data acquisition and the generation of new land cover maps. Instead, it complements these efforts by providing a cost-effective and efficient means of updating historical data and maximizing their usefulness. One of the key advantages of this method is that it preserves the accuracy of the non-cloud-covered regions while updating the cloud-covered portions. This ensures that the quality and reliability of the original data are maintained, allowing for consistent analysis and comparison over time. By utilizing the proposed algorithm, we were able to identify and reclassify 2.6% and 2.7% of forests, as well as 0.7% and 0.6% of non-forests, in the invalid areas of the 2005 and 2010 classification maps, respectively. This highlights the effectiveness of the method in capturing missing data and improving the accuracy of the land cover maps.

Consideration of strategies in the classification process

This study highlights the effectiveness of reclassifying land cover types to rectify previously invalid areas in land cover products. An automated approach was employed, wherein training samples were derived from the intersection results of existing 2005 and 2010 maps. Local random forest models were then constructed for each invalid patch. This approach eliminates the need for manual interaction or region-specific land cover knowledge, as it relies on intersection pixels to provide high-quality training data and calibrate machine learning algorithms in an automated workflow. In terms of sample selection, the number of classification samples was determined based on the number of invalid pixels present in the image. We employed a method in which the number of samples extracted was adjusted based on the number of individual invalid patches. This method was designed to ensure the resulting classification model would accurately classify images with varying degrees of invalid pixel content. This approach resulted in a more comprehensive and representative set of classification samples, ultimately leading to a more robust and accurate model.

Limitations

It is important to acknowledge that the specific thresholds for the number of training samples used in this study were determined through empirical analysis and experimentation. While the results obtained from these experiments demonstrate the effectiveness of the proposed approach and validate the suitability of the selected thresholds for the given dataset, it is crucial to recognize that these thresholds may not be universally applicable. Furthermore, the diversity and complexity of land cover types within cloud-obscured areas can indeed present challenges in constructing local classification models. When constructing local classification models, the lack of comprehensive spectral information and the reliance on surrounding land cover types for inference might lead to misclassifications. The limitations and potential challenges associated with the proposed method should be carefully considered and addressed in future studies.

Conclusion

In this study, we proposed an approach to fill the invalid areas in the 2005 and 2010 GMS+ forest cover maps. We automatically derived training samples from the intersection results of the two maps and combined them with clear view composited images to reclassify the invalid areas. Our method successfully reclassified land cover types in the updated maps, filling all previously missing pixels. The accuracy of the approach was assessed at both the pixel and region level, with an overall accuracy of 94.2% at level I and 86.6% at finer classification level II by pixel level assessment across all reclassified patches, and 93.2% at level I and 89.9% at level II by region level for the selected site. We reclassified 2.6% of forest and 0.7% of non-forest areas in the 2005 map, as well as 2.7% of forest and 0.6% of non-forest in the 2010 map from invalid pixels. Our approach provides a framework for filling and updating invalid areas in forest cover maps toward improving the spatial continuity of existing land cover products over large regions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The high resolution figures for 2005 and 2010 GMS+ forest cover maps are available at https://doi.org/10.6084/m9.figshare.22358116.v1.The 2005 and 2010 GMS+ products are available at Science Data Bank as: https://doi.org/10.57760/sciencedb.09909.

Additional information

Funding

This work was supported by the National Key Research and Development Program of China [No. 2019YFE0126700], the Asia-Pacific Network for Sustainable Forest Management and Rehabilitation [No. 2011PA004 and No. 2018P1‐CAF]. We also thank all the anonymous reviewers for their constructive comments.

References

  • Anderson, W., and T. Johnson. 2016. “Evaluating Global Land Degradation Using Ground-Based Measurements and Remote Sensing.” In Economics of Land Degradation and Improvement-A Global Assessment for Sustainable Development, 85–21. Springer press. https://doi.org/10.1007/978-3-319-19168-3_5.
  • Andrew, M. E., M. A. Wulder, and T. A. Nelson. 2014. “Potential Contributions of Remote Sensing to Ecosystem Service Assessments.” Progress in Physical Geography 38 (3): 328–353. https://doi.org/10.1177/0309133314528942.
  • Asner, G. P. 2001. “Cloud Cover in Landsat Observations of the Brazilian Amazon.” International Journal of Remote Sensing 22 (18): 3855–3862. https://doi.org/10.1080/01431160010006926.
  • Bonan, G. B. 2008. “Forests and Climate Change: Forcings, Feedbacks, and the Climate Benefits of Forests.” Science 320 (5882): 1444–1449. https://doi.org/10.1126/science.1155121.
  • Bond, I. 2009. “Incentives to Sustain Forest Ecosystem Services: A Review and Lessons for REDD.” International Institute for Environment and Development 1–44.
  • Chen, J., A. Liao, X. Cao, and L. Chen, X. , C. He, et al. 2015. “Global Land Cover Mapping at 30 M Resolution: A POK-based Operational Approach.” Isprs Journal of Photogrammetry & Remote Sensing 103: 7–27. doi:10.1016/j.isprsjprs.2014.09.002.
  • Costenbader, J., J. Broadhead, Y. Yasmi, and P. B. Durst. 2015. Drivers Affecting Forest Change in the Greater Mekong Subregion (GMS): An Overview. Rome: FAO.
  • De Bruyn, M., B. Stelbrink, R. J. Morley, R. Hall, G. R. Carvalho, C. H. Cannon, G. van den Bergh, et al. 2014. “Borneo and Indochina are Major Evolutionary Hotspots for Southeast Asian Biodiversity.” Systematic Biology 63 (6): 879–901. https://doi.org/10.1093/sysbio/syu047.
  • Dong, J., X. Xiao, B. Chen, N. Torbick, C. Jin, G. Zhang, and C. Biradar. 2013. “Mapping Deciduous Rubber Plantations Through Integration of PALSAR and Multi-Temporal Landsat Imagery.” Remote Sensing of Environment 134:392–402. https://doi.org/10.1016/j.rse.2013.03.014.
  • Drusch, M., U. Del Bello, S. Carlier, O. Colin, V. Fernandez, F. Gascon, B. Hoersch, et al. 2012. “Sentinel-2: Esa’s Optical High-Resolution Mission for GMES Operational Services.” Remote Sensing of Environment 120:25–36. https://doi.org/10.1016/j.rse.2011.11.026.
  • Estoque, R. C., M. Ooba, V. Avitabile, Y. Hijioka, R. DasGupta, T. Togawa, and Y. Murayama. 2019. “The Future of Southeast Asia’s Forests.” Nature Communications 10 (1): 1829. https://doi.org/10.1038/s41467-019-09646-4.
  • Ganguly, S., M. A. Friedl, B. Tan, X. Zhang, and M. Verma. 2010. “Land Surface Phenology from MODIS: Characterization of the Collection 5 Global Land Cover Dynamics Product.” Remote Sensing of Environment 114 (8): 1805–1816. https://doi.org/10.1016/j.rse.2010.04.005.
  • Giri, C., E. Ochieng, L. L. Tieszen, Z. Zhu, A. Singh, T. Loveland, J. Masek, and N. Duke. 2011. “Status and Distribution of Mangrove Forests of the World Using Earth Observation Satellite Data.” Global Ecology and Biogeography 20 (1): 154–159. https://doi.org/10.1111/j.1466-8238.2010.00584.x.
  • Gómez, C., J. C. White, and M. A. Wulder. 2016. “Optical Remotely Sensed Time Series Data for Land Cover Classification: A Review.” ISPRS Journal of Photogrammetry and Remote Sensing 116:55–72. https://doi.org/10.1016/j.isprsjprs.2016.03.008.
  • Gong, P., H. Liu, M. Zhang, C. Li, J. Wang, H. Huang, N. Clinton, et al. 2019. “Stable Classification with Limited Sample: Transferring a 30-M Resolution Sample Set Collected in 2015 to Mapping 10-M Resolution Global Land Cover in 2017.” Science Bulletin 64 (6): 370–373. https://doi.org/10.1016/j.scib.2019.03.002.
  • Griffiths, P., C. Nendel, and P. Hostert. 2019. “Intra-Annual Reflectance Composites from Sentinel-2 and Landsat for National-Scale Crop and Land Cover Mapping.” Remote Sensing of Environment 220:135–151. https://doi.org/10.1016/j.rse.2018.10.031.
  • Hansen, M. C., S. V. Stehman, P. V. Potapov, T. R. Loveland, J. R. Townshend, R. S. DeFries, K. W. Pittman, et al. 2008. “Humid Tropical Forest Clearing from 2000 to 2005 Quantified by Using Multitemporal and Multiresolution Remotely Sensed Data.” Proceedings of the National Academy of Sciences 105 (27): 9439–9444. https://doi.org/10.1073/pnas.0804042105.
  • Han, Z., and W. Song. 2022. “Interannual Trends of Vegetation and Responses to Climate Change and Human Activities in the Great Mekong Subregion.” Global Ecology and Conservation 38:e02215. https://doi.org/10.1016/j.gecco.2022.e02215.
  • He, B., X. Wu, K. Liu, Y. Yao, W. Chen, and W. Zhao. 2022. “Trends in Forest Greening and Its Spatial Correlation with Bioclimatic and Environmental Factors in the Greater Mekong Subregion from 2001 to 2020.” Remote Sensing 14 (23): 5982. https://doi.org/10.3390/rs14235982.
  • Huete, A. R., and S. R. Saleska. 2010. “Remote Sensing of Tropical Forest Phenology: Issues and Controversies.” International Archives of the Photogrammetry, Remote Sensing and Spatial Information Science, Kyoto Japan 38:539–542.
  • Hughes, A. C. 2017. “Understanding the Drivers of Southeast Asian Biodiversity Loss.” Ecosphere 8 (1): e01624. https://doi.org/10.1002/ecs2.1624.
  • Jones, I. J., A. J. MacDonald, S. R. Hopkins, A. J. Lund, Z. Y. Liu, N. I. Fawzi, M. P. Purba, et al. 2020. “Improving Rural Health Care Reduces Illegal Logging and Conserves Carbon in a Tropical Forest.” Proceedings of the National Academy of Sciences 117 (45): 28515–28524. https://doi.org/10.1073/pnas.2009240117.
  • Ju, J., D. P. Roy, Y. Shuai, and C. Schaaf. 2010. “Development of an Approach for Generation of Temporally Complete Daily Nadir MODIS Reflectance Time Series.” Remote Sensing of Environment 114 (1): 1–20. https://doi.org/10.1016/j.rse.2009.05.022.
  • Kayiranga, A., B. Chen, H. Zhang, W. Nthangeni, S. Measho, and F. Ndayisaba. 2021. “Spatially Explicit and Multiscale Ecosystem Shift Probabilities and Risk Severity Assessments in the Greater Mekong Subregion Over Three Decades.” Science of the Total Environment 798:149281. https://doi.org/10.1016/j.scitotenv.2021.149281.
  • Leinenkugel, P., M. L. Wolters, N. Oppelt, and C. Kuenzer. 2015. “Tree Cover and Forest Cover Dynamics in the Mekong Basin from 2001 to 2011.” Remote Sensing of Environment 158:376–392. https://doi.org/10.1016/j.rse.2014.10.021.
  • Li, P., Z. Feng, and C. Xiao. 2018. “Acquisition Probability Differences in Cloud Coverage of the Available Landsat Observations Over Mainland Southeast Asia from 1986 to 2015.” International Journal of Digital Earth 11 (5): 437–450. https://doi.org/10.1080/17538947.2017.1327619.
  • Li, Z. Y., Y. Pang, and C. Q. Huang. 2012. “Introduction of Forest Cover and Carbon Mapping in the Greater Mekong Subregion and Malaysia Project.” Proceedings on 33rd Asian Conference on Remote Sensing, Pattaya, Thailand, November 26-30.
  • MacDicken, K. G., P. Sola, J. E. Hall, C. Sabogal, M. Tadoum, and C. de Wasseige. 2015. “Global Progress Toward Sustainable Forest Management.” Forest Ecology and Management 352:47–56. https://doi.org/10.1016/j.foreco.2015.02.005.
  • Maskell, G., A. Chemura, H. Nguyen, C. Gornott, and P. Mondal. 2021. “Integration of Sentinel Optical and Radar Data for Mapping Smallholder Coffee Production Systems in Vietnam.” Remote Sensing of Environment 266:112709. https://doi.org/10.1016/j.rse.2021.112709.
  • Mayes, M. T., J. F. Mustard, and J. M. Melillo. 2015. “Forest Cover Change in Miombo Woodlands: Modeling Land Cover of African Dry Tropical Forests with Linear Spectral Mixture Analysis.” Remote Sensing of Environment 165:203–215. https://doi.org/10.1016/j.rse.2015.05.006.
  • Meng, S. L., Y. Pang, C. Q. Huang, and Z. Y. Li. 2023. “A Multi-Factor Weighting Method for Improved Clear View Compositing Using All Available Landsat 8 and Sentinel-2 Images in Google Earth Engine.” Journal of Remote Sensing.
  • Michon, G., H. De Foresta, P. Levang, and F. Verdeaux. 2007. “Domestic Forests: A New Paradigm for Integrating Local communities’ Forestry into Tropical Forest Science.” Ecology and Society 12 (2). https://doi.org/10.5751/ES-02058-120201.
  • Miettinen, J., H. J. Stibig, and F. Achard. 2014. “Remote Sensing of Forest Degradation in Southeast Asia-Aiming for a Regional View Through 5-30 M Satellite Data.” Global Ecology and Conservation 2:24–36. https://doi.org/10.1016/j.gecco.2014.07.007.
  • Pang, Y., K. B. Huang, and Z. Y. Li. 2011. “Forest Aboveground Biomass Analysis Using Remote Sensing in the Greater Mekong Subregion.” Resources Science 33 (10): 1863–1869.
  • Poortinga, A., K. Tenneson, A. Shapiro, Q. Nquyen, K. San Aung, F. Chishtie, and D. Saah. 2019. “Mapping Plantations in Myanmar by Fusing Landsat-8, Sentinel-2 and Sentinel-1 Data Along with Systematic Error Quantification.” Remote Sensing 11 (7): 831. https://doi.org/10.3390/rs11070831.
  • Potapov, P. V., M. C. Hansen, A. Pickens, A. Hernandez-Serna, A. Tyukavina, S. Turubanova, V. Zalles, et al. 2022. “The Global 2000-2020 Land Cover and Land Use Change Dataset Derived from the Landsat Archive: First Results.” Frontiers in Remote Sensing 3. https://doi.org/10.3389/frsen.2022.856903.
  • Potapov, P. V., S. A. Turubanova, M. C. Hansen, B. Adusei, M. Broich, A. Altstatt, L. Mane, and C. O. Justice. 2012. “Quantifying Forest Cover Loss in Democratic Republic of the Congo, 2000-2010, with Landsat ETM+ Data.” Remote Sensing of Environment 122:106–116. https://doi.org/10.1016/j.rse.2011.08.027.
  • Prentice, I. C., G. D. Farquhar, M. J. R. Fasham, M. L. Goulden, M. Heimann, V. J. Jaramillo, H. S. Kheshgi, et al. 2001. “The Carbon Cycle and Atmospheric Carbon Dioxide.” Climate change 2001: the scientific basis, Intergovernmental panel on climate change hal-03333974.
  • Pungkul, S., C. Suraswasdi, and V. Phonekeo. 2014. “Implementation of Forest Cover and Carbon Mapping in the Greater Mekong Subregion and Malaysia Project-A Case Study of Thailand.” IOP Conference Series: Earth and Environmental Science 18 (1): 012141. IOP Publishing. https://doi.org/10.1088/1755-1315/18/1/012141.
  • Rowcroft, P. 2008. “Frontiers of Change: The Reasons Behind Land-Use Change in the Mekong Basin.” AMBIO: A Journal of the Human Environment 37 (3): 213–218. https://doi.org/10.1579/0044-7447(2008)37[213:FOCTRB]2.0.CO;2.
  • Roy, D. P., J. Ju, K. Kline, P. L. Scaramuzza, V. Kovalskyy, M. Hansen, T. R. Loveland, E. Vermote, and C. Zhang. 2010. “Web-Enabled Landsat Data (WELD): Landsat ETM+ Composited Mosaics of the Conterminous United States.” Remote Sensing of Environment 114 (1): 35–49. https://doi.org/10.1016/j.rse.2009.08.011.
  • Sarzynski, T., X. Giam, L. Carrasco, and J. S. H. Lee. 2020. “Combining Radar and Optical Imagery to Map Oil Palm Plantations in Sumatra, Indonesia, Using the Google Earth Engine.” Remote Sensing 12 (7): 1220. https://doi.org/10.3390/rs12071220.
  • Senga, K. 2004. Greater Mekong Subregion Atlas of the Environment. Metro Manila, Philippines: Asian Development Bank.
  • Smith, T. F., D. C. Thomsen, S. Gould, K. Schmitt, and B. Schlegel. 2013. “Cumulative Pressures on Sustainable Livelihoods: Coastal Adaptation in the Mekong Delta.” Sustainability 5 (1): 228–241. https://doi.org/10.3390/su5010228.
  • Sodhi, N. S., L. P. Koh, R. Clements, T. C. Wanger, J. K. Hill, K. C. Hamer, Y. Clough, T. Tscharntke, M. R. C. Posa, and T. M. Lee. 2010. “Conserving Southeast Asian Forest Biodiversity in Human-Modified Landscapes.” Biological Conservation 143 (10): 2375–2384. https://doi.org/10.1016/j.biocon.2009.12.029.
  • Stibig, H. J., F. Achard, S. Carboni, R. Raši, and J. Miettinen. 2014. “Change in Tropical Forest Cover of Southeast Asia from 1990 to 2010.” Biogeosciences 11 (2): 247–258. https://doi.org/10.5194/bg-11-247-2014.
  • Stibig, H. J., F. Stolle, R. Dennis, and C. Feldkötter. 2007. “Forest Cover Change in Southeast Asia-The Regional Pattern.” JRC Scientific and Technical Reports 22896.
  • Townshend, J. R., J. G. Masek, C. Huang, E. F. Vermote, F. Gao, S. Channan, J. Sexton, et al. 2012. “Global Characterization and Monitoring of Forest Cover Using Landsat Data: Opportunities and Challenges.” International Journal of Digital Earth 5 (5): 373–397. https://doi.org/10.1080/17538947.2012.713190.
  • Wells, K., E. K. Kalko, M. B. Lakim, and M. Pfeiffer. 2007. “Effects of Rain Forest Logging on Species Richness and Assemblage Composition of Small Mammals in Southeast Asia.” Journal of Biogeography 34 (6): 1087–1099. https://doi.org/10.1111/j.1365-2699.2006.01677.x.
  • White, J. C., M. A. Wulder, G. W. Hobart, J. E. Luther, T. Hermosilla, P. Griffiths, N. C. Coops, et al. 2014. “Pixel-Based Image Compositing for Large-Area Dense Time Series Applications and Science.” Canadian Journal of Remote Sensing 40 (3): 192–212. https://doi.org/10.1080/07038992.2014.945827.
  • Wilcove, D. S., X. Giam, D. P. Edwards, B. Fisher, and L. P. Koh. 2013. “Navjot’s Nightmare Revisited: Logging, Agriculture, and Biodiversity in Southeast Asia.” Trends in Ecology and Evolution 28 (9): 531–540. https://doi.org/10.1016/j.tree.2013.04.005.
  • Woodcock, C. E., R. Allen, M. Anderson, A. Belward, R. Bindschadler, W. Cohen, F. Gao, et al. 2008. “Free Access to Landsat Imagery.” Science 320 (5879): 1011–1011. https://doi.org/10.1126/science.320.5879.1011a.
  • Wulder, M. A., N. C. Coops, D. P. Roy, J. C. White, and T. Hermosilla. 2018. “Land Cover 2.0.” International Journal of Remote Sensing 39 (12): 4254–4284. https://doi.org/10.1080/01431161.2018.1452075.
  • Wulder, M. A., J. G. Masek, W. B. Cohen, T. R. Loveland, and C. E. Woodcock. 2012. “Opening the Archive: How Free Data Has Enabled the Science and Monitoring Promise of Landsat.” Remote Sensing of Environment 122:2–10. https://doi.org/10.1016/j.rse.2012.01.010.
  • Zhang, X., L. Liu, X. Chen, Y. Gao, S. Xie, and J. Mi. 2021. “GLC_FCS30: Global Land-Cover Product with Fine Classification System at 30 M Using Time-Series Landsat Imagery.” Earth System Science Data 13 (6): 2753–2776. https://doi.org/10.5194/essd-13-2753-2021.
  • Zhu, Z., C. E. Woodcock, C. Holden, and Z. Yang. 2015. “Generating Synthetic Landsat Images Based on All Available Landsat Data: Predicting Landsat Surface Reflectance at Any Given Time.” Remote Sensing of Environment 162:67–83. https://doi.org/10.1016/j.rse.2015.02.009.
  • Zhu, Z., M. A. Wulder, D. P. Roy, C. E. Woodcock, M. C. Hansen, V. C. Radeloff, S. P. Healey, et al. 2019. “Benefits of the Free and Open Landsat Data Policy.” Remote Sensing of Environment 224:382–385. https://doi.org/10.1016/j.rse.2019.02.016.