ABSTRACT
Thermokarst lakes are the main components of the vast Arctic and subarctic landscapes. These lakes can serve as geoindicators of permafrost degradation; therefore, proper lake distribution assessment methods are necessary. In this study, we compared four machine learning methods to improve existing lake detection systems. The northern part of Yakutia was selected as the study area owing to its complex environment. We used data from Landsat 8 and spectral indices to take into account the spectral characteristics of the lakes, and MERIT DEM data to take into account the topography. The lowest accuracy was found for the classification and regression trees (CART) method (overall accuracy = 81%). On the other hand, the random forests (RF) classification provided the best results (overall accuracy = 92%), and only this classification coped well in all problematic areas, such as shaded and humid areas, near steep slopes, burn scars, and rivers. The altitude and bands SWIR1 (Short wave infrared 1), SWIR2 (Short wave infrared 2), and Green were the most important. Spectral indices did not have significant impact on the classification results in the specific conditions of the thermokarst lakes environment. 17,700 lakes were identified with the total area of 271.43 km2.
Acknowledgments
The authors would like to thank the reviewers very much for their comprehensive and detailed comments, which helped to improve the paper.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Data availability statement
The data and code that support the findings of this study are available from the corresponding author, upon reasonable request. Data used in this study were derived from the following resources available in the public domain, https://developers.google.com/earth-engine/datasets.