453
Views
0
CrossRef citations to date
0
Altmetric
Review Article

Training data in satellite image classification for land cover mapping: a review

ORCID Icon, ORCID Icon & ORCID Icon
Article: 2341414 | Received 19 Nov 2023, Accepted 07 Apr 2024, Published online: 14 Apr 2024
 

ABSTRACT

The current land cover (LC) mapping paradigm relies on automatic satellite imagery classification, predominantly through supervised methods, which depend on training data to calibrate classification algorithms. Hence, training data have a critical influence on classification accuracy. Although research on specific aspects of training data in the LC classification context exists, a study that organizes and synthetizes the multiplicity of aspects and findings of these researches is needed. In this article, we review the training data used for LC classification of satellite imagery. A protocol of identification and selection of relevant documents was followed, resulting in 114 peer-reviewed studies included. Main research topics were identified and documents were characterized according to their contribution to each topic, which allowed uncovering subtopics and categories and synthetizing the main findings regarding different aspects of the training dataset. The analysis found four research topics, namely construction of the training dataset, sample quality, sampling design and advanced learning techniques. Subtopics included sample collection method, sample cleaning procedures, sample size, sampling method, class balance and distribution, among others. A summary of the main findings and approaches provided an overview of the research in this area, which may serve as a starting point for new LC mapping initiatives.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data were collected from the Scopus website and discussed in the data collection section.

Additional information

Funding

This research was funded by Fundação para a Ciência e Tecnologia [FCT] grant number [PRT/BD/153517/2021], the Forest Research Centre and Associated Laboratory TERRA [UIDB/00239/2020]. Mário Caetano acknowledges the financial support provided by Fundação para a Ciência e a Tecnologia, Portugal [FCT] under the project [UIDB/04152/2020] - Centro de Investigação em Gestão de Informação [MagIC].