589
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Tree species classification on images from airborne mobile mapping using ML.NET

ORCID Icon, &
Article: 2271651 | Received 30 Mar 2023, Accepted 11 Oct 2023, Published online: 07 Nov 2023

ABSTRACT

Deep learning is a powerful tool for automating the process of recognizing and classifying objects in images. In this study, we used ML.NET, a popular open-source machine learning framework, to develop a model for identifying tree species in images obtained from airborne mobile mapping. These high-resolution images can be used to create detailed maps of the landscape. They can also be analyzed and processed to extract information about visual features, including tree species recognition. The deep learning model was trained using ML.NET to classify two tree species based on the combination of airborne mobile mapping images. Our approach yielded impressive results, with a maximum classification accuracy of 93.9%. This demonstrates the effectiveness of combining imagery sources with deep learning tools in ML.NET for efficient and accurate tree species classification. This study highlights the potential of the ML.NET framework for automating object classification and can provide valuable insights and information for forestry management and conservation efforts. The primary objective of this research was to evaluate the effectiveness of an approach for identifying tree species through a model generated using a combination of ortho and oblique images captured by a mobile mapping system.

Introduction

The primary tree species classification method involves expensive and time-consuming in-situ surveys to measure structure attributes such as tree height, diameter at breast height, leaf area index, branch angle, etc. of individual trees, and then identifying tree species by comparing these parameters to a norm. The accurate classification of tree species plays a vital role in the effective and sustainable management of forest resources, but automation of this process still remains a challenging task for scientists and land managers (Michałowska & Rapiński, Citation2021). The spatial composition of tree species is critical for a variety of reasons including economics, ecology, and technology. In addition, a tree species map is essential for forest inventory and has been recognized as a valuable tool in the field of forest management and planning (Dalponte et al., Citation2012; Heinzel & Koch, Citation2012). Advances in technology made it possible to classify species more efficiently using remote sensing techniques, including optical and Synthetic Aperture Radar (SAR) images (Avtar et al., Citation2023; Magnard et al., Citation2016; Udali et al., Citation2021), images captured by cameras mounted on aircraft such as helicopters or unmanned aerial vehicles (UAVs) (Gini et al., Citation2012; Natesan et al., Citation2019; Zhang et al., Citation2021), and Light Detection and Ranging (LiDAR). The use of LiDAR data in forest inventory started over 20 years ago, primarily for determining the height of individual trees and stands, as well as tree density. Recent studies have focused on extracting tree parameters such as crown size (Chamberlain et al., Citation2020; Pyysalo & Hyyppa, Citation2002; Solberg et al., Citation2006), stem volume (Bont et al., Citation2020; Ørka et al., Citation2010; Yao et al., Citation2012), and biomass (Sarrazin et al., Citation2011), as well as identifying tree species (Krüger Geb Amiri et al., Citation2018; Kukkonen et al., Citation2019; Yu et al., Citation2017) or types (Heinzel & Koch, Citation2011). Most research on tree species classification using airborne LiDAR data emphasizes extracting structural parameters. A comprehensive review of tree species recognition based on LiDAR-derived metrics can be found in the article by Michałowska and Rapiński (Citation2021). Combining LiDAR with hyperspectral sensors can improve the accuracy of tree species classification by utilizing radiometric features associated with the chemical and morphological properties of tree crowns. Studies have shown the effectiveness of both multispectral and hyperspectral spectra in identifying tree species (Shen & Cao, Citation2017; You et al., Citation2020). Over the years, advancements in remote sensing and deep learning have facilitated the creation of innovative approaches for identifying and mapping vegetation. Airborne mobile mapping systems equipped with high-resolution cameras have become increasingly popular for large-scale mapping. Mobile mapping technology is a method of collecting geospatial data using a moving platform equipped with various sensors, such as high-resolution cameras, laser scanners, GPS (Global Positioning System) receivers, and Inertial Measurement Unit (IMU) (El-Sheimy, Citation1996). The integration of these sensors allows for the accurate collection of georeferenced 3D geospatial data, as well as visual information. The acquired data can be utilized for a range of purposes, including mapping (Wu et al., Citation2019), modeling (Huang et al., Citation2019), inspection, inventorying, and analysis (de Vries et al., Citation2020; Michałowska, Citation2020). In the last decades, airborne mobile mapping data has been captured using manned aircraft, helicopters, and small unmanned aerial vehicles. The technology of unmanned aerial vehicles (UAVs) is currently experiencing rapid development, as new sensors and methods are being introduced to the market, providing new opportunities for remote sensing tasks (Nevalainen et al., Citation2017). UAV technologies have already been utilized for tree identification and classification. While some studies have used hyperspectral sensors on UAV for this purpose (Franklin & Ahmed, Citation2018; Nevalainen et al., Citation2017, Franklin et al., Citation2017; Sothe et al., Citation2019), others have utilized digital cameras and deep learning for identifying tree species (Onishi & Ise, Citation2021; Safonova et al., Citation2019). It’s worth mentioning that digital cameras provide valuable information about the objects being inspected, including color and texture data, which can help in the identification of various materials and surface patterns. According to Kattenborn et al. (Citation2019), when utilizing very high spatial resolution data to differentiate between specific tree species, the spatial patterns derived from leaf forms, canopy shapes, and branching patterns hold greater importance than the spectral information.

The use of images for object recognition and classification is a potent method of automation and comprehends vast amounts of visual data. Object recognition and classification technology can be applied to a range of images at various scales, levels of detail and resolutions. It has been extensively tested using brightfield microscopy images (Hung & Carpenter, Citation2017; Wang et al., Citation2019), diagnostic radiographs (Gamdha et al., Citation2021; Mahdi et al., Citation2020; Mahdy et al., Citation2020), satellite imagery (Tsai & Chen, Citation2017). In recent years, there has been substantial progress in image classification methods, which can be broadly divided into two categories: pixel-based (Goldblatt et al., Citation2018; Liu et al., Citation2020; Nogueira et al., Citation2019) and object-oriented (Darwish et al., Citation2003; Liu & Xia, Citation2010). Pixel-based classification approaches treat each pixel as a separate unit for analysis and primarily focus on the band spectral intensity of the pixels, ignoring the spatial relationships and contextual information (Ke et al., Citation2010). In recent years, deep learning (DL), as a subset of machine learning, has gained substantial attention, resulting in the widespread availability of data and software. There has been a growing interest in utilizing deep learning techniques for tree species classification using remote sensing data, which has been observed for nearly a decade (Guo et al., Citation2022; He et al., Citation2023). The advancement of deep learning technology and neural network (LeCun et al., Citation2015) has led to a proliferation of researchers utilising neural networks for automated feature extraction, thereby eliminating the need for manual feature selection, which was prevalent in earlier studies (Crisci et al., Citation2012; Immitzer et al., Citation2012). Typical deep learning networks encompass Convolutional Neural Networks (CNNs), stacked autoencoders, deep belief networks, and recurrent neural networks. Among these networks, CNNs demonstrate significant potential and have already achieved successful applications in various remote sensing tasks, including image classification, object detection, image registration, and segmentation (Ciregan et al., Citation2012; Mallick et al., Citation2019; MS et al., Citation2022; Yu & Zahidi, Citation2023). The representative CNN models are the ResNet (He et al., Citation2016), AlexNet (Krizhevsky et al., Citation2017), GoogLeNet (Szegedy et al., Citation2015), Visual Geometry Group (Simonyan & Zisserman, Citation2014), and DenseNet (Li et al., Citation2021). CNNs have demonstrated promising outcomes when employed for the classification of tree species (Kattenborn et al., Citation2021). Song et al. (Citation2019) proposed attention branch-based convolutional neural networks (ABCNN) to identify tree species with highly similar leaves and tested on a special dataset of Leafsnap with highly similar tree leaves with an overall classification accuracy of 91.43%. Yan et al. (Citation2021) studied the recognition of six tree species using high-resolution satellite remote sensing imagery and achieved an overall accuracy of 82.7% based on the modified GoogLeNet. Other studies have demonstrated that ResNet model can achieve impressive performance in image classification (Al-Haija & Adebanjo, Citation2020; Firat & Hanbay, Citation2021; Li & Lima, Citation2021; Sarwinda et al., Citation2021; Wen et al., Citation2020). He et al. (Citation2016) have shown that ResNet50 significantly improves classification accuracy on the ImageNet dataset, a benchmark dataset for image classification, when compared to traditional image classification methods. Cao and Zhang (Citation2020) developed the Res-UNet network and achieved an overall classification accuracy of 87% for classifying six tree species based on airborne orthophotos. Li et al. (Citation2021) studied the Faster R-CNN models for 10 tree species recognition methods based on the whole tree image and achieved a classification accuracy of 98% using ResNet-50. There have been also extensive investigations into training CNN models using UAV-based RGB images for tree species classification. Deep learning techniques led to exceptionally accurate species classification predictions on centimeter-level resolution acquired from unmanned aerial vehicles (Schiefer et al., Citation2020). Onishi and Ise (Citation2018) achieved up to 89% accuracy for classify seven different types of trees with a UAV high resolution images. Natesan et al. (Citation2019) obtained an 80% classification accuracy for distinguishing three types of tree species (red pine, white pine, and non-pine) in a coniferous mixed forest, utilizing Residual Neural Networks. Schiefer et al. (Citation2020) utilized a U-Net convolutional neural network to classify nine dominant tree species in temperate forests, resulting in an overall classification accuracy of 83.2%. Natesan et al. (Citation2020) employed DenseNet for the classification of five predominant species of coniferous trees using multitemporal images captured under varying acquisition parameters, encompassing seasonal, temporal, illumination, and angular variability. The results showcased an overall classification accuracy of over 84%. Zhang et al. (Citation2021) achieved an overall accuracy of 92.6% when classifying 10 urban tree species using ResNet50 model. Onishi and Ise (Citation2021) applied a ResNet model for the classification of dominant tree species in coniferous mixed forests, utilizing UAV-based RGB imagery, resulting in an improved overall classification accuracy of 5.8–13.1% when compared to the support vector machine algorithm. Recently, the utilization of multispectral data has become common in tree species classification with various machine learning and deep learning techniques. Franklin (Citation2018) employed a machine learning algorithm solely on RGB images in a mixed-wood forest with different dominance proportions, achieving an overall classification accuracy of 69%. However, when combined with multispectral imagery, the accuracy improved to 80%. Brovkina et al. (Citation2018) achieved a comparable level of accuracy using multispectral images alone, albeit in a less complex stand.

The evolution of machine and deep learning software has led to increased user-friendliness, allowing individuals lacking a substantial background in computer science to independently utilize the most advanced algorithms to address their specific problems and datasets (Wäldchen et al., Citation2018). In the context of image recognition, a deep learning model is trained on a dataset of labeled images to categorize objects in new images. The model acquires the visual properties of various objects through techniques such as shape, texture, color analysis, or feature extraction, allowing it to accurately identify objects in new images. Image-based object recognition offers several significant benefits, including its ability to process images quickly and identify objects in real-time (Redmon et al., Citation2016; Shotton et al., Citation2013). In terms of cost, image recognition technology is often more cost-effective compared to alternative methods of object identification, such as manual identification by a team of experts. Furthermore, image recognition systems are non-invasive and do not require the physical collection of samples. The scalability of these systems also allows for the identification of a large number of objects in a short period of time. However, using images for object recognition has several limitations, especially for tree species identification. These limitations include variations in appearance caused by factors such as tree age, season, and location. Images captured in low light or harsh lighting conditions may require more detail to achieve accurate object recognition, and low-resolution images may not provide enough detail. Additionally, objects in the foreground or background of an image may obscure the target object, making it challenging to accurately identify it. Some objects may also have similar appearances, making it difficult to differentiate between them based on an image alone. Furthermore, the accuracy of an object recognition model is often limited by the quality and quantity of training data available. If a model is trained using a limited number of images, it may not be able to accurately recognize objects in new images.

The aim of this study was twofold: to evaluate the performance of the classification model generated with a combination of ortho and oblique imagery obtained from a mobile cartographic system, and to automate the process of identifying tree species. The research was conducted using the open-source ML.NET framework, which enabled the automation of recognizing and classifying selected tree species.

The study was conducted in collaboration with a mobile mapping company operating in the Scandinavian market to support vegetation maintenance in the power grid. The objective of the study was to identify two dominant tree species in the region, pine and spruce, which are widely distributed in Sweden, Norway, and Finland

Material and methods

Research outline

The study aimed to classify two tree species, pine and spruce, based on images acquired using an airborne mobile mapping system (MMS). The MMS used in the study included a Riegl VUX-240 laser scanner, two PhaseOne digital cameras (iXU 100 Mpx), a Trimble GPS receiver, and Ekinox-N IMU. The data was collected in the Bielsko-Biała region of Poland using a helicopter-mounted MMS with a different configuration of sensor components in two campaigns, leaf-off and leaf-on season. During leaf-off data collection, one digital camera was mounted below the helicopter to capture images to create ortho images. The second camera was mounted at the front of the helicopter to capture oblique images of the area. In the leaf-on campaign, the digital cameras captured oblique images in the front and rear direction of the flying platform. The cameras were mounted at approximately 45 degrees angle. A cumulative scheme showing the camera positions used in both campaigns is depicted in .

Figure 1. Scheme of cameras montage on mobile mapping system - the green color corresponds to the camera mounted in the front direction, the red color corresponds to the vertical camera, and the blue color corresponds to the camera mounted in the rear direction.

Figure 1. Scheme of cameras montage on mobile mapping system - the green color corresponds to the camera mounted in the front direction, the red color corresponds to the vertical camera, and the blue color corresponds to the camera mounted in the rear direction.

The trajectory of the flying platform was calculated using Inertial Explorer software. Phase One images were converted from IQQ to JPG format using Capture One Pro software. Inertial Explorer is a software tool that is commonly used to process and analyze data from inertial measurement unit, while Capture One Pro is a professional image editing software that allows users to convert, edit, and enhance images.

Dataset

The dataset used for deep learning included both ortho and oblique images. These images were generously provided by a partner company.

Ortho images

The ortho images for the leaf-off season were an essential resource for accurately identifying and classifying different types of trees in the study area. The high resolution of the image allowed for the precise detection and labeling of tree species, providing the necessary data for building a deep learning model for tree classification. Annotation of images captured during the leaf-off season when the trees were leafless, eliminated the possibility of misclassifications between tree species. shows an example of an ortho image that was used to train a deep learning model.

Figure 2. Part of the ortho image prepared for the study area.

Figure 2. Part of the ortho image prepared for the study area.

Oblique images

Besides ortho images, oblique images were used. During the aerial acquisition process, images of the study area were collected from both the front and rear directions of the flight path. These images were captured during the leaf-on condition, meaning that the trees were in a state of full foliage.

To prepare these images for use in tree species labeling and in building deep learning models, they were resized using IrfanView software. Images were converted from 11,608 × 8708to 2000 × 2000 pixels. The resizing process was performed in a way that preserved all of the trees’ characteristics. Output images were used to label tree species and create a dataset for the deep learning model. The labeling process involved identifying and classifying the two types of trees based on their visual characteristics. presents samples of images captured by a front PhaseOne camera.

Figure 3. Samples of the resized images from the front PhaseOne camera in front and rear direction.

Figure 3. Samples of the resized images from the front PhaseOne camera in front and rear direction.

ML.NET framework

Machine learning and deep learning are two distinct branches of artificial intelligence. In essence, machine learning refers to the capability of AI systems to autonomously adapt and improve with minimal human intervention. On the other hand, deep learning constitutes a subset of machine learning that harnesses artificial neural networks to emulate the intricate learning mechanisms observed in the human brain.

In this study, we employed the ML.NET framework to facilitate the training and deployment of a model aimed at classifying two distinct species with Deep Neural Network (DNN) offered by the framework. ML.NET is a free and open-source cross-platform machine learning framework developed by Microsoft. The framework was designed with the aim of making machine and deep learning more accessible to developers, enabling them to utilize a single framework for the integration, testing, and deployment of machine learning pipelines. ML.NET supports a wide range of machine learning scenarios, including data classification, image classification, value prediction, object detection, and recommendation (Ahmed et al., Citation2019). A comprehensive presentation of ML.NET architecture and application demands that shaped it can be found in Ahmed et al. (Citation2019) publication. Detailed ML.NET documentation can be found on the Microsoft website (https://learn.microsoft.com/en-us/dotnet/machine-learning/).

ML.NET framework includes a Model Builder, an intuitive graphical Visual Studio extension tool that allows developers to easily generate, train and deploy machine learning models, without requiring any machine learning expertise (Yu & Zahidi, Citation2023). The main steps in the workflow of Model Builder include choosing a scenario, defining the training environment (local or cloud-based), importing input data for training, training the model using the imported dataset, evaluating the performance of the generated model, and consuming the code by integrating it into the application for image classification (). In this study, training was done locally on a computer using a CPU (Intel Core i5-1135G7; Random Access Memory (RAM): 8GB).

Figure 4. Workflow of ML.NET model Builder.

Figure 4. Workflow of ML.NET model Builder.

The ML.NET framework performs tests on multiple trainer algorithms and chooses the best-found one for the final implementation. In the case of image classification functions, ML.NET utilizes the DNN and Residual Network model (DNN+ResNet-50) for implementation. DNNs are composed of multiple layers of artificial neurons, which learn to represent complex patterns in data, allowing them to generalize to new, unseen data (LeCun et al., Citation2015). The ResNet model was introduced by He et al. introduced in 2015 and in the same won the title of ILSVRC classification task champion. ResNet architecture is known for its ability to train deep neural networks with a large number of layers without encountering gradient dissipation and degradation problems (Habibzadeh et al., Citation2018; Li & Lima, Citation2021). ResNet achieves this by using so-called residual connections, which allow gradients to flow more easily through the network during training (He et al., Citation2016). In the context of image classification, ResNet50 is a specific implementation of the ResNet architecture that contains 50-layer convolutional neural networks (48 convolutional layers, one MaxPoll layer and one average pool layer).

Model architecture is composed of six distinct stages (), each with a specific function. The first stage serves as the input component of the network and is made up of Convolutional and Max Pooling layers. The Convolutional layers are responsible for extracting features from the input image by convolving it with a set of learnable filters, while the Max Pooling layers downsample the feature maps and retain only the most relevant information. The subsequent stages, stages 2 through 5, are the residual modules that form the majority of the network. These modules consist of Convolutional Blocks and Identity Blocks. The Convolutional Blocks consist of multiple consecutive convolutional layers that are used to extract high-level features from the input, while the Identity Blocks are designed to facilitate the flow of gradients during backpropagation and prevent vanishing gradients. The Identity Block includes a shortcut connection that bypasses the convolutional layers, allowing the gradient to flow more smoothly through the network. The final stage, stage 6, serves as the output component of the network and is typically composed of fully connected layers and a softmax activation function. The fully connected layers transform the features extracted by the convolutional layers into a form suitable for classification, while the softmax function produces a probability distribution over the classes (Yu & Zahidi, Citation2023). ResNet50 boasts an extensive parameter count of over 23 million, which are trainable.

Figure 5. ResNet-50 model architecture.

Figure 5. ResNet-50 model architecture.

DNN+ResNet50 architecture requires significant computational resources for training and requires a large amount of labeled data to be able to provide good results.

Methodology

This section outlines the methodology for tree species classification using aerial mapping data and the ML.NET framework.

Deep learning model

Deep learning involves creating a model that is trained to recognize certain types of patterns. The model is trained on a dataset by providing it with an algorithm that it can use to reason and learn from the data. Once the model has been trained, it can be used to make inferences and predictions about new data it has not seen before.

Labeling and annotations

Training a deep learning model for image classification involves the crucial process of annotating and labeling images. Annotation involves manually drawing bounding boxes around objects in an image, while labeling refers to assigning a class name or label to each annotation. These labels serve as metadata and are used to train the model in recognizing the objects. Although manual annotation and labeling are tedious tasks, they are an integral aspect of supervised learning and play a crucial role in the success of the model.

In this study, annotations for the tree species were generated utilizing LabelImg, an open-source software tool designed for deep learning applications (Tzutalin, Citation2015). LabelImg is a user-friendly program, which allows for the identification and saving object class information along with position information marked within the image. The application has already been widely adopted and utilized in several previous studies across various fields of research (Etienne et al., Citation2021; Khosravian et al., Citation2021; Roberts et al., Citation2020; Xuan et al., Citation2022; Zhou et al., Citation2022).

Two sets of annotations were prepared using LabelImg: one set for the ortho images data, and another set for oblique images. presents screenshots of the annotation process using LabelImg on the ortho and oblique image datasets. On the ortho images, a total of 107 annotations for pine trees and 102 annotations for spruce trees were created. Similarly, on the oblique images, 150 annotations for both pine and spruce trees were added. These annotations were needed to prepare input images for the deep learning training.

Figure 6. Example of labeling with LabelImg software on oblique image.

Figure 6. Example of labeling with LabelImg software on oblique image.

Data processing and augmentation

The annotated images were pre-processed by cropping to the border of the bounding box. This action isolated the tree species from the surrounding background and ensured that the deep learning model was trained on the relevant features of the objects. The bounding box is a rectangular region that outlines the object of interest. present examples of ortho and oblique imagery of pine and spruce clipped to their bounding boxes, which are used as input data for data augmentation.

Figure 7. Examples of ortho images of pine (above) and spruce (below) clipped to bounding boxes.

Figure 7. Examples of ortho images of pine (above) and spruce (below) clipped to bounding boxes.

Figure 8. Examples of oblique images of pine (above) and spruce (below) clipped to bounding boxes.

Figure 8. Examples of oblique images of pine (above) and spruce (below) clipped to bounding boxes.

To mitigate the impact of a limited annotated image dataset on the performance of a deep learning model, various data augmentation techniques have been deployed and utilized in previous studies (Barshooi & Amirkhani, Citation2022; Mikołajczyk & Grochowski, Citation2018; Perez & Wang, Citation2017). In this study, classical image transformations were performed to artificially expand the dataset and improve model performance (Fukuda et al., Citation2020). This included modifications to brightness (B), contrast (C), saturation (S), and sharpness of the images (Sh), as well as automatic color adjustment (Aca), flipping the images horizontally (Hf) and vertically (Vv) and rotating them left (Rl) and right (Rr). The data augmentation was carried out using IrfanView version 4.60 software, which has been used as a data augmentation tool in multiple previous studies (Fukuda et al., Citation2020; Kuwada et al., Citation2020; Murata et al., Citation2019). As a result of these techniques, two datasets were prepared for training the deep learning model – ORTHO set created based on ortho images and ORTHO-OBLIQUE set created based on the combination of ortho and oblique imagery. The details regarding the applied data augmentation techniques are summarised in . describes the dataset composition, indicating the number of images utilized for training the deep learning model.

Table 1. List of datasets for training with information about applied technique for data augmentation (e.g. B +50 corresponds to changes of brightness by + 50). Abbreviations are abovementioned.

Table 2. List of training datasets with a number of samples.

In order to validate the image classification model, a sample of ortho images was selected for analysis. The sample comprised 40 images of spruce and 42 images of pine, representing approximately 40% of the labeled samples for each species in the LabelImg software (102 labels were assigned to spruce and 107 labels to pine). To ensure a consistent and accurate evaluation, the validation set didn’t include any images generated during the augmentation process. The remaining images were used to train the model to identify and classify the tree species under investigation in the study area.

Model evaluation

Evaluating a deep learning model is crucial in assessing its performance and pinpointing areas for improvement.

The Model Builder of the ML.NET framework employs a trained model to make predictions on new test data and measures the accuracy of these predictions. To accomplish this, it divides the training data into two subsets: a training set (80%) that is used to train the model and a test set (20%) that is held back to evaluate the model’s performance. Predictions are considered accurate if the model assigns a probability of belonging to a particular category that is greater than 50%. Evaluation metrics in ML.NET are specific to the type of machine learning task that a model performs. For the classification task, the model is evaluated by measuring how well a predicted category matches the actual category. ML.NET’s Model Builder optimizes training time based on a specified set of images for machine classification. (https://learn.microsoft.com, 29.01.2023).

To evaluate the performance of an image classification model, both micro-average and macro-average accuracy are calculated in the ML.NET framework. Micro average computes a global average F1 score by summing the True Positives (TP), False Negatives (FN), and False Positives (FP) (EquationEquation 1). It measures the precision of the classifier. The closer the F1 value is to 1.0, the better the result. An F1 score of 1.0 means perfect precision and recall.

(1) F1=TPTP+12FP+FN(1)

where:

F1 - micro-averaged F1 score,

TP – true positives,

FP – false positives,

FN – false negatives.

The macro average F1 score is computed using the arithmetic mean of all support values. Every class contributes equally to the accuracy metric, with minority classes given equal weight as the larger classes. The macro-average metric gives the same weight to each class, regardless of how many instances from that class the dataset contains. The closer the macro F1 score is to 1.0, the better the result.

The authors conducted an independent evaluation of the model’s accuracy using a dedicated validation set to determine the overall accuracy of species classification. Overall accuracy is a metric that measures the proportion of correct predictions made by the model out of all predictions. It is calculated by dividing the number of correct predictions by the total number of predictions. The validation set comprised approximately 40% of the original images for each species, specifically excluded from the augmentation processes.

To compare the results obtained using the ML.NET framework, two independent models were built using the InceptionV3 and MobileNetV2 architectures. These models were trained on a dataset of images that produced the best outcomes within the ML.NET framework. Both models were constructed and validated using the same dataset.

Results and discussion

In this study, deep learning algorithms were applied to ORTHO and ORTHO-OBLIQUE datasets to train the models. The training process was carried out using ML.NET and the Model Builder tool, which automatically adjusts the training time to the size of the training dataset. The training duration ranged from 44 minutes for a dataset of 1742 images to 124 minutes for a dataset of 4384 images. Model Builder tool employed DNN+ResNet50 to construct the models and automatically computed the performance of the best model by utilizing metrics such as micro accuracy and macro accuracy. The training process results are presented in , which summarises the achieved results along with the corresponding training times.

Table 3. Details of training and models parameters generated with ML.NET ModelBuilder.

It is crucial to mention that the models in this study were built using the default parameters provided by the Microsoft ML.NET framework. These default parameters were selected to ensure consistency and compatibility with the ML.NET framework and its image classification process. The purpose was to demonstrate that even without in-depth programming knowledge and expertise in machine learning and deep learning, the framework’s default parameters can yield satisfactory results. However, it is worth noting that fine-tuning the parameters of the deep learning process can further enhance the efficiency and performance of the models. While this study focused on utilizing the default parameters to showcase the accessibility of ML.NET for achieving results, future research or practical applications using the ML.NET framework may involve parameter refinement to maximize the model’s performance in specific contexts. By adjusting parameters such as learning rate, batch size, or network architecture, it is possible to optimize the models for specific datasets and improve their accuracy and generalization capabilities.

Once the models were built for each dataset, a validation process of species classification was carried out using the validation set. The classification map for the validation set is presented in , showcasing the results for each model built on the specified sets. The first row corresponds to the validation set of pine trees, while the second row corresponds to spruce. The results of this validation together with generated accuracy are presented in .

Table 4. Classification map for the validation set for each DNN+ResNet50 model. The green color indicates proper classification, red color represents incorrect classification.

Table 5. Details of model validation on a validation set.

After a thorough analysis, we determined that the ORTHO-OBLIQUE dataset was the best-performing, achieving an overall accuracy of 93.9%. This dataset comprised 4384 images, including both ortho and oblique images, and required 124 minutes for training. Pine and spruce classification accuracies were 92.5% and 95.2%, respectively. The model’s micro and macro accuracy were also calculated, yielding values of 0.9878 and 0.9879, respectively, indicating good generalization to the validation data.

The ORTHO dataset, which only used ortho images, generated an overall accuracy of 90.2%. However, the ORTHO-OBLIQUE dataset outperformed it due to the inclusion of images from different perspectives in the training dataset. This dataset included both ortho and oblique images, providing top-down and side views of trees, whereas the ORTHO dataset only included top-down images. The diversity of image perspectives in the training dataset has contributed to the improved results achieved by the ORTHO-OBLIQUE dataset. The input images used in our study were acquired under two specific conditions: leaf off and leaf on. We made a deliberate choice not to elaborate on the details of these data acquisition conditions due to the fact that our classification task specifically targeted coniferous species. Moreover, there are no noticeable variations between the two seasons that are relevant to these particular types of trees.

To facilitate comparative analysis, two additional models were developed on the ORTO-OBLIQUE dataset, which exhibited the most favorable outcomes within the ML.NET framework using DNN+ResNet50. The models were trained within a C# console application, employing the InceptionV3 and MobileNetV2 architectures. The performance of the InceptionV3 model was evaluated by calculating both micro and macro accuracy, resulting in values of 0.9775 for both metrics. The MobileNetV2 model achieved a micro accuracy of 0.9831 and a macro accuracy of the same value. Validation on a dataset yielded the following outcomes: The InceptionV3 model achieved a classification accuracy of 85% for pine and 80.9% for spruce, resulting in an overall accuracy of 82.9%. In contrast, the MobileNetV2 model attained an overall accuracy of 81.7%, with 77.5% accuracy for pine classification and 85.7% accuracy for spruce classification. Among these models, the ResNet50 model exhibited the best performance, boasting an overall accuracy of 93.9%. presents the classification map of the validation set for all models, while provides detailed information about each model’s characteristics.

Table 6. Classification map of the validation set for DNN+ResNet50, InceptionV3 and MobilenetV2 models. The green color indicates proper classification, red color represents incorrect classification.

Table 7. Details of models built on ORTHO-OBLIQUE dataset.

There have been limited studies conducted on the utilization of deep learning methods for the classification of two species using high-resolution RGB imagery from UAV platforms. Among these studies, Kattenborn et al. (Citation2019) tested a CNN-based segmentation approach (U-net) in combination with training data directly derived from visual interpretation of UAV-based high-resolution RGB imagery for fine-grained mapping of two species and demonstrated that this approach had at least 84% overall accuracy. Haq et al. (Citation2021) used deep learning based supervised image classification algorithm and images collected using UAV for classification of forest areas with the overall accuracy of 93%. Using Residual Neural Networks, Natesan et al. (Citation2019) achieved an 80% classification accuracy in distinguishing three distinct types of tree species within a coniferous mixed forest. Egli and Höpke (Citation2020) designed a computationally lightweight CNN for the classification of four tree species based on RGB images obtained from automated UAV observations. Their study demonstrated that regardless of illumination conditions and phenological stages, average classification accuracies of 92% could be achieved. These studies collectively highlight the potential of deep learning methods in the classification of two species using high-resolution RGB imagery from UAV platforms, showcasing the range of accuracies achieved and the effectiveness of various approaches.

The findings of a comprehensive review conducted by Michałowska and Rapiński (Citation2021) focused on tree species classification based on airborne LiDAR datasets revealed that, in studies specifically targeting the classification of two tree species, the median accuracy reached 89.76%. Out of the total of 21 studies investigated, only five achieved the highest overall accuracy, ranging from 93% to 97.1%. The results obtained in this study demonstrated an overall accuracy of 93.9%, which falls within the range of the best classification results reported in studies conducted on LiDAR data (studies encompassed the use of various features such as geometric properties, radiometric attributes, and features derived from full-waveform decomposition) ().

Figure 9. Relationship between the overall accuracy and number of discriminated species in studies reviewed in Michałowska and Rapiński (Citation2021) article. The result of this research is indicated on the chart by a green dot.

Figure 9. Relationship between the overall accuracy and number of discriminated species in studies reviewed in Michałowska and Rapiński (Citation2021) article. The result of this research is indicated on the chart by a green dot.

Conclusions

Deep learning models have gained popularity in automating image analysis tasks due to their impressive results across a range of applications, including remote sensing and computer vision. The primary objective of this study was to employ the ML.NET framework to construct a deep learning model capable of accurately classifying pine and spruce trees based on ortho and oblique imagery. The authors undertook the task of preparing two distinct datasets of tree samples and training models in ML.NET framework utilizing deep neural networks in conjunction with the ResNet-50 architecture. The experiment revealed that the ORTHO-OBLIQUE dataset performed the highest performance, achieving an overall accuracy of 93.9%. The inclusion of images from multiple perspectives in the training dataset was the key factor contributing to the improved performance of the ORTHO-OBLIQUE dataset. The results suggest that species identification at very high spatial resolutions is facilitated through spatial patterns. This study underscores the effectiveness of the ML.NET framework in constructing deep learning models. The Model Builder feature within ML.NET enables users with limited software development knowledge to handle data training, thereby saving time and effort in model development. This study demonstrates the practical value of ML.NET as an accessible tool for image classification tasks, even for domain experts a limited machine and deep learning experience.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The data that support the findings of this study are available from the corresponding author, MM, upon reasonable request.

References

  • Ahmed, Z., Amizadeh, S., Bilenko, M., Carr, R., Chin, W. S., Dekel, Y., Dupre, X., Eksarevsliy, V., Filipi, S., Finley, T., Goswami, A., Hoover, M., Inglis, S., Interlandi, M., Kazmi, N., Krivosheev, G., Luferenko, P., Matantsev, I., Matusevych, S.,… Zhu, Y. (2019, July). Machine learning at Microsoft with ML. NET. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2448–15).
  • Al-Haija, Q. A., & Adebanjo, A. (2020). Breast cancer diagnosis in histopathological images using ResNet-50 convolutional neural network. 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (pp. 1–7). IEEE.
  • Avtar, R., Malik, R., Musthafa, M., Rathore, V. S., Kumar, P., & Singh, G. (2023). Forest plantation species classification using Full-Pol-Time-Averaged SAR scattering powers. Remote Sensing Applications: Society & Environment, 29, 100924. https://doi.org/10.1016/j.rsase.2023.100924
  • Barshooi, A. H., & Amirkhani, A. (2022). A novel data augmentation based on Gabor filter and convolutional deep learning for improving the classification of COVID-19 chest X-Ray images. Biomedical Signal Processing and Control, 72, 103326. https://doi.org/10.1016/j.bspc.2021.103326
  • Bont, L., Hill, A., Waser, L., Bürgi, A., Ginzler, C., & Blattert, C. (2020). Airborne-laser-scanning-derived auxiliary information discriminating between broadleaf and conifer trees improves the accuracy of models for predicting timber volume in mixed and heterogeneously structured forests. Forest Ecology and Management, 459, 117856. https://doi.org/10.1016/j.foreco.2019.117856
  • Brovkina, O., Cienciala, E., Surový, P., & Janata, P. (2018). Unmanned Aerial Vehicles (UAV) for assessment of qualitative classification of Norway spruce in temperate forest stands. Geo-Spatial Information Science, 21(1), 12–20. https://doi.org/10.1080/10095020.2017.1416994
  • Cao, K., & Zhang, X. (2020). An improved res-unet model for tree species classification using airborne high-resolution images. Remote Sensing, 12(7), 1128. https://doi.org/10.3390/rs12071128
  • Chamberlain, C. P., Meador, A. J. S., & Thode, A. E. (2020). Airborne lidar provides reliable estimates of canopy base height and canopy bulk density in southwestern ponderosa pine forests. Forest Ecology and Management, 481, 118695. https://doi.org/10.1016/j.foreco.2020.118695
  • Ciregan, D., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 3642–3649). IEEE.
  • Crisci, C., Ghattas, B., & Perera, G. (2012). A review of supervised machine learning algorithms and their applications to ecological data. Ecological Modelling, 240, 113–122. https://doi.org/10.1016/j.ecolmodel.2012.03.001
  • Dalponte, M., Bruzzone, L., & Gianelle, D. (2012). Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral/hyperspectral images and LiDAR data. Remote Sensing of Environment, 123, 258–270. https://doi.org/10.1016/j.rse.2012.03.013
  • Darwish, A., Leukert, K., & Reinhardt, W. (2003). Image segmentation for the purpose of object-based classification. IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No. 03CH37477) (Vol. 3, pp. 2039–2041). Ieee.
  • de Vries, T. N., Bronkhorst, J., Vermeer, M., Donker, J. C., Briels, S. A., Ziar, H., Zeman, M., & Isabella, O. (2020). A quick-scan method to assess photovoltaic rooftop potential based on aerial imagery and LiDAR. Solar Energy, 209, 96–107. https://doi.org/10.1016/j.solener.2020.07.035
  • Egli, S., & Höpke, M. (2020). CNN-Based tree species classification using high resolution RGB image data from automated UAV observations. Remote Sensing, 12(23), 3892. https://doi.org/10.3390/rs12233892.
  • El-Sheimy, N. (1996). The development of VISAT: A mobile survey system for GIS applications. University of Calgary.
  • Etienne, A., Ahmad, A., Aggarwal, V., & Saraswat, D. (2021). Deep learning-based object detection system for identifying weeds using UAS imagery. Remote Sensing, 13(24), 5182. https://doi.org/10.3390/rs13245182
  • Firat, H., & Hanbay, D. (2021). Classification of hyperspectral images using 3D CNN based ResNet50. 2021 29th Signal Processing and Communications Applications Conference (SIU) (pp. 1–4). IEEE.
  • Franklin, S. E. (2018). Pixel-and object-based multispectral classification of forest tree species from small unmanned aerial vehicles. Journal of Unmanned Vehicle Systems, 6(4), 195–211. https://doi.org/10.1139/juvs-2017-0022
  • Franklin, S. E., & Ahmed, O. S. (2018). Deciduous tree species classification using object-based analysis and machine learning with unmanned aerial vehicle multispectral data. International Journal of Remote Sensing, 39(15–16), 5236–5245. https://doi.org/10.1080/01431161.2017.1363442
  • Franklin, S. E., Ahmed, O. S., & Williams, G. (2017). Northern conifer forest species classification using multispectral data acquired from an unmanned aerial vehicle. PE&Rs, Photogrammetric Engineering & Remote Sensing, 83(7), 501–507. https://doi.org/10.14358/PERS.83.7.501
  • Fukuda, M., Ariji, Y., Kise, Y., Nozawa, M., Kuwada, C., Funakoshi, T., Muramatsu, C., Fujita, H., Katsumata, A., & Ariji, E. (2020). Comparison of 3 deep learning neural networks for classifying the relationship between the mandibular third molar and the mandibular canal on panoramic radiographs. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology, 130(3), 336–343. https://doi.org/10.1016/j.oooo.2020.04.005
  • Gamdha, D., Unnikrishnakurup, S., Rose, K. J., Surekha, M., Purushothaman, P., Ghose, B., & Balasubramaniam, K. (2021). Automated defect recognition on X-ray radiographs of solid propellant using deep learning based on convolutional neural networks. Journal of Nondestructive Evaluation, 40(1), 1–13. https://doi.org/10.1007/s10921-021-00750-4
  • Gini, R., Passoni, D., Pinto, L., & Sona, G. (2012). Aerial images from an UAV system: 3D modeling and tree species classification in a park area. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 39(B1), 361–366. https://doi.org/10.5194/isprsarchives-XXXIX-B1-361-2012
  • Goldblatt, R., Stuhlmacher, M. F., Tellman, B., Clinton, N., Hanson, G., Georgescu, M., Wang, C., Serrano-Candela, F., Khandelwal, A., Cheng, W., & Balling, R. C., Jr. (2018). Using Landsat and nighttime lights for supervised pixel-based image classification of urban land cover. Remote Sensing of Environment, 205, 253–275. https://doi.org/10.1016/j.rse.2017.11.026
  • Guo, X., Li, H., Jing, L., & Wang, P. (2022). Individual tree species classification based on convolutional neural networks and multitemporal high-resolution remote sensing images. Sensors, 22(9), 3157. https://doi.org/10.3390/s22093157
  • Habibzadeh, M., Jannesari, M., Rezaei, Z., Baharvand, H., & Totonchi, M. (2018). Automatic white blood cell classification using pre-trained deep learning models: Resnet and inception. Tenth International Conference on Machine Vision (ICMV 2017) (Vol. 10696, pp. 274–281). SPIE.
  • Haq, M. A., Rahaman, G., Baral, P., & Ghosh, A. (2021). Deep learning based supervised image classification using UAV images for forest areas classification. J Indian Soc Remote Sens, 49(3), 601–606. https://doi.org/10.1007/s12524-020-01231-3
  • Heinzel, J., & Koch, B. (2011). Exploring full-waveform LiDAR parameters for tree species classification. International Journal of Applied Earth Observation and Geoinformation: ITC Journal, 13(1), 152–160. https://doi.org/10.1016/j.jag.2010.09.010
  • Heinzel, J., & Koch, B. (2012). Investigating multiple data sources for tree species classification in temperate forest and use for single tree delineation. International Journal of Applied Earth Observation and Geoinformation, 18, 101–110. https://doi.org/10.1016/j.jag.2012.01.025
  • He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
  • He, T., Zhou, H., Xu, C., Hu, J., Xue, X., Xu, L., Lou, X., Zeng, K., & Wang, Q. (2023). Deep learning in forest tree species classification using sentinel-2 on google Earth engine: A case study of Qingyuan County. Sustainability, 15(3), 2741. https://doi.org/10.3390/su15032741
  • Huang, J., Zhang, X., Xin, Q., Sun, Y., & Zhang, P. (2019). Automatic building extraction from high-resolution aerial images and LiDAR data using gated residual refinement network. ISPRS Journal of Photogrammetry and Remote Sensing, 151, 91–105. https://doi.org/10.1016/j.isprsjprs.2019.02.019
  • Hung, J., & Carpenter, A. (2017). Applying faster R-CNN for object detection on malaria images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 56–61).
  • Immitzer, M., Atzberger, C., & Koukal, T. (2012). Tree species classification with random forest using very high spatial resolution 8-band WorldView-2 satellite data. Remote Sensing, 4(9), 2661–2693. https://doi.org/10.3390/rs4092661
  • Kattenborn, T., Eichel, J., & Fassnacht, F. E. (2019). Convolutional neural networks enable efficient, accurate and fine-grained segmentation of plant species and communities from high-resolution UAV imagery. Scientific Reports, 9(1), 17656. https://doi.org/10.1038/s41598-019-53797-9
  • Kattenborn, T., Leitloff, J., Schiefer, F., & Hinz, S. (2021). Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. Isprs Journal of Photogrammetry & Remote Sensing, 173, 24–49. https://doi.org/10.1016/j.isprsjprs.2020.12.010
  • Ke, Y., Quackenbush, L. J., & Im, J. (2010). Synergistic use of QuickBird multispectral imagery and LIDAR data for object-based forest species classification. Remote Sensing of Environment, 114(6), 1141–1154. https://doi.org/10.1016/j.rse.2010.01.002
  • Khosravian, A., Amirkhani, A., Kashiani, H., & Masih-Tehrani, M. (2021). Generalizing state-of-the-art object detectors for autonomous vehicles in unseen environments. Expert Systems with Applications, 183, 115417. https://doi.org/10.1016/j.eswa.2021.115417
  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386
  • Krüger Geb Amiri, N., Heurich, M., Krzystek, P., & Skidmore, A. (2018). Feature relevance assessment of multispectral airborne LiDAR data for tree species classification. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-3, 31–34. https://doi.org/10.5194/isprs-archives-XLII-3-31-2018
  • Kukkonen, M., Maltamo, M., Korhonen, L., & Packalen, P. (2019). Multispectral airborne LiDAR data in the prediction of boreal tree species composition. IEEE Transactions on Geoscience and Remote Sensing: A Publication of the IEEE Geoscience and Remote Sensing Society, 57(6), 3462–3471. https://doi.org/10.1109/TGRS.2018.2885057
  • Kuwada, C., Ariji, Y., Fukuda, M., Kise, Y., Fujita, H., Katsumata, A., & Ariji, E. (2020). Deep learning systems for detecting and classifying the presence of impacted supernumerary teeth in the maxillary incisor region on panoramic radiographs. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology, 130(4), 464–469. https://doi.org/10.1016/j.oooo.2020.04.813
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
  • Li, H., Hu, B., Li, Q., & Jing, L. (2021). CNN-Based individual tree species classification using high-resolution satellite imagery and airborne LiDAR data. Forests, 12(12), 1697. https://doi.org/10.3390/f12121697
  • Li, B., & Lima, D. (2021). Facial expression recognition via ResNet-50. International Journal of Cognitive Computing in Engineering, 2, 57–64. https://doi.org/10.1016/j.ijcce.2021.02.002
  • Li, Y., Tang, B., Li, J., Sun, W., Lin, Z., & Luo, Q. (2021). Research on common tree species recognition by faster R-CNN based on whole tree image. In 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP) (pp. 28–32). IEEE.
  • Liu, D., & Xia, F. (2010). Assessing object-based classification: Advantages and limitations. Remote Sensing Letters, 1(4), 187–194. https://doi.org/10.1080/01431161003743173
  • Liu, Q., Xiao, L., Yang, J., & Wei, Z. (2020). CNN-enhanced graph convolutional network with pixel-and superpixel-level feature fusion for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 59(10), 8657–8671. https://doi.org/10.1109/TGRS.2020.3037361
  • Magnard, C., Morsdorf, F., Small, D., Stilla, U., Schaepman, M. E., & Meier, E. (2016). Single tree identification using airborne multibaseline SAR interferometry data. Remote Sensing of Environment, 186, 567–580. https://doi.org/10.1016/j.rse.2016.09.018
  • Mahdi, F. P., Motoki, K., & Kobashi, S. (2020). Optimization technique combined with deep learning method for teeth recognition in dental panoramic radiographs. Scientific Reports, 10(1), 19261. https://doi.org/10.1038/s41598-020-75887-9
  • Mahdy, L. N., Ezzat, K. A., Elmousalami, H. H., Ella, H. A., & Hassanien, A. E. (2020). Automatic x-ray COVID-19 lung image classification system based on multi-level thresholding and support vector machine. MedRxiv, 2020.03. https://doi.org/10.1101/2020.03.30.20047787
  • Mallick, P. K., Ryu, S. H., Satapathy, S. K., Mishra, S., Nguyen, G. N., & Tiwari, P. (2019). Brain MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural network. IEEE Access, 7, 46278–46287. https://doi.org/10.1109/ACCESS.2019.2902252
  • Michałowska, M. (2020). Verification of building constructions surroundings based on airborne laser scanning data. XV International Conference on Durability of Building Materials and Components (DBMC 2020).
  • Michałowska, M., & Rapiński, J. (2021). A review of tree species classification based on airborne LiDAR data and applied classifiers. Remote Sensing, 13(3), 353. https://doi.org/10.3390/rs13030353
  • Mikołajczyk, A., & Grochowski, M. (2018). Data augmentation for improving deep learning in image classification problem. 2018 international interdisciplinary PhD workshop (IIPhDW) (pp. 117–122). IEEE.
  • MS, M., SS, S. R., & S S, S. R. (2022). Optimal squeeze net with deep neural network-based aerial image classification model in unmanned aerial vehicles. Traitement du Signal, 39(1), 275–281. https://doi.org/10.18280/ts.390128
  • Murata, M., Ariji, Y., Ohashi, Y., Kawai, T., Fukuda, M., Funakoshi, T., Kise, Y., Nozawa, M., Katsumata, A., Fujita, H., & Ariji, E. (2019). Deep-learning classification using convolutional neural network for evaluation of maxillary sinusitis on panoramic radiography. Oral Radiology, 35(3), 301–307. https://doi.org/10.1007/s11282-018-0363-7
  • Natesan, S., Armenakis, C., & Vepakomma, U. (2019). Resnet-based tree species classification using uav images. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, XLII-2/W13, 475–481. https://doi.org/10.5194/isprs-archives-XLII-2-W13-475-2019
  • Natesan, S., Armenakis, C., & Vepakomma, U. (2020). Individual tree species identification using dense convolutional network (DenseNet) on multitemporal RGB images from UAV. Journal of Unmanned Vehicle Systems, 8(4), 310–333. https://doi.org/10.1139/juvs-2020-0014
  • Nevalainen, O., Honkavaara, E., Tuominen, S., Viljanen, N., Hakala, T., Yu, X., Hyyppä, J., Saari, H., Pölönen, I., Imai, N. N., & Tommaselli, A. M. (2017). Individual tree detection and classification with UAV-based photogrammetric point clouds and hyperspectral imaging. Remote Sensing, 9(3), 185. https://doi.org/10.3390/rs9030185
  • Nogueira, K., dos Santos, J. A., Menini, N., Silva, T. S., Morellato, L. P. C., & Torres, R. D. S. (2019). Spatio-temporal vegetation pixel classification by using convolutional networks. IEEE Geoscience and Remote Sensing Letters, 16(10), 1665–1669. https://doi.org/10.1109/LGRS.2019.2903194
  • Onishi, M., & Ise, T. (2018). Automatic classification of trees using a UAV onboard camera and deep learning. arXiv preprint arXiv:1804.10390.
  • Onishi, M., & Ise, T. (2021). Explainable identification and mapping of trees using UAV RGB image and deep learning. Scientific Reports, 11(1), 903. https://doi.org/10.1038/s41598-020-79653-9
  • Ørka, H. O., Næsset, E., & Bollandsås, O. M. (2010). Effects of different sensors and leaf-on and leaf-off canopy conditions on echo distributions and individual tree properties derived from airborne laser scanning. Remote Sensing of Environment, 114(7), 1445–1461. https://doi.org/10.1016/j.rse.2010.01.024
  • Perez, L., & Wang, J. (2017). The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621.
  • Pyysalo, U., & Hyyppa, H. (2002). Reconstructing tree crowns from laser scanner data for feature extraction. International Archives Of Photogrammetry Remote Sensing And Spatial Information Sciences, 34(3/B), 218–221. https://scholar.google.com/scholar_lookup?title=Reconstructing+tree+crowns+from+laser+scanner+data+for+feature+extraction&author=Pyysalo,+U.&author=Hyyppa,+H.&publication_year=2002&journal=Int.+Arch.+Photogramm.+Remote+Sens.+Spat.+Inf.+Sci.&volume=34&pages=218%E2%80%93221.
  • Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788).
  • Roberts, R., Giancontieri, G., Inzerillo, L., & DiMino, G. (2020). Towards low-cost pavement condition health monitoring and analysis using deep learning. Applied Sciences, 10(1), 319. https://doi.org/10.3390/app10010319
  • Safonova, A., Tabik, S., Alcaraz-Segura, D., Rubtsov, A., Maglinets, Y., & Herrera, F. (2019). Detection of fir trees (abies sibirica) damaged by the bark beetle in unmanned aerial vehicle images with deep learning. Remote Sensing, 11(6), 643. https://doi.org/10.3390/rs11060643
  • Sarrazin, D., van Aardt, J., Asner, G., Mcglinchy, J., Messinger, D., & Wu, J. (2011). Fusing small-footprint waveform LiDAR and hyperspectral data for canopy-level species classification and herbaceous biomass modeling in savanna ecosystems. Canadian Journal of Remote Sensing, 37(6), 653–665. https://doi.org/10.5589/m12-007
  • Sarwinda, D., Paradisa, R. H., Bustamam, A., & Anggia, P. (2021). Deep learning in image classification using residual network (ResNet) variants for detection of colorectal cancer. Procedia Computer Science, 179, 423–431. https://doi.org/10.1016/j.procs.2021.01.025
  • Schiefer, F., Kattenborn, T., Frick, A., Frey, J., Schall, P., Koch, B., & Schmidtlein, S. (2020). Mapping forest tree species in high resolution UAV-based RGB-imagery by means of convolutional neural networks. ISPRS Journal of Photogrammetry and Remote Sensing, 170, 205–215. https://doi.org/10.1016/j.isprsjprs.2020.10.015
  • Shen, X., & Cao, L. (2017). Tree-species classification in subtropical forests using airborne hyperspectral and LiDAR data. Remote Sensing, 9(11), 1180. https://doi.org/10.3390/rs9111180
  • Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., & Moore, R. (2013). Real-time human pose recognition in parts from single depth images. Communications of the ACM, 56(1), 116–124. https://doi.org/10.1145/2398356.2398381
  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  • Solberg, S., Næsset, E., & Bollandsas, O. M. (2006). Single tree segmentation using airborne laser scanner data in a structurally heterogeneous spruce forest. Photogrammetric Engineering & Remote Sensing, 72(12), 1369–1378. https://doi.org/10.14358/PERS.72.12.1369
  • Song, Y., He, F., & Zhang, X. (2019). To identify tree species with highly similar leaves based on a novel attention mechanism for CNN. IEEE Access, 7, 163277–163286. https://doi.org/10.1109/ACCESS.2019.2951607
  • Sothe, C., Dalponte, M., Almeida, C. M. D., Schimalski, M. B., Lima, C. L., Liesenberg, V., Miyoshi, G. T., & Tommaselli, A. M. G. (2019). Tree species classification in a highly diverse subtropical forest integrating UAV-based photogrammetric point cloud and hyperspectral data. Remote Sensing, 11(11), 1338. https://doi.org/10.3390/rs11111338
  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
  • Tsai, D. M., & Chen, W. L. (2017). Coffee plantation area recognition in satellite images using Fourier transform. Computers and Electronics in Agriculture, 135, 115–127. https://doi.org/10.1016/j.compag.2016.12.020
  • Tzutalin, L. (2015). Git code. https://github.com/tzutalin/labelImg
  • Udali, A., Lingua, E., & Persson, H. J. (2021). Assessing forest type and tree species classification using sentinel-1 C-Band SAR data in Southern Sweden. Remote Sensing, 13(16), 3237. https://doi.org/10.3390/rs13163237
  • Wäldchen, J., Mäder, P., & Cooper, N. (2018). Machine learning for image based species identification. Methods in Ecology and Evolution, 9(11), 2216–2225. https://doi.org/10.1111/2041-210X.13075
  • Wang, Q., Bi, S., Sun, M., Wang, Y., Wang, D., Yang, S., & Zhang, J. (2019). Deep learning approach to peripheral leukocyte recognition. PloS One, 14(6), e0218808. https://doi.org/10.1371/journal.pone.0218808
  • Wen, L., Li, X., & Gao, L. (2020). A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Computing and Applications, 32(10), 6111–6124. https://doi.org/10.1007/s00521-019-04097-w
  • Wu, Q., Lane, C. R., Li, X., Zhao, K., Zhou, Y., Clinton, N., de Vries, B., Golden, H., & Lang, M. W. (2019). Integrating LiDAR data and multi-temporal aerial imagery to map wetland inundation dynamics using Google earth engine. Remote Sensing of Environment, 228, 1–13. https://doi.org/10.1016/j.rse.2019.04.015
  • Xuan, J., Li, X., Du, H., Zhou, G., Mao, F., Wang, J., Zhang, B., Gong, Y., Zhu, D., Zhou, L., Huanf, Z., Xu, C., Chen, J., Zhou, Y., Chen, C., Tan, C., & Sun, J. (2022). Intelligent estimating the tree height in urban forests based on deep learning combined with a smartphone and a comparison with UAV-LiDAR. Remote Sensing, 15(1), 97. https://doi.org/10.3390/rs15010097
  • Yan, S., Jing, L., & Wang, H. (2021). A new individual tree species recognition method based on a convolutional neural network and high-spatial resolution remote sensing imagery. Remote Sensing, 13(3), 479. https://doi.org/10.3390/rs13030479
  • Yao, W., Krzystek, P., & Heurich, M. (2012). Tree species classification and estimation of stem volume and DBH based on single tree extraction by exploiting airborne full-waveform LiDAR data. Remote Sensing of Environment, 123, 368–380. https://doi.org/10.1016/j.rse.2012.03.027
  • You, H. T., Lei, P., Li, M. S., & Ruan, F. Q. (2020). Forest species classification based on three-dimensional coordinate and intensity information of airborne LiDAR data with random forest method. The International Archives of Photogrammetry, Remote Sensing & Spatial Information Sciences, 42, 117–123. https://doi.org/10.5194/isprs-archives-XLII-3-W10-117-2020
  • Yu, X., Hyyppä, J., Litkey, P., Kaartinen, H., Vastaranta, M., & Holopainen, M. (2017). Single-sensor solution to tree species classification using multispectral airborne laser scanning. Remote Sensing, 9(2), 108. https://doi.org/10.3390/rs9020108
  • Yu, H., & Zahidi, I. (2023). Tailings pond classification based on satellite images and machine learning: An exploration of Microsoft ML. Net Mathematics, 11(3), 517. https://doi.org/10.3390/math11030517
  • Zhang, C., Xia, K., Feng, H., Yang, Y., & Du, X. (2021). Tree species classification using deep learning and RGB optical images obtained by an unmanned aerial vehicle. Journal of Forestry Research, 32(5), 1879–1888. https://doi.org/10.1007/s11676-020-01245-0
  • Zhou, Y., Liu, W., Bi, H., Chen, R., Zong, S., & Luo, Y. (2022). A detection method for individual infected pine trees with pine wilt disease based on deep learning. Forests, 13(11), 1880. https://doi.org/10.3390/f13111880