Full article: Tree species classification on images from airborne mobile mapping using ML.NET

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

Deep learning is a powerful tool for automating the process of recognizing and classifying objects in images. In this study, we used ML.NET, a popular open-source machine learning framework, to develop a model for identifying tree species in images obtained from airborne mobile mapping. These high-resolution images can be used to create detailed maps of the landscape. They can also be analyzed and processed to extract information about visual features, including tree species recognition. The deep learning model was trained using ML.NET to classify two tree species based on the combination of airborne mobile mapping images. Our approach yielded impressive results, with a maximum classification accuracy of 93.9%. This demonstrates the effectiveness of combining imagery sources with deep learning tools in ML.NET for efficient and accurate tree species classification. This study highlights the potential of the ML.NET framework for automating object classification and can provide valuable insights and information for forestry management and conservation efforts. The primary objective of this research was to evaluate the effectiveness of an approach for identifying tree species through a model generated using a combination of ortho and oblique images captured by a mobile mapping system.

KEYWORDS:

Introduction

The primary tree species classification method involves expensive and time-consuming in-situ surveys to measure structure attributes such as tree height, diameter at breast height, leaf area index, branch angle, etc. of individual trees, and then identifying tree species by comparing these parameters to a norm. The accurate classification of tree species plays a vital role in the effective and sustainable management of forest resources, but automation of this process still remains a challenging task for scientists and land managers (Michałowska & Rapiński, Citation2021). The spatial composition of tree species is critical for a variety of reasons including economics, ecology, and technology. In addition, a tree species map is essential for forest inventory and has been recognized as a valuable tool in the field of forest management and planning (Dalponte et al., Citation2012; Heinzel & Koch, Citation2012). Advances in technology made it possible to classify species more efficiently using remote sensing techniques, including optical and Synthetic Aperture Radar (SAR) images (Avtar et al., Citation2023; Magnard et al., Citation2016; Udali et al., Citation2021), images captured by cameras mounted on aircraft such as helicopters or unmanned aerial vehicles (UAVs) (Gini et al., Citation2012; Natesan et al., Citation2019; Zhang et al., Citation2021), and Light Detection and Ranging (LiDAR). The use of LiDAR data in forest inventory started over 20 years ago, primarily for determining the height of individual trees and stands, as well as tree density. Recent studies have focused on extracting tree parameters such as crown size (Chamberlain et al., Citation2020; Pyysalo & Hyyppa, Citation2002; Solberg et al., Citation2006), stem volume (Bont et al., Citation2020; Ørka et al., Citation2010; Yao et al., Citation2012), and biomass (Sarrazin et al., Citation2011), as well as identifying tree species (Krüger Geb Amiri et al., Citation2018; Kukkonen et al., Citation2019; Yu et al., Citation2017) or types (Heinzel & Koch, Citation2011). Most research on tree species classification using airborne LiDAR data emphasizes extracting structural parameters. A comprehensive review of tree species recognition based on LiDAR-derived metrics can be found in the article by Michałowska and Rapiński (Citation2021). Combining LiDAR with hyperspectral sensors can improve the accuracy of tree species classification by utilizing radiometric features associated with the chemical and morphological properties of tree crowns. Studies have shown the effectiveness of both multispectral and hyperspectral spectra in identifying tree species (Shen & Cao, Citation2017; You et al., Citation2020). Over the years, advancements in remote sensing and deep learning have facilitated the creation of innovative approaches for identifying and mapping vegetation. Airborne mobile mapping systems equipped with high-resolution cameras have become increasingly popular for large-scale mapping. Mobile mapping technology is a method of collecting geospatial data using a moving platform equipped with various sensors, such as high-resolution cameras, laser scanners, GPS (Global Positioning System) receivers, and Inertial Measurement Unit (IMU) (El-Sheimy, Citation1996). The integration of these sensors allows for the accurate collection of georeferenced 3D geospatial data, as well as visual information. The acquired data can be utilized for a range of purposes, including mapping (Wu et al., Citation2019), modeling (Huang et al., Citation2019), inspection, inventorying, and analysis (de Vries et al., Citation2020; Michałowska, Citation2020). In the last decades, airborne mobile mapping data has been captured using manned aircraft, helicopters, and small unmanned aerial vehicles. The technology of unmanned aerial vehicles (UAVs) is currently experiencing rapid development, as new sensors and methods are being introduced to the market, providing new opportunities for remote sensing tasks (Nevalainen et al., Citation2017). UAV technologies have already been utilized for tree identification and classification. While some studies have used hyperspectral sensors on UAV for this purpose (Franklin & Ahmed, Citation2018; Nevalainen et al., Citation2017, Franklin et al., Citation2017; Sothe et al., Citation2019), others have utilized digital cameras and deep learning for identifying tree species (Onishi & Ise, Citation2021; Safonova et al., Citation2019). It’s worth mentioning that digital cameras provide valuable information about the objects being inspected, including color and texture data, which can help in the identification of various materials and surface patterns. According to Kattenborn et al. (Citation2019), when utilizing very high spatial resolution data to differentiate between specific tree species, the spatial patterns derived from leaf forms, canopy shapes, and branching patterns hold greater importance than the spectral information.

The use of images for object recognition and classification is a potent method of automation and comprehends vast amounts of visual data. Object recognition and classification technology can be applied to a range of images at various scales, levels of detail and resolutions. It has been extensively tested using brightfield microscopy images (Hung & Carpenter, Citation2017; Wang et al., Citation2019), diagnostic radiographs (Gamdha et al., Citation2021; Mahdi et al., Citation2020; Mahdy et al., Citation2020), satellite imagery (Tsai & Chen, Citation2017). In recent years, there has been substantial progress in image classification methods, which can be broadly divided into two categories: pixel-based (Goldblatt et al., Citation2018; Liu et al., Citation2020; Nogueira et al., Citation2019) and object-oriented (Darwish et al., Citation2003; Liu & Xia, Citation2010). Pixel-based classification approaches treat each pixel as a separate unit for analysis and primarily focus on the band spectral intensity of the pixels, ignoring the spatial relationships and contextual information (Ke et al., Citation2010). In recent years, deep learning (DL), as a subset of machine learning, has gained substantial attention, resulting in the widespread availability of data and software. There has been a growing interest in utilizing deep learning techniques for tree species classification using remote sensing data, which has been observed for nearly a decade (Guo et al., Citation2022; He et al., Citation2023). The advancement of deep learning technology and neural network (LeCun et al., Citation2015) has led to a proliferation of researchers utilising neural networks for automated feature extraction, thereby eliminating the need for manual feature selection, which was prevalent in earlier studies (Crisci et al., Citation2012; Immitzer et al., Citation2012). Typical deep learning networks encompass Convolutional Neural Networks (CNNs), stacked autoencoders, deep belief networks, and recurrent neural networks. Among these networks, CNNs demonstrate significant potential and have already achieved successful applications in various remote sensing tasks, including image classification, object detection, image registration, and segmentation (Ciregan et al., Citation2012; Mallick et al., Citation2019; MS et al., Citation2022; Yu & Zahidi, Citation2023). The representative CNN models are the ResNet (He et al., Citation2016), AlexNet (Krizhevsky et al., Citation2017), GoogLeNet (Szegedy et al., Citation2015), Visual Geometry Group (Simonyan & Zisserman, Citation2014), and DenseNet (Li et al., Citation2021). CNNs have demonstrated promising outcomes when employed for the classification of tree species (Kattenborn et al., Citation2021). Song et al. (Citation2019) proposed attention branch-based convolutional neural networks (ABCNN) to identify tree species with highly similar leaves and tested on a special dataset of Leafsnap with highly similar tree leaves with an overall classification accuracy of 91.43%. Yan et al. (Citation2021) studied the recognition of six tree species using high-resolution satellite remote sensing imagery and achieved an overall accuracy of 82.7% based on the modified GoogLeNet. Other studies have demonstrated that ResNet model can achieve impressive performance in image classification (Al-Haija & Adebanjo, Citation2020; Firat & Hanbay, Citation2021; Li & Lima, Citation2021; Sarwinda et al., Citation2021; Wen et al., Citation2020). He et al. (Citation2016) have shown that ResNet50 significantly improves classification accuracy on the ImageNet dataset, a benchmark dataset for image classification, when compared to traditional image classification methods. Cao and Zhang (Citation2020) developed the Res-UNet network and achieved an overall classification accuracy of 87% for classifying six tree species based on airborne orthophotos. Li et al. (Citation2021) studied the Faster R-CNN models for 10 tree species recognition methods based on the whole tree image and achieved a classification accuracy of 98% using ResNet-50. There have been also extensive investigations into training CNN models using UAV-based RGB images for tree species classification. Deep learning techniques led to exceptionally accurate species classification predictions on centimeter-level resolution acquired from unmanned aerial vehicles (Schiefer et al., Citation2020). Onishi and Ise (Citation2018) achieved up to 89% accuracy for classify seven different types of trees with a UAV high resolution images. Natesan et al. (Citation2019) obtained an 80% classification accuracy for distinguishing three types of tree species (red pine, white pine, and non-pine) in a coniferous mixed forest, utilizing Residual Neural Networks. Schiefer et al. (Citation2020) utilized a U-Net convolutional neural network to classify nine dominant tree species in temperate forests, resulting in an overall classification accuracy of 83.2%. Natesan et al. (Citation2020) employed DenseNet for the classification of five predominant species of coniferous trees using multitemporal images captured under varying acquisition parameters, encompassing seasonal, temporal, illumination, and angular variability. The results showcased an overall classification accuracy of over 84%. Zhang et al. (Citation2021) achieved an overall accuracy of 92.6% when classifying 10 urban tree species using ResNet50 model. Onishi and Ise (Citation2021) applied a ResNet model for the classification of dominant tree species in coniferous mixed forests, utilizing UAV-based RGB imagery, resulting in an improved overall classification accuracy of 5.8–13.1% when compared to the support vector machine algorithm. Recently, the utilization of multispectral data has become common in tree species classification with various machine learning and deep learning techniques. Franklin (Citation2018) employed a machine learning algorithm solely on RGB images in a mixed-wood forest with different dominance proportions, achieving an overall classification accuracy of 69%. However, when combined with multispectral imagery, the accuracy improved to 80%. Brovkina et al. (Citation2018) achieved a comparable level of accuracy using multispectral images alone, albeit in a less complex stand.

The evolution of machine and deep learning software has led to increased user-friendliness, allowing individuals lacking a substantial background in computer science to independently utilize the most advanced algorithms to address their specific problems and datasets (Wäldchen et al., Citation2018). In the context of image recognition, a deep learning model is trained on a dataset of labeled images to categorize objects in new images. The model acquires the visual properties of various objects through techniques such as shape, texture, color analysis, or feature extraction, allowing it to accurately identify objects in new images. Image-based object recognition offers several significant benefits, including its ability to process images quickly and identify objects in real-time (Redmon et al., Citation2016; Shotton et al., Citation2013). In terms of cost, image recognition technology is often more cost-effective compared to alternative methods of object identification, such as manual identification by a team of experts. Furthermore, image recognition systems are non-invasive and do not require the physical collection of samples. The scalability of these systems also allows for the identification of a large number of objects in a short period of time. However, using images for object recognition has several limitations, especially for tree species identification. These limitations include variations in appearance caused by factors such as tree age, season, and location. Images captured in low light or harsh lighting conditions may require more detail to achieve accurate object recognition, and low-resolution images may not provide enough detail. Additionally, objects in the foreground or background of an image may obscure the target object, making it challenging to accurately identify it. Some objects may also have similar appearances, making it difficult to differentiate between them based on an image alone. Furthermore, the accuracy of an object recognition model is often limited by the quality and quantity of training data available. If a model is trained using a limited number of images, it may not be able to accurately recognize objects in new images.

The aim of this study was twofold: to evaluate the performance of the classification model generated with a combination of ortho and oblique imagery obtained from a mobile cartographic system, and to automate the process of identifying tree species. The research was conducted using the open-source ML.NET framework, which enabled the automation of recognizing and classifying selected tree species.

The study was conducted in collaboration with a mobile mapping company operating in the Scandinavian market to support vegetation maintenance in the power grid. The objective of the study was to identify two dominant tree species in the region, pine and spruce, which are widely distributed in Sweden, Norway, and Finland

Material and methods

Research outline

The study aimed to classify two tree species, pine and spruce, based on images acquired using an airborne mobile mapping system (MMS). The MMS used in the study included a Riegl VUX-240 laser scanner, two PhaseOne digital cameras (iXU 100 Mpx), a Trimble GPS receiver, and Ekinox-N IMU. The data was collected in the Bielsko-Biała region of Poland using a helicopter-mounted MMS with a different configuration of sensor components in two campaigns, leaf-off and leaf-on season. During leaf-off data collection, one digital camera was mounted below the helicopter to capture images to create ortho images. The second camera was mounted at the front of the helicopter to capture oblique images of the area. In the leaf-on campaign, the digital cameras captured oblique images in the front and rear direction of the flying platform. The cameras were mounted at approximately 45 degrees angle. A cumulative scheme showing the camera positions used in both campaigns is depicted in .

Figure 1. Scheme of cameras montage on mobile mapping system - the green color corresponds to the camera mounted in the front direction, the red color corresponds to the vertical camera, and the blue color corresponds to the camera mounted in the rear direction.

The trajectory of the flying platform was calculated using Inertial Explorer software. Phase One images were converted from IQQ to JPG format using Capture One Pro software. Inertial Explorer is a software tool that is commonly used to process and analyze data from inertial measurement unit, while Capture One Pro is a professional image editing software that allows users to convert, edit, and enhance images.

Dataset

The dataset used for deep learning included both ortho and oblique images. These images were generously provided by a partner company.

Ortho images

The ortho images for the leaf-off season were an essential resource for accurately identifying and classifying different types of trees in the study area. The high resolution of the image allowed for the precise detection and labeling of tree species, providing the necessary data for building a deep learning model for tree classification. Annotation of images captured during the leaf-off season when the trees were leafless, eliminated the possibility of misclassifications between tree species. shows an example of an ortho image that was used to train a deep learning model.

Figure 2. Part of the ortho image prepared for the study area.

Oblique images

Besides ortho images, oblique images were used. During the aerial acquisition process, images of the study area were collected from both the front and rear directions of the flight path. These images were captured during the leaf-on condition, meaning that the trees were in a state of full foliage.

To prepare these images for use in tree species labeling and in building deep learning models, they were resized using IrfanView software. Images were converted from 11,608 × 8708to 2000 × 2000 pixels. The resizing process was performed in a way that preserved all of the trees’ characteristics. Output images were used to label tree species and create a dataset for the deep learning model. The labeling process involved identifying and classifying the two types of trees based on their visual characteristics. presents samples of images captured by a front PhaseOne camera.

Figure 3. Samples of the resized images from the front PhaseOne camera in front and rear direction.

ML.NET framework

Machine learning and deep learning are two distinct branches of artificial intelligence. In essence, machine learning refers to the capability of AI systems to autonomously adapt and improve with minimal human intervention. On the other hand, deep learning constitutes a subset of machine learning that harnesses artificial neural networks to emulate the intricate learning mechanisms observed in the human brain.

In this study, we employed the ML.NET framework to facilitate the training and deployment of a model aimed at classifying two distinct species with Deep Neural Network (DNN) offered by the framework. ML.NET is a free and open-source cross-platform machine learning framework developed by Microsoft. The framework was designed with the aim of making machine and deep learning more accessible to developers, enabling them to utilize a single framework for the integration, testing, and deployment of machine learning pipelines. ML.NET supports a wide range of machine learning scenarios, including data classification, image classification, value prediction, object detection, and recommendation (Ahmed et al., Citation2019). A comprehensive presentation of ML.NET architecture and application demands that shaped it can be found in Ahmed et al. (Citation2019) publication. Detailed ML.NET documentation can be found on the Microsoft website (https://learn.microsoft.com/en-us/dotnet/machine-learning/).

ML.NET framework includes a Model Builder, an intuitive graphical Visual Studio extension tool that allows developers to easily generate, train and deploy machine learning models, without requiring any machine learning expertise (Yu & Zahidi, Citation2023). The main steps in the workflow of Model Builder include choosing a scenario, defining the training environment (local or cloud-based), importing input data for training, training the model using the imported dataset, evaluating the performance of the generated model, and consuming the code by integrating it into the application for image classification (). In this study, training was done locally on a computer using a CPU (Intel Core i5-1135G7; Random Access Memory (RAM): 8GB).

Figure 4. Workflow of ML.NET model Builder.

The ML.NET framework performs tests on multiple trainer algorithms and chooses the best-found one for the final implementation. In the case of image classification functions, ML.NET utilizes the DNN and Residual Network model (DNN+ResNet-50) for implementation. DNNs are composed of multiple layers of artificial neurons, which learn to represent complex patterns in data, allowing them to generalize to new, unseen data (LeCun et al., Citation2015). The ResNet model was introduced by He et al. introduced in 2015 and in the same won the title of ILSVRC classification task champion. ResNet architecture is known for its ability to train deep neural networks with a large number of layers without encountering gradient dissipation and degradation problems (Habibzadeh et al., Citation2018; Li & Lima, Citation2021). ResNet achieves this by using so-called residual connections, which allow gradients to flow more easily through the network during training (He et al., Citation2016). In the context of image classification, ResNet50 is a specific implementation of the ResNet architecture that contains 50-layer convolutional neural networks (48 convolutional layers, one MaxPoll layer and one average pool layer).

Model architecture is composed of six distinct stages (), each with a specific function. The first stage serves as the input component of the network and is made up of Convolutional and Max Pooling layers. The Convolutional layers are responsible for extracting features from the input image by convolving it with a set of learnable filters, while the Max Pooling layers downsample the feature maps and retain only the most relevant information. The subsequent stages, stages 2 through 5, are the residual modules that form the majority of the network. These modules consist of Convolutional Blocks and Identity Blocks. The Convolutional Blocks consist of multiple consecutive convolutional layers that are used to extract high-level features from the input, while the Identity Blocks are designed to facilitate the flow of gradients during backpropagation and prevent vanishing gradients. The Identity Block includes a shortcut connection that bypasses the convolutional layers, allowing the gradient to flow more smoothly through the network. The final stage, stage 6, serves as the output component of the network and is typically composed of fully connected layers and a softmax activation function. The fully connected layers transform the features extracted by the convolutional layers into a form suitable for classification, while the softmax function produces a probability distribution over the classes (Yu & Zahidi, Citation2023). ResNet50 boasts an extensive parameter count of over 23 million, which are trainable.

Figure 5. ResNet-50 model architecture.

DNN+ResNet50 architecture requires significant computational resources for training and requires a large amount of labeled data to be able to provide good results.

Methodology

This section outlines the methodology for tree species classification using aerial mapping data and the ML.NET framework.

Deep learning model

Deep learning involves creating a model that is trained to recognize certain types of patterns. The model is trained on a dataset by providing it with an algorithm that it can use to reason and learn from the data. Once the model has been trained, it can be used to make inferences and predictions about new data it has not seen before.

Labeling and annotations

Training a deep learning model for image classification involves the crucial process of annotating and labeling images. Annotation involves manually drawing bounding boxes around objects in an image, while labeling refers to assigning a class name or label to each annotation. These labels serve as metadata and are used to train the model in recognizing the objects. Although manual annotation and labeling are tedious tasks, they are an integral aspect of supervised learning and play a crucial role in the success of the model.

In this study, annotations for the tree species were generated utilizing LabelImg, an open-source software tool designed for deep learning applications (Tzutalin, Citation2015). LabelImg is a user-friendly program, which allows for the identification and saving object class information along with position information marked within the image. The application has already been widely adopted and utilized in several previous studies across various fields of research (Etienne et al., Citation2021; Khosravian et al., Citation2021; Roberts et al., Citation2020; Xuan et al., Citation2022; Zhou et al., Citation2022).

Two sets of annotations were prepared using LabelImg: one set for the ortho images data, and another set for oblique images. presents screenshots of the annotation process using LabelImg on the ortho and oblique image datasets. On the ortho images, a total of 107 annotations for pine trees and 102 annotations for spruce trees were created. Similarly, on the oblique images, 150 annotations for both pine and spruce trees were added. These annotations were needed to prepare input images for the deep learning training.

Figure 6. Example of labeling with LabelImg software on oblique image.

Data processing and augmentation

The annotated images were pre-processed by cropping to the border of the bounding box. This action isolated the tree species from the surrounding background and ensured that the deep learning model was trained on the relevant features of the objects. The bounding box is a rectangular region that outlines the object of interest. present examples of ortho and oblique imagery of pine and spruce clipped to their bounding boxes, which are used as input data for data augmentation.

Figure 7. Examples of ortho images of pine (above) and spruce (below) clipped to bounding boxes.

Figure 8. Examples of oblique images of pine (above) and spruce (below) clipped to bounding boxes.

To mitigate the impact of a limited annotated image dataset on the performance of a deep learning model, various data augmentation techniques have been deployed and utilized in previous studies (Barshooi & Amirkhani, Citation2022; Mikołajczyk & Grochowski, Citation2018; Perez & Wang, Citation2017). In this study, classical image transformations were performed to artificially expand the dataset and improve model performance (Fukuda et al., Citation2020). This included modifications to brightness (B), contrast (C), saturation (S), and sharpness of the images (Sh), as well as automatic color adjustment (Aca), flipping the images horizontally (Hf) and vertically (Vv) and rotating them left (Rl) and right (Rr). The data augmentation was carried out using IrfanView version 4.60 software, which has been used as a data augmentation tool in multiple previous studies (Fukuda et al., Citation2020; Kuwada et al., Citation2020; Murata et al., Citation2019). As a result of these techniques, two datasets were prepared for training the deep learning model – ORTHO set created based on ortho images and ORTHO-OBLIQUE set created based on the combination of ortho and oblique imagery. The details regarding the applied data augmentation techniques are summarised in . describes the dataset composition, indicating the number of images utilized for training the deep learning model.

Table 1. List of datasets for training with information about applied technique for data augmentation (e.g. B +50 corresponds to changes of brightness by + 50). Abbreviations are abovementioned.

Download CSV Display Table

Table 2. List of training datasets with a number of samples.

Download CSV Display Table

In order to validate the image classification model, a sample of ortho images was selected for analysis. The sample comprised 40 images of spruce and 42 images of pine, representing approximately 40% of the labeled samples for each species in the LabelImg software (102 labels were assigned to spruce and 107 labels to pine). To ensure a consistent and accurate evaluation, the validation set didn’t include any images generated during the augmentation process. The remaining images were used to train the model to identify and classify the tree species under investigation in the study area.

Model evaluation

Evaluating a deep learning model is crucial in assessing its performance and pinpointing areas for improvement.

The Model Builder of the ML.NET framework employs a trained model to make predictions on new test data and measures the accuracy of these predictions. To accomplish this, it divides the training data into two subsets: a training set (80%) that is used to train the model and a test set (20%) that is held back to evaluate the model’s performance. Predictions are considered accurate if the model assigns a probability of belonging to a particular category that is greater than 50%. Evaluation metrics in ML.NET are specific to the type of machine learning task that a model performs. For the classification task, the model is evaluated by measuring how well a predicted category matches the actual category. ML.NET’s Model Builder optimizes training time based on a specified set of images for machine classification. (https://learn.microsoft.com, 29.01.2023).

To evaluate the performance of an image classification model, both micro-average and macro-average accuracy are calculated in the ML.NET framework. Micro average computes a global average F1 score by summing the True Positives (TP), False Negatives (FN), and False Positives (FP) (EquationEquation 1(1) $F 1 = \frac{T P}{T P + \frac{1}{2} (F P + F N)}$ (1) ). It measures the precision of the classifier. The closer the F1 value is to 1.0, the better the result. An F1 score of 1.0 means perfect precision and recall.

(1)

F 1 = \frac{T P}{T P + \frac{1}{2} (F P + F N)}

(1)

where:

F1 - micro-averaged F1 score,

TP – true positives,

FP – false positives,

FN – false negatives.

The macro average F1 score is computed using the arithmetic mean of all support values. Every class contributes equally to the accuracy metric, with minority classes given equal weight as the larger classes. The macro-average metric gives the same weight to each class, regardless of how many instances from that class the dataset contains. The closer the macro F1 score is to 1.0, the better the result.

The authors conducted an independent evaluation of the model’s accuracy using a dedicated validation set to determine the overall accuracy of species classification. Overall accuracy is a metric that measures the proportion of correct predictions made by the model out of all predictions. It is calculated by dividing the number of correct predictions by the total number of predictions. The validation set comprised approximately 40% of the original images for each species, specifically excluded from the augmentation processes.

To compare the results obtained using the ML.NET framework, two independent models were built using the InceptionV3 and MobileNetV2 architectures. These models were trained on a dataset of images that produced the best outcomes within the ML.NET framework. Both models were constructed and validated using the same dataset.

Results and discussion

In this study, deep learning algorithms were applied to ORTHO and ORTHO-OBLIQUE datasets to train the models. The training process was carried out using ML.NET and the Model Builder tool, which automatically adjusts the training time to the size of the training dataset. The training duration ranged from 44 minutes for a dataset of 1742 images to 124 minutes for a dataset of 4384 images. Model Builder tool employed DNN+ResNet50 to construct the models and automatically computed the performance of the best model by utilizing metrics such as micro accuracy and macro accuracy. The training process results are presented in , which summarises the achieved results along with the corresponding training times.

Table 3. Details of training and models parameters generated with ML.NET ModelBuilder.

Download CSV Display Table

It is crucial to mention that the models in this study were built using the default parameters provided by the Microsoft ML.NET framework. These default parameters were selected to ensure consistency and compatibility with the ML.NET framework and its image classification process. The purpose was to demonstrate that even without in-depth programming knowledge and expertise in machine learning and deep learning, the framework’s default parameters can yield satisfactory results. However, it is worth noting that fine-tuning the parameters of the deep learning process can further enhance the efficiency and performance of the models. While this study focused on utilizing the default parameters to showcase the accessibility of ML.NET for achieving results, future research or practical applications using the ML.NET framework may involve parameter refinement to maximize the model’s performance in specific contexts. By adjusting parameters such as learning rate, batch size, or network architecture, it is possible to optimize the models for specific datasets and improve their accuracy and generalization capabilities.

Once the models were built for each dataset, a validation process of species classification was carried out using the validation set. The classification map for the validation set is presented in , showcasing the results for each model built on the specified sets. The first row corresponds to the validation set of pine trees, while the second row corresponds to spruce. The results of this validation together with generated accuracy are presented in .

Table 4. Classification map for the validation set for each DNN+ResNet50 model. The green color indicates proper classification, red color represents incorrect classification.

Download CSV Display Table

Table 5. Details of model validation on a validation set.

Download CSV Display Table

After a thorough analysis, we determined that the ORTHO-OBLIQUE dataset was the best-performing, achieving an overall accuracy of 93.9%. This dataset comprised 4384 images, including both ortho and oblique images, and required 124 minutes for training. Pine and spruce classification accuracies were 92.5% and 95.2%, respectively. The model’s micro and macro accuracy were also calculated, yielding values of 0.9878 and 0.9879, respectively, indicating good generalization to the validation data.

The ORTHO dataset, which only used ortho images, generated an overall accuracy of 90.2%. However, the ORTHO-OBLIQUE dataset outperformed it due to the inclusion of images from different perspectives in the training dataset. This dataset included both ortho and oblique images, providing top-down and side views of trees, whereas the ORTHO dataset only included top-down images. The diversity of image perspectives in the training dataset has contributed to the improved results achieved by the ORTHO-OBLIQUE dataset. The input images used in our study were acquired under two specific conditions: leaf off and leaf on. We made a deliberate choice not to elaborate on the details of these data acquisition conditions due to the fact that our classification task specifically targeted coniferous species. Moreover, there are no noticeable variations between the two seasons that are relevant to these particular types of trees.

To facilitate comparative analysis, two additional models were developed on the ORTO-OBLIQUE dataset, which exhibited the most favorable outcomes within the ML.NET framework using DNN+ResNet50. The models were trained within a C# console application, employing the InceptionV3 and MobileNetV2 architectures. The performance of the InceptionV3 model was evaluated by calculating both micro and macro accuracy, resulting in values of 0.9775 for both metrics. The MobileNetV2 model achieved a micro accuracy of 0.9831 and a macro accuracy of the same value. Validation on a dataset yielded the following outcomes: The InceptionV3 model achieved a classification accuracy of 85% for pine and 80.9% for spruce, resulting in an overall accuracy of 82.9%. In contrast, the MobileNetV2 model attained an overall accuracy of 81.7%, with 77.5% accuracy for pine classification and 85.7% accuracy for spruce classification. Among these models, the ResNet50 model exhibited the best performance, boasting an overall accuracy of 93.9%. presents the classification map of the validation set for all models, while provides detailed information about each model’s characteristics.

Table 6. Classification map of the validation set for DNN+ResNet50, InceptionV3 and MobilenetV2 models. The green color indicates proper classification, red color represents incorrect classification.

Download CSV Display Table

Table 7. Details of models built on ORTHO-OBLIQUE dataset.

Download CSV Display Table

There have been limited studies conducted on the utilization of deep learning methods for the classification of two species using high-resolution RGB imagery from UAV platforms. Among these studies, Kattenborn et al. (Citation2019) tested a CNN-based segmentation approach (U-net) in combination with training data directly derived from visual interpretation of UAV-based high-resolution RGB imagery for fine-grained mapping of two species and demonstrated that this approach had at least 84% overall accuracy. Haq et al. (Citation2021) used deep learning based supervised image classification algorithm and images collected using UAV for classification of forest areas with the overall accuracy of 93%. Using Residual Neural Networks, Natesan et al. (Citation2019) achieved an 80% classification accuracy in distinguishing three distinct types of tree species within a coniferous mixed forest. Egli and Höpke (Citation2020) designed a computationally lightweight CNN for the classification of four tree species based on RGB images obtained from automated UAV observations. Their study demonstrated that regardless of illumination conditions and phenological stages, average classification accuracies of 92% could be achieved. These studies collectively highlight the potential of deep learning methods in the classification of two species using high-resolution RGB imagery from UAV platforms, showcasing the range of accuracies achieved and the effectiveness of various approaches.

The findings of a comprehensive review conducted by Michałowska and Rapiński (Citation2021) focused on tree species classification based on airborne LiDAR datasets revealed that, in studies specifically targeting the classification of two tree species, the median accuracy reached 89.76%. Out of the total of 21 studies investigated, only five achieved the highest overall accuracy, ranging from 93% to 97.1%. The results obtained in this study demonstrated an overall accuracy of 93.9%, which falls within the range of the best classification results reported in studies conducted on LiDAR data (studies encompassed the use of various features such as geometric properties, radiometric attributes, and features derived from full-waveform decomposition) ().

Figure 9. Relationship between the overall accuracy and number of discriminated species in studies reviewed in Michałowska and Rapiński (Citation2021) article. The result of this research is indicated on the chart by a green dot.

Conclusions

Deep learning models have gained popularity in automating image analysis tasks due to their impressive results across a range of applications, including remote sensing and computer vision. The primary objective of this study was to employ the ML.NET framework to construct a deep learning model capable of accurately classifying pine and spruce trees based on ortho and oblique imagery. The authors undertook the task of preparing two distinct datasets of tree samples and training models in ML.NET framework utilizing deep neural networks in conjunction with the ResNet-50 architecture. The experiment revealed that the ORTHO-OBLIQUE dataset performed the highest performance, achieving an overall accuracy of 93.9%. The inclusion of images from multiple perspectives in the training dataset was the key factor contributing to the improved performance of the ORTHO-OBLIQUE dataset. The results suggest that species identification at very high spatial resolutions is facilitated through spatial patterns. This study underscores the effectiveness of the ML.NET framework in constructing deep learning models. The Model Builder feature within ML.NET enables users with limited software development knowledge to handle data training, thereby saving time and effort in model development. This study demonstrates the practical value of ML.NET as an accessible tool for image classification tasks, even for domain experts a limited machine and deep learning experience.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

The data that support the findings of this study are available from the corresponding author, MM, upon reasonable request.

References

Ahmed, Z., Amizadeh, S., Bilenko, M., Carr, R., Chin, W. S., Dekel, Y., Dupre, X., Eksarevsliy, V., Filipi, S., Finley, T., Goswami, A., Hoover, M., Inglis, S., Interlandi, M., Kazmi, N., Krivosheev, G., Luferenko, P., Matantsev, I., Matusevych, S.,… Zhu, Y. (2019, July). Machine learning at Microsoft with ML. NET. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2448–15).
Google Scholar
Al-Haija, Q. A., & Adebanjo, A. (2020). Breast cancer diagnosis in histopathological images using ResNet-50 convolutional neural network. 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (pp. 1–7). IEEE.
Google Scholar
Avtar, R., Malik, R., Musthafa, M., Rathore, V. S., Kumar, P., & Singh, G. (2023). Forest plantation species classification using Full-Pol-Time-Averaged SAR scattering powers. Remote Sensing Applications: Society & Environment, 29, 100924. https://doi.org/10.1016/j.rsase.2023.100924
Google Scholar
Barshooi, A. H., & Amirkhani, A. (2022). A novel data augmentation based on Gabor filter and convolutional deep learning for improving the classification of COVID-19 chest X-Ray images. Biomedical Signal Processing and Control, 72, 103326. https://doi.org/10.1016/j.bspc.2021.103326
PubMed Web of Science ®Google Scholar
Bont, L., Hill, A., Waser, L., Bürgi, A., Ginzler, C., & Blattert, C. (2020). Airborne-laser-scanning-derived auxiliary information discriminating between broadleaf and conifer trees improves the accuracy of models for predicting timber volume in mixed and heterogeneously structured forests. Forest Ecology and Management, 459, 117856. https://doi.org/10.1016/j.foreco.2019.117856
Web of Science ®Google Scholar
Brovkina, O., Cienciala, E., Surový, P., & Janata, P. (2018). Unmanned Aerial Vehicles (UAV) for assessment of qualitative classification of Norway spruce in temperate forest stands. Geo-Spatial Information Science, 21(1), 12–20. https://doi.org/10.1080/10095020.2017.1416994
Web of Science ®Google Scholar
Cao, K., & Zhang, X. (2020). An improved res-unet model for tree species classification using airborne high-resolution images. Remote Sensing, 12(7), 1128. https://doi.org/10.3390/rs12071128
Web of Science ®Google Scholar
Chamberlain, C. P., Meador, A. J. S., & Thode, A. E. (2020). Airborne lidar provides reliable estimates of canopy base height and canopy bulk density in southwestern ponderosa pine forests. Forest Ecology and Management, 481, 118695. https://doi.org/10.1016/j.foreco.2020.118695
Web of Science ®Google Scholar
Ciregan, D., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 3642–3649). IEEE.
Google Scholar
Crisci, C., Ghattas, B., & Perera, G. (2012). A review of supervised machine learning algorithms and their applications to ecological data. Ecological Modelling, 240, 113–122. https://doi.org/10.1016/j.ecolmodel.2012.03.001
Web of Science ®Google Scholar
Dalponte, M., Bruzzone, L., & Gianelle, D. (2012). Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral/hyperspectral images and LiDAR data. Remote Sensing of Environment, 123, 258–270. https://doi.org/10.1016/j.rse.2012.03.013
Web of Science ®Google Scholar
Darwish, A., Leukert, K., & Reinhardt, W. (2003). Image segmentation for the purpose of object-based classification. IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No. 03CH37477) (Vol. 3, pp. 2039–2041). Ieee.
Google Scholar
de Vries, T. N., Bronkhorst, J., Vermeer, M., Donker, J. C., Briels, S. A., Ziar, H., Zeman, M., & Isabella, O. (2020). A quick-scan method to assess photovoltaic rooftop potential based on aerial imagery and LiDAR. Solar Energy, 209, 96–107. https://doi.org/10.1016/j.solener.2020.07.035
Web of Science ®Google Scholar
Egli, S., & Höpke, M. (2020). CNN-Based tree species classification using high resolution RGB image data from automated UAV observations. Remote Sensing, 12(23), 3892. https://doi.org/10.3390/rs12233892.
Web of Science ®Google Scholar
El-Sheimy, N. (1996). The development of VISAT: A mobile survey system for GIS applications. University of Calgary.
Google Scholar
Etienne, A., Ahmad, A., Aggarwal, V., & Saraswat, D. (2021). Deep learning-based object detection system for identifying weeds using UAS imagery. Remote Sensing, 13(24), 5182. https://doi.org/10.3390/rs13245182
Web of Science ®Google Scholar
Firat, H., & Hanbay, D. (2021). Classification of hyperspectral images using 3D CNN based ResNet50. 2021 29th Signal Processing and Communications Applications Conference (SIU) (pp. 1–4). IEEE.
Google Scholar
Franklin, S. E. (2018). Pixel-and object-based multispectral classification of forest tree species from small unmanned aerial vehicles. Journal of Unmanned Vehicle Systems, 6(4), 195–211. https://doi.org/10.1139/juvs-2017-0022
Web of Science ®Google Scholar
Franklin, S. E., & Ahmed, O. S. (2018). Deciduous tree species classification using object-based analysis and machine learning with unmanned aerial vehicle multispectral data. International Journal of Remote Sensing, 39(15–16), 5236–5245. https://doi.org/10.1080/01431161.2017.1363442
Web of Science ®Google Scholar
Franklin, S. E., Ahmed, O. S., & Williams, G. (2017). Northern conifer forest species classification using multispectral data acquired from an unmanned aerial vehicle. PE&Rs, Photogrammetric Engineering & Remote Sensing, 83(7), 501–507. https://doi.org/10.14358/PERS.83.7.501
Web of Science ®Google Scholar
Fukuda, M., Ariji, Y., Kise, Y., Nozawa, M., Kuwada, C., Funakoshi, T., Muramatsu, C., Fujita, H., Katsumata, A., & Ariji, E. (2020). Comparison of 3 deep learning neural networks for classifying the relationship between the mandibular third molar and the mandibular canal on panoramic radiographs. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology, 130(3), 336–343. https://doi.org/10.1016/j.oooo.2020.04.005
PubMed Web of Science ®Google Scholar
Gamdha, D., Unnikrishnakurup, S., Rose, K. J., Surekha, M., Purushothaman, P., Ghose, B., & Balasubramaniam, K. (2021). Automated defect recognition on X-ray radiographs of solid propellant using deep learning based on convolutional neural networks. Journal of Nondestructive Evaluation, 40(1), 1–13. https://doi.org/10.1007/s10921-021-00750-4
Web of Science ®Google Scholar
Gini, R., Passoni, D., Pinto, L., & Sona, G. (2012). Aerial images from an UAV system: 3D modeling and tree species classification in a park area. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 39(B1), 361–366. https://doi.org/10.5194/isprsarchives-XXXIX-B1-361-2012
Google Scholar
Goldblatt, R., Stuhlmacher, M. F., Tellman, B., Clinton, N., Hanson, G., Georgescu, M., Wang, C., Serrano-Candela, F., Khandelwal, A., Cheng, W., & Balling, R. C., Jr. (2018). Using Landsat and nighttime lights for supervised pixel-based image classification of urban land cover. Remote Sensing of Environment, 205, 253–275. https://doi.org/10.1016/j.rse.2017.11.026
Web of Science ®Google Scholar
Guo, X., Li, H., Jing, L., & Wang, P. (2022). Individual tree species classification based on convolutional neural networks and multitemporal high-resolution remote sensing images. Sensors, 22(9), 3157. https://doi.org/10.3390/s22093157
PubMed Web of Science ®Google Scholar
Habibzadeh, M., Jannesari, M., Rezaei, Z., Baharvand, H., & Totonchi, M. (2018). Automatic white blood cell classification using pre-trained deep learning models: Resnet and inception. Tenth International Conference on Machine Vision (ICMV 2017) (Vol. 10696, pp. 274–281). SPIE.
Google Scholar
Haq, M. A., Rahaman, G., Baral, P., & Ghosh, A. (2021). Deep learning based supervised image classification using UAV images for forest areas classification. J Indian Soc Remote Sens, 49(3), 601–606. https://doi.org/10.1007/s12524-020-01231-3
Web of Science ®Google Scholar
Heinzel, J., & Koch, B. (2011). Exploring full-waveform LiDAR parameters for tree species classification. International Journal of Applied Earth Observation and Geoinformation: ITC Journal, 13(1), 152–160. https://doi.org/10.1016/j.jag.2010.09.010
Web of Science ®Google Scholar
Heinzel, J., & Koch, B. (2012). Investigating multiple data sources for tree species classification in temperate forest and use for single tree delineation. International Journal of Applied Earth Observation and Geoinformation, 18, 101–110. https://doi.org/10.1016/j.jag.2012.01.025
Web of Science ®Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).
Google Scholar
He, T., Zhou, H., Xu, C., Hu, J., Xue, X., Xu, L., Lou, X., Zeng, K., & Wang, Q. (2023). Deep learning in forest tree species classification using sentinel-2 on google Earth engine: A case study of Qingyuan County. Sustainability, 15(3), 2741. https://doi.org/10.3390/su15032741
Web of Science ®Google Scholar
Huang, J., Zhang, X., Xin, Q., Sun, Y., & Zhang, P. (2019). Automatic building extraction from high-resolution aerial images and LiDAR data using gated residual refinement network. ISPRS Journal of Photogrammetry and Remote Sensing, 151, 91–105. https://doi.org/10.1016/j.isprsjprs.2019.02.019
Web of Science ®Google Scholar
Hung, J., & Carpenter, A. (2017). Applying faster R-CNN for object detection on malaria images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 56–61).
Google Scholar
Immitzer, M., Atzberger, C., & Koukal, T. (2012). Tree species classification with random forest using very high spatial resolution 8-band WorldView-2 satellite data. Remote Sensing, 4(9), 2661–2693. https://doi.org/10.3390/rs4092661
Web of Science ®Google Scholar
Kattenborn, T., Eichel, J., & Fassnacht, F. E. (2019). Convolutional neural networks enable efficient, accurate and fine-grained segmentation of plant species and communities from high-resolution UAV imagery. Scientific Reports, 9(1), 17656. https://doi.org/10.1038/s41598-019-53797-9
PubMedGoogle Scholar
Kattenborn, T., Leitloff, J., Schiefer, F., & Hinz, S. (2021). Review on Convolutional Neural Networks (CNN) in vegetation remote sensing. Isprs Journal of Photogrammetry & Remote Sensing, 173, 24–49. https://doi.org/10.1016/j.isprsjprs.2020.12.010
Web of Science ®Google Scholar
Ke, Y., Quackenbush, L. J., & Im, J. (2010). Synergistic use of QuickBird multispectral imagery and LIDAR data for object-based forest species classification. Remote Sensing of Environment, 114(6), 1141–1154. https://doi.org/10.1016/j.rse.2010.01.002
Web of Science ®Google Scholar
Khosravian, A., Amirkhani, A., Kashiani, H., & Masih-Tehrani, M. (2021). Generalizing state-of-the-art object detectors for autonomous vehicles in unseen environments. Expert Systems with Applications, 183, 115417. https://doi.org/10.1016/j.eswa.2021.115417
Web of Science ®Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386
Web of Science ®Google Scholar
Krüger Geb Amiri, N., Heurich, M., Krzystek, P., & Skidmore, A. (2018). Feature relevance assessment of multispectral airborne LiDAR data for tree species classification. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-3, 31–34. https://doi.org/10.5194/isprs-archives-XLII-3-31-2018
Google Scholar
Kukkonen, M., Maltamo, M., Korhonen, L., & Packalen, P. (2019). Multispectral airborne LiDAR data in the prediction of boreal tree species composition. IEEE Transactions on Geoscience and Remote Sensing: A Publication of the IEEE Geoscience and Remote Sensing Society, 57(6), 3462–3471. https://doi.org/10.1109/TGRS.2018.2885057
Web of Science ®Google Scholar
Kuwada, C., Ariji, Y., Fukuda, M., Kise, Y., Fujita, H., Katsumata, A., & Ariji, E. (2020). Deep learning systems for detecting and classifying the presence of impacted supernumerary teeth in the maxillary incisor region on panoramic radiographs. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology, 130(4), 464–469. https://doi.org/10.1016/j.oooo.2020.04.813
PubMed Web of Science ®Google Scholar
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
PubMed Web of Science ®Google Scholar
Li, H., Hu, B., Li, Q., & Jing, L. (2021). CNN-Based individual tree species classification using high-resolution satellite imagery and airborne LiDAR data. Forests, 12(12), 1697. https://doi.org/10.3390/f12121697
Web of Science ®Google Scholar
Li, B., & Lima, D. (2021). Facial expression recognition via ResNet-50. International Journal of Cognitive Computing in Engineering, 2, 57–64. https://doi.org/10.1016/j.ijcce.2021.02.002
Google Scholar
Li, Y., Tang, B., Li, J., Sun, W., Lin, Z., & Luo, Q. (2021). Research on common tree species recognition by faster R-CNN based on whole tree image. In 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP) (pp. 28–32). IEEE.
Google Scholar
Liu, D., & Xia, F. (2010). Assessing object-based classification: Advantages and limitations. Remote Sensing Letters, 1(4), 187–194. https://doi.org/10.1080/01431161003743173
Web of Science ®Google Scholar
Liu, Q., Xiao, L., Yang, J., & Wei, Z. (2020). CNN-enhanced graph convolutional network with pixel-and superpixel-level feature fusion for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 59(10), 8657–8671. https://doi.org/10.1109/TGRS.2020.3037361
Web of Science ®Google Scholar
Magnard, C., Morsdorf, F., Small, D., Stilla, U., Schaepman, M. E., & Meier, E. (2016). Single tree identification using airborne multibaseline SAR interferometry data. Remote Sensing of Environment, 186, 567–580. https://doi.org/10.1016/j.rse.2016.09.018
Web of Science ®Google Scholar
Mahdi, F. P., Motoki, K., & Kobashi, S. (2020). Optimization technique combined with deep learning method for teeth recognition in dental panoramic radiographs. Scientific Reports, 10(1), 19261. https://doi.org/10.1038/s41598-020-75887-9
PubMed Web of Science ®Google Scholar
Mahdy, L. N., Ezzat, K. A., Elmousalami, H. H., Ella, H. A., & Hassanien, A. E. (2020). Automatic x-ray COVID-19 lung image classification system based on multi-level thresholding and support vector machine. MedRxiv, 2020.03. https://doi.org/10.1101/2020.03.30.20047787
Google Scholar
Mallick, P. K., Ryu, S. H., Satapathy, S. K., Mishra, S., Nguyen, G. N., & Tiwari, P. (2019). Brain MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural network. IEEE Access, 7, 46278–46287. https://doi.org/10.1109/ACCESS.2019.2902252
Web of Science ®Google Scholar
Michałowska, M. (2020). Verification of building constructions surroundings based on airborne laser scanning data. XV International Conference on Durability of Building Materials and Components (DBMC 2020).
Google Scholar
Michałowska, M., & Rapiński, J. (2021). A review of tree species classification based on airborne LiDAR data and applied classifiers. Remote Sensing, 13(3), 353. https://doi.org/10.3390/rs13030353
Web of Science ®Google Scholar
Mikołajczyk, A., & Grochowski, M. (2018). Data augmentation for improving deep learning in image classification problem. 2018 international interdisciplinary PhD workshop (IIPhDW) (pp. 117–122). IEEE.
Google Scholar
MS, M., SS, S. R., & S S, S. R. (2022). Optimal squeeze net with deep neural network-based aerial image classification model in unmanned aerial vehicles. Traitement du Signal, 39(1), 275–281. https://doi.org/10.18280/ts.390128
Web of Science ®Google Scholar
Murata, M., Ariji, Y., Ohashi, Y., Kawai, T., Fukuda, M., Funakoshi, T., Kise, Y., Nozawa, M., Katsumata, A., Fujita, H., & Ariji, E. (2019). Deep-learning classification using convolutional neural network for evaluation of maxillary sinusitis on panoramic radiography. Oral Radiology, 35(3), 301–307. https://doi.org/10.1007/s11282-018-0363-7
PubMed Web of Science ®Google Scholar
Natesan, S., Armenakis, C., & Vepakomma, U. (2019). Resnet-based tree species classification using uav images. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, XLII-2/W13, 475–481. https://doi.org/10.5194/isprs-archives-XLII-2-W13-475-2019
Google Scholar
Natesan, S., Armenakis, C., & Vepakomma, U. (2020). Individual tree species identification using dense convolutional network (DenseNet) on multitemporal RGB images from UAV. Journal of Unmanned Vehicle Systems, 8(4), 310–333. https://doi.org/10.1139/juvs-2020-0014
Web of Science ®Google Scholar
Nevalainen, O., Honkavaara, E., Tuominen, S., Viljanen, N., Hakala, T., Yu, X., Hyyppä, J., Saari, H., Pölönen, I., Imai, N. N., & Tommaselli, A. M. (2017). Individual tree detection and classification with UAV-based photogrammetric point clouds and hyperspectral imaging. Remote Sensing, 9(3), 185. https://doi.org/10.3390/rs9030185
Web of Science ®Google Scholar
Nogueira, K., dos Santos, J. A., Menini, N., Silva, T. S., Morellato, L. P. C., & Torres, R. D. S. (2019). Spatio-temporal vegetation pixel classification by using convolutional networks. IEEE Geoscience and Remote Sensing Letters, 16(10), 1665–1669. https://doi.org/10.1109/LGRS.2019.2903194
Web of Science ®Google Scholar
Onishi, M., & Ise, T. (2018). Automatic classification of trees using a UAV onboard camera and deep learning. arXiv preprint arXiv:1804.10390.
Google Scholar
Onishi, M., & Ise, T. (2021). Explainable identification and mapping of trees using UAV RGB image and deep learning. Scientific Reports, 11(1), 903. https://doi.org/10.1038/s41598-020-79653-9
PubMed Web of Science ®Google Scholar
Ørka, H. O., Næsset, E., & Bollandsås, O. M. (2010). Effects of different sensors and leaf-on and leaf-off canopy conditions on echo distributions and individual tree properties derived from airborne laser scanning. Remote Sensing of Environment, 114(7), 1445–1461. https://doi.org/10.1016/j.rse.2010.01.024
Web of Science ®Google Scholar
Perez, L., & Wang, J. (2017). The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621.
Google Scholar
Pyysalo, U., & Hyyppa, H. (2002). Reconstructing tree crowns from laser scanner data for feature extraction. International Archives Of Photogrammetry Remote Sensing And Spatial Information Sciences, 34(3/B), 218–221. https://scholar.google.com/scholar_lookup?title=Reconstructing+tree+crowns+from+laser+scanner+data+for+feature+extraction&author=Pyysalo,+U.&author=Hyyppa,+H.&publication_year=2002&journal=Int.+Arch.+Photogramm.+Remote+Sens.+Spat.+Inf.+Sci.&volume=34&pages=218%E2%80%93221.
Google Scholar
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788).
Google Scholar
Roberts, R., Giancontieri, G., Inzerillo, L., & DiMino, G. (2020). Towards low-cost pavement condition health monitoring and analysis using deep learning. Applied Sciences, 10(1), 319. https://doi.org/10.3390/app10010319
Google Scholar
Safonova, A., Tabik, S., Alcaraz-Segura, D., Rubtsov, A., Maglinets, Y., & Herrera, F. (2019). Detection of fir trees (abies sibirica) damaged by the bark beetle in unmanned aerial vehicle images with deep learning. Remote Sensing, 11(6), 643. https://doi.org/10.3390/rs11060643
Web of Science ®Google Scholar
Sarrazin, D., van Aardt, J., Asner, G., Mcglinchy, J., Messinger, D., & Wu, J. (2011). Fusing small-footprint waveform LiDAR and hyperspectral data for canopy-level species classification and herbaceous biomass modeling in savanna ecosystems. Canadian Journal of Remote Sensing, 37(6), 653–665. https://doi.org/10.5589/m12-007
Web of Science ®Google Scholar
Sarwinda, D., Paradisa, R. H., Bustamam, A., & Anggia, P. (2021). Deep learning in image classification using residual network (ResNet) variants for detection of colorectal cancer. Procedia Computer Science, 179, 423–431. https://doi.org/10.1016/j.procs.2021.01.025
Google Scholar
Schiefer, F., Kattenborn, T., Frick, A., Frey, J., Schall, P., Koch, B., & Schmidtlein, S. (2020). Mapping forest tree species in high resolution UAV-based RGB-imagery by means of convolutional neural networks. ISPRS Journal of Photogrammetry and Remote Sensing, 170, 205–215. https://doi.org/10.1016/j.isprsjprs.2020.10.015
Web of Science ®Google Scholar
Shen, X., & Cao, L. (2017). Tree-species classification in subtropical forests using airborne hyperspectral and LiDAR data. Remote Sensing, 9(11), 1180. https://doi.org/10.3390/rs9111180
Web of Science ®Google Scholar
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., & Moore, R. (2013). Real-time human pose recognition in parts from single depth images. Communications of the ACM, 56(1), 116–124. https://doi.org/10.1145/2398356.2398381
Web of Science ®Google Scholar
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Google Scholar
Solberg, S., Næsset, E., & Bollandsas, O. M. (2006). Single tree segmentation using airborne laser scanner data in a structurally heterogeneous spruce forest. Photogrammetric Engineering & Remote Sensing, 72(12), 1369–1378. https://doi.org/10.14358/PERS.72.12.1369
Web of Science ®Google Scholar
Song, Y., He, F., & Zhang, X. (2019). To identify tree species with highly similar leaves based on a novel attention mechanism for CNN. IEEE Access, 7, 163277–163286. https://doi.org/10.1109/ACCESS.2019.2951607
Web of Science ®Google Scholar
Sothe, C., Dalponte, M., Almeida, C. M. D., Schimalski, M. B., Lima, C. L., Liesenberg, V., Miyoshi, G. T., & Tommaselli, A. M. G. (2019). Tree species classification in a highly diverse subtropical forest integrating UAV-based photogrammetric point cloud and hyperspectral data. Remote Sensing, 11(11), 1338. https://doi.org/10.3390/rs11111338
Web of Science ®Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Google Scholar
Tsai, D. M., & Chen, W. L. (2017). Coffee plantation area recognition in satellite images using Fourier transform. Computers and Electronics in Agriculture, 135, 115–127. https://doi.org/10.1016/j.compag.2016.12.020
Web of Science ®Google Scholar
Tzutalin, L. (2015). Git code. https://github.com/tzutalin/labelImg
Google Scholar
Udali, A., Lingua, E., & Persson, H. J. (2021). Assessing forest type and tree species classification using sentinel-1 C-Band SAR data in Southern Sweden. Remote Sensing, 13(16), 3237. https://doi.org/10.3390/rs13163237
Web of Science ®Google Scholar
Wäldchen, J., Mäder, P., & Cooper, N. (2018). Machine learning for image based species identification. Methods in Ecology and Evolution, 9(11), 2216–2225. https://doi.org/10.1111/2041-210X.13075
Web of Science ®Google Scholar
Wang, Q., Bi, S., Sun, M., Wang, Y., Wang, D., Yang, S., & Zhang, J. (2019). Deep learning approach to peripheral leukocyte recognition. PloS One, 14(6), e0218808. https://doi.org/10.1371/journal.pone.0218808
PubMed Web of Science ®Google Scholar
Wen, L., Li, X., & Gao, L. (2020). A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Computing and Applications, 32(10), 6111–6124. https://doi.org/10.1007/s00521-019-04097-w
Web of Science ®Google Scholar
Wu, Q., Lane, C. R., Li, X., Zhao, K., Zhou, Y., Clinton, N., de Vries, B., Golden, H., & Lang, M. W. (2019). Integrating LiDAR data and multi-temporal aerial imagery to map wetland inundation dynamics using Google earth engine. Remote Sensing of Environment, 228, 1–13. https://doi.org/10.1016/j.rse.2019.04.015
PubMed Web of Science ®Google Scholar
Xuan, J., Li, X., Du, H., Zhou, G., Mao, F., Wang, J., Zhang, B., Gong, Y., Zhu, D., Zhou, L., Huanf, Z., Xu, C., Chen, J., Zhou, Y., Chen, C., Tan, C., & Sun, J. (2022). Intelligent estimating the tree height in urban forests based on deep learning combined with a smartphone and a comparison with UAV-LiDAR. Remote Sensing, 15(1), 97. https://doi.org/10.3390/rs15010097
Web of Science ®Google Scholar
Yan, S., Jing, L., & Wang, H. (2021). A new individual tree species recognition method based on a convolutional neural network and high-spatial resolution remote sensing imagery. Remote Sensing, 13(3), 479. https://doi.org/10.3390/rs13030479
Web of Science ®Google Scholar
Yao, W., Krzystek, P., & Heurich, M. (2012). Tree species classification and estimation of stem volume and DBH based on single tree extraction by exploiting airborne full-waveform LiDAR data. Remote Sensing of Environment, 123, 368–380. https://doi.org/10.1016/j.rse.2012.03.027
Web of Science ®Google Scholar
You, H. T., Lei, P., Li, M. S., & Ruan, F. Q. (2020). Forest species classification based on three-dimensional coordinate and intensity information of airborne LiDAR data with random forest method. The International Archives of Photogrammetry, Remote Sensing & Spatial Information Sciences, 42, 117–123. https://doi.org/10.5194/isprs-archives-XLII-3-W10-117-2020
Google Scholar
Yu, X., Hyyppä, J., Litkey, P., Kaartinen, H., Vastaranta, M., & Holopainen, M. (2017). Single-sensor solution to tree species classification using multispectral airborne laser scanning. Remote Sensing, 9(2), 108. https://doi.org/10.3390/rs9020108
Web of Science ®Google Scholar
Yu, H., & Zahidi, I. (2023). Tailings pond classification based on satellite images and machine learning: An exploration of Microsoft ML. Net Mathematics, 11(3), 517. https://doi.org/10.3390/math11030517
Web of Science ®Google Scholar
Zhang, C., Xia, K., Feng, H., Yang, Y., & Du, X. (2021). Tree species classification using deep learning and RGB optical images obtained by an unmanned aerial vehicle. Journal of Forestry Research, 32(5), 1879–1888. https://doi.org/10.1007/s11676-020-01245-0
Web of Science ®Google Scholar
Zhou, Y., Liu, W., Bi, H., Chen, R., Zong, S., & Luo, Y. (2022). A detection method for individual infected pine trees with pine wilt disease based on deep learning. Forests, 13(11), 1880. https://doi.org/10.3390/f13111880
Web of Science ®Google Scholar

Tree species classification on images from airborne mobile mapping using ML.NET

ABSTRACT

Introduction