Full article: Determining the optimal generalization operators for building footprints using an improved graph neural network model

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Determining the optimal generalization operators of city buildings is a crucial step during the building generalization process and an important aspect of realizing cross-scale updating of map data. It is a decision-making behavior of the cartographer that can be learned and simulated using artificial intelligence algorithms. Multi-scale data can provide rich generalization samples to train the determination process. However, previous studies have focused primarily on the intelligent use of each generalization operator separately, neglecting the intelligent scheduling issue between multiple operators when they are used simultaneously. Herein, we propose an improved graph neural network (GNN) called self-neighborhood merged GNN (SNGNN) that selects the optimal generalization operators for different buildings. In SNGNN, node and edge information are passed with different weights through two modules to simulate the effects of a building on itself and the neighborhood on either side. SNGNN has been experimentally validated using sample datasets for Ningbo, China, at 1:10,000 and 1:25,000. The F1-score of the testing dataset was 94.19%, and the classification precision of each operator was ≥87%. Compared with other popular intelligent algorithms, the experimental results for SNGNN revealed better performance in determining the optimal generalization operators.

Keywords:

1. Introduction

As an important social and human element and one of the main components of city maps, buildings have been the focus of automatic urban map generalization (Li et al. Citation2004). Building generalization plays an important role in expressing multi-scale geographical spatial elements and facilitating their cross-scale updating; this process is primarily realized using the corresponding operators and algorithms (Shea and Master Citation1989; Zhou et al. Citation2009; Wu et al. Citation2022). The operators perform basic building generalization operations, such as selection and simplification, executed by generalization algorithms. The building generalization operators include deletion, simplification, displacement, aggregation, and typification (Rieger and Coulson Citation1993; Susetyo and Hidayat Citation2019). During generalization, the characteristics of buildings and their surrounding environmental constraints should be considered. Subsequently, the generalization operators of different buildings should be determined. The generalization algorithms and their parameters to execute these operators should then be set (Steiniger et al. Citation2010; Lyu et al. Citation2022; Li, Wu et al. Citation2022). Therefore, determining the generalization operators for buildings is a crucial step affecting subsequent generalization processes and the last generalization quality. Generally, this is controlled by a multi-intelligence system (Lamy et al. Citation1999; Touya et al. Citation2010; Duchêne et al. Citation2018) that schedules different generalization operators based on artificially set rules and constraints. However, due to the diversity of building shapes and spatial distribution, the constraints of multi-intelligent systems are generally complex and not common for different regions. In early building generalization research, a corresponding process was formulated for operator selection (Ruas & Duchene Citation2007; Ware et al. Citation2003). However, the setting of parameters is inseparable from the experience of human cartographers; implementing this type of system cannot meet the needs of automatic map generalization.

Due to the resurgence of artificial intelligence, automatic map generalization approaches have substantially changed. Intelligent methods can extract implicit knowledge from data, and artificial intelligence can be applied to build generalization, including building shape cognition (Touya and Lokhat Citation2020; Abhilash et al. Citation2022; Hu et al. Citation2022; Yang et al. Citation2022), building group classification (Yan et al. Citation2020; Liu et al. Citation2023), and generation of generalization results (Sester et al. Citation2018; Courtial et al. Citation2022). Scholars have gradually focused on simulating decision-making processes with the help of artificial intelligence (Balboa and Ariza-López Citation2008; Courtial et al. Citation2020; Karsznia et al. Citation2022). Applying machine learning methods for automatic generalization requires an appropriate sample data set; various generalization behaviors can then be simulated using mathematical models based on the training data set. When the learning goal is set to select the optimal generalization operator, it simulates the decision-making process of cartographers, determining the generalization operators for different building types, a technique referred to as an ‘intelligent decision-making process’ (Ai Citation2021). The intelligent use of different operators has been widely investigated. For example, inspired by case-based reasoning, Xie et al. (Citation2017) studied building selection by setting appropriate feature items from existing small-scale map data. Meanwhile, using multi-standard decision-making technology, Touya (Citation2021) described the map context elements and tested the selection operation on a true large-scale cartographic dataset. By applying the rough set theory, Fiedukowicz (Citation2020) compared three methods to assess the related attributes in building selection, while An et al. (Citation2023) selected the buildings using convolutional neural networks. For the simplification operator, Yan et al. (Citation2022) proposed an adaptive selection method comprising four simplification algorithms using graph convolutional self-coding networks. Meanwhile, for the typification operator, Yan et al. (Citation2021) proposed a novel building approach to obtain high-quality representative objects using affinity propagation exemplar-based clustering. For the aggregation operator, Cheng et al. (Citation2015) realized intelligent building outline aggregation using the backpropagation neural network. However, map generalization results from the simultaneous use of various generalization operators. Lee et al. (Citation2017) evaluated the optimal deletion, selection, and aggregation operators by inputting the building characteristics. However, the description of buildings was limited to semantic attributes, causing the classification accuracy of the aggregation operators to be too low to meet the practical requirements. Feng et al. (Citation2019) and Courtial et al. (Citation2021b) addressed the common use of different generalization operators by directly outputting the generalization results. However, their output results are in raster form, which cannot be directly applied for vector map use. Therefore, research on determining the optimal generalization operators to realize their scheduling is required.

Among the many intelligent methods, the characteristics of graph neural networks (GNNs) have advantages in cartographic generalization (Courtial et al. Citation2021a). First, these models can accept inputs with inconsistent neighborhood structures. However, when dealing with vector data that have inconsistent neighborhood topological structures and complex spatial distribution, mainstream neural networks, such as convolutional neural networks (Bouvrie Citation2006) and recurrent neural networks (Zaremba et al. Citation2014), only accept inputs with fixed neighborhood structure, such as grids; therefore, vector grid conversions are necessary. However, complex processing is required to convert the generated results into usable vector data. Meanwhile, the GNN input is generally in graph structure form, with which the natural adjacent relationships of geographical elements can be transformed into graph structures with a good fit. Moreover, the powerful graph structure expression of GNN can achieve good information retention when driven by different tasks. Traditional machine learning methods are typically based on the assumption of independent and identically distributed data; they can only quantify the data using their own attributes, potentially leading to environmental information loss. In contrast, GNNs can easily express graph structures (Xu et al. Citation2018) and optimize various loss functions to preserve information; different graph embedding algorithms can then embed the expression of different information on the graph. Owing to their advantages in dealing with geographic element data, GNNs have been widely applied to building group pattern recognition and other tasks (Yan et al. Citation2019; Karsznia and Sielicka Citation2020; Zhang et al. Citation2022; Li, Lu et al. Citation2022). Such intelligent algorithms are expected to perform well in solving macro-level problems, such as the optimal selection of generalization operators (Touya et al. Citation2019).

In summary, determining the optimal generalization operator faces several challenges. First, although many studies have investigated the intelligent use of generalization operators, the collocation use of these operators has not been considered. Given the natural order and constraints among generalization operators, it is necessary to analyze their characteristics and different operator combinations to realize the intelligent scheduling of generalization operators. This requires human cartographers to analyze building characteristics to determine which generalization operator combination to use, thus selecting the optimal generalization operators. This behavior can be simulated using neural networks such as GNN. By learning from existing samples, the network model can determine the optimal generalization operators for new large-scale buildings. Second, determining the optimal generalization operators is not only related to the characteristics of the building but also to the characteristics of the surrounding buildings. However, it is difficult to simulate the feature items of buildings and their environments simultaneously. For example, typically, when graph theory and graph convolution algorithms are applied to construct the relationships between buildings, only the topological information of the graph structures is used, while rich relationship features, such as distance and similarity, are ignored. Therefore, these graph structures do not sufficiently consider the complex relationships between buildings and their surrounding geographical elements; thus, explicit modeling of these relationships is lacking. Here, we propose an improved GNN model, called the self and neighborhood merged GNN (SNGNN), for optimal building generalization operator determination. The main steps include:

Setting the sample label of the large-scale buildings: Based on the matching relationship analysis and the characteristics of buildings before and after generalization, the generalization operator combination is summarized. Furthermore, the importance of different generalization operators in the combination is assessed. The operator that plays a leading role in the combination is regarded as the optimal generalization operator of the building. Subsequently, the large-scale buildings are labeled according to their optimal generalization operator for subsequent network model learning and validation.
Sample feature extraction of the large-scale buildings: Building characteristics and their relationships with the surrounding environment are analyzed before division into node and edge features and organized into a reasonable input structure as the input source of the decision-making model.
Determining the optimal generalization operator of large-scale buildings: Traditional GNN algorithms cannot incorporate edge features in the modeling of building group characteristics, and the nodes and neighbors have the same weight without distinction between them. To overcome this issue, SNGNN distinguishes the node self-expression from the neighborhood characteristic expression to better simulate the building features. Moreover, to assess the model’s classification ability, a testing data set is applied to verify whether SNGNN can determine the optimal generalization operator.

2. Methods

2.1. Framework for determining optimal generalization operators

The goal of this study is to select the optimal generalization operator for different buildings. To this end, we extracted a sample data set from the multi-scale database, including label setting (the labels are the learning targets of the training data set and the evaluation criteria of the testing data set) and feature extraction (the analysis of sample features to obtain items describing the samples). A suitable learning model was built to learn high-dimensional information from the extracted feature descriptors, simulating the cartographers’ decision-making process. Finally, the importance of each feature item, as well as the effectiveness, scientificity, and sensitivity of the model, were analyzed. The framework is presented in .

Figure 1. Framework for determining the optimal generalization operator based on the SNGNN.

2.2. Setting the sample label of the large-scale buildings

The scientific use of generalization operators is a key factor in building generalization (Li Citation2007). The prerequisite for automatically determining optimal generalization operators is to define the correct operators and corresponding labels. Typically, the generalization operators of buildings can be divided into five types: selection (deletion), simplification, aggregation, typification, and displacement (Zhang et al. Citation2018). Displacement is generally a conflict processing operator based on the need for map symbolic representation and is independent of other generalization operators. Thus, the displacement operator was not considered here.

Generalization operators are often used in combination and, thus, have a certain priority order. By observing generalization examples of multi-scale data, it is apparent that the combination of generalization operators has regularity, and different combinations lead to different matching relationships of the same target in multi-scale data. For example, if a 1:0 matching relationship is obtained, the cartographers will delete the building after analyzing its characteristics. In contrast, a 1:1 matching relationship will result in the building being retained and merged with other buildings; additionally, the simplification operator, or combined simplification and displacement, is adopted. For the N:1 and M:N matching relationships, the characteristics of building groups are obvious; according to the characteristics of the group pattern, the aggregation and typification operators are respectively adopted, and the simplification operator can be applied for optimization. Thus, the different matching relationships of multi-scale data are related to the generalization operators used in the building generalization.

Meanwhile, the generalization operators in the combination have different importance. Taking the generalization operator combination corresponding to an N:1 matching relationship as an example, the aggregation operator is the most important in the combination. Considering that applying the aggregation operator indicates that the building has been selected, the simplification and displacement operators adopted after the aggregation only make minor adjustments to the building. Hence, the aggregation operator is the main generalization operator that the cartographer decides to use according to the characteristics of the buildings. We call this the optimal generalization operator of buildings. Similarly, combinations corresponding to other matching relationships also have optimal synthesis operators.

In summary, a mapping relationship exists among multi-scale data matching, the generalization operator combination, and the optimal generalization operator of large-scale buildings ().

Table 1. Mapping relationships within multi-scale data matching, usage of the generalization operator combination, and the optimal generalization operator.

Display Table

Thus, by identifying the matching relationship of multi-scale data, the optimal generalization operator used by cartographers for large-scale buildings can be obtained and set as a sample label. The sample will then be used to train and test the model, simulating the decision behavior of cartographers in choosing the optimal generalization operator for buildings.

2.3. Sample feature extraction of large-scale buildings

When determining the optimal generalization operators, cartographers must consider the characteristics of the target building and those of surrounding buildings. In past studies, the sample structure was organized as one or two two-dimensional node feature matrixes (Lee et al. Citation2017), which described the characteristics of the building itself and the topological relationships of the buildings. However, owing to the dimensional limitations of matrices, it is difficult to express the information-rich relationships between buildings and their neighboring buildings (Yan et al. Citation2019). To address this issue, we analyzed the message transmission principle of spatial graph convolution. Based on the input matrix of traditional graph convolution networks (GCNs), an edge feature matrix containing information on the relationship between a building and its neighboring buildings was added as an extra input matrix to the GNN.

2.3.1. Structure setting of the large-scale building input

GCN in the spatial domain is an important branch of GNN. It aggregates the information of adjacent nodes directly in the spatial domain, extends the convolution operation in Euclidean space directly to the graphs, and allows parameter sharing through a consistent information aggregation function. An information transmission architecture for spatial graph convolution was proposed, covering the most popular spatial GCNs (Kipf and Welling Citation2017) and graph attention networks (GATs) (Veličković et al. Citation2018). The message propagation process in GNN was divided into two steps, i.e. message propagation and state updating, represented by the $M$ and $U$ functions, respectively, as shown in EquationEquation 1(1) $m_{ν}^{t + 1} = \sum_{w \in N (v)} M_{t} (h_{v}^{t}, h_{w}^{t}, e_{vw})$ (1) and EquationEquation 2(2) $h_{v}^{t + 1} = U_{t} (h_{v}^{t}, m_{ν}^{t + 1})$ (2) : (1) $m_{ν}^{t + 1} = \sum_{w \in N (v)} M_{t} (h_{v}^{t}, h_{w}^{t}, e_{vw})$ (1) (2) $h_{v}^{t + 1} = U_{t} (h_{v}^{t}, m_{ν}^{t + 1})$ (2) wherein $ν and w$ represent nodes in Graph $G,$ $h$ is a node characteristic of the hidden layer, $e$ is an edge characteristic of the hidden layer, $t$ is the running time step, $N (v)$ is the neighborhood of $v,$ and $M$ and $U$ are differentiable functions, which generally contain learnable parameters. $M$ should not be sensitive to node ranking.

The equations indicate that to describe the characteristics and context of the building better, the input of the GNN model should include three matrices:

Node feature matrix $N \in R^{n \times d_{1}},$
Building adjacency matrix $A \in R^{n \times n},$
Edge feature matrix $E \in R^{n \times n \times d_{2}},$

where

n

is the number of nodes in the graph;

d_{1} {and d}_{2}

are the number of node features and the number of edge features, respectively;

N

is used primarily to describe the attribute characteristics and macroscopic environmental characteristics of the building;

A

describes the relationship between the buildings (message transmission only occurs between the nodes of buildings with connection) in which

A_{ij}

indicates the existence of edge connections between

ν_{i} and v_{j},

and

E

organizes the relationships between pairs of buildings in the form of edges. Combined with the adjacent information provided by the adjacency matrix, the edge feature matrix can describe the relationships between the building and its multiple neighbors and retain substantial amounts of contextual information about the building and its neighbors.

2.3.2. Setting description items of input matrixes

According to the above analysis, the description items of $N, A,$ and $E$ should be set.

2.3.2.1. Setting of the building adjacency matrix

Traditional methods to construct the neighborhood topological relationships of buildings are Delaunay triangulation, minimum spanning tree networks (Zhang et al. Citation2011), and building mesh (Wang, Qian et al. Citation2015). Delaunay networks are characterized by a rich neighborhood structure and fast construction and can retain more neighborhood information than the other two methods. Thus, the Delaunay network was adopted in this study for group network construction.

A noteworthy issue is the constraint conditions of network construction, including determining the global constraint of the network construction objects (Deng et al. Citation2018) and the local constraint of pruning unreasonable connected objects after construction. Considering the linkage problem between roads and buildings in the generalization process (Liu, Qian, Wang et al. Citation2016), the buildings were divided into different block settlements using small-scale road data as a global constraint. For the local constraint problem, unreasonable neighborhood objects were avoided by pruning the building Delaunay network according to two principles: 1. Excessively long edges in the Delaunay network were deleted (blue lines in ), and the length threshold was manually set according to regional characteristics. 2. Delaunay edges that intersected with other buildings were deleted (red lines in ).

Figure 2. Flowchart for buildings adjacency matrix construction.

2.3.2.2. Setting of the node feature and edge feature matrices

Attribute items describing buildings have been widely studied and used in cartography (Ai et al. Citation2013; Liqiang et al. Citation2013; Abbaspour et al. Citation2021; Touya Citation2021). The morphological and environmental characteristics of buildings are important factors influencing the determination of the generalization operator (Shuai et al. Citation2007). By analyzing and classifying these characteristics, we can summarize them into different description items and organize them into node and edge features, respectively.

To characterize the building itself, in cartographic generalization, we focus on the size, orientation, and shape of the building. To describe the environmental characteristics of buildings, the relationships between building pairs must be quantified. The Gestalt theory and urban morphology have been widely used in this process. Li et al. (Citation2004) summarized eight Gestalt principles: proximity, similarity, common fate, common region, closure, continuity, element connectivity, and common orientation. We reorganized these principles and set them into four categories to describe the similarity, proximity, closure, and connectivity of building pairs. Additionally, other parameters, such as density and nearest road distance, were adopted to describe the macro-environment characteristics.

Based on existing research and further analysis, description items could be summarized as individual or building pair attributes. The specific definitions of the description items are shown in . The calculation and explanation of certain complex description items are detailed in .

Figure 3. Schematic diagram of complex feature item calculations.

Table 2. Formulas and descriptions of the building characteristics.

Display Table

Further, the relationship between the nearest building and itself is used as a description of the contextual environment of buildings and as a component of the node features. Finally, the node and edge feature matrix attributes are summarized in .

Table 3. Setting of node feature and edge feature matrices.

Download CSV Display Table

After designing the input structure, the specific values of different feature matrices are obtained according to the formula for each feature description term; finally, the inputs for network model learning are obtained. The flowchart of sample feature extraction is illustrated in .

Figure 4. Flowchart of sample feature extraction.

2.4. Self-neighborhood merged GNN model

To achieve the selection capability of the model, two problems must be addressed. The first is obtaining high-dimensional expression of the building itself and its neighborhood based on rich node feature information, neighborhood information, and edge feature information. The second issue involves learning these high-dimensional expressions and selecting the optimal generalization operator by merging the expressions.

Regarding the first problem, for the high-dimensional expression of the building itself, the traditional GNN could be adopted. When determining the optimal generalization operator of buildings, attention must be paid to the surrounding buildings according to their different characteristics. Thus, for the high-dimensional expression of the building neighborhood, the multi-attention mechanism GAT model was adopted, which fuses node features and edge features in a neighborhood embedding module.

Concerning the second issue, during the generalization of buildings, the role and influence of the characteristics of buildings on themselves and other buildings differ. In GNN and its variants, the influence of the node attributes on themselves is often preserved by adding self-rings. Thus, the parameters expressed by the influence of nodes on themselves and neighborhood nodes are shared, making it difficult to distinguish between their effects. To address this, an improved GNN structure was designed, namely SNGNN, which integrates the expression of nodes and neighborhood information with different modules. The overall structure of the generalization operator decision model is shown in , and the primary components of each module are described as follows.

Figure 5. Workflow of the SNGNN decision model. $f (\cdot) and σ (\cdot)$ are the fully connected transformation + Relu activation function and the fully connected transformation + Softmax function, respectively; EGAT represents the EGAT layer convolution + Relu activation function. EGAT: edge graph attention network; SNGNN: self and neighborhood merged graph neural network.

Figure 5. Workflow of the SNGNN decision model. f(⋅)and σ(⋅) are the fully connected transformation + Relu activation function and the fully connected transformation + Softmax function, respectively; EGAT represents the EGAT layer convolution + Relu activation function. EGAT: edge graph attention network; SNGNN: self and neighborhood merged graph neural network.

2.4.1. Self-feature embedding module

The main task of this module is to learn the embedded expression of node features. It belongs to the GNN model with only a self-loop, essentially equivalent to a fully connected network. The received input is $N \in R^{n \times d_{1}},$ and the output function of the node is expressed as shown in EquationEquation 3(3) $\vec{h^{'}} = σ (W \vec{h} + b)$ (3) . (3) $\vec{h^{'}} = σ (W \vec{h} + b)$ (3) where $W \in R^{E_{in}} \times R^{E_{out}}$ is the learnable parameter, and $σ$ represents the nonlinear activation function.

2.4.2. Neighborhood feature embedding module

In recent years, scholars have paid increasing attention to the application of edge features in graph structure. Most GNN variants are based on the improvement of the GAT network, a representative GNN. The central concept of GAT is the performance of self-attention on the nodes—a shared attentional mechanism that can assign different importance values to different nodes in the convolution process. The general improved method is to enhance the calculation formula (Gong and Cheng Citation2019) or the information transmission strategy (Yang and Li Citation2020). Chen and Chen (Citation2021) and Wang et al. (Citation2021) fused these two improved methods. The splicing edge features were integrated into the calculation formula of the attention coefficient in a form equivalent to node features to realize synchronous message aggregation of the node graph and edge dual graph at each network layer. The node module and edge module in the edge GAT (EGAT) layer may be updated in different forms depending on specific requirements, $R^{F} \times R^{F} \to R .$ Finally, different learning tasks were performed by fusing multi-scale edge features and node features in the final merged layer. The framework is shown in .

Figure 6. Fusion edge characteristic graph convolution network framework (Chen and Chen Citation2021).

The main task of the neighborhood structure module is to learn the neighborhood structure features of buildings. The module accepts $N \in R^{n \times d_{1}},$ $E \in R^{n \times n \times d_{2}},$ and $A \in R^{n \times n},$ primarily realized by two EGAT layers. The multi-attention mechanism adopts the calculation method proposed by Chen and Chen (Citation2021), as shown in EquationEquation 4(4) $a_{ij} = \frac{exp (LeakyReL U ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ij}}])}{Σ_{k \in N_{i}} exp (LeakyReLU ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ik}}])}$ (4) . (4) $a_{ij} = \frac{exp (LeakyReL U ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ij}}])}{Σ_{k \in N_{i}} exp (LeakyReLU ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ik}}])}$ (4) where T denotes the matrix transposition operation, $‖$ denotes the matrix stitching operation, $W_{h} \in R^{F_{H}^{'}} \times R^{F_{H}}$ and $W_{e} \in R^{F_{E}^{'}} \times R^{F_{E}}$ are learnable parameters, which respectively realize the mapping of node features and edge features to high dimensions, and $F_{H}, F_{H}^{'}, F_{E}, F_{E}^{'}$ denote the input and output dimensions of the nodes and edges. The dimensions are converted $(R^{F_{H}^{'}}, R^{F_{H}^{'}}, R^{F_{E}^{'}}) \to R$ by calculating $a_{ij} .$ The normalized weights were obtained and organized in the form of an adjacency matrix $A_{2} \in R^{N} \times R^{N},$ in which $a_{ij}$ is the attention coefficient between $v_{i}$ and $v_{j} .$ The update of node features took the $A_{2}$ matrix as a guide to weigh and average the features of its neighbors. The calculation formula is shown in EquationEquation 5(5) $\vec{h^{'}} = σ (Σ_{j \in N_{i}} a_{ij} [W_{h}^{'} \vec{h_{j}} ‖ ‖ W_{e} \vec{e_{ij}}])$ (5) . (5) $\vec{h^{'}} = σ (Σ_{j \in N_{i}} a_{ij} [W_{h}^{'} \vec{h_{j}} ‖ ‖ W_{e} \vec{e_{ij}}])$ (5)

Notably, the learning matrix $W_{h}^{'} \in R^{E_{in}} \times R^{E_{out}}$ differs from $W_{h}$ in EquationEquation 4(4) $a_{ij} = \frac{exp (LeakyReL U ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ij}}])}{Σ_{k \in N_{i}} exp (LeakyReLU ({\vec{a}}^{T} [W_{h} \vec{h_{i}} ‖ W_{h} \vec{h_{j}} ‖ W_{e} \vec{e_{ik}}])}$ (4) , which distinguishes the node features used in the calculation of the attention coefficient from the node features used in expression learning for building neighborhood features.

Next, we set different attention channels to calculate and generate $\vec{h^{'}}$ values, which were fused. To prevent the fusion of different attention channels, the concat aggregation method was adopted. Therefore, the final node features are as follows: (6) $\vec{h^{'}} = ‖_{l = 0}^{L} \vec{{h_{l}}^{'}}$ (6)

Meanwhile, the transformation results of the edge features in the attention, updated by different channels, are spliced as the input dimensions of the next layer of the edge features: (7) $E^{l} = {‖_{l = 0}^{L} W}_{e} \vec{e_{ijl}}$ (7)

An example of a concrete EGAT layer is shown in .

Figure 7. Schematic diagram of the node and edge feature updates of the EGAT layer. The number of multiple attention channels in EGAT layer l = 2. EGAT: edge graph attention network.

2.4.3. Merge module

After learning the above two modules, the high-dimensional expression of the building itself and its neighborhood was obtained. The outputs of the two modules were spliced to obtain the merging module input; then, the reduced dimension expression of the fusion characteristics was learned through the fully connected network. The hidden layer also used a Relu nonlinear activation function, while the output layer used a Softmax function to calculate the probability of each category. The dimension index with the largest output probability was adopted as the determining result.

3. Results and analyses

The platform used in this experiment was the Microsoft Win10 64-bit operating system. The proposed SNGNN is implemented with Python in Pytorch. The code is publicly available (https://doi.org/10.6084/m9.figshare.23118758). The libraries are attached as a ‘environment.txt’ file.

3.1. Data preparation

The dataset from 1:10,000 and 1:25,000 buildings in Ningbo, China was used in this study. The dataset is part of the Beilun District of Ningbo city; the specific location is shown in .

We first utilized manual inspection to identify matching relationships for the same building object before and after generalization. According to the steps in sections 2.2. and 2.3., sample labels were set, and description items were extracted; 3100 samples were obtained, including 313 deletion, 146 simplification, 1931 aggregation, and 710 typification samples () $.$

Figure 8. Dataset overview of the Beilun District, Ningbo, China. A. Dataset location and study site depicted within the red box. B. Building and road conditions in the dataset; right: label of large-scale data samples.

3.2. Experimental results and analysis

3.2.1. Experimental results

A 8:2 ratio was used to divide the training and testing sets. Two EGAT layers were used to update the node and edge features. The Adam optimizer was selected in the training iteration, the learning rate was set to 0.005, and the weight attention coefficient was set as 0.0005. The specific parameter settings are shown in .

Table 4. Parameter settings in layers.

Download CSV Display Table

Given that the number of samples in the case was not balanced, it was relatively easy to direct the model toward simpler samples. The focal loss function (Lin et al. Citation2017) provided a method for adjusting the direction of the model by increasing the loss value for error-prone and difficult samples. Standard cross entropy (CE) was multiplied by ${(1 - p_{t})}^{γ},$ which is proportional to the difficulty of classification. The specific formula is shown in EquationEquation 8(8) $FL (p_{t}) = - α_{t} {(1 - p_{t})}^{γ} log (p_{t})$ (8) . (8) $FL (p_{t}) = - α_{t} {(1 - p_{t})}^{γ} log (p_{t})$ (8)

Therefore, a focus function was used to balance the learning degree of different samples. Gamma ( $γ$ ) was set to 1 and alpha ( $α_{t}$ ) to the empirical values [1, 2, 1, 1]. The changes in the accuracy rate and loss function of the training and testing set are shown in . It can be seen that as the number of iterations increased, the loss function values for the training and test sets gradually decreased, and the accuracy of the training and testing datasets increased. The values were relatively smooth after 200 iterations, indicating that the model’s ability to simulate the training dataset gradually increased during training, and its ability to classify the testing dataset gradually improved.

Figure 9. Loss and accuracy curves of the training and testing set in the SNGNN model learning process. SNGNN: self and neighborhood merged graph neural network.

According to the training results, the accuracy rate of the testing set could be obtained, and the determining results were displayed using the evaluation indexes of neural networks (recall rate, precision rate, and F1 score). The final fusion matrix classification results and the scores of each index are shown in and , respectively.

Table 5. Confusion matrix for dataset 1 testing set.

Download CSV Display Table

Table 6. Correct rate and F1 score of different operators.

Download CSV Display Table

As shown in , after training was completed, the correct rate of the training and testing sets exceeded 96% and 94%, respectively. Although the proportions of the different categories greatly differed, the recall and accuracy rates exceeded 90% and 85%, respectively, indicating that the model could effectively distinguish buildings with different generalization operators and determine the optimal generalization operators.

3.2.2. Importance analysis of features

In cartographic generalization, the interpretability and prediction accuracy of machine learning models are extremely important. However, the mathematical models in neural network methods are simulated by many parameters, and the input features are related in numbers. Compared with self-explanatory machine learning methods, it is difficult to generate reliable regular description knowledge. Recently, many studies on interpretable artificial intelligence have been conducted (Smilkov et al. Citation2017; Galkin et al. Citation2018). Permutation importance (Altmann et al. Citation2010) tests feature items with high computational efficiency and flexible implementation. The process involves retaining the trained neural network model and analyzing the influence of a feature item on the results by randomly disrupting the input of that feature item. This is achieved by disrupting the values of different feature items in the testing data set, inputting them into the trained model, and calculating the reduction in the output accuracy before and after the disruption in percentage. Due to the large number of attribute items, we categorized them into different feature modules. Each experiment disrupts the values of all attribute items in the same feature module to assess the importance of different feature modules. The same experimental group was repeated five times to neutralize the uncertainty caused by random scrambling. Finally, the average correctness change value was calculated for the five replicate experiments (). In addition to calculating the change in total accuracy, we also determined the change in accuracy for the four label types.

Table 7. Impact of different feature modules on the accuracy of different categories.

Display Table

According to the results in , when different feature module input was disturbed, the accuracy of deletion, simplification, and typification operators is affected as follows:

Owing to the large number of samples and the wide range of description items in the aggregation operator, the model easily determined the confusing samples as aggregation operators when key attributes were disturbed. Thus, its accuracy fluctuations were relatively small and positive.
For the deletion operator, the size, shape, environment, and proximity to surrounding nodes were important features, as the deletion operator was suitable for buildings with small areas and those relatively far away from other buildings.
For the simplification operator, the area’s environment, various attributes of the nearest building, and its proximity to surrounding buildings were the most important features. This was primarily due to the suitability of the simplification operator for buildings relatively far from other buildings and those with nearest buildings that are small.
The orientation, similarity, and proximity of the nodes around the typical operator were highly important features, as the typification operator was sensitive to orientation and shape similarity. Notably, the influence of orientation features on the node feature module of the self-feature embedding module was less than that in the node feature module of the neighborhood feature module. This might be due to the ability of edge features to model the orientation similarity of two buildings explicitly. This suggests that inputting the relationships between buildings into the model as edge features is an effective means of modeling.
Overall, the edge feature module of the neighborhood feature module in the SNGNN is a key module that substantially affects overall accuracy, demonstrating the necessity of increased edge feature matrices in this paper.

3.2.3. Model comparison analysis

To verify the scientific applicability of SNGNN, we compared it with the traditional method of determining generalization operators. Execution of the traditional generalization model relies on constraints and procedural knowledge. Referring to the work of Ruas and Duchene (Citation2007), we summarized the constraints and rules according to the characteristics of the sample building dataset, and set certain priorities as the rules. According to the constraints shown in and the rule execution process in , selecting the appropriate generalization operators for buildings with different characteristics is regarded as a traditional rule-based method (RB).

Table 8. Constraints of the rule-based method.

Display Table

Table 9. Steps in the rule-based selection of optimal generalization operators.

Download CSV Display Table

To further test the validity of the model, we compared it with other traditional intelligent methods, namely iterative dichotomiser 3 (ID3), support vector machine (SVM), k-nearest neighbor (KNN), and other popular neural networks, such as the multilayer perceptron (MLP), GCN, and GAT methods. The datasets were divided into training, testing, and validation sets at a ratio of 6:2:2. The input matrices for the comparison experiments were identical save for the edge feature matrix added by special improvement. According to the performance of the results on the testing set, the max depth of ID3 was set to 5, the nearest neighbor K of KNN to 10, and the gamma of SVM to 1. Compared with SNGNN, the focal loss was used as the target loss function for feedback iterative training, and the training step size and weight loss parameters were consistent. The learning curves of four neural networks are shown in . The experimental results of the seven methods are listed in .

Figure 10. Training of four neural networks. The four curves represent the changes in the accuracy of four neural network models with an increase in the number of iterations.

Table 10. Comparison of decision determining results for different methods.

Download CSV Display Table

According to , SNGNN has obvious advantages over the other methods:

Compared with the traditional rule-based approach, SNGNN exhibited a substantial improvement in the classification correctness for the typification and aggregation operators. Particularly for the typification operator, the distribution patterns of different building groups are readily observable, while quantifying them as rules is difficult. The distribution patterns of different buildings may vary considerably and be difficult to describe using traditional rule-based methods. For building similarity metrics, it is difficult to generalize the complexities in new data by relying only on manually set threshold parameters. Therefore, SNGNN can learn the characteristics of buildings and select the optimal generalization operator suitable for different buildings.
Compared with traditional machine learning algorithms and MLP—a classical multilayer perceptron for deep learning—SNGNN greatly improved the overall accuracy even for small categories with a few samples, such as deletion and simplification operators. Moreover, SNGNN could better determine complex generalization operators, such as typification. This is attributed to the difficulty of comprehensively describing the environmental information of buildings using traditional machine learning methods. In contrast, SNGNN constructs a suitable relationship between a building and its surroundings to realize the transmission of feature information. Hence, the model can perceive buildings with notable group patterns.
Compared with the GCN and GAT, which also use graph structure input, SNGNN can effectively maintain the accuracy of small categories. Although the improvement in overall accuracy might not be substantial, according to the results, the other methods exhibit considerably poorer determination effects regarding deletion and simplification operators and are thus not practical. This may be because when SNGNN determines nodes, the edge features not only affect the attention among nodes but also directly fuse with the node features. Thus, SNGNN can learn more feature information, and the correct rate in the validation set is considerably higher than that of the other three neural networks, suggesting that the learned features are more consistent with the data. As a result, SNGNN shows better practicality.

To further analyze the simulation ability of SNGNN, we selected blocks in the verification set, visualized the determination results, analyzed the characteristics of the results, and counted the accuracy rate of each block. The visualization effects are shown in , and each block was sorted according to the determination results of the real label, KNN, ID3, SVM, MLP, GCN, GAT, and SNGNN.

Figure 11. Comparison of visualization effects for building generalization operator decision results under different learning models. RB: rule-based method, KNN: k-nearest neighbor, ID3: iterative dichotomiser 3, SVM: support vector machine, MLP: multilayer perceptron, GCN: graph convolution networks, GAT: graph attention network, and SNGNN: self and neighborhood merge graph neural network.

As shown in the determination results of blocks 1–4, the SNGNN model could detect relative outlier buildings well and determine the simplification operator category with high accuracy. It seldom produced unreasonable results, for example, determining the buildings without a distinguished group pattern as typification operators and deleting the buildings with obvious large areas. Compared with other intelligent algorithms, SNGNN effectively used the context description of the buildings provided by the node and edge feature items.

Although the algorithm presented in this study exhibited substantial improvements in the optimal decision-making of generalization operators, certain issues remained. For example, SNGNN is not particularly effective in determining small buildings. In , buildings with the wrong determination results in blocks 3, 4, and 5, which are close to other buildings with smaller areas, might cause confusion in determining aggregation and deletion operators. Similarly, for small buildings with distinguished group characteristics, confusion between typification and aggregation operators may occur, which appears to be a reasonable prediction result for simulating a cartographer’s behavior, as the use of generalization operators is a subjective decision, and cartographers might make different decisions (deletion and aggregation) for small buildings. This type of error is acceptable in practice and can be manually corrected.

3.2.4. Parameter sensitivity analysis

The preset SNGNN model underwent preliminary effect validation. To further validate the rationality and robustness of the model settings, the SNGNN structure settings were discussed for different situations and the specific implementation of loss.

3.2.4.1. Influence of SNGNN structure setting

In the SNGNN model, the neighborhood expression module is the central module of the abundant information transmission. This module is based on the EGAT layer and is affected by two main parameters. The first is the number of stacked layers. According to the operation principle of spatial graph convolution, the number of layers determines the order of the neighbor node when using the neighbor node feature to update the node information. The GNN structure is generally set to 2–5 layers, showing good effects (Zhao and Akoglu Citation2019). The second parameter is the number of attention channels in each layer, determining the information dissemination channel diversity between the nodes. This section focuses on these two parameters and establishes comparative experiments to obtain the optimal network structure.

In the layer parameter comparative experiment, the number of attention channels was set to 4 for each layer; the number of node and edge features for the hidden layer was set to 16, and the output layer was set to 8. Each experimental group was trained five times, from which the average value was obtained. The changes in the correct rate of each category for different numbers of layers are summarized in . In the comparative experiment of attention channel number, the number of layers was set to 2, the number of the hidden layers in the node and edge features was set to 32, and the dimension of the output layer was 8. The correct rate and total time consumption of each category under different experimental conditions are summarized in and , respectively.

Figure 12. Influence of different numbers of layers on the correct rate of each category. Numbers 1–6 represent layer numbers. Each category is replaced with shorthand, Del: deletion, Sim: simplification, Typ: typification, all: all samples.

Figure 13. Influence of different attention channel numbers on the correction rate of different categories. Numbers 1–6 represent channel numbers. Each category is replaced with shorthand, Del: deletion, Sim: simplification, Typ: typification, all: all samples.

Table 11. Correspondence between the number of attention channels and training time.

Download CSV Display Table

With an increase in the number of layers, the overall accuracy of the training and testing set increased and then decreased gradually. When the number of layers was 2, accuracy was maintained at a relatively high level, the accuracy of the different categories in the testing set was more balanced, and the overall effect of the network was optimal. Notably, when the number of layers increased, the correct rate of the simplification operator in the testing set decreased, while that of the typification operator increased. One possible explanation is that with too few layers, only first-order neighbors are considered, and group feature information might be lost from the nodes, resulting in decreased typification accuracy. When there are too many layers, too many indirect neighbors are considered, and the nodes might lose their particularity in transmitting information. Moreover, the network would be too smooth, resulting in a sharp decrease in the accuracy of simplifying generalization operators with obvious individual characteristics. Overall, to determine the optimal generalization operators, it is appropriate to consider the second-order (two layers) neighbors around each building.

The fitting ability of the model improves with an increase in channels owing to increases in the number of parameters and expression ability. However, the overall accuracy of the testing set initially increases and then decreases with an increasing attention channel number, demonstrating that too many channels can easily lead to over-fitting and a substantial increase in training time. When there are four channels, the performance of the model in the testing set is the best, and the accuracy in the training set can also be maintained at a relatively high level. Therefore, setting the number of attention channels to four is appropriate to maintain the simulation and generalization abilities of the network and shorten the training time.

3.2.4.2. Influence of the hyper-parameter setting of the loss function

The focal loss function was adopted in this study to balance the uneven distribution of samples. Its action principle is to improve the determining ability of the model for categories with few or difficult samples by increasing the weight of the loss value of wrongly determined samples through $γ$ and increasing the weight of loss value of categories with too few samples through $α_{t} .$ Therefore, the experiment was divided into two groups. In the first group, $γ$ was fixed as 2, and $α_{t}$ was set as [1,1,1,1], $(n - n_{t}) ∕ n,$ and [1,2,1,1], in which the last one was an empirical value given according to observation and analysis experiences. In the second group, $α_{t}$ = [1,1,1,1] was fixed, and $γ$ was set to different values to observe the performance of different categories on the testing set.

In the first group, each comparative experiment was repeated five times, and a box diagram of the classification accuracy rate and the total samples was drawn to show changes in the accuracy rate ().

Figure 14. Influence of different $α_{t}$ values on the accuracy of different categories represented by different colors. Del: deletion, Sim: simplification, Typ: typification, All: all samples.

The training and testing set showed the best effects when a certain proportion was set, according to experience, to adjust the proportion of the different analogies in the loss function. A better balance between categories was achieved at the cost of slightly reducing the overall correction rate. However, the reverse weighting method, according to the total number of different samples in the training set, had a deleterious effect overall. Considering that simplification and deletion of categories with small sample numbers led to over-fitting when assigning high weight values, the correct rates of the training and testing sets were quite different. Moreover, as simplification operators acted on difficult samples compared with deletion operators, the weight might improve the accuracy of the small and simple samples, however, does not balance them well. Overall, setting $α_{t}$ to a default value of [1,1,1,1] might be the best choice. In the absence of human experience, the basic requirements can be met without considering $α_{t}$ settings.

In the second group of experiments, each comparative analysis was repeated five times and the average value was calculated. The results are shown in .

Figure 15. Effect of $γ$ on the accuracy of different categories. Del: deletion, Sim: simplification, Typ: typification, all: all samples.

In the four different types of samples, the simplification operator was the most sensitive to the $γ$ value. While $γ$ increased, the learning of the simplification operator initially increased and then decreased gradually; the overall trend was roughly the same. The experimental effect was the best when $γ$ was 1, and the overall correct rate of the model exceeded 90% when $γ$ was less than 5. This indicates that $γ$ can fine-tune the correct rate of different samples but has low overall influence on the model. Regarding the large fluctuation in the simplification operator, this type of sample is difficult for the model, as explained previously. Although an appropriate increase in $γ$ can improve the accuracy, a large $γ$ can confuse the model and reduce its overall simulation ability.

4. Conclusion

An improved GNN, i.e. SNGNN, was proposed in this study to determine the optimal building generalization operators. Distinguished with traditional machine learning methods and GNN, SNGNN can realize intelligent decision-making with multiple generalization operators effectively and scientifically.

We proposed a new approach to determine the use of multiple generalization operators. In previous studies on building generalization, scholars often considered selection as the first step in the execution process. On this basis, the specific application of different distribution patterns of building groups differed among generalization calculations. Depending on the accuracy of the determination results for spatial distribution patterns, knowledge rules might be complex. This study considered the sequential use of multiple generalization operators as a regular combination. Four common types of combinations were summarized in terms of the graphical differences before and after generalization. By analyzing the importance of different generalization operators, the use of the four combinations was summarized as the selection of different optimal generalization operators. The cartographer use of different optimal generalization operators for different buildings after analyzing their characteristics is regarded as a decision-making behavior. We simulated this behavior using SNGNN, generated optimal decision results, and conducted experiments with data at 1:10,000 to 1:25,000. This research provides new insights for subsequent intelligent generalization and provides a method for adaptively selecting multiple generalization operators, which is a key step toward automatic building generalization. Meanwhile, when using large-scale data to update small-scale data in a cascade, downscaled updating of multiscale data can be realized by detecting the altered building and determining its optimal generalization operator.

This study also provided a detailed description of edge features based on GAT, which can model the relationship between two buildings from various perspectives. The experimental results showed that the proposed model can effectively describe the situation of the surrounding buildings. The SNGNN model can extract the characteristics of the building itself and its environment and constantly optimize the objective function in the iterative process to obtain good accuracy. This type of description and learning model is universal and can be applied to the problems of building clusters, building group pattern recognition, and other questions regarding the relationships among buildings to be described.

There are certain limitations to this study. First, the sample datasets in this study were extracted and learned only for the building data at 1:10,000 and 1:25,000. All intelligent methods rely on high-quality samples. Touya et al. (Citation2019) indicated that the knowledge of datasets should be learned through artificial intelligence, and the main responsibility of cartographers is to design reasonable sample datasets. Hence, developing rich samples is essential for reflecting the generalization changes of buildings with different distribution characteristics and scale data. Second, the proposed SNGNN only determines the optimal generalization operator that should be used for each building and does not address the problem of implementing specific generalization algorithms. In our subsequent study, we will focus on sample enrichment and implementation of the generalization algorithm after determining the optimal generalization operators.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data and codes that support the findings of this study are available with the identifier(s) at the private link: https://doi.org/10.6084/m9.figshare.23118758.

Additional information

Funding

This work was supported in part by the National Natural Science Foundation of China [number 42271463, 42101453, 42371461] and the Natural Science Foundation for Distinguished Young Scholars of Henan Province [number 212300410014].

References

Abbaspour RA, Chehreghan A, Chamani M. 2021. Multi-scale polygons matching using a new geographic context descriptor. Appl Geomat. 13(4):885–899. doi:10.1007/s12518-021-00396-x.
Web of Science ®Google Scholar
Abhilash B, Busari E, Syranidou C, Linssen J, Stolten D. 2022. Classification of building types in Germany: a data-driven modeling approach. Data. 7(4):45. doi:10.3390/data7040045.
Web of Science ®Google Scholar
Ai T, Cheng X, Liu P, Yang M. 2013. A shape analysis and template matching of building features by the Fourier Transform Method. Comp Environ Urban Syst. 41:219–233. doi:10.1016/j.compenvurbsys.2013.07.002.
Web of Science ®Google Scholar
Ai T. 2021. Some thoughts on deep learning enabling cartography. Acta Geodaetica et Cartograhica Sinica. 50:1170–1182. doi:10.11947/j.AGCS.2021.20210091.
Google Scholar
Altmann A, Toloşi L, Sander O, Lengauer T. 2010. Permutation importance: a corrected feature importance measure. Bioinformatics. 26(10):1340–1347. doi:10.1093/bioinformatics/btq134.
PubMed Web of Science ®Google Scholar
An X, Zhu Y, Yan X. 2023. Building selection method supported by convolutional neural network. Acta Geodaetica et Cartographica Sinica. 52:1574–1583. doi:10.11947/j.AGCS.2023.20220216.
Google Scholar
Balboa JL, Ariza-López FJ. 2008. Generalization-oriented road line classification by means of an artificial neural network. GeoInformatica. 12:289–312. doi:10.1007/s10707-007-0026-z.
Web of Science ®Google Scholar
Basaraner M, Cetinkaya S. 2017. Performance of shape indices and classification schemes for characterising perceptual shape complexity of building footprints in GIS. International Journal of Geographical Information Science. 31(10):1952–1977. doi:10.1080/13658816.2017.1346257.
Web of Science ®Google Scholar
Bouvrie JV. 2006. Notes on convolutional neural networks. Cambridge, MA: Massachusetts Institute of Technology.
Google Scholar
Chen J, Chen H. 2021. Edge-featured graph attention network. arXiv. preprint arXiv210107671 doi:10.48550/arXiv.2101.07671.
Google Scholar
Cheng B, Liu Q, Li X. 2015. Local perception-based intelligent building outline aggregation approach with back propagation neural network. Neural Process Lett. 41(2):273–292. doi:10.1007/s11063-014-9345-x.
Web of Science ®Google Scholar
Courtial A, Ayedi AE, Touya G, Zhang X. 2020. Exploring the Potential of deep learning segmentation for mountain roads generalisation. IJGI. 9(5):338. doi:10.3390/ijgi9050338.
Google Scholar
Courtial A, Touya G, Zhang X. 2021a. Can graph convolution networks learn spatial relations? Abstr Int Cartogr Assoc. 3:1–2. doi:10.5194/ica-abs-3-60-2021.
Google Scholar
Courtial A, Touya G, Zhang X. 2021b. Generative adversative networks to generalize urban areas in topographic maps. In: XXIV ISPRS Congress (2021 Edition). p. 15–22. nice (en ligne), france. doi:10.5194/isprs-archives-XLIII-B4-2021-15-2021. Hal-03306595
Google Scholar
Courtial A, Touya G, Zhang X. 2022. Representing vector geographic information as a tensor for deep learning based map generalisation. AGILE GIScience Ser. 3:1–8. doi:10.5194/agile-giss-3-32-2022.
Google Scholar
Deng M, Tang J, Liu Q, Wu F. 2018. Recognizing building groups for generalization: a comparative study. Cartogr Geogr Inform Sci. 45(3):187–204. doi:10.1080/15230406.2017.1302821.
Web of Science ®Google Scholar
Duchêne C, Touya G, Taillandier P, Gaffuri J, Ruas A, Renard J. 2018. Multi-agents systems for cartographic generalization: feedback from past and on-going research. PhD dissertation, Institut National de l’Information Géographique et Forestière doi:10.13140/RG.2.2.35489.92006.
Google Scholar
Feng Y, Thiemann F, Sester M. 2019. Learning cartographic building generalization with deep convolutional neural networks. IJGI. 8(6):258. doi:10.3390/ijgi8060258.
Google Scholar
Fiedukowicz A. 2020. The role of spatial context information in the generalization of geographic information: using reducts to indicate relevant attributes. IJGI. 9(1):37. doi:10.3390/ijgi9010037.
Google Scholar
Galkin F, Aliper A, Kuznetsov I, Gladyshev VN, Zhavoronkov A. 2018. Human microbiome aging clocks based on deep learning and tandem of permutation feature importance and accumulated local effects. bioRxiv. doi:10.1016/j.isci.2020.101199.
Google Scholar
Gong L, Cheng Q. 2019. Exploiting edge features for graph neural networks. arXiv. Preprint. arXiv1809.02709: 9203–9211 doi:10.48550/arXiv.1809.02709.
Google Scholar
Hu Y, Liu C, Li Z, Xu J, Han Z, Guo J. 2022. Few-shot building footprint shape classification with relation network. IJGI. 11(5):311. doi:10.3390/ijgi11050311.
Google Scholar
Karsznia I, Sielicka K. 2020. When traditional selection fails: how to improve settlement selection for small-scale maps using machine learning. IJGI. 9(4):230. doi:10.3390/ijgi9040230.
Google Scholar
Karsznia I, Wereszczyńska K, Weibel R. 2022. Make it simple: effective road selection for small-scale map design using decision-tree-based models. IJGI. 11(8):457. doi:10.3390/ijgi11080457.
Google Scholar
Kipf T, Welling M. 2017. Semi-supervised classification with graph convolutional networks. arXiv. preprint. arXiv1609.02907. doi:10.48550/arXiv.1609.02907.
Google Scholar
Lamy S, Ruas A, Demazeu Y, Jackson M, Mackaness WA, Weibel R. 1999. The application of agents in automated map generalization. In: Proceedings of the 19th Ica Meeting. Ottawa, Canada 14, p. 160–169.
Google Scholar
Lee J, Jang H, Yang J, Yu K. 2017. Machine learning classification of buildings for map generalization. IJGI. 6(10):309. doi:10.3390/ijgi6100309.
Google Scholar
Li C, Wu W, Yin Y, Wu P, Wu Z. 2022. A multi‐scale partitioning and aggregation method for large volumes of buildings considering road networks association constraints. Transat GIS. 26(2):779–798. doi:10.1111/tgis.12885.
Web of Science ®Google Scholar
Li Y, Lu X, Yan H, Wang W, Li P. 2022. A skeleton-line-based graph convolutional neural network for areal settlements: shape classification. Appl Sci. 12(19):10001. doi:10.3390/app121910001.
Google Scholar
Li Z, Yan HW, Ai T, Chen J. 2004. Automated building generalization based on urban morphology and gestalt theory. Inter J Geogr Inform Sci. 18(5):513–534. doi:10.1080/13658810410001702021.
Web of Science ®Google Scholar
Li Z. 2007. Digital map generalization at The Age of Enlightenment: a review of the first forty years. The Cartographic Journal. 44(1):80–93. doi:10.1179/000870407X173913.
Web of Science ®Google Scholar
Lin T, Goyal P, Girshick R, He K, Dollár P, Facebook AI, Research FAIR. 2017. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell. 42(2):318–327. doi:10.1109/TPAMI.2018.2858826.
Google Scholar
Liqiang Z, Hao D, Dong C, Zhen W. 2013. A spatial cognition-based urban building clustering approach and its applications. Inter J Geogr Inform Sci. 27(4):721–740. doi:10.1080/13658816.2012.700518.
Web of Science ®Google Scholar
Liu C, Qian H, Wang X, He H, Xie L, Wang C. 2016. Linkage matching method of road and residential land using urban skeleton line network. c. 45:1485–1494. doi:10.11947/j.AGCS.2016.20160221.
Google Scholar
Liu C, Qian H, He H, Wang X, Xie L. 2016. A Construction Method of Road and Residence Correlation Based on Urban Skeleton Network. In: Yuan H, Geng J, Bian F, editors. geo-spatial knowledge and intelligence. GRMSE 2016. Communications in computer and information science, 699. Singapore: Springer.
Google Scholar
Liu C, Zhai R, Qian H, Gong X, Wang A, Wu F. 2023. Identification of drainage patterns using a graph convolutional neural network. Transactions in GIS. 27(3):752–776. doi:10.1111/tgis.13041.
Web of Science ®Google Scholar
Lyu Z, Sun Q, Ma J, Xu Q, Li Y, Zhang F. 2022. Road network generalization method constrained by residential areas. IJGI. 11(3):159. doi:10.3390/ijgi11030159.
Google Scholar
MacEachren AM. 1985. Compactness of geographic shape: comparison and Evaluation of measures. Geografiska Annaler: Series B, Human Geography. 67(1):53–67. doi:10.2307/490799.
Web of Science ®Google Scholar
Peura M, Iivarinen J. 1997. Efficiency of simple shape descriptors. In: Cordella LP, Arcelli C, Sanniti di Baja G, editors. Advances in Visual Form Analysis: proceedings of the 3rd International Workshop on Visual Form. Singapore: World Scientific. p. 443–451.
Google Scholar
Rieger MK, Coulson MR. 1993. Consensus or confusion: cartographers’ knowledge of generalization. Cartographica. 30(2-3):69–80. doi:10.3138/M6H4-1006-6422-H744.
Google Scholar
Rosin PL. 2000. Measuring shape: ellipticity, rectangularity, and triangularity. Machine Vision and Applications. 14(3):172–184. doi:10.1007/s00138-002-0118-6.
Google Scholar
Ruas A, Duchene C. 2007. A prototype generalisation system based on the multi-agent system paradigm. In: Mackaness WA, Ruas A, Sarjakoski LT, editors. Generalisation of Geographic Information. Amsterdam: Elsevier. p. 269–284.
Google Scholar
Sester M, Feng Y, Thiemann F. 2018. Building generalization using deep learning. Int Arch Photogramm Remote Sens Spatial Inf Sci. XLII-4:565–572. doi:10.5194/isprs-archives-XLII-4-565-2018.
Google Scholar
Shea KS, Master RB. 1989. Cartographic generalization in a digital environment: when and how to generalize. In: International Symposium on Computer-Assisted Cartography. p. 56–67.
Google Scholar
Shuai Y, Shuai H, Ni L. 2007. Polygon cluster pattern recognition based on new visual distance. In: Proceedings Volume 6753, Geoinformatics 2007: Geospatial Information Science, Nanjing, China; 675316. doi:10.1117/12.761778.
Google Scholar
Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M. 2017. SmoothGrad: removing noise by adding noise. ArXiv. Preprints. ArXiv abs/1706.03825. doi:10.48550/arXiv.1706.03825.
Google Scholar
Steiniger S, Taillandier P, Weibel R. 2010. Utilising urban context recognition and machine learning to improve the generalisation of buildings. Inter J Geogr Inform Sci. 24(2):253–282. doi:10.1080/13658810902798099.
Web of Science ®Google Scholar
Susetyo DB, Hidayat F. 2019. Specification of map generalization from large scale to small scale based on existing data. IOP Conf Ser Earth Environ Sci. 280(1):012026. doi:10.1088/1755-1315/280/1/012026.
Google Scholar
Touya G, Zhang X, Lokhat I. 2019. Is deep learning the new agent for map generalization? Inter J Cartogr. 5(2-3):142–157. doi:10.1080/23729333.2019.1613071.
Web of Science ®Google Scholar
Touya G. 2021. Multi-criteria geographic analysis for automated cartographic generalization. The Cartographic Journal. 59(1):18–34. doi:10.1080/00087041.2020.1858608.
Web of Science ®Google Scholar
Touya G, Lokhat I. 2020. Deep learning for enrichment of vector spatial databases. ACM Trans Spatial Algorithms Syst. 6(3):1–21. doi:10.1145/3382080.
Web of Science ®Google Scholar
Touya G, Duchêne C, Ruas A. 2010. Collaborative generalisation: formalisation of generalisation knowledge to orchestrate different cartographic generalisation processes. In: Geographic Information Science: 6th International Conference, GIScience 2010, Zurich, Switzerland, September 14–17, 2010. Proceedings 6, Springer, Berlin. p. 264–278.
Google Scholar
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. 2018. Graph attention networks. ArXiv. preprint. ArXiv abs/1710.10903. doi:10.48550/arXiv.1710.10903.
Google Scholar
Wang W, Du S, Guo Z, Luo L. 2015. Polygonal clustering analysis using multilevel graph‐partition. Transactions in GIS. 19(5):716–736. doi:10.1111/tgis.12124.
Web of Science ®Google Scholar
Wang X, Qian H, He H, Chen J, Hu H. 2015. Matching multi-source residential land by blank area skeleton line mesh. Acta Geodaetica et Cartographica Sinica. 44:927–935. doi:10.11947/j.AGCS.2015.20140462.
Google Scholar
Wang Z, Chen J, Chen H. 2021. EGAT: edge-featured graph attention network. In: Farkaš I, Masulli P, Otte S, Wermter S, editors. Artificial neural networks and machine learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science. Cham: Springer International Publishing. p. 12891.
Google Scholar
Ware JM, Jones CB, Thomas N. 2003. Automated map generalization with multiple operators: a simulated annealing approach. Inter J Geograph Inform Sci. 17(8):743–769. doi:10.1080/13658810310001596085.
Web of Science ®Google Scholar
Wu F, Du J, Qian H, Zhai R. 2022. The development and thinking of map comprehensive intelligence research. Wuhan Daxue Xuebao. 47:1675–1687.
Google Scholar
Xie L, Qian H, He H, Liu C, Duan P. 2017. Settlement selection method based on case reasoning. Acta Geodaetica et Cartographica Sinica. 46:1910–1918. doi:10.11947/j.AGCS.2017.20170061.
Google Scholar
Xu K, Hu W, Leskovec J, Jegelka S. 2018. How powerful are graph neural networks? ArXiv. preprints. ArXiv abs/1810.00826. doi:10.48550/arXiv.1810.00826.
Google Scholar
Yan H,Weibel R,Yang B. 2008. A multi-parameter approach to automated building grouping and generalization. Geoinformatica. 12(1):73–89. doi:10.1007/s10707-007-0020-5.
Web of Science ®Google Scholar
Yan X, Ai T, Yang M, Yin H. 2019. A Graph convolutional neural network for classification of building patterns using spatial vector data. ISPRS J Photogr Remote Sens. 150:259–273. doi:10.1016/j.isprsjprs.2019.02.010.
Web of Science ®Google Scholar
Yan X, Ai T, Yang M, Tong X, Liu Q. 2020. A graph deep learning approach for urban building grouping. Geocarto Inter. 37(10):2944–2966. doi:10.1080/10106049.2020.1856195.
Web of Science ®Google Scholar
Yan X, Chen H, Huang H, Liu Q, Yang M. 2021. Building typification in map generalization using affinity propagation clustering. IJGI. 10(11):732. doi:10.3390/ijgi10110732.
Google Scholar
Yan X, Yuan T, Yang M. 2022. Building shape characterization representation and adaptive simplification methods. Acta Geodaetica et Cartographica Sinica. 51:269–278. doi:10.11947/j.AGCS.2022.20210302.
Google Scholar
Yang M, Yuan T, Yan X, Ai T, Jiang C. 2022. A hybrid approach to building simplification with an evaluator from a backpropagation neural network. Inter J Geograph Inform Sci. 36(2):280–309. doi:10.1080/13658816.2021.1873998.
Web of Science ®Google Scholar
Yang Y, Li D. 2020. NENN: incorporate Node and Edge Features in Graph Neural Networks. In: Proceedings of the 12th Asian Conference on Machine Learning, PMLR129, p. 593–608.
Google Scholar
Zaremba W, Sutskever I, Vinyals O. 2014. Recurrent neural network regularization. ArXiv. Preprints. ArXiv abs/1409.2329. doi:10.48550/arXiv.1409.2329.
Google Scholar
Zhang X, Ai T, Stoter J. 2008. The Evaluation of Spatial Distribution Density in Map Generalization. In: The XXI Congress of the International Society for Photogrammetry and Remote Sensing (ISPRS 2008) p. 181–188.
Google Scholar
Zhang X, Ai T, Stoter J, Kraak M-J, Molenaar M. 2011. Building pattern recognition in topographic data: examples on collinear and curvilinear alignments. GeoInformatica. 17(1):1–33. doi:10.1007/s10707-011-0146-3.
Web of Science ®Google Scholar
Zhang X, Yin W, Yang M, Ai T, Stoter J. 2018. Updating authoritative spatial data from timely sources: a multiple representation approach. Int J Appl Earth Obs Geoinf. 72:42–56. doi:10.1016/j.jag.2018.05.022.
Web of Science ®Google Scholar
Zhang Z, Liu T, Du P, Yang G. 2022. DGCNN recognition method of spatial map convolutional model of typical building group pattern. Wuhan Daxue Xuebao. 16:11–13. 1–
Google Scholar
Zhao L, Akoglu L. 2019. PairNorm: tackling Oversmoothing in GNNs. ArXiv. preprints. arXiv1909.12223. doi:10.48550/arXiv.1909.12223.
Google Scholar
Zhou S, Regnauld N, Roensdorf C. 2009. Generalisation log for managing and utilising a multi-representation spatial database in map production. Comput Environ Urban Syst. 33(5):334–348. doi:10.1016/j.compenvurbsys.2009.06.004.
Web of Science ®Google Scholar

Determining the optimal generalization operators for building footprints using an improved graph neural network model

Abstract

1. Introduction

2. Methods

2.1. Framework for determining optimal generalization operators

2.2. Setting the sample label of the large-scale buildings

Table 1. Mapping relationships within multi-scale data matching, usage of the generalization operator combination, and the optimal generalization operator.

2.3. Sample feature extraction of large-scale buildings

2.3.1. Structure setting of the large-scale building input

2.3.2. Setting description items of input matrixes

2.3.2.1. Setting of the building adjacency matrix

2.3.2.2. Setting of the node feature and edge feature matrices

Table 2. Formulas and descriptions of the building characteristics.

Table 3. Setting of node feature and edge feature matrices.

2.4. Self-neighborhood merged GNN model

2.4.1. Self-feature embedding module

2.4.2. Neighborhood feature embedding module

2.4.3. Merge module

3. Results and analyses

3.1. Data preparation

3.2. Experimental results and analysis

3.2.1. Experimental results

Table 4. Parameter settings in layers.

Table 5. Confusion matrix for dataset 1 testing set.

Table 6. Correct rate and F1 score of different operators.

3.2.2. Importance analysis of features

Table 7. Impact of different feature modules on the accuracy of different categories.

3.2.3. Model comparison analysis

Table 8. Constraints of the rule-based method.

Table 9. Steps in the rule-based selection of optimal generalization operators.

Table 10. Comparison of decision determining results for different methods.

3.2.4. Parameter sensitivity analysis

3.2.4.1. Influence of SNGNN structure setting

Table 11. Correspondence between the number of attention channels and training time.

3.2.4.2. Influence of the hyper-parameter setting of the loss function

4. Conclusion

Disclosure statement

Data availability statement

Additional information

Funding

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date