571
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Simulating inter-city population flows based on graph neural networks

&
Article: 2331223 | Received 08 Nov 2023, Accepted 11 Mar 2024, Published online: 25 Mar 2024

Abstract

Inter-city population mobility, a critical phenomenon in the modern urbanisation process, is closely related to urban industrial structure and socioeconomic development. This paper aims to investigate the dynamics of population flows and their intricate ties to industrial structure, so we employ the graph neural networks (GNNs) method to simulate inter-city population flows in China, which efficiently integrates demographic and socioeconomic data with Tencent migration big data while accounts for geographical relationships between cities. The results show that the model’s predictive accuracy using the CPC index was high for road and rail traffic and moderate for air transportation. A comparison with real-world data verified the model’s effectiveness in predicting the urban hierarchy and regional aggregation of flows. Using GNNExplainer, the results indicated that population size positively influenced population flow, while developed manufacturing reduced population mobility for road and rail traffic but increased it for air transportation. By conducting scenario simulations in Northeast China, we found that enhancing the region’s industry and consumer service industry could mitigate negative population outflows. The conclusions drawn from this study offer valuable perspectives to policymakers and urban planners, enabling them to make well-informed and judicious choices concerning urban planning, transportation, and resource allocation.

1. Introduction

Population mobility is closely associated with economic advancement, industrial configuration, and residential livelihood (Hong et al. Citation2019; Zhang et al. Citation2020; Zhao et al. Citation2021), which serves to propel urbanization, optimize industrial growth, and alleviate disparateness between regions (Bhagat and Mohanty Citation2009; He et al. Citation2016). Concurrently, it provides a reference to conceptualize innovative models of regional economic progression, thereby fostering equitable growth across disparate locales (Liu Wangbao Citation2016; Pan and Lai Citation2019).

Population flow data enables the quantification of individual movements across geographical spaces during a specific time interval. Up to now, numerous studies in Geography have analyzed population flow data and proposed corresponding models. The first proposed theory of population flow pattern held that the interaction between two cities is proportional to population size and inversely proportional to the intervention distance (Carey Citation1859). This theory was subsequently revised as the gravity model, where the population flows increase with the locations’ populations while decrease with the distance between them (Zipf Citation1946). The size of a location can be represented using different attributes, such as GDP (Wajdi et al. Citation2017) or employment data (Devi and Sudarsan Citation2021). In the practical application of population flow simulation, the gravity model encounters the challenges of parameter calibration, leading to a substantial underestimation of deviations between actual and simulated results (Zhao et al. Citation2019).

In addition to the gravity model, the intervening opportunities model was among the initial models utilized in population mobility (Stouffer Citation1940). It assumes that the factors determining the number of travellers between the origin and destination are the cumulative number of opportunities at that distance and the number of opportunities to reach intermediate destinations, which are proportional to the former and inversely proportional to the latter. Additionally, the concept of intervening opportunities was further generalized to the process of particle radiation in physics, introducing the radiation model (Simini et al. Citation2012), which divides the choice of the destination in two steps. In the first step, the attractiveness of each destination adheres to a specific distribution. In the second step, the traveller selects the nearest destination with an attractiveness exceeding a threshold, based on the distance from the origin. However, the radiation model can predict commuting trips but is not suitable for predicting general inter-city travels (Liu and Yan Citation2020). This limitation arises because it is not always the case that travellers only choose the nearest city as the destination, but also likely to consider other cities with high attractiveness even though they are far away from the origin (Jia et al. Citation2022).

Although gravity model and radiation model reveal general patterns in population mobility, their formulations are constrained by linear relationships among independent variables, making it challenging to describe complex nonlinear flow patterns. The integration of Artificial Intelligence (AI) methods into the mobility domain has overcome these limitations, enabling the prediction of flows in a nonlinear manner. Considering population mobility as a spatial interaction between origin supply and destination demand (Rodrigue Citation2020), advanced techniques such as Convolutional Neural Network (CNN) (Rong et al. Citation2019), Graph Attention Network (GAT) (Liu et al. Citation2020), and ensemble learning method (Chen et al. Citation2023) have been employed to capture geographical contextual features and spatial relationships for predicting intra-city mobility. For inter-city mobility prediction, Robinson and Dilkina (Citation2018) developed a general population migration estimation model based on machine learning models. Spadon et al. (Citation2019) took into account the network structure of inter-city mobility, employing machine learning classification to predict population flow and reconstruct the inter-city commuter network. Simini et al. (Citation2021) extended the original gravity model to the Deep Gravity model (DG) based on deep neural networks (DNNs) using 19 urban indicators as model inputs, which exhibits good generalization capability and interpretability. These studies consistently show that AI-based models have higher accuracy than traditional models (e.g. gravity model and radiation model) in the realm of population mobility prediction.

GNNs are a class of deep learning methods for processing graph-structured data and are widely used to model the connections between entities in various domains, such as social networks, recommendation systems, and biological networks (Zhou et al. Citation2020), and have become a popular research topic in geographic studies (Zhang et al. Citation2022). Although GNNs have not been used in predicting inter-city population flows, their effectiveness in capturing spatial correlation and forecasting spatial flows has been demonstrated in predicting origin-destination mobility flows and imputing missing spatial flows within cities (Chen Citation2020; Yao et al. Citation2021). In this study, we adopted the graph neural networks (GNNs) method to simulate inter-city population flows. We chose GNN over other deep learning methods because it can extract the attributes and connections of cities and perform computations directly on the graph, thereby enhancing the computing efficiency and accuracy.

The configuration of industrial structure has a significant impact on the dynamics of inter-city population flows (Zhang et al. Citation2020), and its upgrading and optimization bear relevance to the patterns of population influx (Cheong and Wu Citation2014). This statement is supported by the fact that over the past four decades of reform and opening-up, eastern China has experienced remarkable economic growth and industrial transformation, thereby attracting a large influx of floating population (You et al. Citation2018). Previous studies investigating the relationship between inter-city population movement and industrial structure have primarily used statistical analyses and empirical investigations (Wang et al. Citation2019; Cao et al. Citation2023), or applied it as factors in the gravity model (Wang et al. Citation2022). However, there are relatively few studies that encompass population mobility simulations accounting for the evolution of industrial structure. Consequently, comprehending the intricate interdependence between industrial structure and inter-city population flows, facilitated by modelling and simulation, becomes imperative.

Our study aims to investigate simulation with the integration of industrial structure and inter-city population flows, yielding a more accurate and widely applicable model. This holistic approach not only provides an explanatory mechanism which deepens comprehension of the investigated relationship between industrial structure and inter-city population flows, but also holds the potential to furnish predictions for future population mobility under certain industry system development.

To investigate the effects of development of the industrial sectors on inter-city population flows, we utilized the technique of Explainable AI (Xu et al. Citation2019). This approach enhances the transparency, interpretability, and reliability of AI systems by providing a clear understanding of the model’s operation, the variables it considers, and the reasoning behind its predictions or decisions (Burkart and Huber Citation2021). Besides, we conducted simulated scenario analyses in Northeast China with varying industry systems to further explore the impacts on population mobility. The findings of this study may offer relevant government departments new insights on enhancing employment structures, promoting industrial upgrading, establishing a rational flow of labour, and achieving a positive interaction between industrialization and urbanization through industrial policies.

2. Materials and methods

2.1. Data description

Two sets of data are involved in the analysis. The first set is the Tencent migration big data that records the inter-city population flows in China. This data comes from Tencent Migration Platform (https://heat.qq.com/qianxi.php) and captures the changes in users’ geographical locations through their usage of Tencent’s service, the largest social media company in China, to present real-time and dynamic representations of daily population flows between cities. By studying and matching fluctuations in users’ geographical locations, Tencent Migration Big Data has cumulatively recorded the inter-city mobility patterns of hundreds of millions of people, including transportation modes. It is worth noting that the dataset represents only the movements of Tencent’s social media users and not the actual number of travellers. Nonetheless, given Tencent’s wide user base, including more than 1 billion and 800 million monthly active users for WeChat and QQ, respectively, and Tencent’s location-based services receiving about 55 billion location call requests per day, Tencent’s data covers a significant portion of smartphone users. Considering these factors, Tencent migration big data is generally considered representative for conducting research on population mobility. Consequently, it is believed that this data set can adequately reflect the actual situation of inter-city travel. Additionally, it should be acknowledged that the Tencent migration big data includes a union of mobility information from the top 10 cities with the highest travel volumes for each mode of transport (move-in and move-out, respectively) for each city. Even so, when aggregated across the 366 cities in the dataset, it can still provide significant insights into the situation of inter-city travel. The data set provides the cumulative population flows between cities during a whole year spanning from 2018.7.1 to 2019.6.30. The transport modes considered in the data include road, rail, and air. In the following analysis, we used the average daily population flows after removing outlier records with insignificant mean population flows (t¯<10). After processing, the record counts for road, rail, and air travel were 18895, 24758, and 18484, respectively. For the 366 cities, each originating city has data for more than 50 destinations on average.

The second data set is the city-level socioeconomic data collected from the China City Statistical Yearbook, which were obtained from the National Bureau of Statistics of China (http://www.stats.gov.cn/), including population and employment. The city-level population data is the number of permanent residents in 2018. The city-level employment statistics provide the number of employed persons in 19 economic sectors of the primary, secondary, and tertiary industries. The statistics for 2016, 2017, and 2018 were collected, and the three-year average representing stable economic conditions in each city was used in the subsequent analysis.

2.2. Methods

In this section, we developed the architecture of the GNN-based model used for simulating inter-city population flows. As shown in , we developed this model according to three procedures: 1) We constructed a city graph with the locations of cities as nodes and the connections between them as edges. A set of socioeconomic variables (see Section 2.2.1) was used as node attributes to represent the characteristics of cities. Besides, the distance between cities was also considered to represent their spatial relationships. 2) The Graph Sample and Aggregation (GraphSAGE) algorithm was performed on the city graph to generate embeddings of city nodes by sampling and aggregating node features from their local neighbourhood. The principle of GraphSAGE can be found in Section 2.2.2. 3) On top of GraphSAGE, we built an edge regression model to predict the inter-city population flows using the high-level features. The details can be found in section 2.2.3.

Figure 1. A model structure for simulating inter-city population flows based on GNN.

Figure 1. A model structure for simulating inter-city population flows based on GNN.

2.2.1. City graph construction

In this paper, we should construct a city graph to facilitate the calculation of city attributes and inter-city relationships. Generally, a graph consists of vertices (or nodes) and edges and can be represented as G={V,E,W}, where nodes V={v1,,vN}, edges E={e1,,eM}. The connectivity of the nodes in the graph is represented by an adjacency matrix WRN×N, of which the element whether nodes vi and vj are connected: wi,j=1 means connected otherwise wi,j =0. Alternative to the binary wi,j, the inverse distance between node i and j can be used to represent wi,j. In this case, higher values of wi,j indicate closer connections (Sandryhaila and Moura Citation2013).

Specifically, we conceptualized cities as nodes (V), and for each city node k, we use a vector xk=[xk1,xk2,xk3,xkd] to represent its socioeconomic characteristics, where d is the number of vector elements considered (i.e. the city-level population and employment statistics). The vectors of all city nodes formed a node attribute matrix, denoted as Xn*d, where n is the number of nodes. We further calculated the normalized inverse Euclidean distance between all pairs of nodes to form a weighted adjacency matrix WRn×n to represent connectivity. The interactions between cities (nodes) were represented as edges (E={e1,,em}), and the inter-city population flows were represented as their properties (T={t1,,tm}).

By this means, the problem of inter-city population flow prediction could be transformed into the prediction of edge property in a city graph. The prediction of population flows involved two procedures: (1) GraphSAGE algorithm, which in effect extracted features from the nodes’ attributes and connectivity, and (2) edge regression operation, which predicted population flows between city nodes using the proceeded features. Details can be found in the following sections.

2.2.2. Feature aggregation

Inter-city population flows are not only affected by the origin and destination conditions. However, they are also influenced by their neighbouring cities’ conditions and interactions (Choe and Laquian Citation2008). Therefore, the prediction of the city’s population flows should consider its local neighbourhood. To this end, the operation of GraphSAGE was applied to the city graph to aggregate node features. Here, GraphSAGE is a sampling-based GNN that uses random walks to sample a number of subgraphs from the entire graph and then performs local aggregation of the nodes in each subgraph, generating a representation of the node. In contrast to other existing methods of graph prediction, such as graph convolutional networks that require information of all nodes during node embeddings, GraphSAGE focuses on the information from those key neighbouring nodes rather than all nodes. It demonstrates notable adaptability to diverse neighbour relationships among nodes (Hamilton et al. Citation2017). GraphSAGE is also powerful for processing large-scale graph data with the commendable scalability and sampling efficiency. In this study, we chose GraphSAGE instead of other GNN algorithms because its better performance (as shown in Supplementary Table 1).

The GraphSAGE embedding generation (i.e. forward propagation) algorithm can be represented by the following procedures (Xiao et al. Citation2019). Firstly, each node i aggregates the neighbour information and feature information of adjacent nodes N(i) to the neighbourhood vector hN(i)l, which is calculated by the representations generated during the preceding iteration: (1) hN(i)l=AGGREGATE({wjihjl1,jN(i)})(1) where the initial representations of nodes are the input node attributes hi0=xi. The output representations of nodes are output node features hiL=zi, in which L refers to the total number of GraphSAGE layers. AGGREGATE refers to aggregator functions that aggregates neighbourhood information. wji is the weighted adjacency on the edge from node j to node i.

Secondly, the node’s current representation hil is combined with its aggregated neighbourhood vector hN(i)l and central node information hil1 by CONCAT (⋅) function, and embedded by the following equation: (2) hil=σ(wl·CONCAT(hil1,hN(i)l))(2) where wl is the weight matrix of the lth layer and σ(·) is the activation function.

Finally, each node in the graph performs the calculations described above, and then performs the next round of aggregation and embedding. After each round of calculations, the resulting features need to be standardized: (3) hil=norm(hil1)(3)

In this study, we implemented three GraphSAGE layers to process the node attribute matrix Xn*d and the weighted adjacency matrix Wn*n () using pooling aggregator function. Each GraphSAGE layer is followed by a BatchNorm layer and a ReLU activation layer. Here, the BatchNorm layer is to avoid gradient disappearance and accelerate training, while the ReLU activation layer is to improve the fitting ability of neural networks. The output of the model is a convolutional feature matrix, whose elements represent the convolutional features of the nodes, where the feature of each node is one-dimensional. To avoid overfitting, the hidden feature dimensionality of the convolutional layers is set to 256 and a dropout value of 0.5 is employed.

Figure 2. A Three-layer GraphSAGE model is used to extract features of nodes.

Figure 2. A Three-layer GraphSAGE model is used to extract features of nodes.

2.2.3. Population flows prediction

The features extracted by GraphSAGE were used to predict the population flows between city nodes based on edge regression. Similar to conventional regression methods, the purpose of edge regression is to find the quantitative relationship that can predict a certain edge property with the given features. Edge regression can usually be expressed as the hidden representation of its nodes and optionally the features on the edge itself.

In the proposed city graph, the population flow between nodes i and node j, denoted as tk(0,+), was approximated as the multiplication product of the features of nodes i and j and their weighted adjacency value (i.e. their normalized inverse distance): (4) tk=zi*wi,j*zj(4) where zi and zj are the high-dimensional hidden features of nodes i and j, respectively, and wi,j is the weighted adjacency between them. By computing tk for all edges with EquationEquation (4), a tensor T={t1,,tm} that represents inter-city population flows can be obtained, and it is the final output of the GNN-based population flow model.

2.2.4. Model evaluation and interpretation

The population flow data was randomly divided into a training set, a validation set, and a testing set (60%, 20%, and 20% of the total pieces of data, respectively). We use the root mean square error (RMSE) as the loss function to back propagate the errors, which is calculated as 1mi=1m(t˜iti)2, where t˜i and ti are the predicted and observed population flows for edge ei, respectively. During the parameter update phase, we employed the Adam algorithm to dynamically optimize the model parameters. The Adam algorithm significantly improves the speed of convergence by calculating the adaptive learning rate for each parameter, making the learning process more efficient (Kingma and Ba Citation2014). We set the learning rate to 0.001 and the number of iterations of the model to 10000. Then we repeat the experiment ten times and use the mean of the predictions (t¯) as the results.

The common part of commuters (CPC) index was used to evaluate the performance of the GNN-based population flow model. The CPC index measures the similarity between the predicted flows T˜ and the actual flows T: (5) CPC(T,T˜)=2i,j=1nmin(tij,t˜ij)i,j=1ntij+i,j=1nt˜ij(5)

The value of CPC is between 0 and 1, in which CPC=1 indicates a perfect match between predicted and actual flows, while CPC=0 indicates no agreement between predictions and real-world data (Simini et al. Citation2021). Therefore, larger values of CPC suggest better performance of this model.

In this paper, GNNExplainer was utilized to understand the impact of the input city’s socioeconomic characteristics on the model’s predicted output flows. GNNExplainer is a perturbation-based interpretable method that uses node attributions and graph structure to explain the predictions made by GNN models. It has a distinct advantage over other interpretation methods by using the recursive neighbourhood-aggregation scheme of GNN to highlight significant node feature information and provide explanations for input elements (Ying et al. Citation2019). After the model is constructed, this approach randomly initializes a set of soft masks and optimizes them by learning soft masks of edge and node features. Node masks, edge masks, or node feature masks are generated to account for the different contribution levels of nodes, edges, and node features (Zhdanov et al. Citation2022).

Therefore, we adopted this method to obtain the node feature masks and denoted the results as a matrix of shape (1,d), where n is the number of city nodes, d is the size of socioeconomic characteristics. As there are significant variations in the mask values by GNNExplainer for different features of the same node, in order to facilitate the comparison of the magnitude of the contribution of different features of the same node, we opted for z-score normalization for each node’s feature masks on a row-wise basis: (6) xnew=xx¯σ(6)

Where x is the value of a particular feature mask for a node, x¯ and σ are the mean and standard deviation of all feature masks for this node, respectively, and xnew is the value of that feature mask for that node after processing.

The processed values resulted in a normal distribution, with the majority of values falling within the range of −4 to 4. The magnitude of contribution, determined by the absolute value, indicated a positive influence for positive values and a negative impact for negative values.

3. Results

3.1. Comparison of actual and predicted inter-city flows

This section initiates by presenting an analysis of the spatial pattern of inter-city flows within the dataset, denoted as actual flows. Specifically, we delved into the urban hierarchy and regional aggregation of flows, aiming to unearth latent patterns. Subsequently, we carried out a comparative analysis between the actual flows and the predicted flows and employed the CPC index to validate the proposed model.

Since the edges of the city graph between city nodes are indicative of the direction of population flow, it is possible to quantitatively assess the attractiveness or radiance of a given city node with respect to its inflow or outflow, the inward or outward flows of the node. The total flow is comprised of the cumulative inflow and outflow associated with a specific city node. The importance of a city node i in the inter-city flows can be quantified by calculating the total flow proportion, which is the ratio of the total flow volume of the city to the total flow volume among all cities (Gu et al. Citation2023): (7) TFPi=jtij+jtji2ijtij(7)

The outcomes of the computations reveal that Beijing, Guangzhou, Shanghai, Shenzhen, Chengdu, and Chongqing are the top six cities in terms of total flow proportion. These cities act as central cities for the economic development of China’s four major urban agglomerations, namely Jing-Jin-Ji, Yangtze River Delta, Greater Bay Area, and Chengdu-Chongqing.

depicts the total flow proportions of cities, with red dots representing megacities, yellow dots representing provincial capitals or economically developed cities, and black dots representing normal prefecture-level cities. This figure also presents actual flows T and predicted flows T˜, with values exceeding one thousand. It reveals that inter-city road connections were primarily established among geographically proximate locations, whereas mobility between distant cities was markedly facilitated by the expedited development of convenient transportation infrastructure such as airlines and high-speed rail systems.

Figure 3. Comparison of actual and predicted inter-city population flows.

Figure 3. Comparison of actual and predicted inter-city population flows.

Besides, the urban agglomeration aggregation effect of actual inter-city flows of road traffic and the megacity effect of air traffic flows have been observed, with rail traffic falling in between. The inter-city road connections exhibited extreme proximity within urban agglomerations (Jing-Jin-Ji, Chengdu-Chongqing, Yangtze River Delta, and Greater Bay Area account for 4%, 4%, 10%, and 7% of total flows, respectively). These regions also hosted the busiest routes, such as Dongguan-Shenzhen, Guangzhou-Foshan, Shanghai-Suzhou, and Beijing-Langfang. Inter-city air traffic flows displayed a distinct diamond structure, with six megacities as vertices and strong air connections between them as edges (t¯>60000). Although there was good rail connectivity within urban agglomerations, rail traffic between megacities had the highest volume, especially the routes radiating outward with Beijing as the centre, which is not only the capital but also the largest railroad hub in China.

As can be seen from , it is evident that road traffic exhibited the highest degree of accuracy for the predicted flows, as indicated by its CPC value of 0.6854. It was followed by rail traffic (0.6570) and air traffic (0.5325). The model demonstrated its effectiveness in accurately predicting inter-city road and rail connections within urban agglomerations, as evidenced by the average CPC values of 0.7997 and 0.7420, respectively. Furthermore, the model accurately predicted high-value air traffic flows between megacities, with a CPC value of 0.8431.

However, the model was observed to underestimate the flows between cities over relatively long distances. This was primarily because there were more regular connections between major cities on road and rail traffic (as indicated by the red-red, yellow-yellow, and red-yellow dots). Besides, the actual values of flows between far-off medium-sized cities within the diamond structure were considerably higher for air traffic, with most of their connections being made through this mode of transportation.

The distance between cities was identified as a crucial factor in determining the most appropriate mode of transportation. Air travel was typically preferred when the distance exceeded a specific threshold (Li and Sheng Citation2016). Additionally, due to the proximity of cities and the well-developed road and rail networks, the population flows of air traffic did not increase with proximity within urban agglomerations, leading to a discrepancy between the predicted and actual values.

In order to evaluate the effectiveness of the proposed model, we conducted a comparative analysis with the classical gravity model and DG and examined their respective performances. For the classical gravity model, we only considered the population of the originating and the destination cities and the distance between them. We set the total flows from the origins as a constant, setting it as a single-constraint gravity model. The results of our study show that our model outperforms the classical gravity model and DG in predicting population flows (as shown in ), proving the incorporation industrial structures and the use of GNN algorithm’s effectiveness and availability in predicting population flows.

Table 1. Model performance evaluation for models.

3.2. Model interpretation

The results of GNNExplainer’s analysis are depicted in , which highlights the influential factors in predicting population flows for different modes of transport. The analysis revealed that population size was the most significant factor, demonstrating a positive correlation with population flows for all modes of transportation, regardless of its value. This finding is consistent with previous research suggesting that a city’s ability to attract or repel population is closely related to its total population size (Zhang et al. Citation2020).

Figure 4. Distributions of explanation value of the model’s input socioeconomic characteristics by GNNExplainer. The horizontal axis displays the socioeconomic characteristics, while the colour of the dots indicates their values. The position of the dot on the vertical axis represents the explanation value of the characteristics, showing whether they increase or decrease population flow.

Figure 4. Distributions of explanation value of the model’s input socioeconomic characteristics by GNNExplainer. The horizontal axis displays the socioeconomic characteristics, while the colour of the dots indicates their values. The position of the dot on the vertical axis represents the explanation value of the characteristics, showing whether they increase or decrease population flow.

The impact of public administration on population flows was mainly concentrated in an increase in the predicted flows of road and rail traffic and a decrease in air traffic. This observation may be attributed to the fact that public administration tends to attract people from nearby cities, who tend to travel by automobile or train, while having a negative effect on population flows from distant cities, where air travel is the predominant mode of transportation.

The analysis also revealed size thresholds in the secondary and tertiary industries, with population mobility only affected when a city’s employment level surpasses a certain threshold. Specifically, for sectors characterised by lower magnitudes (the blue dots in ), their influence on predicted population flows is minimal.

High levels of education (the red dots in ) contributed to an increase in the predicted flow. This result suggests that adequate educational opportunities create favourable conditions for population mobility. This supports evidence from an analysis of floating people in China (Fan and Li Citation2020), which suggests that better educational resources are crucial in family mobility considerations.

Manufacturing was found to play a crucial role in the predicted population flows, with a positive effect on air transport but a negative effect on road and rail transport. This phenomenon can be elucidated through the push-pull theory, which suggests that attractive living conditions act as a pulling force for population movement, drawing populations towards certain areas, while unfavourable living conditions become a pushing force, encouraging emigration (Lee Citation1966). In some sense, manufacturing represents the level of industrialization of cities. Statistics show that the proportion of employed people in manufacturing frequently being high in developed cities, reaching more than 25%. This finding suggests that the growth of the manufacturing industry can generate increased employment opportunities and appealing living conditions, thereby creating an inward pulling force for air transport to attract labour from remote cities. Concurrently, this development triggers a weak outward pushing force acting upon road and rail transport, which may help retain labour in the region by discouraging relocation to neighbouring cities and regions.

3.3. Scenario simulation

Northeast China is an independent geographic, cultural, and economic region in northern China that includes the provinces of Liaoning, Jilin, and Heilongjiang, as well as four league cities in eastern Inner Mongolia. Since 2000, the region has experienced continuous and increasing population loss, with severe brain drain, an ageing population, and a highly polarized population spatial structure (Wei Qi and Liu, Citation2017). The primary cause of this population decrease is the pursuit of economic opportunities (Wu et al. Citation2019), leading to a mobility of people to more economically prosperous areas (Jiang Citation2017).

In order to provide a reference for policy formulation to provide a better employment system to counteract population loss in Northeast China, we set up several industry system scenarios to evaluate the impact of the increase in employment opportunities of different industry systems on population flows. Compared with the baseline scenario, these scenarios represent a doubling of specific industry system employment. Specifically, the division of the industry system first conventionally categorizes the 19 industrial sectors into three overarching industries: agriculture, industry, and service industry. Notably, only the latter two industries were taken into consideration due to their salient impacts. Based on the characteristics of the service industry, we further subdivided it into three categories: consumer service industry, producer service industry, and public service industry (Shao et al. Citation2017).

Industry Scenario 1 (S1-Ind): This scenario increases employment in the sectors of industry, representing the effects of the growth of urban industrialization.

Consumer Service Industry Scenario 2 (S2-Com): This scenario increases employment in consumer service industry (including real estate, etc.), indicating the impact of improvements in residents’ life quality and living standards.

Producer Service Industry Scenario 3 (S3-Prod): This scenario increases employment in producer service industry (including transportation, etc.), reflecting the influence of an upgrade in the service industry’s capacity to provide security for industrial production.

Public Service Industry Scenario 4 (S4-Pub): This scenario increases employment in public service industry (including public administration, etc.), denoting the impact of an increase in social foundation and security.

Total flow and net flow were chosen as indices to present comparative results under different scenarios. Net population inflow or outflow is the difference between the volume of population inflow and outflow, which is widely used to describe the direction and magnitude of population movements in specific geographical areas over a given period (Johnson and Beale Citation2010; Lan et al. Citation2020). Using the proposed model, we obtained a baseline of net population flows for cities in the Northeast (as shown in , where net inflows are in red and net outflows are in blue, and the shades of colour reflect the absolute magnitude). It shows that most of the cities, predominantly located in the northern region, had net population outflows for road and rail transport, which would result in a net loss of population.

Figure 5. Location of Northeast China and predicted net population flows of cities.

Figure 5. Location of Northeast China and predicted net population flows of cities.

Scenario simulation results reveal that employment growth in industry and consumer service industry positively decreases the net population outflows in Northeast China. displays the ratios of net outflows to baseline for cities under different scenarios (where red indicates a large ratio, blue indicates a small ratio, and the darker the colour, the greater the degree). Meanwhile, illustrates percentage changes in total net outflows for four scenarios compared to the baseline. In S3-Prod and S4-Pub, the net outflows of road traffic increased significantly in some cities, resulting in an intensification of the total net outflows by 21% and 19%, respectively. The total net outflows of rail traffic all reduced under all four different scenarios, with the change of S1-Ind being the most obvious. In S1-Ind and S2-Com, the total net outflows of air traffic declined by 60% and 65%, respectively, while it raised by 16% in S3-Prod.

Figure 6. Net outflow ratios of four scenarios to the baseline of cities in Northeast China.

Figure 6. Net outflow ratios of four scenarios to the baseline of cities in Northeast China.

Table 2. Percentage changes of the four scenarios to the baseline in total net outflows in Northeast China.

Besides, the development of industry and consumer service industry reduces the travel necessity within the region for work or leisure by road. Calculating and visualizing the overall flow of road traffic in the Northeast region (as shown in and ), it is apparent that there was a decrease of 16% and 18% in the mobility of people, including those moving between prefecture-level cities and transportation hubs (Shenyang, Dalian, Changchun, and Harbin are also sub-provincial cities) (Chen and Zhang Citation2021), and within the prefecture-level cities themselves, in S1-Ind and S2-Com, respectively.

Figure 7. Population flows of cities in Northeast China of road traffic.

Figure 7. Population flows of cities in Northeast China of road traffic.

Table 3. Total population flow changes within Northeast China of road traffic.

The development of industry and consumer service industry also reduces the travel necessity to other regions by road. Road traffic connections with other regions declined by 11% and 10% in S1-Ind and S2-Com, respectively (as shown in ). While the mobility of the population is facilitated by the producer service industry. In the S3-Prod scenario, air links with other regions increased by 21%. It is likely that the producer service industry has a high degree of trade and exchange (Coffey Citation2000) and often necessitates people to relocate, which can lead to increased population mobility.

Table 4. Total population flow changes between Northeast China and other regions.

presents an analysis to investigate the changes in population flows between Northeast China and other regions. Due to the limited availability of data on road traffic connections to other regions, the focus was on evaluating the other two modes of transport. The findings indicated that the increasing employment opportunities in Northeast China had a positive impact on drawing populations from the Western region, thereby signifying an increased attractiveness of the area with a relatively undeveloped economy for labour. Moreover, the net inflow ratios increased in S1-Ind and S2-Com, indicating an overall increment in the attractiveness of Northeast China to other regions. In addition, S3-Prod was found to strongly promote air links between the Northeast and other regions, while the impact of S4-Pub was insignificant.

Table 5. Population flow changes between Northeast China and other regions.

4. Discussion

4.1. Contributions of this study

The study has revealed distinct spatial patterns evident in the inter-city flows of road traffic and air traffic. Notably, the former displays an urban agglomeration aggregation effect, while the latter manifests a megacity effect. The proposed GNN-based model was deemed adequate in capturing these patterns and predicting inter-city population flows of different transport modes. The accuracy of the model was ranked in the order of road, rail, and air in predicting population flows across the three modes. This ranking is probably due to the highest accessibility of road transport, as rail and air transport are subject to urban development, route planning, and facility construction.

This study uncovers a noteworthy revelation concerning the multifaceted impact of manufacturing development on diverse modes of population movement. Specifically, the expansion of manufacturing had distinct effects on road, rail, and air traffic. While this expansion caused a decline in population mobility for road and rail traffic, it simultaneously led to an increase in air traffic. This discovery challenges the existing hypothesis that the rise of industry supports employment-based mobility (Dou Citation2018), which seems inconsistent with our empirical evidence. It can be argued that developed manufacturing tends to cultivate economic concentration in specific regions, which may engender economic disparities and diminish incentives for individuals to relocate between geographically close but economically disparate regions. Consequently, the demand for inter-city mobility via road and rail transportation within close geographical proximity is prone to wane. Conversely, business travellers who are seeking improved employment opportunities and economic prospects may favour air transport for long-distance journeys (Bieger and Wittmer Citation2006), thus augmenting the demand for air travel.

4.2. Policy applications of the model

The findings of the experiment could provide an appropriate policy reference for urban development in Northeast China, a region experiencing significant population decline. The findings suggest that the development of specific industries could effectively regulate population mobility, with the expansion of all four divided industries potentially enhancing the attractiveness of Northeast China to the relatively underdeveloped regions in Western China. The consumer service industry may play a vital role in increasing the overall attractiveness of Northeast China by reducing outflows of rail and air travel. This result aligns with Williams’ research, which underscored the favourable impact of consumer service industry on local economic growth by limiting the income outflow to external regions (Williams Citation1996).

Moreover, this study suggests that the progression in industry led to a decrease in net population outflows. These results confirm the conclusion that manufacturing plays a significant role in enhancing the city’s competitiveness (Qizhi et al. Citation2016). However, the industry in Northeast China is facing the challenge of upgrading and transforming as a result of the dramatic contraction of the traditional manufacturing and heavy industry on which the region once depended, the loss of the region’s traditional economic advantages, and the massive loss of labour force (Zhang Citation2008).

From a policy formulation perspective, the results of this study have important implications that can be approached from three different angles. The first perspective pertains to identifying potential opportunities and challenges for regional development. By examining the patterns and regularities of population mobility between different regions, policymakers can gain better insights into urban planning decisions (Reia et al. Citation2022), which could be achieved through simulation methods. Specifically, when selecting urban development strategies, policymakers can leverage the study’s findings to determine which regions have development potential and which regions may be facing challenges based on the observed patterns and trends of population mobility.

The second perspective is using the study’s results to evaluate the impact of various policies or interventions on population mobility. Governments can utilize the model presented in this study to predict and simulate the effects of different policies on population mobility, enabling policymakers to assess the feasibility and impact of the policies. This can be useful in developing more targeted and practical policies to promote population mobility and urban development, thus enhancing the effectiveness of government interventions.

The third and final perspective for policy formulation involves using Explainable AI techniques to enhance the transparency and trustworthiness of AI models in urban planning decisions. Specifically, the model utilizes the GNNExplainer method to clarify the importance of node features. This can assist urban planners in better understanding and interpreting the model’s decision results, which in turn can increase trust in the model. Moreover, the interpretable approach can help urban planners identify limitations and flaws in the model, thereby improving the model’s design and decision-making process.

4.3. Limitations

However, it is essential to note that this study has several limitations. The population flow data used in this study are subject to bias, as it only incorporates the union of top ten destinations or origins of each mode of transportation, which should be complement with other data sources in future work.

Besides, it should be noted that the application of data-driven machine learning techniques may lead to the generalization of predicted values, which may affect the accuracy of predicting extremely high or low population flow values. These values are often considered as outliers that deviate significantly from the normal distribution of the data and may not fit the proposed model. Therefore, the accurate simulation of such population flows can be challenging and requires the inclusion and improvement of parameters.

5. Conclusion

In order to explore the relationship between industrial structure and population flows in the context of China, we constructed a city graph comprising 366 cities, leveraging relevant socioeconomic indicators and spatial relationships, and subsequently simulated inter-city population flows based on GNN. This model could predict the urban hierarchy and regional aggregation of flows. Its predictive accuracy was assessed by comparing its results with actual population flow data, yielding CPC values of 0.6854, 0.6570, and 0.5325 for road, rail, and air traffic, respectively.

Using GNNExplainer to interpret the model’s output, we found that population size positively influenced predicted population flows across all three transportation modes. Moreover, the results suggest that the development of manufacturing reduced population mobility for road and rail traffic but increased it for air transportation.

To demonstrate the practical applicability of the simulation model, we conducted a scenario simulation in Northeast China, which indicates that the negative population outflows from the region could be mitigated by enhancing the region’s industry and consumer service industry. Such findings have policy implications for improving a city’s attractiveness and attracting labour forces.

Future studies may consider expanding the scope of the data set to include industrial architectures to provide a more comprehensive assessment of their influence on regional population flows. Moreover, integrating process simulation with economic theories may offer a thorough comprehension of the dynamics of population flow changes.

Supplemental material

Supplemental Material

Download MS Word (15.1 KB)

Acknowledgments

All authors express their gratitude to the reviewers and editors for their helpful and detailed comments and suggestions to this paper.

Disclosure statement

No potential conflict of interest was reported by the author(s)

Data availability statement

The data presented in this study are available on request from the corresponding author.

Additional information

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 42322110 and 42271415) and the Guangdong Natural Science Funds for Distinguished Young Scholar (Grant No. 2021B1515020104).

References

  • Bhagat RB, Mohanty S. 2009. Emerging pattern of urbanization and the contribution of migration in urban growth in India. Asian Population Studies. 5(1):5–20. doi:10.1080/17441730902790024.
  • Bieger T, Wittmer A. 2006. Air transport and tourism—Perspectives and challenges for destinations, airlines and governments. J Air Transp Manag. 12(1):40–46. doi:10.1016/j.jairtraman.2005.09.007.
  • Burkart N, Huber MF. 2021. A survey on the explainability of supervised machine learning. jair. 70:245–317. doi:10.1613/jair.1.12228.
  • Cao Y, Hua Z, Chen T, Li X, Li H, Tao D. 2023. Understanding population movement and the evolution of urban spatial patterns: an empirical study on social network fusion data. Land Use Policy. 125:106454. doi:10.1016/j.landusepol.2022.106454.
  • Carey HC. 1859. Principles of social science. London: Lippincott.
  • Chen X. 2020. Application of GNN in Urban Computing. In 2020 International Conference on Communications, Information System and Computer Engineering (CISCE). Presented at the 2020 International Conference on Communications, Information System and Computer Engineering (CISCE), IEEE, Kuala Lumpur, Malaysia, pp. 14–17. doi:10.1109/CISCE50729.2020.00010.
  • Chen Y, Geng M, Zeng J, Yang D, Zhang L, Chen X. 2023. A novel ensemble model with conditional intervening opportunities for ride-hailing travel mobility estimation. Physica A. 628:129167. doi:10.1016/j.physa.2023.129167.
  • Chen Y, Zhang D. 2021. Evaluation and driving factors of city sustainability in Northeast China: an analysis based on interaction among multiple indicators. Sustain Cities Soc. 67:102721. doi:10.1016/j.scs.2021.102721.
  • Cheong TS, Wu Y. 2014. The impacts of structural transformation and industrial upgrading on regional inequality in China. China Eco Rev. 31:339–350. doi:10.1016/j.chieco.2014.09.007.
  • Choe K, Laquian AA. 2008. City cluster development: toward an urban-led development strategy for Asia, Urban development series. Asian Development Bank, Mandaluyong City, Metro Manila, Philippines.
  • Coffey WJ. 2000. The geographies of producer services. Urban Geogr. 21(2):170–183. doi:10.2747/0272-3638.21.2.170.
  • Defferrard M, Bresson X, Vandergheynst P. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst. 29:3837–3845.
  • Devi PS, Sudarsan PK. 2021. Determinants of Migration to Goa, India: a gravity model analysis. Ind J Labour Econ. 64(2):485–498. doi:10.1007/s41027-021-00323-z.
  • Dou X. 2018. Labor mobility under the background of industrial relocation.
  • Du J, Zhang S, Wu G, Moura JM, Kar S. 2017. Topology adaptive graph convolutional networks. arXiv preprint arXiv:1710.10370.
  • Fan CC, Li T. 2020. Split households, family migration and Urban settlement: findings from China’s 2015 National Floating Population Survey. SI. 8(1):252–263. doi:10.17645/si.v8i1.2402.
  • Gu H, Shen J, Chu J. 2023. Understanding intercity mobility patterns in rapidly Urbanizing China, 2015–2019: evidence from longitudinal Poisson gravity modeling. Ann Am Assoc Geographers. 113(1):307–330. doi:10.1080/24694452.2022.2097050.
  • Hamilton W, Ying Z, Leskovec J. 2017. Inductive representation learning on large graphs. Adv Neural Inf Process Syst. 30:1024–1234.
  • He C, Chen T, Mao X, Zhou Y. 2016. Economic transition, urbanization and population redistribution in China. Habitat International. 51:39–47. doi:10.1016/j.habitatint.2015.10.006.
  • Hong J, Tang M, Wu Z, Miao Z, Shen GQ. 2019. The evolution of patterns within embodied energy flows in the Chinese economy: a multi-regional-based complex network approach. Sustain Cities Soc. 47:101500. doi:10.1016/j.scs.2019.101500.
  • Jia X-Y, Liu E-J, Chen C-Y, He Z, Yan X-Y. 2022. An interactive city choice model and its application for measuring the intercity interaction. Front Phys. 10:850415. doi:10.3389/fphy.2022.850415.
  • Jiang Y. 2017. Population migration and brain drain in Northeast China. China Popul Dev Stud. 1(2):71–80. doi:10.1007/BF03500925.
  • Johnson KM, Beale CL. 2010. The recent revival of widespread population growth in nonmetropolitan areas of the United States1. Rural Sociol. 59(4):655–667. doi:10.1111/j.1549-0831.1994.tb00553.x.
  • Kingma DP, Ba J. 2014. Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  • Kipf TN, Welling M. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
  • Lan F, Gong X, Da H, Wen H. 2020. How do population inflow and social infrastructure affect urban vitality? Evidence from 35 large- and medium-sized cities in China. Cities. 100:102454. doi:10.1016/j.cities.2019.102454.
  • Lee ES. 1966. A theory of migration. Demography. 3(1):47–57. doi:10.2307/2060063.
  • Li Z-C, Sheng D. 2016. Forecasting passenger travel demand for air and high-speed rail integration service: a case study of Beijing-Guangzhou corridor, China. Trans Res Part A: Policy Pract. 94:397–410. doi:10.1016/j.tra.2016.10.002.
  • Liu E-J, Yan X-Y. 2020. A universal opportunity model for human mobility. Sci Rep. 10(1):4657. doi:10.1038/s41598-020-61613-y.
  • Liu Wangbao SE. 2016. Spatial pattern of population daily flow among cities based on ICT: a case study of “Baidu Migration. Acta Geographica Sinica. 71:1667. doi:10.11821/dlxb201610001.
  • Liu Z, Miranda F, Xiong W, Yang J, Wang Q, Silva C. 2020. Learning geo-contextual embeddings for commuting flow prediction. AAAI. 34(01):808–816. doi:10.1609/aaai.v34i01.5425.
  • Pan J, Lai J. 2019. Spatial pattern of population mobility among cities in China: case study of the National Day plus mid-autumn festival based on Tencent migration data. Cities. 94:55–69. doi:10.1016/j.cities.2019.05.022.
  • Qizhi M, Ying L, Kang W. 2016. Spatio-temporal changes of population density and urbanization Pattern in China (2000–2010). China City Planning Review. 25:8–14.
  • Reia SM, Rao PSC, Barthelemy M, Ukkusuri SV. 2022. Spatial structure of city population growth. Nat Commun. 13(1):5931. doi:10.1038/s41467-022-33527-y.
  • Robinson C, Dilkina B. 2018. A machine learning approach to modeling human migration. In Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies. pp. 1–8. doi:10.1145/3209811.3209868.
  • Rodrigue J-P. 2020. The geography of transport systems. New York: Routledge.
  • Rong C, Feng J, Li Y. 2019. Deep learning models for population flow generation from aggregated mobility data. In Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers. Presented at the UbiComp ’19: The 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing, ACM, London United Kingdom, pp. 1008–1013. doi:10.1145/3341162.3349319.
  • Sandryhaila A, Moura JMF. 2013. Discrete signal processing on graphs: graph fourier transform. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Presented at the ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Vancouver, BC, Canada, pp. 6167–6170. doi:10.1109/ICASSP.2013.6638850.
  • Shao S, Tian Z, Yang L. 2017. High speed rail and urban service industry agglomeration: evidence from China’s Yangtze River Delta region. J Trans Geogr. 64:174–183. doi:10.1016/j.jtrangeo.2017.08.019.
  • Simini F, Barlacchi G, Luca M, Pappalardo L. 2021. A deep gravity model for mobility flows generation. Nat Commun. 12(1):6576. doi:10.1038/s41467-021-26752-4.
  • Simini F, González MC, Maritan A, Barabási A-L. 2012. A universal model for mobility and migration patterns. Nature. 484(7392):96–100. doi:10.1038/nature10856.
  • Spadon G, Carvalho A, Rodrigues JF, Jr, Alves LGA. 2019. Reconstructing commuters network using machine learning and urban indicators. Sci Rep. 9(1):11801. doi:10.1038/s41598-019-48295-x.
  • Stouffer SA. 1940. Intervening opportunities: a theory relating mobility and distance. Am Sociol Rev. 5(6):845–867. doi:10.2307/2084520.
  • Wajdi N, Adioetomo SM, Mulder CH. 2017. Gravity models of interregional migration in Indonesia. Bull Indones Econ Stud. 53(3):309–332. doi:10.1080/00074918.2017.1298719.
  • Wang Y, Dong L, Liu Y, Huang Z, Liu Y. 2019. Migration patterns in China extracted from mobile positioning data. Habitat Int. 86:71–80. doi:10.1016/j.habitatint.2019.03.002.
  • Wang Y, Li X, Yao X, Li S, Liu Y. 2022. Intercity population migration conditioned by city industry struct. Ann Am Assoc oGeogr. 112(5):1441–1460. doi:10.1080/24694452.2021.1977110.
  • Wei Qi FJ, Liu S. 2017. Calculation and spatial evolution of population loss in Northeast China. Scientia Geographica Sinica. 37:1795. doi:10.13249/j.cnki.sgs.2017.12.002.
  • Williams CC. 1996. Understanding the role of consumer services in local economic development: some evidence from the Fens. Environ Plan A. 28(3):555–571. doi:10.1068/a280555.
  • Wu J, Yu Z, Wei YD, Yang L. 2019. Changing distribution of migrant population and its influencing factors in urban China: economic transition, public policy, and amenities. Habitat Int. 94:102063. doi:10.1016/j.habitatint.2019.102063.
  • Xiao L, Wu X, Wang G. 2019. Social network analysis based on Graph SAGE. in 2019 12th International Symposium on Computational Intelligence and Design (ISCID). Presented at the 2019 12th International Symposium on Computational Intelligence and Design (ISCID), IEEE, Hangzhou, China, pp. 196–199. doi:10.1109/ISCID.2019.10128.
  • Xu F, Uszkoreit H, Du Y, Fan W, Zhao D, Zhu J. 2019. Explainable AI: a brief survey on history, research areas, approaches and challenges. In: Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8. Springer, pp. 563–574.
  • Yao X, Gao Y, Zhu D, Manley E, Wang J, Liu Y. 2021. Spatial origin-destination flow imputation using graph convolutional networks. IEEE Trans Intell Transport Syst. 22(12):7474–7484. doi:10.1109/TITS.2020.3003310.
  • Ying Z, Bourgeois D, You J, Zitnik M, Leskovec J. 2019. Gnnexplainer: generating explanations for graph neural networks. Adv Neural Inf Process Syst. 32:9240–9251.
  • You Z, Yang H, Fu M. 2018. Settlement intention characteristics and determinants in floating populations in Chinese border cities. Sustain Cities Soc. 39:476–486. doi:10.1016/j.scs.2018.02.021.
  • Zhang P. 2008. Revitalizing old industrial base of Northeast China: process, policy and challenge. Chin Geogr Sci. 18(2):109–118. doi:10.1007/s11769-008-0109-2.
  • Zhang W, Chong Z, Li X, Nie G. 2020. Spatial patterns and determinant factors of population flow networks in China: analysis on tencent location Big Data. Cities. 99:102640. doi:10.1016/j.cities.2020.102640.
  • Zhang Y, Zheng X, Helbich M, Chen N, Chen Z. 2022. City2vec: urban knowledge discovery based on population mobile network. Sustain Cities Soc. 85:104000. doi:10.1016/j.scs.2022.104000.
  • Zhao Y, Zhang G, Zhao H. 2021. Spatial network structures of urban agglomeration based on the improved gravity model: a case study in China’s two urban agglomerations. Complexity. 2021:1–17. doi:10.1155/2021/6651444.
  • Zhao Z, Wei Y, Yang R, Wang S, Zhu Y. 2019. Gravity model coefficient calibration and error estimation: based on Chinese interprovincial population flow. Dili Xuebao/Acta Geographica Sinica. 74:203–221. doi:10.11821/dlxb201902001.
  • Zhdanov M, Steinmann S, Hoffmann N. 2022. Investigating Brain Connectivity with Graph Neural Networks and GNNExplainer. In 2022 26th International Conference on Pattern Recognition (ICPR). Presented at the 2022 26th International Conference on Pattern Recognition (ICPR), IEEE, Montreal, QC, Canada, pp. 5155–5161. doi:10.1109/ICPR56361.2022.9956201.
  • Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M. 2020. Graph neural networks: a review of methods and applications. AI Open. 1:57–81. doi:10.1016/j.aiopen.2021.01.001.
  • Zipf GK. 1946. The P1 P2/D Hypothesis: on the intercity movement of persons. AmSociol Rev. 11(6):677–686. doi:10.2307/2087063.