Full article: GCN-based weakly-supervised community detection with updated structure centres selection

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

Community detection is a classic problem in network learning. Semi-supervised network learning requires a certain amount of known samples, while sample annotation is time-consuming and laborious. In particular, when the number of known samples is only very small, the learning ability of existing semi-supervised network learning models decreases sharply. In view of this, a weakly-supervised community detection method based on graph convolutional neural network (WC-GCN). Firstly, it introduces a genetic evolution strategy to select and update the structure centres, which enables the updating structure centre process to not get stuck in the local optima, and get the structural centres that are closer to the global best, solving the problem of centre dependence. Secondly, the structural centrality index Cstruct is proposed to measure the representativeness of a subgraph, learning more accurate network structure centres. Thirdly, a self-training method to expand the pseudo-labelled nodes for GCN training to further improve the model effect. The proposed method is evaluated on various real-world networks and shows that it outperforms the state-of-the-art community detection algorithms.

KEYWORDS:

1. Introduction

In recent years, complex systems have modelled data as networks in many cases, such as biological networks, social networks, citation networks, etc. (Girvan & Newman, Citation2002). Community is an important micro-structural feature of a complex network, as it is often associated with function and organisation modules of the underlying system (He et al., Citation2018; Lancichinetti et al., Citation2009). Community detection is helpful to reveal important hidden properties of complex networks from a microscopic perspective, and thus becomes one of the most important tasks to explore and understand how networks work (Girvan & Newman, Citation2002). Therefore, community detection in complex networks has received more and more attention (Rostami et al., Citation2023). How to effectively describe and discover communities in a network is one of the most challenging problems in community detection.

Currently, many algorithms for community detection have been proposed. These methods attempt to explore community structure characteristics in networks from various perspectives which are mainly divided into two categories, global algorithms and local algorithms. The global methods generally depend on prior knowledge of the entire network, such as the number of communities and network size. At present, many classical community detection methods have been proposed, including hierarchical clustering (Girvan & Newman, Citation2002), random block model (Fortunato & Hric, Citation2016), modular optimisation (Yang et al., Citation2016), spectral clustering (Raghavan et al., Citation2007), non-negative matrix factorisation (He et al., Citation2021; He et al., Citation2021; Sun & Yang, Citation2023). The global approach has certain limitations (Wang et al., Citation2017). On the one hand, the analysis from the division and clustering viewpoint, these methods require information on the entire network structure for the network to be processed. Global methods need the whole graph’s information to detect the structure of community; hence, they cannot be used for today’s dynamic and large-scale complex networks due to their high time complexity. On the other hand, only the topology of the network is considered in the calculation process, while the attribute information of the nodes in the network is ignored.

Local optimisation methods are mainly based on the local topological characteristics of the network to reveal the community structure of a local or whole network (Moradi et al., Citation2014). Compared with global community discovery methods, local-based methods can effectively discover communities without complete information about the network structure and have more advantages in terms of combining node attribute information to better mine local community characteristics (Wang et al., Citation2017). Many local approaches have been proposed, which are based purely on local information of nodes. Local methods based on various optimising strategies have been surveyed in a recent study (Li et al., Citation2015), where the local methods are empirically divided into label propagation algorithms (Raghavan et al., Citation2007), link clustering (Šubelj & Bajec, Citation2012) and local expansion optimisation methods (Huang et al., Citation2011). Among them, local expansion optimisation methods are widely used for local community detection in large networks, due to the advantages both in effectivity and accuracy. In the local methods, exploring such structural centres of a network is important to community detection.

However, as the problems that network methods try to solve and the network data that need to be analysed become more and more complex, new methods are proposed and developed. Deep learning techniques have recently been employed to process high-dimensional network data and learn low-dimensional representations of network structures. For example, Convolutional Neural Networks (CNN) (Zhou et al., Citation2020) utilise convolution and pooling operations to reduce the dimensionality of network data, thereby effectively discovering communities in the network. Graph Convolutional Network (GCN) (Kipf & Welling, Citation2017), inherits the advantages of CNN and directly operates on network structured data (Zhou et al., Citation2020). Graph Convolutional Network can not only effectively combine topology and attribute information, but also extract more attribute network features, so this is widely used in network analysis tasks (Lambiotte & RenaudPeel, Citation2020; Wang et al., Citation2021).

Although, some scholars have applied GCN to community detection and achieved good results in (He et al., Citation2020; Jin et al., Citation2019a), there are still great challenges. For example, on one hand, GCN models are semi-supervised learning models that require a certain amount of prior knowledge (labelled nodes) for graph learning. When there are very few known labelled nodes, the GCN model has limited propagation ability. On the other hand, the detected communities are highly dependent on the selected structure centres. The traditional structure-based centre selection method considers the extension process of node centrality guidance, but has a high dependence on the centre node of the initial structure of the network, and the extension process only considers the network topology information, leading to the poor community discovery effect. As shown in Figure , the first two abnormal points in the karate decision graph are the structure centres, but some networks have the same or similar values for a part of the outlier points, and it is difficult to choose the structure centres, which can result in an inaccurate structure centres, for example, the football decision graph.

Figure 1. Decision diagram of karate network and football network. (a) karate network. (b) football network.

To overcome the limitations, we propose a weakly-supervised community detection model based on GCN and the updated structural centres. The main contributions of this paper are as follows:

Weakly-supervised community detection method based on graph convolutional neural network (WC-GCN) is proposed, which can use a very small number of known label samples to learn the network important nodes (structure centres), and further mine the network community after expanding the training set.
The method of selecting structural centres uses a genetic evolution strategy, which enables the updating structure centre process to not get stuck with the local optima, and get the structural centres that are closer to the global best, solving the problem of centre dependence. In particular, the genetic evolution strategy has a high fault tolerance when a extremely small number of known labelled samples still have mislabelled samples.
And the structural centrality index C struct is firstly proposed, which can well reflect the ability of a subgraph to represent the whole network and learn more accurate network structure centres.
The self-training method is utilised to expand some nearby high-confidence nodes to establish a balanced training set for GCN training to further improve the model effect.

2. Related work

The graph convolutional networks (GCN) (Kipf & Welling, Citation2016) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning. This model, however, was originally designed to be learned with the presence of both training and test data. Over the past few years, several graph-based convolution network models emerged for addressing applications of graph-structured data.

In recent years, more and more scholars leverage deep learning methods to solve the problem of community detection on graphs and improve the effect of community detection (Zhou et al., Citation2020). Some scholars have applied GCN to community detection and achieved good results (He et al., Citation2020; Jin et al., Citation2019b). GCN can effectively combine topology and attribute information, so this is widely used in network analysis tasks (Wang et al., Citation2021). However, these methods still have some challenges. On one hand, the detected communities are highly dependent on the selected structure centres. Exploring such structural centres of a network is important to community detection. Lancichinetti et al. proposed a local optimisation algorithm based on a local fitness measure (Lancichinetti et al., Citation2009), which generates hierarchical community structure of the network by randomising the starting node. Lee et al. introduced a greedy clique expansion algorithm (Lee et al., Citation2010), which selects distinct cliques as starting seeds, and expands these seeds by greedy optimisation. These methods expand each community from the structural centre to the boundary with a local search procedure and accelerate the convergence to optimal solutions, which avoids the randomness of seed selection to improve its stability. However, if the structure centres are inaccurate, they will affect the result of community discovery.

The local expansion methods are highly dependent on the selected structure centres, affecting the effect of community detection. Structure centres-dependent demonstrates that the quality of initial structure centres directly affects the effect of community detection. For instance, a small community may disappear if no node in it is identified as one of the initial structure centres. A large community might be split into more than two fragments if more than two nodes are identified as initial structure centres. Therefore, after the initial structure centres are selected, the updating structure centres is a very necessary step for the effect of community detection.

On the other hand, when there are very few known labelled nodes, the GCN model has limited propagation ability, thus yielding unsatisfactory results. To overcome the problem, Li et al. (Citation2018) proposed a joint training and self-training method for training GCNs with few labels. Furthermore, some researchers have designed self-supervised GCN to perform node clustering in complex networks (Kou et al., Citation2021; Xie et al., Citation2023). Sun et al. (Citation2020) proposed a multi-stage self-supervised learning algorithm for graph convolutional networks with a small number of labelled nodes on the graph. Some scholars have also applied GCN to community detection and achieved good results (He et al., Citation2020; Jin et al., Citation2019a). Jin et al. (Citation2019a) integrated GCN and MRF for semi-supervised community detection. The framework effectively incorporates local topological features and properties of nodes into the task of community detection, but does not consider mesoscopic community properties (Berahmand et al., Citation2018) such as structure centres.

3. Preliminaries

This section briefly introduces the preliminary knowledge of this work, including basic notations, problem statements and the core of graph convolutional networks. Table lists the main symbols used in the paper.

Table 1. Main symbols and their definitions.

Display Table

3.1. Problem formulation

We are interested in the community detection task in an undirected graph $G = (V, E)$ , where $V = {v_{1}, v_{2}, \dots, v_{N}}$ denotes the set of nodes, and $E = {e_{1}, e_{2}, \dots, e_{M}}$ denotes the set of edges. The topology of G can be represented by its adjacency matrix $A \in R^{N \times N}$ , where the i-th row, i.e. $x_{i} = X_{i *}$ , is the feature vector for node $v_{i}$ . If there are usually only a small amount of labels that indicate some nodes of the network belong to a certain community $k (k \in [1, C])$ , which can be indexed by k sets of labels $S^{(0)} = {l_{1}, l_{2}, \dots \dots, l_{C}}$ .

The community detection problem is regarded as a semi-supervised learning task where unlabelled nodes are assigned to the corresponding communities by training some of the labelled nodes.

However, it is often the case that there is only a few available label information in some complex networks. This means it’s a challenging problem to uncover underlying communities in these unlabelled networks. Here, we are interested in developing community detection methods that make full use of the known label information. So we require that method accepts networks of few labelled nodes, similarly as in weakly-supervised learning models. In this paper, we focus on the problem of non-overlapping community detection based on the network topology $A$ and node attributes $X$ , which aims to partition the node set V into a set of disjoint communities $C = {C_{1}, C_{2}, \dots, C_{K}}$ , where $C_{i} \subseteq V$ for i = 1,2, … , K and $\cup_{i = 1}^{k} C_{i} = V$ .

3.2. Graph convolutional network

Graph Neural Networks (GNNs) are an effective framework for representation learning of graphs. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks, such as Graph Isomorphism Network (GIN) (Xu et al., Citation2018) and GCN (Kipf & Welling, Citation2017). To make full use of the network topology and node attributes, GCN is a basic module in our method. GCN (Kipf & Welling, Citation2017) is a multi-layer neural network that operates directly on a homogeneous graph and induces the embedding vector of a node based on the properties of its neighbours. The layer-wise propagation rule is as follows: (1) $H^{(l + 1)} = σ ({\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}} H^{(l)} W^{(l)})$ (1) It is a special form of localised filter – a linear combination of the feature vectors of adjacent neighbours. $\tilde{A} = A + I$ , which added self-connections to the original adjacency matrix. $\tilde{D}$ is the degree matrix, where ${\tilde{D}}_{i i} = \sum_{i} {\tilde{A}}_{i j}$ . $W^{(l)}$ is a layer-specific trainable transformation matrix. $σ (.)$ denotes an activation function such as ReLU. $H (l)$ denotes the hidden representations of nodes in the 1-th layer. Initially, $H (0) = X$ . For non-attributed networks, X is initialised as one-hot representations of nodes in the graph.

Most GCNs (Jin et al., Citation2019b; Sun et al., Citation2020) are built on supervised or semi-supervised settings and require some high-quality node labels for efficient model optimisation. However, in real-world applications, it is difficult to obtain high-quality node labels. Therefore, unsupervised graph representation learning remains a challenging task.

4. Methodology

4.1. Overview

The overall framework of the model is shown in Figure . Firstly, we consider structural centre updating in the structural centre selection algorithm, and use the genetic evolution algorithm of roulette (Kochenderfer & Wheeler, Citation2019) to make the process of updating structural centres break the local optimum and get the structural centres that are closer to the global optimum, which solves the centre dependence problem. Secondly, the structural centrality index C_struct is firstly proposed, which can well reflect the ability of a subgraph to represent the whole network. Learning the GCN-based framework effectively incorporates network topology and attributes information, which naturally allows us to tackle the above challenges. Thirdly, a self-training method is used to extend some trusted nearest neighbour nodes to construct a balanced training set to further improve the model effect.

Figure 2. WC-GCN is comprised of three steps: (1) the structural centre selection; (2) constructing pseudo-labelled nodes; and (3) finally training the GCN model for community partition.

4.2. Selecting structure centres

The selection of initial structure centres is particularly important. They act as carriers for the initial labelling and to some extent influence the resulting community. In the following, the set of known label nodes $S^{(0)} = {l_{1}, l_{2}, \dots \dots, l_{C}}$ is as the initial set of structure centres. To reduce the adverse effect of inappropriate structure centres, we suggests using graph topology and node attributes to select more appropriate structure centre nodes. Specifically, a two-layer GCN is trained under the set of structure centres $S^{(0)}$ (Kipf & Welling, Citation2017). The GCN takes as input the graph adjacency matrix A and node attribute matrix X. The output is denoted as $Z \in R^{N \times K}$ , which is computed as (2) $Z = f (X, A) = s o f t m a x (\hat{A} R e L U (\hat{A} X W^{(0)}) W^{(1)})$ (2) $\hat{A} = {\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}}$ , which is the normalised adjacency matrix with self-connection. $W^{(0)} \in R^{F \times D}$ and $W^{(1)} \in R^{D \times K}$ are the weight parameters for the two graph convolution layers. $R e L U (x) = m a x (0, x)$ , and $s o f t m a x (x_{i}) = \frac{e x p (x_{i})}{\sum_{i^{'}} e x p (x_{i^{'}})}$ , which are the activation functions for the first and second GCN layer. Let $z_{i, k}$ the entry denote $v_{i}$ the affiliation strength of node to the k-th community. Then the predicted community label for an unlabelled node $v_{i}$ is (3) $\hat{y_{i}} = \underset{k \in [1, 2, \dots, K]}{a r g m a x} z_{i, k}$ (3)

Based on the predicted community labels, we can obtain a temporary partition of the nodes $Ω = {V_{1}, V_{2}, \dots, V_{K}}$ , where $V_{K} = {ν_{i} | \hat{y_{i}} = k}$ . Then K subgraphs can be induced: $G_{k} = (V_{k}, E_{k})$ , where $E_{k} = {{(ν_{i}, ν_{j}) | ν_{i} \in V_{k}, ν_{j} \in V_{k}, (ν_{i}, ν_{j}) \in E}}$ , $k = 1, 2, \dots, K$ . Some inaccurate structure centres may emerge now: the k-th initial structure centre, i.e. $c_{k}^{(0)}$ , may locate on the periphery of subgraph $G_{k}$ or even in another subgraph $G_{k^{'}}$ . To find the structure centres of $G_{k}$ , we compute the local structure importance for each node in $G_{k}$ , which is calculated from the perspective of the shortest path (Zheng et al., Citation2020), to get to the set of structure centres $S^{(1)}$ . On this basis, in order to make the structural centre more accurate, we propose the structural centrality refinement structural centres. Formally speaking, we introduce the following definitions.

Definition 1

Similarity based on Local Simple Paths, SLP

Given a network $G_{k} = (V_{k}, E_{k})$ , the SLP between nodes $ν_{i}$ and $ν_{j}$ is defined as: (4) $S L P (ν_{i}, ν_{j} | G_{k}) = {\begin{cases} a_{1} η_{i, j}^{(1)} + a_{2} η_{i, j}^{(2)} + a_{3} η_{i, j}^{(3)} & ν_{i} \neq ν_{j} \\ 0 & ν_{i} = ν_{j} \end{cases}$ (4) $η_{i, j}^{(l)}$ is the number of simple paths – paths with no repeated nodes – with length $l$ between nodes $ν_{i}$ and $ν_{j}$ . α1, α2, α3 are non-negative weights that satisfy α1≥α2≥α3 and α1 + α2 + α3 = 1 . Intuitively, a higher value of $S L P (ν_{i}, ν_{j} | G_{k})$ indicates that the two nodes $ν_{i}$ and $ν_{j}$ are better connected in the subgraph $G_{k}$ .

Definition 2

Local Structural Importance

Given a network $G_{k} = (V_{k}, E_{k})$ , the notion of local structural importance of node $ν_{i} \in V_{k}$ is defined as: (5) $w (ν_{i} | G_{k}) = \sum_{ν_{j} \in V_{k}} S L P (ν_{i}, ν_{j} | G_{k})$ (5) Since $S L P (ν_{i}, ν_{j} | G_{k})$ measures the connectivity strength between nodes $ν_{i}$ and $ν_{j}$ from the perspective of the number of simple paths shorter than 3, $w (ν_{i} | G_{k})$ can indicate the local structure importance of node $ν_{i}$ with respect to its surrounding nodes. A higher value of $w (ν_{i} | G_{k})$ means that the node $ν_{i}$ is closely connected to other nodes in the local neighbourhood and is more likely to be the centre of $G_{k}$ . Therefore, the node with the largest value of $w (ν_{i} | G_{k})$ is selected as the structure centre for the subgraph $w^{G^{k}}$ . That is to say, (6) $c_{h}^{(r)} = a r g max_{ν_{i} \in V_{k}} w (ν_{i} | G_{k})$ (6)

Definition 3

Structural centrality

Given a network $G = {V, E}$ , with a set of nodes $S \subseteq V$ , the importance of S in G is defined as follows: (7) $C_{s t r u c t} (S) = \frac{| S | \cdot \sum_{i, j \in S} I (i) \cdot I (j) \cdot d (i, j)}{C_{| s |}^{2} \cdot {\bar{I}}^{2} \cdot \bar{d} \cdot N}, (| S | > 1)$ (7) S is the structural centrality set, |S| denotes the number of nodes in S. I(i) denotes the influence of node $ν_{i}$ . Metrics such as degree centrality, median centrality, proximity centrality, eigenvector centrality and so on, can denote the influence of a node. d(i,j) denotes the shortest distance between node $ν_{i}$ and $ν_{j}$ , the $C_{| S |}^{2} = \frac{| S | \times (| S | - 1)}{2}$ , $\bar{I}$ denotes the average of the influence of all nodes of network G, $\bar{d}$ denotes the average shortest distance of network G, and N denotes the number of nodes of graph G.

When the network G is a complete graph, the structural centrality $C_{s t r u c t} (V) = 1$ of G. That is, the network is best represented when the set of structural centres is the full set of nodes V of that network G. However, in practice, the size of the real network is getting higher and higher, and we choose a subset S to represent as much as possible the network G, for example in Figure and Table . The nodes in S are characterised by a high local influence and are far away from each other. In weakly-supervised learning, the number of nodes in S is deterministic and equal to the number of classes, assuming that all classes are included in the training sample.

Figure 3. An example of structural centrality $C_{s t r u c t}$ function calculation.

In practice, considering the large size of the network, the complexity of calculating the influence of all nodes and the shortest distance between all nodes is high, and for a specific network, the value of its sum is fixed, so Equation (7) can be simplified as: (8) $C_{s t r u c t} (S) = \frac{\sum_{i, j \in S} I (i) \cdot I (j) \cdot d (i, j)}{2 (| S | - 1) \cdot N}, (| S | > 1)$ (8) In this paper, node degree centrality is used to represent node influence. Figure shows an example of structural centrality metric calculation and Table shows the calculation process. In this example, $C_{s t r u c t} (S)$ = 2.67 in Figure (NO. 4) is the largest value, so the three points a,b,c are more accurately chosen as the set of structure centre nodes for this graph.

Table 2. Structural centrality example calculation process.

Download CSV Display Table

According to the above calculations, when $S = {a, b, c}$ , the value of structural centrality is large, it is an ideal set of structural centrality. K = 3 in this case.

The procedure for selecting initial structure centres is listed in algorithm 1, which selects the K nodes with largest centrality and assigns them with distinct community labels 1,2, … … ,K, respectively. The output is denoted $S^{(0)} = {{c_{1}^{(0)}}, {c_{2}^{(0)}}, \dots, {c_{K}^{(0)}}}$ , where $c_{K}^{(0)}$ denotes the k-th structure centre and k is assigned a pseudo community label. The method in this paper is a weakly-supervised method where the number of communities k is determined by the number of known sample labels. The selecting structural centres algorithm not only considers the network topology information and node attribute information at the same time to obtain better structure centres iteratively, which solves the problem of structure centres dependence.

Table

Display Table

4.3. Constructing the balanced training set

Now we have obtained a set of refined structure centres, each of which is assigned a distinct pseudo community label. To realise community partition, a GCN can be trained to predict the community labels for other nodes in the graph by integrating the graph structure and node attributes. As known, GCN is a localised filter, hence it cannot effectively propagate the label information to the entire graph when only a limited amount of labelled nodes are available (Li et al., Citation2018). In order to train a better GCN for community detection, we propose to expand the pseudo-labelled set by the following process.

When the structure centres are iteratively updated in algorithm 1, a temporary graph partition $Ω = {V_{1}, V_{2}, \dots \dots, V_{K}}$ is also returned: node $ν_{i}$ is assigned to $V_{K}$ , if $k = a r g m a x_{k^{'}} z_{i, k^{'}}$ . For each node $ν_{x} \in V_{k}$ , we compute the difference between its affiliation strength to the k-th community and that of $c_{k}^{(r)}$ , i.e. $| z_{v_{x}, k} - z_{c_{k}^{(r)}, k} |$ . Then we construct the set of nodes $S_{k}$ with pseudo label k by selecting the $τ$ nodes with the smallest affiliation strength difference to that of $c_{k}^{(r)}$ among $V_{K}$ . (9) $S_{k} = a r g max_{ν_{x} \in V_{k}} (| z_{v_{x}, k} - z_{c_{k}^{(r)}, k} |, τ)$ (9) where $a r g m i n (., τ)$ is the function for selecting the $τ$ nodes with smallest values. The lower bound of $τ$ is estimated by solving $K \cdot τ \cdot d^{L} \approx N$ , where $d$ is the average degree of nodes in $G$ , and $L$ is the number of graph convolution layers which is 2 in our experiments. Note that the pseudo-label expansion process takes into account both the graph topology and node attributes, since the probability $z_{v_{x}, k}$ for node $ν$ predicted by an $L$ -layer GCN depends on the labels and attributes of its $L$ -hop neighbours in the graph, as defined in Equation (2). In this way, we expand the set of refined structure centres $S^{(r)} = {{c_{1}^{(r)}}, {c_{2}^{(r)}}, \dots, {c_{K}^{(r)}}}$ to a larger set of pseudo-labelled nodes $S = {S_{1}, S_{2}, \dots, S_{K}}$ . An alternative strategy is to select the $τ$ nodes with the largest $z_{i, k}$ among all nodes in $V$ (Li et al., Citation2018), i.e. (10) $S_{k} = a r g max_{ν_{x} \in V_{k}} (z_{v_{x}, k}, τ)$ (10) We did not adopt this strategy, because it is sensitive to inappropriate structure centres. Besides, some nodes satisfying this criterion are ones of low degree which are located on the periphery of a certain community and far away from other communities. Although such nodes belong to the corresponding community with a high confidence, they have limited propagation ability.

4.4. Training GCN with expanded pseudo-labelled set

Under the supervision of the larger set of pseudo-labelled nodes, we can train a two-layer GCN with the same structure as defined by Equation (2). We minimise the cross-entropy loss over all pseudo-labelled nodes: (11) $T := - \sum_{k = 1}^{K} \sum_{v_{i} \in S_{k}} \log Z_{i k}$ (11) where $S_{k}$ denotes the set of nodes with pseudo label, and $K$ is the output dimension of the softmax layer, which actually corresponds to the number of communities. The adam (Kingma & Ba, Citation2015) optimizer is used to update the model parameters $W^{(0)}$ and $W^{(1)}$ . Once the GCN is trained to convergence, we can predict the community label using Equation (3) and obtain the final community partition $C = {C_{1}, C_{2}, \dots, C_{K}}$ , where (12) $C_{k} = {v_{i} | \hat{y_{i}} = k} f o r k = 1, 2, \dots, K$ (12)

The whole process of our proposed method is shown in algorithm 2.

Table

Display Table

4.5 Computational cost

In this paper, we try to solve an interesting problem, i.e. community detection with limited labelled nodes, iterating the process of training GCN, refining community centres, generating pseudo-labels of communities, to uncover underlying community structures with topology and attribute information.

Given a network G = (V, E) with N nodes and M edges, the time complexity of the algorithm mainly depends on the SLP process and the GCN model. The SLP algorithm takes linear time O(M) in the number of edges for each iteration. This process is repeated for a finite number of iterations and hence requires an overall near-linear time. In experiments, the runtime is negligible on networks with a few thousand nodes. In the GCN training phase, we follow Kinf and Welling’s setup with full-batch gradient descent, which scales linearly in the number of network edges. In our experiments, it generally converges fast with a few epochs as it propagates labels from expanded labelled nodes to the entire network effectively. Moreover, the computation of GCN can be further accelerated with mini-batch stochastic gradient descent. Therefore, the running time of our algorithm is comparable to GCN.

For GCNs, we follow the same hyper-parameters as Kipf and Welling (Citation2017) with a learning rate 0.01, maximum epochs 200, dropout rate 0.5, L2 regularisation weight 5 × 10⁻⁴ and 16 hidden units. For each run, we randomly split labels into a small set for training, a large set for testing. According to the analysis and experiments (Jin et al., Citation2019a), GCN-based methods usually get the best performance with 5% training size on most datasets. We choose this labelling rate as the initial training size for easy comparison with other baselines. We adopt Adam optimizer to train the GCN-based models and run experiments on TensorFlow. All the experiments were conducted on a PC with a 2.6 GHz, i7-5600, Quad-core CPU and 12 GB of RAM.

5. Experiments

In this section, we validate the performance of the proposed community detection method on various real-world networks. We conduct extensive experiments with the aim of answering the following research questions:

RQ-1: Whether the proposed WC-GCN helps to improve the community detection?
RQ-2: Are both updating structure centre and pseudo-labelled set expansion essential for WC-GCN?
RQ-3: Does the step of updating structure centres indeed yield better structure centres for later community detection?
RQ-4: Why can constructing a pseudo-label training set improve the division effect of GCN?

5.1. Datasets

We conduct extensive experiments on the public network datasets, including Cora,Footnote¹ CiteSeer,Footnote² PubMed,Footnote³ FootballFootnote⁴ and PolBlogs.⁴ Detailed statistics are shown in Table .

Football (Girvan & Newman, Citation2002): The Football network is a network of American football games between Division IA colleges during regular season Fall 2000. In the network nodes denote the 115 teams that are divided into 12 conferences, and the edges represent 613 games.
PolBlogs (Adamic & Glance, Citation2005): The PolBlogs dataset is a directed network of hyperlinks between political blogs collected during the 2004 US election. It includes 1490 nodes and 16,715 directed edges. The political orientation of each blog is either conservative or liberal.
Cora (Sen et al., Citation2008): Cora is a citation network consisting of 2708 documents and 5,429 links (Zhang et al., Citation2023). Each document has a 1433-dimensional sparse bag-of-words feature vector. The Cora data set contains a number of machine learning papers divided into one of seven classes, including Case Based, Genetic Algorithms, Neural Networks, Probabilistic Methods, Reinforcement Learning, Rule Learning and Theory.
CiteSeer (Sen et al., Citation2008): CiteSeer is a citation network consisting of 3327 scientific publications from six categories – agents, AI, databases, human–computer interaction, machine learning, and information retrieval, and a total of 4732 links.
PubMed (Namata et al., Citation2012): PubMed is a citation network consisting of 19,717 nodes and 44,338 links. The dataset contains sparse bag-of-words feature vectors of length 500 for each document. The PubMed data set contains documents related to biomedical, health field and life sciences.

Table 3. Specific information of five network data sets.

Download CSV Display Table

5.2. Evaluation metrics

To evaluate the community detection performance of baselines and our method, we utilise four widely used performance metrics – Normalised Mutual Information (NMI) (Danon et al., Citation2005), Adjusted Rand Index (ARI) (Rand, Citation1971), F1 score (Bhattacharya et al., Citation2023; Jin et al., Citation2021a) and Accuracy (Bhattacharya et al., Citation2023; Liu et al., Citation2011). Accuracy to evaluate division performance from different perspectives.

$A C C ()$ assesses cluster quality by measuring the agreement between the community partition predicted by an algorithm and the ground-truth community partition of a network. Let $P = {P_{1}, P_{2}, \dots \dots, P_{A}}$ be the ground-truth community partition with A communities, and $C = {C_{1}, C_{2}, \dots \dots, C_{K}}$ be the community partition detected by an algorithm. $A C C ()$ is the ratio of the number of correctly predicted samples to whole samples, which is defined as given in Equation (13). (13) $A C C (P, C) = \frac{\sum_{i = 1}^{N} δ (P_{i}, m a p (C_{i}))}{N}$ (13) where $P_{i}$ is the actual category label of the $i - t h$ sample, $C_{i}$ is the predicted category label of the model on the $i - t h$ sample. The $m a p ()$ function facilitates each predicted category label to reach its optimal accurate category label. $δ (x, y)$ denotes an indicator function defined as shown in Equation (14): (14) $δ (x, y) = {\begin{cases} 1, & x = y; \\ 0, & x \neq y . \end{cases}$ (14) F1 score: The F1 score (Bhattacharya et al., Citation2023; Jin et al., Citation2021a) for detecting communities has been defined as the harmonic mean of Precision and Recall, which may be expressed as follows: (15) $F 1 (P, C) = 2 \frac{P r e c i s i o n (P, C) . R e c a l l (P, C)}{P r e c i s i o n (P, C) + R e c a l l (P, C)}$ (15) where $P = {P_{1}, P_{2}, \dots \dots, P_{A}}$ be the ground-truth community partition with A communities, and $C = {C_{1}, C_{2}, \dots \dots, C_{K}}$ be the community partition detected by an algorithm.

Regarding the influence of the size of the training set on the GCN results, we also conducted related experiments with the NMI and ARI, and the experimental results are shown in the following Figure . They assess the community quality by measuring the agreement between the community partition predicted by an algorithm and the ground-truth community partition of the network. (15) $NMI (P, C) = \frac{2 \sum_{a = 1}^{A} \sum_{b = 1}^{K} N_{a, b} \log \frac{N_{a, b} N}{N_{a, .} N_{., b}}}{- \sum_{a = 1}^{A} N_{a, .} \log \frac{N_{a, .}}{N} - \sum_{b = 1}^{K} N_{., b} \log \frac{N_{., b}}{N}}$ (15) where $N_{a, b} = | P_{a} \cap P_{b} |$ is the number of nodes in common between the ground-truth community $P_{a}$ and detected community $C_{b}$ , $N_{a, .} = | P_{a} | = \sum_{b = 1}^{K} N_{a, b}$ , $N_{., b} = | C_{b} | = \sum_{a = 1}^{A} N_{a, b}$ , and N is the total number of nodes in G. (16) $ARI (P, C) = \frac{\sum_{a = 1}^{A} \sum_{b = 1}^{K} (\begin{matrix} N_{a, b} \\ 2 \end{matrix}) - [\sum_{a = 1}^{A} (\begin{matrix} N_{a, .} \\ 2 \end{matrix}) \sum_{b = 1}^{K} (\begin{matrix} N_{., b} \\ 2 \end{matrix})] / (\begin{matrix} N \\ 2 \end{matrix})}{\frac{1}{2} [\sum_{a = 1}^{A} (\begin{matrix} N_{a, .} \\ 2 \end{matrix}) + \sum_{b = 1}^{K} (\begin{matrix} N_{., b} \\ 2 \end{matrix})] - [\sum_{a = 1}^{A} (\begin{matrix} N_{a, .} \\ 2 \end{matrix}) \sum_{b = 1}^{K} (\begin{matrix} N_{., b} \\ 2 \end{matrix})] / (\begin{matrix} N \\ 2 \end{matrix})}$ (16)

Figure 4. The update process of the structure centres on the Football network when initialised randomly.

The range of NMI and ARI is [0, 1]. The value is equal to 1 only if the community partition detected by an algorithm is completely identical to the ground-truth community partition, and 0 for a random partition.

5.3. Baselines

We compare our method with 5 state-of-the-art methods on 3 widely used benchmark datasets. We mainly make a fair comparison between the models, which clearly presents the commonalities and differences between different models including ChebyNet, FastGCN, GAT, GCN, SimPGCN, PPR-GCN and Bet-GAT.

ChebyNet (Defferrard et al., Citation2016): ChebyNet presents a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localised convolutional filters on graphs.
FastGCN (Chen et al., Citation2018): FastGCN, a fast improvement of the GCN model recently proposed by Kipf and Welling (Citation2016) for learning graph embeddings. It generalises transductive training to an inductive manner and also addresses the memory bottleneck issue of GCN caused by recursive expansion of neighbourhoods.
GAT (Velikovi et al., Citation2017): Graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.
GCN (Kipf & Welling, Citation2016): It is a semi-supervised classification on graph-structured data. GCN model uses an efficient layer-wise propagation rule that is based on a first-order approximation of spectral convolutions on graphs.
SimPGCN (Jin et al., Citation2021b): There are many scenarios where node similarity plays a crucial role. Thus, it has motivated the proposed framework SimP-GCN that can effectively and adaptively balance the structure and feature information as well as capture pairwise node similarity via self-supervised learning.
PPR-GCN (Zhang et al., Citation2023): A graph convolutional network based on personalised web page ranking algorithm (PPR-GCN) is proposed for community detection in attribute networks, which uses the strong characteristics of graph convolution algorithm for integrating node topology and attribute information, and combines with personalised web page ranking algorithm to decouple the prediction process and propagation process in the model.
Bet-GAT (Verma et al., Citation2023): It is an efficient graph-attention-based semi-supervised classification method (Bet-GAT) where training nodes are selected based on betweenness centrality. The model essentially exploits the fact that GNN-based models rely on the aggregation of neighbourhood information. In a graph network, the structurally well-positioned nodes ensure good aggregation of information. This key idea is used to train the model.

5.4. Experimental results and analysis (RQ-1)

We compare the performance of our method (WC-GCN) with other state-of-the-art algorithms on Cora, CiteSeer, and PubMed real-world networks. The Cora, CiteSeer and PubMed datasets belong to the attributed network. All experiments take the average performance of the clustering results obtained by running the algorithm 30 times on each network, and the experimental results are shown in Tables .

Table 4. ACC results for Cora dataset with different labels selected in each class.

Download CSV Display Table

Table 5. Citeseer dataset in each class selects different labels for ACC results.

Download CSV Display Table

Table 6. ACC results for Pubmed dataset with different labels selected in each class.

Download CSV Display Table

The experimental parameters in this paper are consistent with the parameters $a_{1}$ , $a_{2}$ and $a_{3}$ in the literature (Zheng et al., Citation2020). Experiments show that when $a_{1} \in [0.6, 0.8]$ , $a_{2} \in [0.15, 0.35]$ and $a_{3} = 1 - a_{1} - a_{2} > 0$ , the algorithm WC-GCN will reach the optimal value in most cases, so $a_{1}$ = 0.6, $a_{2}$ = 0.35, $a_{3}$ = 0.05 for the unlabelled network. The experimental results are shown in Tables .

Table is the experimental results on the Cora dataset. For different comparison methods, Cora has 7 communities and we selected 2,3,4,5,6,7,8,9 and 10 points for each community as input. From the experimental results in the table, we can intuitively see that when the less the known samples of the input, the accuracy of our model WC-GCN is higher than the other 5 methods. When the number of input points increases, the results of WC-GCN are still better than the other methods, but the accuracy is not obvious as the GAT. The reason is that GAT considers aggregation of neighbours based on their importance with respect to the node. This further improves the accuracy of the results and the quality of the learning process imparted to the model. At the same time, the learning framework WC-GCN should have certain fault tolerance, that is to say, when a very small number of samples with labelled errors in known samples, it can still learn the accurate network structure centre, which does not affect the accuracy of the network learning task.

In Tables and , experiments on Citeseer and Pubmed datasets, there are 6 communities in Citeseer and the number of communities in Pubmed is 3. The trends in experimental results in Tables and are the same as those in Table . When the number of known samples is less, the accuracy of WC-GCN is higher than that of the other five methods; as the number of known samples increases, the results of WC-GCN are still better than other methods.

The F1 score is an indicative measure of the accuracy of the model. Furthermore, in the result of the F1 score index, a similar trend is seen for the various datasets, as illustrated in Tables . The F1 scores (calculated using different methods) for the three benchmark datasets were compared with the other state-of-the-art methods. The results of the WC-GCN model are more advantageous because the model is more advantageous when the number of known labels is small. In general, the algorithm WC-GCN proposed in this paper performs effectively in experiments, and its improvement is obvious compared with other comparison algorithms.

Table 7. F1 results for Cora dataset with different labels selected in each class.

Download CSV Display Table

Table 8. Citeseer dataset in each class selects different labels for F1 results.

Download CSV Display Table

Table 9. F1 results for Pubmed dataset with different labels selected in each class.

Download CSV Display Table

The experimental results show that the WC-GCN model uses a very small number of known samples to learn the important network nodes (network structure centre), further spread the influence of the structure centre nodes, construct the balanced training set and finally achieve the purpose of network learning. At the same time, the learning framework should have certain fault tolerance, that is to say, when a very small number of samples with labelled errors in known samples, it can still learn the accurate network structure centre, which does not affect the accuracy of the network learning task.

5.5. Ablation study (RQ-2)

Two key steps of WC-GCN are updating structure centre (UpSC) and pseudo-labelled set expansion (PLSet). To answer RQ2, we compare four variants of WC-GCN. (1) Variant 1 (w/o UpSC and PLSet) directly trains a GCN for community detection under the supervision of the initial structure centres identified, neither updating the structure centres nor constructing an expanded pseudo-labelled set. (2) Variant 2 (w/o UpSC) does not updates the initial structure centres, but construct a larger pseudo-labelled set based on these structure centres, which is used to train the final GCN for community detection. (3) Variant 3 (w/o PLSet) updates the initial structure centres, but does not expand the pseudo-labelled set before training the final GCN for community detection. (4) Variant 4 is the full model with both UpSC and PLSet. The results are shown in Table .

Table 10. ACC results for Comparison of four variants of WC-GCN.

Download CSV Display Table

For the ablation experiments, we calculated the average of the ACC results of the experiments with 9 different inputs. We can roughly obtain a rank of the four variants according to their performance: Variant 1 < Variant 2 < Variant 3 < Variant 4. (1) Variant 1 (w/o UpSC & PLSet) performs the worst. Since it neither updates the initial structure centres nor constructs an expanded pseudo-labelled set, when some of the initial structure centres are not good, training the GCN with a limited set containing inappropriate seeds will yield unsatisfactory community partition. (2) Variant 2 (w/o UpSC) achieves better performance than Variant 1, but still falls far behind the full model. Note that it expands the pseudo-labelled set on the basis of initial structure centres, without updating them in advance. On the one hand, training the final GCN with an expanded set of pseudo-labelled nodes improves its propagation ability when detecting communities. On the other hand, if inappropriate initial seeds are directly used for expanding pseudo-labelled set, it may mislead the model. (3) The performance of Variant 3 is better than that of Variant 1, but is still lower than the full model. It updates the initial structure centres, which provide high-quality seeds for local community detection. However, the final GCN has limited propagation ability, if it is trained with only the refined structure centres. GCN models are semi-supervised learning models that require a certain amount of prior knowledge (labelled nodes) for graph learning. When there are very few known labelled nodes, the GCN model has limited propagation ability. (4) The performance of Variant 2 is lower than that of Variant 3, indicating that refining the structure centres has a larger impact than expanding the pseudo-labelled set.

Based on the above analysis, we conclude that both updating structure centre and pseudo-labelled set expansion are essential for WC-GCN to achieve its best performance. By updating the initial structure centres, the former step obtains a set of high-quality seed nodes, which lay a good foundation for local community detection. By expanding the pseudo-labelled set, the latter prepares a larger amount of supervision information for training GCN, which helps to improve its label propagation ability.

5.6. Case study for updating structure centre (RQ-3)

To answer RQ3, we conduct case studies on the Football network, and visualise the process of structure centre refinement.

In Figure , we randomly select 12 initial structure centres, which are located in 6 communities as shown in Figure (b). After 11 iterations of updates, the refined 12 structure centres come from 11 communities, as shown in Figure (d). Therefore, no matter whether the initial structure centres are selected at random, they can be refined to a set of more representative seeds that are scattered in different communities by algorithm 1. That is to say, it can overcome the sensitivity to initial structure centres to some extent, and reduce the adverse effect of inappropriate structure centres on community detection.

5.7. The influence of the size of the training set on the GCN (RQ-4)

Regarding the influence of the size of the training set on the GCN results, we also conducted related experiments with the NMI and ARI on the dataset of PolBlogs, Cora, Citeseer and Pubmed, and investigated how their performance varies with respect to the number of expanded nodes per pseudo label (i.e. $τ$ ).

The experimental results are shown in the following Figure . We can observe that when the training set is very small, the NMI and ARI values are not high. When the training set exceeds a large number, the index value basically does not change much. In all datasets, the community detection performance of WC-GCN with an expansion strategy generally improves as the number of expanded nodes per pseudo-label increases to a moderate size. The reason is that a larger set of pseudo-labelled nodes makes up for the lack of label propagation ability of local graph convolutional filter. The experiment shows that constructing a pseudo-label training set to train GCN can make up for the lack of the division ability of the GCN model trained with a small number of training sets, and improve the division effect of the GCN model.

Figure 5. Performance variation on each network dataset with the size of the training set.

6. Conclusion

This paper proposes a community detection method based on weakly-supervised graph convolutional neural network (WC-GCN) with updated structure centres selection. The structure-centre selection algorithm not only considers the topological structure and attribute information, but also compensates for the lack of structure-centre dependence. Secondly, the structural centrality index C_struct is proposed to measure the representativeness of a subgraph, learning more accurate network structure centres. Thirdly, the self-training method is used to expand some nearby nodes with high credibility to build a balanced training set to further improve the model effect. Finally, the proposed method is evaluated experimentally on various real-world networks. Experimental results show that the proposed method performs more effectively compared to the current state-of-the-art community detection algorithms.

In this study, some limitation is that the attributes of nodes are homogeneous. As a fact, attributes of nodes are heterogeneous. How to extend the proposed algorithm for networks with heterogeneous attributes, how to handle dynamic networks, heterogeneous networks, or networks with noisy or missing data, possible extensions of the proposed method are also promising.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by National Natural Science Foundation of China [grant number 11702289]; Key Core Technology and Generic Technology R&D Project of Shanxi Province [grant number 2020XXX013].

Notes

1 https://linqs.org/datasets/#cora

2 https://linqs.org/datasets/#citeseer-doc-classification

3 https://linqs.org/datasets/#pubmed-diabetes

4 https://www.cc.gatech.edu/dimacs10/archive/clustering.shtml

References

Adamic, L. A., & Glance, N. (2005, August). The political blogosphere and the 2004 U.S. Election: Divided they blog. LinkKDD '05: Proceedings of the 3rd international workshop on Link discovery, ACM, New York, NY, USA, pp. 36–43. https://doi.org/10.1145/1134271.1134277
Google Scholar
Berahmand, K., Bouyer, A., & Vasighi, M. (2018). Community detection in complex networks by detecting and expanding core nodes through extended local similarity of nodes. IEEE Transactions on Computational Social Systems, 5(4), 1021–1033. https://doi.org/10.1109/TCSS.2018.2879494
Web of Science ®Google Scholar
Bhattacharya, R., Nagwani, N. K., & Tripathi, S. (2023). CommunityGCN: Community detection using node classification with graph convolution network. Data Technologies and Applications, https://doi.org/10.1108/DTA-02-2022-0056
Web of Science ®Google Scholar
Chen, J., Ma, T., & Xiao, C. (2018). Fastgcn: Fast learning with graph convolutional networks via importance sampling, CoRR abs/1801.10247. arXiv:1801.10247. http://arxiv.org/abs/1801.10247
Google Scholar
Danon, L., Díaz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008–P09008. https://doi.org/10.1088/1742-5468/2005/09/P09008
Web of Science ®Google Scholar
Defferrard, M., Bresson, X., & Vandergheynst, P. (2016). Convolutional neural networks on graphs with fast localized spectral filtering. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 29). Curran Associates, Inc. http://arxiv.org/abs/1606.09375
Google Scholar
Fortunato, S., & Hric, D. (2016). Community detection in networks: A user guide. Physics Reports, 659, 1–44. https://doi.org/10.1016/j.physrep.2016.09.002
Web of Science ®Google Scholar
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. https://doi.org/10.1073/pnas.122653799
PubMed Web of Science ®Google Scholar
He, C., Fei, X., Cheng, Q., Li, H., Hu, Z., & Tang, Y. (2021a). A survey of community detection in complex networks using nonnegative matrix factorization. IEEE Transactions on Computational Social Systems, 9(2), 440–457. https://doi.org/10.1109/TCSS.2021.3114419
Web of Science ®Google Scholar
He, C., Zheng, Y., Fei, X., Li, H., Hu, Z., & Tang, Y. (2021b). Boosting nonnegative matrix factorization based community detection with graph attention auto-encoder. IEEE Transactions on Big Data, 8(4), 968–981. https://doi.org/10.1109/TBDATA.2021.3103213
Web of Science ®Google Scholar
He, D., Song, Y., Jin, D., Feng, Z., Zhang, B., Yu, Z., & Zhang, W. (2020). Community-centric graph convolutional network for unsupervised community detection. In C. Bessiere (Ed.), Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20 (pp. 3515–3521). International Joint Conferences on Artificial Intelligence Organization, main track. https://doi.org/10.24963/ijcai.2020/486
Google Scholar
He, K., Li, Y., Soundarajan, S., & Hopcroft, J. E. (2018). Hidden community detection in social networks. Information Sciences, 425, 92–106. https://doi.org/10.1016/j.ins.2017.10.019
Web of Science ®Google Scholar
Huang, J., Sun, H., Liu, Y., Song, Q., & Weninger, T. (2011). Towards online multiresolution community detection in large-scale networks. PLoS One, 6(8), e23829. https://doi.org/10.1371/journal.pone.0023829
PubMed Web of Science ®Google Scholar
Jin, D., Li, B., Jiao, P., He, D., & Shan, H. (2019a). Community detection via joint graph convolutional network embedding in attribute network. Lecture Notes in Computer Science, 11731, 594–606. https://doi.org/10.1007/978-3-030-30493-5_55
Google Scholar
Jin, D., Liu, Z., Li, W., He, D., & Zhang, W. (2019b). Graph convolutional networks meet Markov random fields: Semi-supervised community detection in attribute networks. Proceedings of the AAAI Conference on Artificial Intelligence, 33(1), 152–159. https://doi.org/10.1609/aaai.v33i01.3301152
Google Scholar
Jin, D., Yu, Z. Z., Jiao, P. F., Pan, S. R., He, D. X., Wu, J., Yu, W., Paris, Yu., & Zhang, W. X. (2021a). A survey of community detection approaches: From statistical modeling to deep learning. IEEE Transactions on Knowledge and Data Engineering, 35(2), 1149–1170. https://doi.org/10.1109/TKDE.2021.3104155
Web of Science ®Google Scholar
Jin, W., Derr, T., Wang, Y., Ma, Y., Liu, Z., & Tang, J. (2021b, March). Node similarity preserving graph convolutional networks. WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Association for Computing Machinery, New York, NY, USA, pp. 148-156. https://doi.org/10.1145/3437963.3441735
Google Scholar
Kingma, D. P., & Ba, J. L. (2015, May 7-9). Adam: A method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), Conference Track Proceedings, San Diego, CA, USA. https://doi.org/10.48550/arXiv.1412.6980
Google Scholar
Kipf, T. N., & Welling, M. (2016, November 21). Variational graph auto-encoders. Bayesian Deep Learning Workshop (NIPS 2016). https://doi.org/10.48550/arXiv.1611.07308
Google Scholar
Kipf, T. N., & Welling, M. (2017, Feburary). Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1609.02907
Google Scholar
Kochenderfer, M. J., & Wheeler, T. A. (2019). Algorithms for optimization. Mit Press. https://doi.org/10.1007/978-1-4757-3222-1_2
Google Scholar
Kou, S., Xia, W., Zhang, X., Gao, Q., & Gao, X. (2021). Self-supervised graph convolutional clustering by preserving latent distribution. Neurocomputing, 437, 218–226. https://doi.org/10.1016/j.neucom.2021.01.082
Web of Science ®Google Scholar
Lambiotte, L. N. S., & RenaudPeel, T. (2020). Community detection in networks without observing edges. Science Advances, 6(4), eaav1478. https://doi.org/10.1126/sciadv.aav1478
PubMed Web of Science ®Google Scholar
Lancichinetti, A., Fortunato, S., & Kertész, J. (2009). Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 11(3), 033015. https://doi.org/10.1088/1367-2630/11/3/033015
Web of Science ®Google Scholar
Lee, C., Reid, F., McDaid, A. F., & Hurley, N. J. (2010). Detecting highly overlapping community structure by greedy clique expansion, pp. 33–42.
Google Scholar
Li, J., Wang, X., & Wu, P. (2015). Review on community detection methods based on local optimization. Bulletin of Chinese Academy of Sciences, 30(2), 238–247. https://doi.org/10.16418/j.issn.1000-3045.2015.02.011
Google Scholar
Li, Q., Han, Z., & Wu, X.-M. (2018). Deeper insights into graph convolutional networks for semi-supervised learning. Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 32(1), 3538–3545. https://doi.org/10.1609/aaai.v32i1.11604
Google Scholar
Liu, H., Wu, Z., Li, X., Cai, D., & Huang, T. S. (2011). Constrained nonnegative matrix factorization for image representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7), 1299–1311. https://doi.org/10.1109/TPAMI.2011.217
PubMed Web of Science ®Google Scholar
Moradi, F., Olovsson, T., & Tsigas, P. (2014). A local seed selection algorithm for overlapping community detection. 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing, China, pp. 1–8. https://doi.org/10.1109/ASONAM.2014.6921552
Google Scholar
Namata, G., London, B., Getoor, L., Huang, B., & Edu, U. (2012). Query-driven active surveying for collective classification. 10th International Workshop on Mining and Learning with Graphs (MLG-2012), Edinburgh, Scotland, UK, 8, 1.
Google Scholar
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3), 036106. https://doi.org/10.1103/PhysRevE.76.036106
PubMed Web of Science ®Google Scholar
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850. https://doi.org/10.1080/01621459.1971.10482356
Web of Science ®Google Scholar
Rostami, M., Oussalah, M., Berahmand, K., & Farrahi, V. (2023). Community detection algorithms in healthcare applications: A systematic review. IEEE Access 2023, 11, 30247–30272. https://doi.org/10.1109/ACCESS.2023.3260652
Google Scholar
Sen, P., Namata, G., Bilgic, M., Getoor, L., & Eliassi-Rad, T. (2008). Collective classification in network data. AI Magazine, 29(29), 93–106. https://doi.org/10.1609/aimag.v29i3.2157
Google Scholar
Šubelj, L., & Bajec, M. (2012). Ubiquitousness of link-density and link-pattern communities in real-world networks. The European Physical Journal B, 85(1), 32. https://doi.org/10.1140/epjb/e2011-20448-7
Web of Science ®Google Scholar
Sun, H., & Yang, J. (2023). The generalization of non-negative matrix factorization based on algorithmic stability. Electronics, 12(5), 1147. https://doi.org/10.3390/electronics12051147
Web of Science ®Google Scholar
Sun, K., Lin, Z., & Zhu, Z. (2020). Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. Proceedings of the AAAI Conference on Artificial Intelligence, 34(4), 5892–5899. https://doi.org/10.1609/aaai.v34i04.6048
Google Scholar
Velikovi, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2017). Graph attention networks. International Conference on Learning Representations (ICLR), pp. 1–12. http://arxiv.org/abs/1710.10903
Google Scholar
Verma, A. K., Saxena, R., Jadeja, M., Bhateja, V., & Lin, J. C. W. (2023). Bet-GAT: An efficient centrality-based graph attention model for semi-supervised node classification. Applied Sciences, 13(2), 847. https://doi.org/10.3390/app13020847
Google Scholar
Wang, X., Li, J., Yang, L., & Mi, H. (2021). Unsupervised learning for community detection in attributed networks based on graph convolutional network. Neurocomputing, 456, 147–155. https://doi.org/10.1016/j.neucom.2021.05.058
Web of Science ®Google Scholar
Wang, X., Liu, G., Li, J., & Nees, J. P. (2017). Locating structural centers: A density-based clustering method for community detection. PLoS One, 12(1), e0169355. https://doi.org/10.1371/journal.pone.0169355
PubMed Web of Science ®Google Scholar
Xie, Y., Xu, Z., Zhang, J., Wang, Z., & Ji, S. (2023). Self-supervised learning of graph neural networks: A unified review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2) 2412–2429. https://doi.org/10.1109/TPAMI.2022.3170559
PubMed Web of Science ®Google Scholar
Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2018). How powerful are graph neural networks?, https://doi.org/10.48550/arXiv.1810.00826
Google Scholar
Yang, L., Cao, X., He, D., Wang, C., Wang, X., & Zhang, W. (2016). Modularity based community detection with deep learning. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI-16), 16(2016), 2252–2258. https://doi.org/10.5555/3060832.3060936
Google Scholar
Zhang, W., Shang, R., Li, Z., Sun, R., & Du, J. (2023). Personalized web page ranking based graph convolutional network for community detection in attribute networks. IEEE Access, 11, 84270–84282. https://doi.org/10.1109/ACCESS.2023.3303210
Web of Science ®Google Scholar
Zheng, W.-P., Che, C.-H., Qian, Y.-H., Wang, J., & Yang, G. (2020). A graph clustering algorithm based on paths between nodes in complex networks. Chinese Journal of Computers, 43(7), 1312–1327. https://doi.org/10.11897/SP.J.1016.2020.01312
Google Scholar
Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., & Sun, M. (2020). Graph neural networks: A review of methods and applications. AI Open, 1, 57–81. https://doi.org/10.1016/j.aiopen.2021.01.001
Google Scholar

GCN-based weakly-supervised community detection with updated structure centres selection

Abstract

1. Introduction

2. Related work

3. Preliminaries

Table 1. Main symbols and their definitions.

3.1. Problem formulation

3.2. Graph convolutional network

4. Methodology

4.1. Overview

4.2. Selecting structure centres

Similarity based on Local Simple Paths, SLP

Local Structural Importance

Structural centrality

Table 2. Structural centrality example calculation process.

4.3. Constructing the balanced training set

4.4. Training GCN with expanded pseudo-labelled set

4.5 Computational cost

5. Experiments

5.1. Datasets

Table 3. Specific information of five network data sets.

5.2. Evaluation metrics

5.3. Baselines

5.4. Experimental results and analysis (RQ-1)

Table 4. ACC results for Cora dataset with different labels selected in each class.

Table 5. Citeseer dataset in each class selects different labels for ACC results.

Table 6. ACC results for Pubmed dataset with different labels selected in each class.

Table 7. F1 results for Cora dataset with different labels selected in each class.

Table 8. Citeseer dataset in each class selects different labels for F1 results.

Table 9. F1 results for Pubmed dataset with different labels selected in each class.

5.5. Ablation study (RQ-2)

Table 10. ACC results for Comparison of four variants of WC-GCN.

5.6. Case study for updating structure centre (RQ-3)

5.7. The influence of the size of the training set on the GCN (RQ-4)

6. Conclusion

Disclosure statement

Additional information

Funding

Notes

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date