Full article: Security situational awareness of power information networks based on machine learning algorithms

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

To properly predict the security posture of these networks, we provide a method based on machine learning algorithms to detect the security condition of power information networks. A perception model outlines the consequences of the abstracted perception problem. Sample data is initially pre-processed using linear discriminant analysis methods to optimise the data, get integrated features, and ascertain the best projection. To assess system safety posture and find mapping relationships with network posture values, the cleaned data is subsequently input into an RBF neural network as training data. The reliability of the suggested technique for network security posture analysis is finally shown by simulations using the KDD Cup99 dataset and attack data from power information networks, with detection rates frequently surpassing 90%.

KEYWORDS:

1. Introduction

Dispatching automation systems are increasingly exposed to cyber dangers and attacks because of the construction of smart grids and ongoing information technology advancements (Envelope, Citation2022). Efficiency, global connection, data sharing, remote work, economic development, innovation, IoT, vital infrastructure, communication, digital services, education, and emergency response all depend on information networks. For instance, in 2010, the “Shock net” virus attacked Iran's uranium enrichment infrastructure by taking advantage of undiscovered flaws in information systems. Situational awareness has emerged as a novel method of cyber security evaluation and protection because traditional security techniques are no longer adequate to meet protection needs (Liu & Yang, Citation2023). DoS and DDoS assaults, malware, network congestion, insider threats, ransomware infections, data breaches, APTs, botnets, IoT vulnerabilities, phishing, and software flaws are security risks to a company's network performance. To sustain peak performance, effective security measures are essential. This has made situational awareness a crucial component of industrial control system cyber security. By examining crucial behavioural and state data in the network, situational awareness aims to identify whether a network is at risk from cyberattacks and quantify the security status of the system.

Situational awareness was first introduced in disciplines like sociology and avionics. The idea of situational awareness for cyber security was first presented by (Liang et al., Citation2021), who also built a reference model based on the concept of data fusion for the identification, assessment, and localisation of cyberattacks. Experiments in cybersecurity situational awareness encompass the preparation and collecting of data as well as the assessment and analysis of algorithms, with a particular focus on the study and optimisation of algorithms for anomaly detection and threat identification. Situational awareness in cybersecurity is not a brand-new idea in power systems. While proposing a paradigm for network security situational awareness in terms of physical security, system security, and protocol security, (Xu et al., Citation2021) deployed network security situational awareness technology to substation networks without performing any specific quantitative calculations on it. (Yang et al., Citation2021) Begins with a security situational awareness approach for computer networks, combining the probability of attack occurrence, the probability of attack success, and the threat of attack to evaluate the state of the power grid now. For effective anomaly detection and situational comprehension, the security situational awareness framework blends Linear Discriminant Analysis (LDA) and Radial Basis Function, providing thorough detection, fewer false positives, efficient resource utilisation, and flexibility. However, the method does not introduce feature extraction methods in detail, and the information extracted from the features has not been validated. A crucial stage in dimensionality reduction is feature extraction, which is accomplished using a variety of techniques. Principal Component Analysis (PCA), a methodology for converting high-dimensional data into a lower-dimensional space while retaining as much variation as feasible, is one of the frequently used approaches. By projecting information with high dimensions onto discriminative directions, it converts it into a lower-dimensional space. To determine the finest discriminative instructions, LDA computes class means and scatter matrices before performing eigenvalue-eigenvector reduction.

Machine learning offers a range of options, including neural networks and random forests, which open new opportunities for situational awareness of network security as artificial intelligence advances quickly (Chen et al., Citation2021a), (Liu et al., Citation2021), suggested a ball vector machine classifier approach for electric power information networks based on quantum genetic algorithm with optimised training parameters for precise classification of network posture. On the other hand, (Chen et al., Citation2021b) increase classification accuracy by splitting the dataset into subsets and integrating training and learning with a distribution line fault classification approach based on the random forest algorithm. The K-nearest neighbour technique is also used by (Wang et al., Citation2021) as a classifier for intrusion detection systems to find unlawful attacks. IoT network security is predicted using preprocessed data and an RBF neural network in a study technique based on machine learning algorithms. Since its launch in China Mobile, it has discovered over 65,000 suspected illicit IoT cards, enhancing efficiency, detection, and classification. This technique lowers operator expenses and false alarm rates (Meng, Citation2022).

This study suggests a security situational awareness approach that combines linear discriminant analysis (LDA) and radial basis function (RBF) neural networks to integrate the features of existing situational awareness methods (Shu et al., Citation2021). It is not possible to understand and perceive the network state accurately by using it directly as an input to the neural network due to the complexity and diversity of network feature selection (Liu et al., Citation2022). The samples are subsequently pre-processed using LDA to efficiently fuse and extract feature metrics to generate the best projection for the data's optimum separability. The goal of state awareness is then achieved by training the RBF neural network model with the processed input to identify the mapping relationship with the network state values (Yu et al., Citation2021). The RBF neural network model is adaptable and appropriate for non-linear data modelling and decision-making since it can do pattern recognition, regression, categorisation, anomaly detection, as well as time series analytics. Using a Scrappy web crawler architecture, this article investigates network security scenario awareness and measurement. Data were gathered from the China Computer Network Intrusion Prevention Centre's vulnerability database and the Zhiming network security event websites. The development of a text-based analysis tool improved data cleansing and offered complete answers. In comparison to conventional methods, the crawler algorithm boosted capacity by 12.79% and 29.33% and decreased reading time by 63.5% and 87.2% (Wu et al., Citation2022).

Due to the development of the smart grid and improvements in information technology, dispatching automation systems are becoming more susceptible to cyber-attacks. Through the analysis of behavioural and state data, situational awareness is a novel technique for assessing and safeguarding complex systems. Situational awareness in network security has new like neural networks and random forests. However, because of its complexity, using network state directly as input to neural networks is difficult. A strategy based on machine learning methods is suggested to forecast the security posture of power information networks. Utilising linear discriminant analysis techniques, sample data is pre-processed to find integrated characteristics and improve the quality of the data. Simulations utilising the KDD Cup99 dataset and attack data from power information networks show the technique's dependability.

2. Power information network security situational awareness methods

Network risks significantly affect grid operations in power systems because the grid is becoming more and more dependent on information networks (Huang et al., Citation2021). Performance metrics such as the operation, traffic patterns, and status tracking of devices in the power information network need to be continuously monitored, collected, and extracted (Li et al., Citation2021) to effectively evaluate and anticipate the cyber security posture of a system. With understanding and predictive capabilities, information network defense can become proactive rather than reactive, enabling the prompt implementation of efficient security measures to defend the grid against assault. Three key components of power information network security situational awareness are as follows.

2.1. Situation element extraction

The security condition and original data of the network under assault are obtained using a variety of sensors or detecting devices in this module, from which the typical indicators that have a bigger impact on the power information network are retrieved to serve as the data foundation for the subsequent work. In a network that is being attacked, using a variety of sensors and detection tools enables thorough monitoring, real-time analysis, and anomaly detection, providing a full picture of the network, security conditions, and forensic investigation.

2.2. Posture understanding

To map the network's security state and create a macroscopic situational awareness model, the extracted situational element information is analyzed using neural networks or mathematical models to ascertain the relationship between the information and the situational situation. The macroscopic situational awareness approach improves decision-making and knowledge across a variety of fields by using neural networks to extract useful information from complicated data environments.

2.3. Situation prediction

Based on the extraction of posture components from the power dispatch automation system and an understanding of the mapping model, the security risk assessment and prediction of the electric power information network makes qualitative or quantitative inferences about the values of the network security posture. The security risk assessment for the electric power information network employs a combination of qualitative and quantitative analysis to pinpoint threats, weigh consequences, and foresee weaknesses, allowing for well-informed decision-making and ongoing development.

2.4. Security posture level classification

The security posture of the dispatch automation system is divided into five assessment levels per the “Information Security Technology Information Security Risk Assessment Specification” (GB/T 20984-2007), which takes into account the risk factors of the system and the threat of attackers. The security posture values in the [0, 1] interval is then used to quantitatively describe the system behaviour and network characteristics for each level. An organised framework for managing and analyzing information security risks inside an organisation, detailing techniques, and best practices, is provided by the Information Security Technology Information Security Risk Assessment Specification. The circumstance (Zhang et al., Citation2021). The evaluation of machine learning-based methods for network security situational awareness is based on these security levels. To comprehensively assess the security levels and use a rating scale to quantify the values under various levels, we combine the observed phenomena of various attacks, such as the number of active ports, the severity of virus threats, and the number of open vulnerabilities (Song et al., Citation2021). The network security posture scale is shown below (Table ). Through such security level classification and quantitative description of security posture values, we can make an accurate assessment of the security posture of the dispatch automation system and provide an effective evaluation and optimisation basis for machine learning-based network security posture awareness methods.

Table 1. Network Security Situation Level Table.

Download CSV Display Table

3. LDA-RBF-based network security situational awareness algorithm

3.1. LDA

To extract classification information and reduce the dimensionality of the feature space, LDA is the projection of a high-dimensional sample into the best discriminative vector space (Wang et al., Citation2021). Linear Discriminant Analysis (LDA) is supervised for classification problems and improves data representation and classification performance by translating high-dimensional data into a lower-dimensional space. LDA can be used to provide optimal sample projection, extract integrated features, and eliminate redundant or complex information due to the wide variety and complexity of information network features. LDA decreases dimensionality while boosting class differences, producing instructive features for the next RBF network. With better classification accuracy and well-defined decision limits, the method excels at identifying complicated, non-linear patterns and anomalies. Both PCA and LDA are dimensionality reduction methods, with LDA boosting class separation in classification tasks while PCA concentrates on reducing data dimensions. Depending on the assignment, PCA is flexible whereas LDA emphasises definite class boundaries.

Let the number of samples collected from the power information network be n, the total number of features d, and the sample matrix with c categories (safety categories), where the number of samples of category i sample $ω_{i}$ is $n_{i}$ and satisfies $\sum_{i = 1}^{c} n_{i} = n$ , and the centroids of each category and all samples are $μ_{i} = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} x_{i j}$ and $μ = \frac{1}{n} \sum_{i = 1}^{c} \sum_{j = 1}^{n_{i}} x_{i j}$ respectively. Let the projection matrix $W \in R^{d \times d^{,}}$ , then the projected sample matrix is $z = W^{T} X$ , and the centroids of the projected class i samples $ζ_{i}$ and of all samples $ζ$ : (1) $\begin{aligned} ζ_{i} & = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} z_{j} = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} W^{T} x_{j} = W^{T} μ_{i} \end{aligned}$ (1) (2) $\begin{aligned} ζ & = \frac{1}{n} \sum_{j = 1}^{n} z_{j} = \frac{1}{n} \sum_{i = 1}^{c} \sum_{j = 1}^{n_{i}} W^{T} x_{i j} = W^{T} μ \end{aligned}$ (2) LDA finds an optimal discriminative projection vector with the following objective function: (3) $\begin{aligned} J & = \arg max_{W} \frac{J_{B}}{J_{W}} = \frac{| W^{T} S_{B} W |}{| W^{T} S_{W} W |} \end{aligned}$ (3) (4) $\begin{aligned} S_{B} & = \sum_{i = 1}^{c} (μ_{i} - μ) (μ_{i} - μ)^{T} \end{aligned}$ (4) (5) $\begin{aligned} S_{W} & = \sum_{i = 1}^{c} \sum_{j = 1}^{n_{i}} (x_{i j} - μ_{i}) (x_{i j} - μ_{i})^{T} \end{aligned}$ (5)

where: $S_{B}$ ______new sample interclass scatter matrix.

$S_{W}$ ______ new space intra-class scatter matrix.

Solving the eigenvalue problem $S_{B} W = S W_{A}$ yields the optimal projection matrix $W$ . Steps of the algorithm for solving the LDA projection matrix.

Input the training sample matrix $X \in R^{d \times n}$ , where $d$ denotes the number of feature indicators collected and $n$ is the number of samples collected. Solve for each type of sample centre $μ_{i}$ as well as $μ$ .
Solve for the inter-class dispersion matrix $S_{B}$ and the intra-class dispersion matrix $S_{W}$ .
Find its eigenvalues and eigenvectors by equation $S_{B} W = S W_{A}$ to find the projection matrix $W$ , the first $d$ eigenvectors.

3.2. RBF neural networks

RBF neural networks do not suffer from local minima as well as slow learning rates. (Lin et al., Citation2021) demonstrated that RBF neural networks can approximate arbitrary nonlinear functions with any accuracy. In machine learning techniques like SVMs and clustering, the Radial Basis Function (RBF) is a kernel function that improves classification and clustering accuracy by preserving nonlinear features. The structure of RBF neural networks is shown in Figure .

Figure 1. RBF neural network structure.

Common types include string, custom, sigmoid, Laplacian, polynomial, and Gaussian. Each category captures complicated non-linear connections, polynomial, sigmoidal, localised patterns, outliers, sequence matches, and unique relationships. There are various forms of kernel functions in the implicit layer nodes, but the most used is the Gaussian function, with the function $R_{i} (z)$ expressed as. (6) $\begin{aligned} R_{i} (z) = G | | Z_{k} - C_{i} | | = e x p [- \frac{| | Z_{k} - C_{i} | |}{2 σ_{i}^{2}}] \end{aligned}$ (6) Where: $G$ —base function.

$Z_{k}$ — $k$ th sample vector.

$C_{i}$ —centre of the $i$ th hidden layer neuron.

$σ_{i}$ —the scale function of the kernel function.

$| | Z_{k} - C_{i} | |$ parametric number.

A scale function is used to change the scale or size units of data to make it less dissimilar and appropriate for investigation or modelling. A base function is a basic statistical component utilised in numerous operations and modelling approaches. The output of the network can be obtained from Figure as. (7) $\begin{aligned} y = \sum_{i = 1}^{n} ω_{i} R_{i} (z) \end{aligned}$ (7) Where: $y$ —Output.

$ω_{i}$ —network weights of the implicit layer node $i$ and the output layer.

3.3. Algorithm flow

Combined with the theoretical description, the proposed overall prediction process for power information network security situational awareness.

Step 1: According to the properties of the power information network, raw network data is gathered, features are extracted to create sample sets, and security category labels are then applied. Table displays the format of the data that was gathered. Each piece of data is crystallized into a variety of dimensions, each of which corresponds to a particular piece of collected data. The network behaviour corresponds to that piece of data's security category.

Table 2. Acquisition data format.

Display Table

Step 2: The sample data generated in step 1 is divided into training and test sample sets, and the training sample matrix is $X \in R^{d \times n}$ , where $d$ denotes the number of feature indicators collected and $n$ is the number of samples collected. Using the LDA optimisation process, the sample matrix $Z$ and the projection matrix $W$ are obtained in the optimal projection space. To improve classification accuracy, data visualisation, pattern identification, medical diagnosis, and other uses in several disciplines, linear discriminant analysis (LDA) resolves obstacles such as class imbalance, dimensional scream, non-linearity, anomaly sensibility, and data distribution reliance.

Step 3: Build the RBF neural network, use the LDA pre-processed data as the training input of the RBF, and use the attack category or security index corresponding to this sample matrix as the training output. Then train the RBF neural network model, and the training is complete when a specific network error is satisfied, i.e. discover the mapping relationship with the network posture value.

Step 4: The test data is used as input and after the projection matrix $Z$ and the RBF neural network model, the corresponding situational awareness results are obtained.

The flow chart of the algorithm is shown in Figure .

Figure 2. Algorithm flow chart.

3.4. Security situational awareness framework

With the data acquisition module and the human-machine interface, the security situational awareness framework combines the LDA-RBF, which serves as its core module, to create the security situational awareness structure of the power information network. The LDA-RBF technique, which combines Linear Discriminant Analysis (LDA) with Radial Basis Function (RBF) to handle complicated data interactions, represents a significant development in security situational awareness. As a result, security monitoring and anomaly detection are improved. It successfully differentiates between normal and unusual behaviour, decreases dimensions, and makes use of non-linear transformation capabilities. Figure depicts the power information network security situational awareness structure. Another module is the “User Interaction and Feedback Module” which improves usability and user experience by offering interactive features, feedback mechanisms, and instructions.

Figure 3. Power information network security situational awareness structure.

In the field monitoring region, smart measurement devices are often set up, and the data collected by each sensor in real-time is transferred to a database via a concentrator (Zeng, Citation2021). The historical behaviour database and the cyber threat database make up the database's two primary sections. The Historical Behaviour Database logs everyday, innocuous network or system activity over time, including user actions, regular tasks, and communication patterns. It aids in creating baselines and identifying deviations from the norm, which may point to security events or anomalies. The Cyber Threat Database offers verified data on cyber threats, assaults, weaknesses, and criminal activity. Data is compared to known threat profiles and attack patterns to help security systems find behaviours that correspond to known dangers. The two primary components of a cybersecurity database are the Historical Behaviour Database and the Cyber Threat Database. While the latter offers reference points for recognising known risks and assaults, the former develops typical behaviour patterns, improving an organisation's capacity to recognise and respond to cybersecurity problems. The network threat database is used to store various threat sample data to assess the security posture of the network, while the historical behaviour database is used to keep raw sample data and real-time acquired network data. LDA enhances class discrimination and identification accuracy by increasing inter-class distance and decreasing intra-class variance, hence optimising data preprocessing and laying the groundwork for classification methods. A security posture awareness model is created using the LDA-RBF technique, which entails gathering historical network data, using LDA for feature extraction, training an RBF neural network, modifying hyperparameters, deployment for real-time detection of anomalies, and retraining regularly. Using historical data, the LDA-RBF approach is trained to create a security posture awareness model (Ma & Zhang, Citation2021). Once trained, the model can use real-time data to be measured to conduct situational awareness of network behaviour. Both the network security status and alarm alerts for the monitored region are shown by the human-machine interface module, which is divided into two sections. The first part of the module generates the required rules for system requirements.

4. Analysis of experimental results

4.1. Experimental dataset

We decided to use intrusion detection assessment data from the KDD Cup99 dataset for training and testing to conduct simulation tests on the suggested security situational awareness method (Zhao et al., Citation2021). The KDD Cup 1999 dataset, with 4.9 million records and 41 network connection parameters, is essential for analyzing machine learning and data mining techniques for intrusion detection in cybersecurity research. This dataset includes both typical network data and four primary attack types: DoS (Denial of Service), Probe (Detection), U2R (User-to-Root attack), and R2L (Remote Login assault). DoS, Probe, U2R, and R2L are examples of cybersecurity attacks that overwhelm systems with traffic, gather information about flaws, provide users more rights, and make use of authentication flaws. Effective defenses and intrusion detection systems require a thorough understanding of these categories. The dataset for each sample data includes 41 feature attributes as well as a type of label to indicate whether the data is normal or the result of an attack.

Detection of the type of cyberattack and quantitative evaluation of the cyber security posture included the two components of the experiments on situational awareness in cyber security. We can assess the effectiveness of the suggested mechanism in recognising various network threats and offer a precise quantitative assessment of the network security posture by training and testing the KDD Cup99 dataset (Lai et al., Citation2020). These experimental findings will serve as a crucial foundation for us to confirm the viability and efficiency of the suggested mechanism.

4.2. Network attack category detection

A portion of data samples were randomly selected for training and testing, and the types of attacks with corresponding sample numbers are shown in Table .

Table 3. Types of Attacks and Corresponding Sample Numbers.

Download CSV Display Table

Figure illustrates the recognition rates for various attack types. Figure shows that the LDA-RBF approach often has a recognition rate of over 90% (Shi et al., Citation2019). Due to the modest amount of “buffer overflow” type samples in the dataset, the accuracy rate is 88%. In terms of accuracy rate, linear discriminant analysis (LDA), which emphasises class separability, dimensionality reduction, and feature interpretability, surpasses RBF and BP neural networks. Combining several methods might produce better outcomes. The suggested method exhibits a strong advantage over RBF neural networks and BP neural networks due to its greater accuracy rate. LDA-BP is a hybrid technique that combines the feature extraction power of linear discriminant analysis (LDA) with the learning capabilities of backpropagation neural networks (BP) for classification.

Figure 4. Recognition rate by attack type.

Assuming that the records in the “Normal” class are positive samples and the other classes are negative samples, the False Negative Rate (FNR) and False Positive Rate (FPR) can be used as performance indicators for the algorithm. False alarm rates are shown by FPR, whereas FNR gauges missed abnormalities. It's critical to balance the trade-offs between them. This entails modifying decision thresholds, taking into account real-world repercussions, and modifying models in response to shifting facts. The priorities, risk tolerance, and operational environment of the application will determine the proper balance. (8) $\begin{aligned} F N R & = \frac{N_{e}}{N} \times 100 % \end{aligned}$ (8) (9) $\begin{aligned} F P R & = \frac{P_{e}}{P} \times 100 % \end{aligned}$ (9) Where: $N_{e}$ ____ Number of negative samples with errors.

$N$ ____ Total number of negative samples.

$P_{e}$ ____ Number of errors in positive samples.

$P$ ____ Total number of positive samples.

To reflect the superiority of the proposed method, several other algorithms were selected for comparison. The evaluation results of each method are shown in Table .

Table 4. Evaluation results for each method.

Download CSV Display Table

Comparing the LDA-RBF approach to the RBF neural network alone, there is some gain in recognition accuracy. While maintaining class separability, LDA decreases dimensionality, and the ensuing RBF neural network captures intricate non-linear patterns. In settings with complex and non-linear data distribution, this combination performs especially well. Due to the high sample count and the fact that the normal samples are only of the “Normal” type, there is no difference between the two approaches in this area (Wang et al., Citation2020). However, the underreporting rate of negative samples was improved by 2.02% with the LDA-RBF approach. The approach developed in this paper demonstrates a substantial advantage in network attack category detection when compared to the method in the literature (He et al., Citation2020).

4.3. Quantitative evaluation of security posture

The types of cyber attacks considered mainly include four categories, which are quantified in terms of threats concerning relevant literature. The threat event quantification values are shown in Table .

Table 5. Quantitative values of threat events.

Download CSV Display Table

The network threat value is established based on the type of attack the network is subjected to, the network security posture level to which it belongs is assessed, and finally, its median value is taken as the network security posture value. Figure displays a comparison of the output findings. Figure illustrates how, in many situations, the anticipated output of LDA-RBF matches the actual circumstance very well and outperforms the expected output of RBF alone (Zhou et al., Citation2023a), (Zhou et al., Citation2023b).

Figure 5. Comparison of output results.

The error values for each test sample are shown in Figure . Only three of these points, with the others within 0.2 of the error, have very small output errors, all of which are more accurate in predicting network security posture values.

Figure 6. Error values for each test sample.

Three error evaluation metrics commonly used in forecasting were selected to evaluate the forecasting results of the simulation experiments, namely Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE). (10) $\begin{aligned} MAE & = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} | \end{aligned}$ (10) (11) $\begin{aligned} MSE & = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2} \end{aligned}$ (11) (12) $\begin{aligned} RMSE & = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |} \end{aligned}$ (12)

Where: $y_{i}$ ____ actual network potential value.

${\hat{y}}_{i}$ ____ Predicted potential value.

$n$ ____ Number of predicted samples.

The results of the error evaluation indicators are shown in Table .

Table 6. Results of error evaluation indicators.

Download CSV Display Table

The LDA-RBF technique maintains a smaller mean error, mean squared error, and root mean squared error than the other methods, as seen in Table when comparing the RBF neural network. This is because the LDA processes the data samples efficiently, leading to a larger improvement in accuracy. It demonstrates that when utilised for situational awareness, the suggested LDA-RBF approach has greater stability and accuracy.

4.4. Power information network attack experiments

Based on a domestic electric power information system, the electric power information network environment is shown in Figure . Among the test equipment are the vulnerability scanning tool (RSAS NX3 V6.0), Ko lai Network Analysis System (CSNAS), network performance tester (Spirent Test CenterC1), and attack tester Avalanche. A network security architecture for real-time detection and reaction to network threats is called CSNAS (Cyber Security Network Attack System). To improve cybersecurity, it uses automated incident response and anomaly detection. The Remote Sensing and Automation System (RSAS) specialises in remote data collecting, transmission, and control for uses including industrial automation and environmental monitoring.

Figure 7. Power information network environment.

In this environment, the operation is simulated under normal operation and Avalanche injection attack. Attack information is regularly collected from the monitoring platform, network traffic information is collected by CSNAS, vulnerability scanning information is collected by RSAS, etc.

4.4.1. Experiment 1

As in Figure , the test equipment was connected to the scheduling automation network environment, and traffic information was collected from the router at regular intervals. Traffic statistics under normal were collected every 10 s. Traffic statistics under normal are shown in Table ; packet statistics under normal are shown in Table .

Table 7. Network Traffic Statistics under Normal Conditions.

Download CSV Display Table

Table 8. Packet statistics under normal.

Display Table

A composite message with a digital signature is sent from the master server to the intelligent terminal RTU, and the time difference between the sending of the message and the return of the confirmation message from the terminal is calculated, i.e. the network transmission delay. The delay test results are shown in Table .

Table 9. Time delay test results.

Download CSV Display Table

4.4.2. Experiment 2

The attack tester Avalanche was connected to the test network through a switch, and penetration attacks were carried out to the test system, with specific types of DDoS attacks, SQL injection attacks, UDP flooding attacks, replay attacks launched on end devices, network storms, etc.

After adding the storm traffic to the network, the network traffic was re-captured (every 10 s) and the network traffic statistics under the attack are shown in Table ; the packet statistics under the attack are shown in Table .

Table 10. Network traffic statistics under attack.

Download CSV Display Table

Table 11. Packet statistics under attack.

Display Table

After joining the storm, the traffic on the network increases significantly and varies irregularly.

During the attack, the master emulator sends digitally signed load messages to the terminal to verify that the device can perform correctly and to calculate the time difference between sending the message and acknowledging the return message. The results of the delay test under attack are shown in Table . Compared to Experiment 1, the latency has increased by a factor of nearly one.

Table 12. Delay test results under attack.

Download CSV Display Table

Experimental data from an information network environment provided by a power company was collected for experimental validation. Seven categories of network behaviour were collected, and a random portion of the data was selected as training and test data. The sample set is shown in Table .

Table 13. Sample Set.

Download CSV Display Table

Figure displays the identification outcomes of each attack on the information network. As shown in Figure , the suggested method has several benefits, particularly for the first three categories of samples, which are typically challenging to identify because port scanning, etc., has little effect on network traffic until the subsequent intrusion assault action is carried out. Due to similarities to acceptable user behaviour and system operations, the absence of obvious anomalies, and the wide variation in typical behaviour, it can be difficult to distinguish between normal or benign samples. Contextual knowledge and feature engineering are necessary for effective identification. The first three types of samples used in intrusion detection and network security have little effect on network traffic and probably reflect normal or beneficial activity. This enables security systems to concentrate on locating potentially harmful or out-of-the-ordinary activity. The suggested approach uses LDA to pre-process the sample data to offer the samples the best possible separability.

Figure 8. Identification results for each attack in the information network.

The results of the evaluation of the algorithms were compared and the comparison of the methods is shown in Table . There is a near 10% improvement in the overall recognition rate and the proposed method has a very significant improvement in recognition accuracy.

Table 14. Comparison of methods.

Download CSV Display Table

5. Conclusion

An LDA-RBF-based strategy is suggested in this paper to establish security situational awareness of power information networks. Before optimising the inter- and intra-class relationships of the samples, the approach first performs dimensionality reduction on the sample data. The network security posture is then measured for situational awareness using RBF neural networks. By doing simulation experiments with the KDD Cup99 dataset and experimental data from power information networks, the proposed method is contrasted with alternative methods. The testing outcomes demonstrate that the approach is quite accurate at detecting network threats.

Author contributions

The authors confirm contribution to the paper as follows: study conception and design: Chao Wang; data collection: Jia-han Dong, Guang-xin GUO; analysis and interpretation of results: Tian-yu REN, Xiao-hu WANG; draft manuscript preparation: Ming-yu Pan. All authors reviewed the results and approved the final version of the manuscript.

Consent for publication

All authors reviewed the results, approved the final version of the manuscript, and agreed to publish it.

Acknowledgments

The authors would like to show sincere thanks to those techniques who have contributed to this research.

Data availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

Chen, Z., Guo, Y., Bai, D., Wang, J., Dong, Y., Qian, S., Lu, T., & Xing, H. (2021a). Research on cyber security defense and protection in power industry. Journal of Physics: Conference Series, 1769(1), 012040 (7pp). https://doi.org/10.1088/1742-6596/1769/1/012040
Google Scholar
Chen, Z., Yan, L., Lü, Z., Zhang, Y., Guo, Y., Liu, W., & Xuan, J. (2021b). Research on zero-trust security protection technology of power iot based on blockchain. Journal of Physics: Conference Series, 1769(1), 012039 (8pp). https://doi.org/10.1088/1742-6596/1769/1/012039
Google Scholar
Envelope, L. L. (2022). Research on status information monitoring of power equipment based on internet of things. Energy Reports, 8, 281–286.
Web of Science ®Google Scholar
He, J., Shang, B., Li, Y., & Yin, B. (2020). Research on automatic mining and utilization of vulnerability in power web system. Journal of Physics: Conference Series, 1550(3), 032057 (7pp). https://doi.org/10.1088/1742-6596/1550/3/032057
Google Scholar
Huang, F., Yong, M., & Jian, Z. (2021). Genetic algorithm-based power system information security risk assessment method. Journal of Physics: Conference Series, 1852(2), 022063 (7pp). https://doi.org/10.1088/1742-6596/1852/2/022063
Google Scholar
Lai, J., Wu, J., & Jiang, Y. (2020). Research on risk prevention measures of network information security in power systems. IOP Conference Series: Materials Science and Engineering, 768(6), 062018 (4pp). https://doi.org/10.1088/1757-899X/768/6/062018
Google Scholar
Li, G., Peng, Z., & Yu, B. (2021). Research on self-healing mode of communication channel of regional power grid stability control system. Journal of Physics: Conference Series, 1982(1), 012157 (6pp). https://doi.org/10.1088/1742-6596/1982/1/012157
Google Scholar
Liang, Y., Zhang, L., Wang, S., Zhang, J., Zhang, J., & Wu, M. (2021). Research on network threat and situation assessment method of electric power information system. Journal of Physics: Conference Series, 1883(1), 012105 (7pp). https://doi.org/10.1088/1742-6596/1883/1/012105
Google Scholar
Lin, T., Zhao, Y., Zhang, H., Li, G., & Zhang, J. (2021). Research on information security system of ship platform based on cloud computing. Journal of Physics: Conference Series, 1802(4), 042032 (7pp). https://doi.org/10.1088/1742-6596/1802/4/042032
Google Scholar
Liu, S., Yu, Y., Guo, A., Zhao, Z. Y., Wu, Z. H., & Zhou, Y. (2021). Research on security verification mechanism of perception layer terminal of power internet of things based on device operation fingerprint. IOP Conference Series: Earth and Environmental Science, 692(2), 022024 (7pp). https://doi.org/10.1088/1755-1315/692/2/022024
Google Scholar
Liu, X., Wang, H., Sun, Q., & Guo, T. (2022). Research on fault scenario prediction and resilience enhancement strategy of active distribution network under ice disaster. International Journal of Electrical Power & Energy Systems, 135(3), 107478. https://doi.org/10.1016/j.ijepes.2021.107478
Google Scholar
Liu, Z., & Yang, W. (2023). Research on the value positioning of university library based on customer delivered value. Asian Agricultural Research, 15(2), 4.
Google Scholar
Ma, T., & Zhang, G. (2021). Research on self-adaptive clustering algorithms for large data sparse networks based on information entropy. Journal of Physics: Conference Series, 1941(1), 012041 (13pp). https://doi.org/10.1088/1742-6596/1941/1/012041
Google Scholar
Meng, L. (2022). Internet of things information network security situational awareness based on machine learning algorithms. Mobile Information Systems, 2022.
Web of Science ®Google Scholar
Shi, S., Wang, L., Zheng, S., Lv, J., & Zhang, Q. (2019). Research on multiple security authentication schemes for mobile applications of power trading platforms. IOP Conference Series: Materials Science and Engineering, 486(1), 012108. https://doi.org/10.1088/1757-899X/486/1/012108
Google Scholar
Shu, Z. M., Liu, Y. G., Wang, H. N., Sun, C. L., & He, S. S. (2021). Research on the technology of preventing illegal terminal access. Journal of Physics: Conference Series, 1871(1), 012144. https://doi.org/10.1088/1742-6596/1871/1/012144
Google Scholar
Song, X., Hao, C., Zhang, X., & Wang, Y. (2021). Research on power environment monitoring system of information room. IOP Conference Series: Earth and Environmental Science, 680(1), 012024 (5pp). https://doi.org/10.1088/1755-1315/680/1/012024
Google Scholar
Wang, L. (2021). Retracted: research on network security maintenance based on computer technology. Journal of Physics: Conference Series, 1915(2), 022001 (5pp). https://doi.org/10.1088/1742-6596/1915/2/022001
Google Scholar
Wang, S., Zhang, L. H., Zhang, J., Tang, C., & Wang, H. (2021). Research on intelligent identification method for access equipment of grid information system. Journal of Physics: Conference Series, 1792(1), 012015 (6pp). https://doi.org/10.1088/1742-6596/1792/1/012015
Google Scholar
Wang, Y., Gao, Y., & Luo, B. B. (2020). Research on security protection technology based on terminal information jump. Journal of Physics: Conference Series, 1651(1), 012044 (4pp). https://doi.org/10.1088/1742-6596/1651/1/012044
Google Scholar
Wu, X., Wei, D., Vasgi, B. P., Oleiwi, A. K., Bangare, S. L., & Asenso, E. (2022). Research on Network Security Situational Awareness Based on Crawler Algorithm. Security and Communication Networks, 2022.
Web of Science ®Google Scholar
Xu, H., Li, H., Chen, S., & Wu, X. (2021). Research on health assessment of electric power information system based on deep belief networks and cluster analysis. IOP Conference Series: Earth and Environmental Science, 675(1), 012123 (9pp). https://doi.org/10.1088/1755-1315/675/1/012123
Google Scholar
Yang, C. Y., Ling, Y., & Li, X. (2021). Research on information encryption algorithm under the power network communication security model. Journal of Physics: Conference Series, 1852(3), 032007 (7pp). https://doi.org/10.1088/1742-6596/1852/3/032007
Google Scholar
Yu, T., Yin, X., Yao, M., & Liu, T. (2021). Network security monitoring method based on deep learning. Journal of Physics: Conference Series, 1955(1), 012040 (6pp). https://doi.org/10.1088/1742-6596/1955/1/012040
Google Scholar
Zeng, H. (2021). SolarWinds supply chain breach threatens government agencies and enterprises worldwide. Network Security, 2021(1), 1–3. https://doi.org/10.1016/S1353-4858(21)00001-5
Google Scholar
Zhang, L. H., Liang, Y., Tang, Y., Wang, S., Tang, C., & Liu, C. (2021). Research on unknown threat detection method of information system based on deep learning. Journal of Physics: Conference Series, 1883(1), 012107 (6pp). https://doi.org/10.1088/1742-6596/1883/1/012107
Google Scholar
Zhao, Q., Cheng, Y., & Zhou, T. (2021). Research on Node localization Algorithm in WSN Based on TDOA. Journal of Physics: Conference Series, 1757(1), 012142 (7pp). https://doi.org/10.1088/1742-6596/1757/1/012142
Google Scholar
Zhou, J., Pang, L., & Zhang, W. (2023a). Underwater image enhancement method via multi-interval subhistogram perspective equalization. IEEE Journal of Oceanic Engineering, 48(2), 474–488. https://doi.org/10.1109/JOE.2022.3223733
Web of Science ®Google Scholar
Zhou, J., Sun, J., Zhang, W., & Lin, Z. (2023b). Multi-view underwater image enhancement method via embedded fusion mechanism. Engineering Applications of Artificial Intelligence, 121(1), 105946.
Google Scholar

Security situational awareness of power information networks based on machine learning algorithms

Abstract

1. Introduction

2. Power information network security situational awareness methods

2.1. Situation element extraction

2.2. Posture understanding

2.3. Situation prediction

2.4. Security posture level classification

Table 1. Network Security Situation Level Table.

3. LDA-RBF-based network security situational awareness algorithm

3.1. LDA

3.2. RBF neural networks

3.3. Algorithm flow

Table 2. Acquisition data format.

3.4. Security situational awareness framework

4. Analysis of experimental results

4.1. Experimental dataset

4.2. Network attack category detection

Table 3. Types of Attacks and Corresponding Sample Numbers.

Table 4. Evaluation results for each method.

4.3. Quantitative evaluation of security posture

Table 5. Quantitative values of threat events.

Table 6. Results of error evaluation indicators.

4.4. Power information network attack experiments

4.4.1. Experiment 1

Table 7. Network Traffic Statistics under Normal Conditions.

Table 8. Packet statistics under normal.

Table 9. Time delay test results.

4.4.2. Experiment 2

Table 10. Network traffic statistics under attack.

Table 11. Packet statistics under attack.

Table 12. Delay test results under attack.

Table 13. Sample Set.

Table 14. Comparison of methods.

5. Conclusion

Author contributions

Consent for publication

Acknowledgments

Data availability

Disclosure statement

References

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date