Full article: Enhancing Network Intrusion Recovery in SDN with machine learning: an innovative approach

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

In modern network environments, the swift recovery of network flow intrusions poses a substantial challenge. Particularly in the context of Software-Defined Networks (SDN), addressing this challenge necessitates the strategic selection of backup paths based on traffic patterns. In response to this critical issue, our paper introduces a groundbreaking approach known as Machine Learning-based Network Intrusion Recovery (MLBNIR) for enhancing intrusion recovery in SDN. We leverage a dedicated SDN dataset to train a flow-based Machine Learning (ML) model, enabling a deeper understanding of traffic dynamics within the SDN framework. Our study, presented in this paper, reveals that the MLBNIR approach significantly reduces intrusion recovery time by up to 90% and concurrently increases network bandwidth consumption by up to 57% when compared to existing methods reviewed in the literature.

Keywords:

1. Introduction

The world is currently heading toward digital transformation. Almost all organizations are adopting digital systems that rely on the Internet to provide digital experiences to their customers. The digitization of all business processes should be always secured using state-of-the-art security technologies.

The prospects of network security have been transformed by the extensive popularity and use of computer networks. Foregoing in view, a need for a system that can identify intrusive threats and risks on its own besides taking benefit from the intrusion prevention system. Identification of these vulnerabilities will help in damage evaluation besides enabling avoiding future attacks.

In contemporary network environments, the swift recovery of network flow intrusions poses a substantial and pressing challenge. This challenge is especially pronounced in the context of Software-Defined Networks (SDN), where mitigating the impact of intrusions requires strategic selection of backup paths based on evolving traffic patterns. Our paper is motivated by the critical need to address this challenge effectively.

In modern network environments, the management of network traffic presents a fundamental challenge. Traditional distributed networks, characterized by devices like routers and switches, involve configuring traffic policies individually for each network device through network controllers. However, a transformation in network architecture, driven by Software-Defined Networks (SDN), has brought about a paradigm shift.

In SDN, the control plane, responsible for decision-making, is distinctly separated from the data plane, which is responsible for forwarding network traffic. This architectural change centralizes control logic in a remote device known as the controller, and data flow is orchestrated through a Southbound Application Programming Interface (SB-API). This shift towards centralized management has paved the way for rapid deployment of applications, services, and network operations, offering organizations greater adaptability and operational efficiency (Hu, Hao, & Bao, Citation2014; Khan, Laouini, Alshammari, Khalid, & Aamir, Citation2023; Lara, Kolasani, & Ramamurthy, Citation2014; Ndiaye, Hancke, & Abu-Mahfouz, Citation2017; Singh & Jha, Citation2017).

SDN has gained significant traction as a solution to the inherent challenges of traditional distributed networks. Its primary advantage lies in the decoupling of the control and data planes, making networks more agile and manageable. This approach enables centralized control over the entire network, and many commercial and industrial entities have embraced SDN systems for various purposes, including:

Streamlining network management by separating the control and data planes, reducing the complexity of network modifications, and minimizing human errors.
Facilitating the deployment and upgrades of network devices without vendor lock-in, providing greater flexibility to IT administrators.
Providing a holistic view of the network to SDN controllers, enhancing overall network management.
Enabling developers to deploy a multitude of applications in a virtual environment through the SDN system’s upper layer (Jahromi & Delaney, Citation2018).
Reducing operating costs significantly by eliminating the need for programming languages in network device configuration, in contrast to traditional networks.

However, the adoption of SDN technology also introduces new security challenges. SDN networks are vulnerable to evolving security risks that attackers may exploit for malicious purposes (Inayat, Zia, Mahmood, Khalid, & Benbouzid, Citation2022). If an attacker gains access to the SDN controller, the entire network becomes susceptible to severe threats. Therefore, the incorporation of an Intrusion Detection System (IDS) to detect anomalies in SDN network traffic is imperative (Ashraf et al., Citation2021).

Numerous security strategies are employed in SDN networks to thwart and detect Denial of Service (DoS) attacks, as discussed in (Ashraf et al., Citation2021; McKeown et al., Citation2008). These strategies rely on acquiring flow information from incoming packets, including metrics such as packet count, IP address frequency, and byte count, to identify DoS attacks within the SDN network. However, as the volume of monitored packets increases, the existing solutions face limitations. Processing and classifying a vast number of packets in real-time, exacerbated by the surge in generated data, pose significant challenges. Additionally, the similarity between benign communication and malicious packets makes detecting malicious traffic a daunting task (Rafique, Khalid, & Muyeen, Citation2020). Attackers can manipulate packet headers to resemble normal traffic, evading detection systems. As anomalies constantly evolve, small adjustments by attackers can introduce new attack vectors undetected by existing security measures.

In response to these challenges, the application of Machine Learning (ML) and data mining methods has gained prominence in identifying and classifying intrusion threats. While ML has been extensively explored in various contexts (Chen, Wawrzynski, & Lv, Citation2021; Laprie, Citation1992; Shafiq, Tian, Bashir, Jolfaei, & Yu, Citation2020; Singh, Jeong, & Park, Citation2020; Wickboldt et al., Citation2015), its application in the realm of SDN remains limited. Effectiveness in ML-driven intrusion detection can be influenced by factors such as feature selection methods and the choice of datasets.

Traditional network failures, including device or link issues, demand specialized strategies for swift recovery to prevent packet forwarding delays and loss. Two recovery approaches, namely reactionary and proactive, are available. Reactionary recovery involves a switch requesting an alternate route for failed traffic, which entails additional signaling to the controller, known as restoration. This process may introduce delays, depending on path computation complexity and controller computational capabilities, leading to packet loss on the disrupted connection.

Proactive recovery, also referred to as protection, minimizes such delays and potential packet loss by precomputing backup paths and installing forwarding rules in switches on these paths prior to any failures (Mohan, Truong-Huu, & Gurusamy, Citation2014). In the event of a switch failure, switches automatically reroute traffic to the backup path using pre-installed flow table rules, obviating the need for controller intervention.

Despite the allocation of network bandwidth to back up the primary flow, the computation of backup paths typically follows a static network state, failing to account for evolving traffic patterns. In certain cases, backup paths may experience traffic bottlenecks, particularly when multiple primary paths share common backup paths or networks (Mohan, Truong-Huu, & Gurusamy, Citation2017; Mohan et al., Citation2014). The main motivation of this paper is to ensure the uninterrupted operation of critical applications which could lead to a breach of the service-level agreement (SLA). Bandwidth allocation on backup networks is crucial for delay-sensitive traffic flows. Therefore, a novel solution is proposed to predict the suitability of a path in future scenarios based on network traffic characteristics and dynamically select backup paths.

Several research and development initiatives are paying close emphasis to the introduction of machine learning methods including deep learning. Among the greatest benefits of machine learning is its ability to deal with difficult situations (Wang, Cui, Wang, Xiao, & Jiang, Citation2018). With so much traffic moving through the networks, using a machine learning approach to assess those properties could provide meaningful information for traffic engineering and fault management. In this paper, we present a Machine Learning-based Network Intrusion Recovery (MLBNIR) approach to evaluate the goodness of the backup paths for a primary path between two source and destination nodes. The capabilities of SDN in a network can be used in network surveillance to continually record traffic characteristics in both normal and intrusion conditions. InSDN dataset (Elsayed, Le-Khac, & Jurcut, Citation2020) has been used to build and test our machine learning model for backup path computation.

Various traffic parameters, like flow interarrival times, size of flows measured in the number of packets, and size of flows measured in the number of bytes, are taken into account by the SDN controller to determine the goodness of a backup path in various failure and intrusion circumstances. The controller may compute and update the backup path adaptively when there is a variation in traffic patterns by reviewing and understanding traffic characteristics collected from the network. The switches are updated ahead of time with the computed backup paths, allowing for quick intrusion recovery. A breakdown in one section of the network, for example, could have an impact on traffic patterns in another area of the network. We use a variety of machine learning approaches to train our learning model, including Linear Regression (LR), Decision Trees (DT), Support Vector Machines (SVM), and Random Forest (RF). The test indicates the efficacy of the proposed MLBNIR approach compared to the approaches discussed in the literature review.

The main contribution of this paper is as follows:

To survive continuing attacks and preserve the availability of the SDN’s core network, we propose an innovative ML-based intrusion recovery technique.
We introduce a congestion-sensitive recovery process to help select the optimal backup path.
MLBNIR approach prevents the attacker from reinfecting the system and allowing the system to withstand reinfection since the source of infection is blocked.
MLBNIR approach has been evaluated by measuring the effectiveness of the backup path applied and the ability to select an appropriate backup path.

In SDN, we tackled the issue of proactive intrusion recovery. We created a machine learning-based method that learns the traffic features gathered during a flow intrusion situation to assist us in analyzing the effect of the intrusion on the goodness of backup paths for each primary path.

The scope of this paper covers the InSDN dataset which contains seven types of commonly known attacks, namely DoD, DDoS, Probe, brute-force, Exploitation, web attack, and Botnet. All of these have been executed in an SDN network with all the SDN planes i.e. Application, controller, and data. The MLBNIR has been quantitatively measured and validated using the InSDN dataset. First by computing the shortest path routing approach which is called a “baseline” and then validating the quantitatively measured data with regards to Intrusion Recovery Time, Bandwidth, and Number of modifications in backup paths.

The remainder of the paper is laid out as follows. The related work is presented in section 2. The SDN architecture is explained in Section 3. Section 4 provides an overview of the Proposed MLBNIR approach. Section 5 describes the InSDN dataset. Section 6 illustrates the steps for data pre-processing. We outline our novel MLBNIR approach and the machine learning algorithms we applied in Section 7. The tests to illustrate the efficacy of the innovative MLBNIR approach are presented in Section 8. Section 9 concludes the paper.

2. Related work

The different techniques for network failure recovery in SDN can be separated into two classifications: proactive and reactive (Akyildiz, Lee, Wang, Luo, & Chou, Citation2014). A basic understanding of the data and control planes, and how these planes communicate with one another, is required to comprehend how these two strategies function. Traditional recovery strategies in the event of network failures have been restricted due to the extraordinary expansion and complexity of network traffic. In (Xu et al., Citation2018), the number of interlinked network devices and data from the Internet of cars is expected to reach 50 billion (Evans, Citation2011) and 300,000 exabytes, accordingly. As a result, the likelihood of link failures will rise in tandem with advancements in internet devices. One of the most inherent benefits of machine learning is its capability to solve complicated problems (Wang et al., Citation2018). Likewise, when dealing with vast quantities of data, typical recovery procedures may collapse. As a result, applying ML algorithms to crucial traffic properties can provide valuable information for failure detection and recovery. As a result, machine learning algorithms are being probed for big data management.

The authors of Klaine, Imran, Onireti, and Souza (Citation2017) discussed the use of machine learning algorithms for self- or autonomous configuration, healing, and optimization. The purpose of self-healing is to identify, recuperate from, and analyze network failures. It is vital to note that ML techniques are employed to discover and classify problems in this context. The data from previous transfers are used to train the machine learning model and anticipate the upcoming handover. The fast re-routing technique (FRT) using shortest path (Muthumanikandan & Valliyammai, Citation2017) takes up to 30 ms as recovery time and up to 70% bandwidth utilization upon flow recovery. Another proactive recovery approach (Ali, min Lee, hee Roh, Ryu, & Park, Citation2020) takes up to 22 ms to recover an SDN flow, however, it suffers from low backup path bandwidth utilization i.e. 17% (Srinivasan, Truong-Huu, & Gurusamy, Citation2019), explains how to use machine learning to perform link failure identification in complicated networks depending on the traffic characteristics. Nevertheless, these systems do not allow dynamic recovery from link failures, i.e. no traffic rerouting to the endpoint from a failed to a substitute path is provided once traffic circumstances are taken into account.

The accessibility of data is crucial for ML algorithms. ML algorithms have an edge in this situation because they acquire network traffic data. A method for using the SDN capabilities of centralized traffic engineering and network surveillance was provided in (Truong-Huu, Prathap, Mohan, & Gurusamy, Citation2019). The SDN controller keeps track of any alterations in traffic behavior. When the controller identifies a shift in network traffic, it alerts the backup path. SVM, RF, Neural Networks (NN), LR, and DT are among the classification algorithms used to train the ML models. Yet, as the number of nodes grows, so does the number of backup paths, resulting in a significant rise in flow-matching overhead. In (Khunteta & Chavva, Citation2017) the deep learning-based techniques adjust themselves to the network’s adaptive alterations. As a result, they’re perfect for self-organizing networks that make use of SDN controllers’ programmability.

Many subsequent papers have proposed using machine learning approaches to manage network failures. The authors of (Nguyen, Ge, der Merwe, Yan, & Yates, Citation2015) described a technique for detecting failures in mobile networks depending on consumption. The researchers suggested that for a specific geographic location, device type, and service, aggregated customer usage data be monitored and a consumption profile be derived. A decrease in aggregated consumption (below expectations) will be taken as an indicator of a possible service outage in that area. Nevertheless, this approach necessitates the implementation of service monitoring in parallel to network surveillance. It also necessitates proper user classification, with users in the same class having identical usage behavior (Noshad et al., Citation2019), employs SVM to classify acquired sensor data to identify faults through anomalous data patterns. This method necessitates rerouting traffic to the server where the classifier is installed, resulting in a significant latency in data processing and added communication costs. Our method just needs traffic attributes that are significantly less in size (typically only a few KB) than the data traffic. Furthermore, the two works mentioned above concentrate on wireless/mobile networks, which can utilize our system to manage the network recovery in wireless network (Abujubbeh, Al-Turjman, & Fahrioglu, Citation2019).

The authors of (Duenas, Navarro, Andion, & Cuadrado, Citation2018) described an online failure prediction system developed on Apache Spark that accepts a database of network management events, trains an RF model, and utilizes that model to predict the arrival of coming occurrences in approaching real time. Nevertheless, no event will occur in the network for some failures (silent failures), causing the system to fail to identify them. Furthermore, we suggest examining the properties of regular traffic produced by real users or applications using a machine learning approach. The authors of Truong-Huu et al. (Citation2019) proposed using a machine learning approach to analyze traffic characteristics to locate link faults in complicated networks. Nevertheless, none of these studies take into account quick and dynamic failure recovery, which entails rerouting failed traffic to an alternative endpoint depending on the traffic patterns. Within several studies, machine learning has also been employed for path allocation. The authors of Eswaradass, Sun, and Wu (Citation2006) suggested that Artificial Neural Networks (ANN) and reinforcement learning be used to forecast link bandwidth. In (Brun, Wang, & Gelenbe, Citation2016), the researchers suggested using machine learning to determine the optimal intercontinental overlay routing path. We use machine learning for proactive intrusion recovery, which follows the same patterns. The suggested method intends to understand traffic behavior not only under regular operating conditions but also in flow intrusion situations. The network can then develop effective intrusion recovery strategies with respect to bandwidth consumption, load balancing, and recovery period.

3. SDN architecture

SDN encourages innovation by introducing a centralized, programmable data plane control approach that makes it easier to develop new routines and network services. The SDN layout is based on the concept of dividing data and control planes (see ).

Figure 1. SDN architecture.

The SDN controller issues a flow-based logic for making accurate decisions to all Open Flow OF-enabled switches. This logic is in charge of preparing forwarding tables for every OF switch. This process is detailed in . Moreover, a standard OF-enable switch contains a pipeline inside it along with flow tables. The tables manifest flow access which involves three components: (a) counters responsible for maintaining matched flow data, (b) rules to adequately match incoming packets, and (c) guidelines that comprise both proactive and reactive configurations that can be put into action upon a match. Software or hardware OF-enabled switches can be used in SDN. Particular APIs are generated for the specific purpose for instance interdomain routing and Voice Over IP (VOIP) applications. Other than that, there are several SDN programming languages such as Procera and NetCore. These contain high-level APIs with the purpose of building different SDN applications with high flexibility and efficiency (Yamanaka, Kawai, & Shimojo, Citation2017).

Each packet traveling via the OF switch has a header that aligns with the flow entries comprised in the switch’s Flow Table. In the instance that flow entries correctly pair up with the packet header, it is assumed that the statistics are updated (such as the number of packets and bytes being improved). However, it is possible that the flow information stored in the switch Flow Table doesn’t match the header of the packet. In that case, a message is issued to the controller by the switch requesting the initiation of data flow to evaluate uncertain packets. It must be noted that only the hosts and switches listed on the controller are authorized to exchange packets in SDN networks. The controller, as per policy, has the authority to add new flow access for every switch to the flow table.

4. Description of InSDN dataset

The authors of Elsayed et al. (Citation2020) have provided a way for creating an attack-specific SDN dataset that is accessible to the public and also showed the way to operate an InSDN dataset. This allows the testing of some ML algorithms in the detection of network attack behavior. Attack types such as DoS, DDoS, Botnet, web attacks, brute force attacks, malware, probes, and exploitation which can take place in many aspects of the SDN were included in the new InSDN. Furthermore, several of the previously mentioned attacks can also be used against the SDN control plane.

The experiment was done using the InSDN pre-collected dataset. The simulated network used to collect the InSDN dataset contains four different network subnets (Elsayed et al., Citation2020), namely controller network, Metasploitable2 Server network, mininet SDN Data network, and Kali Linux server network.

The pattern of flow that emerged was assessed Bidirectionally. As such, the initial packet was the initial packet served as the deciding factor in determining whether the direction of the flow was forward or backward. The InSDN dataset has been constructed using the CICFlowMeter tool (Draper-Gil, Lashkari, Mamun, & Ghorbani, Citation2016) to generate a CSV file with over 80 statistical features. The features include Protocol, Duration, Number of Bytes, Number of Packets, etc. There are a total of 80 features in the InSDN dataset. We classified the complete set of features into eight groups.

Network identifiers attributes: This group contains features such as IP address, Port number, and protocol types. Equipped with general information, the purpose of these features is to determine the flow’s source and destination.
Packet-based attributes: All information regarding packets is included in these features. For instance, the total quantity of packets in the forward and backward flow.
Bytes-based attributes: Data about bytes can be found in these features, such as the number of bytes found in the forward and backward direction of flow.
Interarrival time attributes: Information regarding the interarrival time in both directions of flow is contained in these features.
Flow timers attributes: The knowledge of the time duration of each flow is found in these features. Like whether the flow is inactive or active.
Flag attributes: Features in this group contain information about flags, for example, SYN flag, RST flag, Puch flag, and more.
Flow descriptors attributes: Features containing details of traffic are grouped here. Information such as the number of both packets and bytes in the bidirectional flow.
Subflow descriptors attributes: These features display data about sub-flows: the number of packets and bytes sent and received in the bidirectional flow.

Each group’s attack classes, along with their total size, are shown in . For regular and attack traffic, there are combined dataset incidents of 343, 939. There are 68424 cases of normal data and 275, 515 of attack traffic.

Table 1. Attack types in InSDN dataset.

Download CSV Display Table

5. Data pre-processing and SDN specific features

The identification of features that are required and which may be acquired directly from the SDN network is discussed in this section. From the SDN controller, the statistical features can be collected via OpenFlow communication to SDN switches in SDN (eg. flow duration, packet, and bytes quantity) (Krishnan, Duttagupta, & Achuthan, Citation2019). These features can either be derived readily from the SDN controller using API queries or manually calculated using flow statistics data. We opted for a subset of 52 features. The source IP, destination IP, source port, and destination port have been used only to determine the simulated backup path from the benign subset (i.e. they are not considered in the machine learning model because IP addresses and ports are capable of changing from one network to another). shows the total number of features shortlisted for the SDN context as devised by the data.

Table 2. Selected features for SDN.

Download CSV Display Table

When studying the SDN features in the InSDN dataset, we found that many features can be removed due to the presence of standard deviation of many features; thus, redundancy is avoided. , shows the unique InSDN dataset features utilized by the MLBNIR approach.

Table 3. Selected unique features for SDN.

Download CSV Display Table

6. MLBNIR proactive approach for backup path recovery

The proposed MLBNIR approach is illustrated in . Our work begins when an IDS detects an intrusion in the SDN, as we focus our study on intrusion recovery.

Figure 2. High-level overview of our intrusion recovery approach.

As shown in , an external IDS is responsible for monitoring the SDN network by getting the SDN controller statistical features that can be collected via OpenFlow communication to the SDN switches network in the SDN. Then, upon detecting an intrusion in any network flow, the IDS sends an alert message to the SDN controller to delete the infected flow from the SDN switches to isolate and withstand the infection. The SDN controller in its turn activates the pre-configured backup path (as shown in ) that has been proactively calculated using the MLBNIR. It is worth mentioning that this algorithm can be installed as a sub-system in the SDN controller with no changes required in the SDN network configuration (Pratama, Suwastika, & Nugroho, Citation2018).

Figure 3. Backup path in SDN.

We initiate a series of reactions when the IDS detects an intrusion. The procedure must achieve the following objectives: recover the infected flow, and withstand potential reinfection. We use machine learning to achieve these objectives, and source blocking devices such as network firewalls.

Using the InSDN dataset, we build a machine learning method that can understand traffic dynamics, and evaluate backup path goodness values, which can be used to dynamically update the backup path. The SDN switches can be pre-configured with a backup path, allowing for quick intrusion recuperation. We use machine learning algorithms including LR, DR, SVM, and RF to train, test, and evaluate the learning model for backup path evaluation.

The traffic features evident in the InSDN dataset were gathered not only in regular working conditions but also during intruded flow scenarios. Traffic attributes from the flow intrusion scenario assist in the estimation of intrusion impact on the goodness of backup paths. It further helps in creating a more efficiently devised plan for a backup path for each primary path. It is important to emphasize that rather than waiting for the next estimation instant to occur, switches can be used to start the estimating process in case an anomaly in traffic behavior is spotted.

6.1. Proposed MLBNIR approach

After identifying all the possible flow-node disjointed backup paths (which are simulated in our experiment by using the benign traffic flows) to the intruded path. This allows the SDN to judge the goodness value of all the candidate backup paths. Till another estimation exercise is conducted, the backup path with the lowest flow congestion value is used. In case variations are observed in the behavior of traffic which renders our best backup path different from the one generated in the last instant, the controller can readily provide new backup forwarding guidelines and immediately installs them on all primary path switches. The T-MLBNIR algorithm is illustrated in Algorithm Equation(1)(1) $\begin{matrix} F C = \frac{Flow IAT Mean}{Flow IAT Max} \\ + \frac{Flow Pkts / s}{Fwd Pkts / s + Bwd Pkts / s} \\ + \frac{Flow Byts / s}{Fwd Byts / s + Bwd Byts / s} \end{matrix}$ (1) . As a result, the controller can demonstrate a rapid response to the alterations in traffic patterns. This allows it to adjust the backup path to alleviate traffic congestion. The proposed MLBNIR approach flow diagram is shown in . In , an external IDS is responsible for monitoring the SDN network by getting the SDN controller network traffic statistical features that can be collected via OpenFlow communication to the SDN switches network in the SDN. Then, upon detecting an intrusion in any network flow, the IDS sends an alert message using northbound API commands (Salazar & Cardenas, Citation2019) to the SDN controller to delete the infected flow from the SDN switches to isolate and withstand the infection. Then, the SDN controller checks the source and destination IPs of the attack network flow to find the candidate backup paths using normal network traffic. The source and destination IPs have been captured in the InSDN dataset (Elsayed et al., Citation2020). After that, the MLBNIR approach is used to find the optimum backup path with minimum flow congestion. Finally, the SDN controller in its turn activates the pre-configured backup path to recover the SDN network.

Figure 4. Proposed SDN MLBNIR approach flow diagram.

The following traffic congestion attributes are used to calculate the flow congestion value for each destination and source pair as seen in EquationEquation (1)(1) $\begin{matrix} F C = \frac{Flow IAT Mean}{Flow IAT Max} \\ + \frac{Flow Pkts / s}{Fwd Pkts / s + Bwd Pkts / s} \\ + \frac{Flow Byts / s}{Fwd Byts / s + Bwd Byts / s} \end{matrix}$ (1) , the lower the Flow Congestion (FC) value the higher the goodness of the backup path:

Flow interarrival times: Time between two packets sent in the flow, by verifying the Mean of inter-arrival packet times (microsecond) in the routed traffic queue for a group of consecutive packets with shorter interarrival delays. The lower the time interval between two packets sent in the flow, the higher the goodness of the same network flow.
Size of flows measured in the number of packets: The number of packets per second in a related flow was used to determine the flow size. The larger the size of the aggregate flow is, the more bandwidth the flow will utilize, and this will result in more congestion in the backup path.
Size of flows measured in the number of bytes: Flow size, measured in bytes per second in a related flow, was examined similarly. The larger the number of bytes of the aggregate flow is, the more bandwidth the flow will utilize, and this will result in higher congestion in the backup path.

(1)

\begin{matrix} F C = \frac{Flow IAT Mean}{Flow IAT Max} \\ + \frac{Flow Pkts / s}{Fwd Pkts / s + Bwd Pkts / s} \\ + \frac{Flow Byts / s}{Fwd Byts / s + Bwd Byts / s} \end{matrix}

(1)

In EquationEquation (1)(1) $\begin{matrix} F C = \frac{Flow IAT Mean}{Flow IAT Max} \\ + \frac{Flow Pkts / s}{Fwd Pkts / s + Bwd Pkts / s} \\ + \frac{Flow Byts / s}{Fwd Byts / s + Bwd Byts / s} \end{matrix}$ (1) , for each flow in the SDN, the summation of each flow congestion attribute ratio is calculated to get the percentage of the flow interarrival time in the flow, the utilization of flow packets in the flow, and the flow bandwidth.

6.2. Machine learning techniques for model training

A variety of ML algorithms are used to develop the learning model recommended in this study. Considering that a data point comprises the total traffic attributes (i.e. in ) gathered from the network in a specific time instant, the learning model must produce a vector that comprises numerous absolute values (i.e. traffic congestion attributes). These values must further indicate the flow congestion values of possible backup paths for each primary path in the network.

Linear Regression (LR).
Decision Tree (DT).
Support Vector Machines (SVM).
Random Forest (RF).

Although the previously mentioned ML techniques can be applied to both regression and classification issues, however, in this study we mainly employ them to address regression problems, i.e. to estimate the traffic congestion attributes of the backup path.

7. Performance study

The MLBNIR approach has been compared to the shortest path routing approach which is called “baseline” (Truong-Huu et al., Citation2019) and the approaches discussed in the literature review i.e. “ML approach in Baseline” (Truong-Huu et al., Citation2019), “FRT” (Muthumanikandan & Valliyammai, Citation2017), and “Proactive approach” (Ali et al., Citation2020). The baseline method employs a reactive recovery mechanism, which means that the backup path for a primary path is not set in advance. When a link fails, the switch alerts the controller, which uses the shortest path routing to compute a backup path for the failed traffic. Before the failed traffic is sent to the destination on this backup path, the backup path will be configured in the affected switches.

When comparing the MLBNIR approach to the approaches presented in the literature review, we employ the multiple performance metrics namely failure recovery time, the bandwidth allocated per flow, bandwidth utilization, and the number of changes in backup paths.

7.1. Analysis of results

Mean Square Error of Machine Learning Algorithms: In , since the intention is to reduce the flow congestion, the mean square error for or flow bytes per second as a result of the machine learning approaches is utilized in our research is plotted. The results reveal that DT and RF outperform SVM and LR among machine learning algorithms. Based on this, we apply Random Forest for our proposed MLBNIR approach in the following experiment and compare it to the approaches discussed in the literature review.
Intrusion Recovery Time: In , the time it takes to recover from a flow intrusion after it has occurred. It has an impact on the amount of time it takes to recover from the incursion, which is crucial. This means that recovering a flow incursion faster would have a lower network impact. According to the findings, MLBNIR took 9.8 milliseconds to select the best backup path to recover from network flow intrusions. This demonstrates the effectiveness of our proposed MLBNIR approach since it recovers from flow intrusion quickly and correctly. When compared to the baseline approach, our solution provides the best reduction in failure recovery time by 90%. Also, using the proposed ML algorithm in Truong-Huu et al. (Citation2019), the reduction is 80%. The fast re-routing technique (FRT) using shortest path (Muthumanikandan & Valliyammai, Citation2017) takes up to 30 ms as recovery time. Another proactive recovery approach (Ali et al., Citation2020) takes up to 22 ms to recover an SDN flow. This illustrates the efficiency of our proposed MLBNIR approach, which uses machine learning to identify the backup path. When a flow entry is removed upon the detection of an intrusion in that flow, the switch simply uses the backup paths that were pre-installed in its flow table to route the failed traffic to the backup path. This differs from the baseline technique, which needs switches to notify the controller of a failure and then wait for the controller to compute a backup path. The round-trip delay of the signaling packet and the path computation time at the controller make up the total time needed by the baseline method.
Bandwidth: In , the network’s connections all have a 10 Gbps bandwidth capability. The findings demonstrate that the MLBNIR approach greatly increases the amount of bandwidth that can be assigned to a flow in the event of failure i.e. 900 Mbps. While the baseline approach always takes the shortest path (which may become congested if a connection fails), the MLBNIR approach learns traffic patterns and selects the path with the lowest flow congestion value. If a backup path is shared by several primary paths with a similar connection, the impact of that flow intrusion is magnified, and bottleneck links on the backup path may result. Compared to the approaches discussed in the literature review and the baseline approach in Truong-Huu et al. (Citation2019), the results demonstrate that the MLBNIR approach can boost the amount of bandwidth allocated per flow by 225% in the event of an Intrusion. Also, using the proposed ML algorithm in Truong-Huu et al. (Citation2019), the increase of bandwidth allocation is 40%. In , the proposed MLBNIR approach is more effective in terms of bandwidth consumption since the traffic flows are assigned extra bandwidth upon an intrusion, as mentioned above. In comparison to the baseline approach, our solution delivers better bandwidth usage. Our MLBNIR proposed approach utilizes 90% of the bandwidth available on each flow. In comparison to the baseline approach, our solution enhances network bandwidth consumption by 57%. Also, using the proposed ML algorithm in Truong-Huu et al. (Citation2019), the increase of bandwidth utilization is 28%. Indeed, rather than always using the shortest path, our proposed MLBNIR approach uses a backup path that is potentially longer but has greater bandwidth availability. Not only does this increase network bandwidth usage, but it also improves load balancing across the network’s links. As a result of our proposed MLBNIR approach, we can prevent the situation where one connection becomes overloaded while other lines go underutilized in the event of an intrusion.
Number of modifications in backup paths: MLBNIR’s average number of backup path modifications per source-destination pair (s-d pair). When a flow detects an incursion, it switches its backup path from one to another. We count the number of changes in the backup paths of flows in the networks after a link incursion. Considering the “Timestamp” feature in the InSDN dataset, We run the tests using all of the flow intrusion scenarios in the InSDN dataset and taking the average number of modifications. The findings reveal that backup paths have fewer modifications. The number of modifications grows in line with the number of incursions in the network, where several backup paths exist for an intruded flow. Despite this, the MLBNIR approach only requires a maximum of 2 backup paths per s-d pair to be altered in the event of an intrusion, compared to a maximum of 3 backup baths changes in Truong-Huu et al. (Citation2019). This does not affect the controller’s performance. Furthermore, because the controller evaluates the goodness of the backup paths separately, changes in backup paths do not affect the network during normal operation.

Figure 5. Mean Square error of machine learning algorithms for flow bytes per second.

Figure 6. Recovery time.

Figure 7. Bandwidth allocated per flow.

Figure 8. Bandwidth utilization.

8. Conclusion

In this paper, we discussed the issues associated with proactive intrusion recovery that is prevalent in SDN. To achieve our goal of addressing this issue, we worked on using the InSDN dataset to dynamically identify the appropriate backup path for an intruded path. Furthermore, the MLBNIR machine learning-based approach has been created and adapted to study traffic dynamics during normal conditions and flow incursion scenarios. Depending on parameters like flow interarrival times, the size of flows measured in the number of packets, and the size of flows measured in the number of bytes, the MLBNIR approach determines the goodness of a backup path. The backup path can be pre-installed in the switches’ flow tables. Moreover, it is automatically updated when it detects any intrusion in the traffic. With four ML algorithms, the performance has been evaluated for the MLBNIR proposed approach which was later compared to the approaches discussed in the literature review. When compared to the baseline approach, our solution provides the best reduction in flow recovery time with 10 ms as recovery time. Also, our proposed algorithm greatly increases the amount of bandwidth that can be assigned to a flow in the event of failure i.e. 900 Mbps which exceeds the baseline approach assigned bandwidth of 400 Mbps. Our solution, according to our findings, exceeds the previous approaches in the literature review concerning intrusion recovery time by up to 90%. It further surpasses the network bandwidth use by 57%.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

Abujubbeh, M., Al-Turjman, F., & Fahrioglu, M. (2019) Software-defined wireless sensor networks in smart grids: An overview. Sustainable Cities and Society, 51, 101754. doi:10.1016/j.scs.2019.101754
Web of Science ®Google Scholar
Akyildiz, I. F., Lee, A., Wang, P., Luo, M., & Chou, W. (2014). A roadmap for traffic engineering in SDN-OpenFlow networks. Computer Networks, 71, 1–30. doi:10.1016/j.comnet.2014.06.002
Web of Science ®Google Scholar
Ali, J., Min Lee, G., Hee Roh, B., Ryu, D. K., & Park, G. (2020). Software-defined networking approaches for link failure recovery: A survey. Sustainability, 12(10), 4255. doi:10.3390/su12104255
Web of Science ®Google Scholar
Ashraf, J., Keshk, M., Moustafa, N., Abdel-Basset, M., Khurshid, H., Bakhshi, A. D., & Mostafa, R. R. (2021). IoTBoT-IDS: A novel statistical learning-enabled botnet detection framework for protecting networks of smart cities. Sustainable Cities and Society, 72, 103041. doi:10.1016/j.scs.2021.103041
Web of Science ®Google Scholar
Brun, O., Wang, L., & Gelenbe, E. (2016). Big data for autonomic intercontinental overlays. IEEE Journal on Selected Areas in Communications, 34(3), 575–583. doi:10.1109/JSAC.2016.2525518
Web of Science ®Google Scholar
Chen, D., Wawrzynski, P., & Lv, Z. (2021). Cyber security in smart cities: A review of deep learning-based applications and case studies. Sustainable Cities and Society, 66, 102655. doi:10.1016/j.scs.2020.102655
Web of Science ®Google Scholar
Draper-Gil, G., Lashkari, A. H., Mamun, M. S. I., & Ghorbani, A. A. (2016). Characterization of encrypted andVPN traffic using time-related features. In Proceedings of the 2nd international conference on information systems security and privacy. SCITEPRESS – Science and and Technology Publications. doi:10.5220/0005740704070414
Google Scholar
Duenas, J. C., Navarro, J. M., G, H. A. P., Andion, J., & Cuadrado, F. (2018). Applying event stream processing to network online failure prediction. IEEE Communications Magazine, 56(1), 166–170. doi:10.1109/MCOM.2018.1601135
Web of Science ®Google Scholar
Elsayed, M. S., Le-Khac, N.-A., & Jurcut, A. D. (2020). InSDN: A novelSDN intrusion dataset. IEEE Access. 8, 165263–165284. doi:10.1109/ACCESS.2020.3022633
Web of Science ®Google Scholar
Eswaradass, A., Sun, X.-H., & Wu, M. (2006). Network bandwidth predictor (NBP): A system for online network performance forecasting. In SixthIEEE international symposium on cluster computing and the grid (CCGRID’06). IEEE. Retrieved from doi:10.1109/ccgrid.2006.72
Google Scholar
Evans, D. (2011). The internet of things: How the next evolution of the internet is changing everything. CISCO White Paper, 1(2011), 1–11.
Google Scholar
Hu, F., Hao, Q., & Bao, K. (2014). A survey on software-defined network andOpenFlow: From concept to implementation. IEEE Communications Surveys & Tutorials, 16(4), 2181–2206. doi:10.1109/COMST.2014.2326417
Web of Science ®Google Scholar
Inayat, U., Zia, M. F., Mahmood, S., Khalid, H. M., & Benbouzid, M. (2022). Learning-based methods for cyber attacks detection inIoT systems: A survey on methods, analysis, and future prospects. Electronics, 11(9), 1502. doi:10.3390/electronics11091502
Web of Science ®Google Scholar
Jahromi, H. Z., & Delaney, D. T. (2018). An application awareness framework based onSDN and machine learning: Defining the roadmap and challenges. In 2018 10th international conference on communication software and networks (ICCSN). IEEE. doi:10.1109/ICCSN.2018.8488328
Google Scholar
Khan, N. A., Laouini, G., Alshammari, F. S., Khalid, M., & Aamir, N. (2023). Supervised machine learning for jamming transition in traffic flow with fluctuations in acceleration and braking. Computers and Electrical Engineering, 109, 108740. https://www.sciencedirect.com/science/article/pii/S0045790623001647. doi:10.1016/j.compeleceng.2023.108740
Web of Science ®Google Scholar
Khunteta, S., & Chavva, A. K. R. (2017). Deep learning based link failure mitigation. In 2017 16thIEEE international conference on machine learning and applications (ICMLA). IEEE. doi:10.1109/icmla.2017.00-58
Google Scholar
Klaine, P. V., Imran, M. A., Onireti, O., & Souza, R. D. (2017). A survey of machine learning techniques applied to self-organizing cellular networks. IEEE Communications Surveys & Tutorials, 19(4), 2392–2431. doi:10.1109/COMST.2017.2727878
Web of Science ®Google Scholar
Krishnan, P., Duttagupta, S., & Achuthan, K. (2019). VARMAN: Multi-plane security framework for software defined networks. Computer Communications, 148, 215–239. doi:10.1016/j.comcom.2019.09.014
Web of Science ®Google Scholar
Laprie, J. C. (1992). Dependability: Basic concepts and terminology. In Dependability: Basic concepts and terminology (pp. 3–245). Vienna: Springer. from doi:10.1007/978-3-7091-9170-5\_1
Google Scholar
Lara, A., Kolasani, A., & Ramamurthy, B. (2014). Network innovation usingOpenFlow: A survey. IEEE Communications Surveys & Tutorials, 16(1), 493–512. doi:10.1109/SURV.2013.081313.00105
Web of Science ®Google Scholar
McKeown, N., Anderson, T., Balakrishnan, H., Parulkar, G., Peterson, L., Rexford, J., … Turner, J. (2008). OpenFlow. ACM SIGCOMM Computer Communication Review, 38(2), 69–74. doi:10.1145/1355734.1355746
Web of Science ®Google Scholar
Mohan, P. M., Truong-Huu, T., & Gurusamy, M. (2014, December). TCAM-aware local rerouting for fast and efficient failure recovery in software defined networks. In 2015IEEE global communications conference (GLOBECOM). IEEE. doi:10.1109/glocom.2014.7417309
Google Scholar
Mohan, P. M., Truong-Huu, T., & Gurusamy, M. (2017). Fault tolerance inTCAM-limited software defined networks. Computer Networks, 116, 47–62. doi:10.1016/j.comnet.2017.02.009
Web of Science ®Google Scholar
Muthumanikandan, V., & Valliyammai, C. (2017). Link failure recovery using shortest path fast rerouting technique inSDN. Wireless Personal Communications, 97(2), 2475–2495. doi:10.1007/s11277-017-4618-0
Web of Science ®Google Scholar
Ndiaye, M., Hancke, G., & Abu-Mahfouz, A. (2017). Software defined networking for improved wireless sensor network management: A survey. Sensors, 17(5), 1031. doi:10.3390/s17051031
PubMed Web of Science ®Google Scholar
Nguyen, B., Ge, Z., der Merwe, J. V., Yan, H., & Yates, J. (2015, September). ABSENCE: Usage-based Failure Detection in Mobile Networks. In Proceedings of the 21st annual international conference on mobile computing and networking. ACM. doi:10.1145/2789168.2790127
Google Scholar
Noshad, Z., Javaid, N., Saba, T., Wadud, Z., Saleem, M., Alzahrani, M., & Sheta, O. (2019). Fault detection in wireless sensor networks through the random forest classifier. Sensors, 19(7), 1568. doi:10.3390/s19071568
PubMed Web of Science ®Google Scholar
Pratama, R. F., Suwastika, N. A., & Nugroho, M. A. (2018). Design and implementation adaptive intrusion prevention system (ips) for attack prevention in software-defined network (sdn) architecture. In 2018 6th international conference on information and communication technology (ICOICT) (pp. 299–304). doi:10.1109/ICoICT.2018.8528735
Google Scholar
Rafique, Z., Khalid, H. M., & Muyeen, S. M. (2020). Communication systems in distributed generation: A bibliographical review and frameworks. IEEE Access. 8, 207226–207239. doi:10.1109/ACCESS.2020.3037196
Web of Science ®Google Scholar
Salazar, L. E., & Cardenas, A. A. (2019). Enhancing the resiliency of cyber-physical systems with software-defined networks. In Proceedings of the acm workshop on cyber-physical systems security & privacy (pp. 15–26). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3338499.3357356
Google Scholar
Shafiq, M., Tian, Z., Bashir, A. K., Jolfaei, A., & Yu, X. (2020). Data mining and machine learning methods for sustainable smart cities traffic classification: A survey. Sustainable Cities and Society, 60, 102177. doi:10.1016/j.scs.2020.102177
Web of Science ®Google Scholar
Singh, S., & Jha, R. K. (2017). A survey on software defined networking: Architecture for next generation network. Journal of Network and Systems Management, 25(2), 321–374. doi:10.1007/s10922-016-9393-9
Web of Science ®Google Scholar
Singh, S. K., Jeong, Y.-S., & Park, J. H. (2020). A deep learning-basedIoT-oriented infrastructure for secure smart city. Sustainable Cities and Society, 60, 102252. doi:10.1016/j.scs.2020.102252
Google Scholar
Srinivasan, S. M., Truong-Huu, T., & Gurusamy, M. (2019). Machine learning-based link fault identification and localization in complex networks. IEEE Internet of Things Journal, 6(4), 6556–6566. doi:10.1109/JIOT.2019.2908019
Web of Science ®Google Scholar
Truong-Huu, T., Prathap, P., Mohan, P. M., & Gurusamy, M. (2019, May) Fast and adaptive failure recovery using machine learning in software defined networks. In 2019 IEEE international conference on communications workshops (ICC workshops). IEEE. doi:10.1109/iccw.2019.8757169
Google Scholar
Wang, M., Cui, Y., Wang, X., Xiao, S., & Jiang, J. (2018, March) Machine learning for networking: Workflow, advances and opportunities. IEEE Network. 32(2), 92–99. doi:10.1109/MNET.2017.1700200
Web of Science ®Google Scholar
Wickboldt, J., Jesus, W. D., Isolani, P., Both, C., Rochol, J., & Granville, L. (2015, January) Software-defined networking: Management requirements and challenges. IEEE Communications Magazine, 53(1), 278–285. doi:10.1109/MCOM.2015.7010546
Web of Science ®Google Scholar
Xu, W., Zhou, H., Cheng, N., Lyu, F., Shi, W., Chen, J., & Shen, X. (2018). Internet of vehicles in big data era. IEEE/CAA Journal of Automatica Sinica, 5(1), 19–35. doi:10.1109/JAS.2017.7510736
Google Scholar
Yamanaka, H., Kawai, E., & Shimojo, S. (2017). AutoVFlow: Virtualization of large-scale wide-areaOpenFlow networks. Computer Communications, 102, 28–46. doi:10.1016/j.comcom.2016.12.006
Web of Science ®Google Scholar

Enhancing Network Intrusion Recovery in SDN with machine learning: an innovative approach

Abstract

1. Introduction

2. Related work

3. SDN architecture