Abstract

The growth of the Internet of Things (IoT) has recently impacted our daily lives in many ways. As a result, a massive volume of data are generated and need to be processed in a short period of time. Therefore, a combination of computing models such as cloud computing is necessary. The main disadvantage of the cloud platform is its high latency due to the centralized mainframe. Fortunately, a distributed paradigm known as fog computing has emerged to overcome this problem, offering cloud services with low latency and high-access bandwidth to support many IoT application scenarios. However, attacks against fog servers can take many forms, such as distributed denial of service (DDoS) attacks that severely affect the reliability and availability of fog services. To address these challenges, we propose mitigation of fog computing-based SYN Flood DDoS attacks using an adaptive neuro-fuzzy inference system (ANFIS) and software defined networking (SDN) assistance (FASA). The simulation results show that the FASA system outperforms other algorithms in terms of accuracy, precision, recall, and F1-score. This shows how crucial our system is for detecting and mitigating TCP-SYN floods and DDoS attacks.

1. Introduction

The growing number of connected objects, from millions to billions in various fields, is leading to an explosion in the amount of data. These huge volumes of data cause a lack of latency and make real-time analysis complex and difficult. To solve these issues, the deployment of computing models such as cloud and fog computing is crucial [1]. Technologies of cloud computing enable an extremely powerful computer resource over the network. Nevertheless, due to several concerns about data privacy and security, attaching more diverse types of objects immediately to the cloud is extremely difficult, as are network latency difficulties [2, 3]. Therefore, the need to introduce a new paradigm is necessary to solve these problems.

Recently, fog computing has emerged to expand the cloud computing paradigm from the core to the network’s periphery. The purpose of fog computing is to bring computer capabilities closer to IoT devices, offering real-time processing with low latency [4]. Aside from this, fog computing also provides mobility support, location awareness, and decentralized infrastructure. Fog computing has a local data storage infrastructure, which makes it more secure than cloud computing. Despite this, IoT devices are limited in terms of storage capacity and battery life. Thus, they can be easily hacked, destroyed, or stolen, and fog computing may become unavailable and unable to handle normal user requests. Therefore, it is necessary to apply security mechanisms to identify and block unauthorized requests on network systems. However, fog computing is still susceptible to various security and privacy gaps. It can be a point of vulnerability, and it is easily overwhelmed by a massive number of malicious requests, primarily intended for distributed denial of service (DDoS) attacks [5]. DDoS attacks can be divided into two types, depending on the protocol level addressed. The first one is known as “network-level flooding,” when TCP, UDP, ICMP, and DNS packets are used to overload intended clients’ network capabilities and resources. Whereas, the second protocol level is referred to as “application-level DDoS flooding” which is typically done on an HTTP web page when attacks are launched to deplete server resources such as sockets, CPU, ports, memory, databases, and input/output bandwidth [6]. Regarding the rapid growth and the harm caused by DDoS attacks, several kinds of research have been conducted on these attacks, and various approaches have been presented in the literature to prevent these attacks using fog computing [7, 8]. Most of them proposed a defensive fog computing that operates as a filtering layer between the user layer and the cloud computing layer. However, these defensive approaches miss DDoS detection mechanisms, and detailed computation is not discussed. Also, they do not identify any infrastructure to protect fog computing which is particularly susceptible to DDoS attacks and may disrupt network services.

For this purpose, proposing software defined networking (SDN) technology-based solutions could provide an innovative framework to deal efficiently with this insidious attack. SDN enables us to define logic control and instructs the forwarding plane to act appropriately by isolating the control and data planes. This programmability provides more control over network traffic, which wasn’t conceivable before the development of SDN [9].

Considerable research has been done within SDN-based IoT-fog networks using task scheduling techniques like threshold random walk with credit-based connection (TRWCB) and rate limiting. These techniques are deployed for detecting anomalies and mitigating DDoS attacks, which effectively reduces average response times [10]. However, it can result in excessive CPU and RAM consumption. This scheduling-based approach only focuses on secure scheduling periods, leaving the network vulnerable during idle times when no tasks are scheduled [11]. Moreover, the previous approach incorporates both fuzzy logic and multiobjective particle swarm optimization. Nevertheless, as the number of variables and rules increases, designing and fine-tuning the fuzzy logic system can become highly complex.

Recently, machine and deep learning algorithms have gained attention for their effectiveness in detecting DDoS attacks by analyzing data patterns [12, 13]. Hence, merging fuzzy systems with neural networks combines the benefits of neural learning with the interpretability offered by fuzzy systems. An adaptive neuro-fuzzy inference system (ANFIS) empowers fuzzy systems to acquire knowledge from data. This synergy enhances fuzzy systems through neural networks. ANFIS’s hybrid approach facilitates adaptability to diverse attack patterns and network conditions.

Moreover, employing various approaches, such as the neuro-fuzzy classifier on the KDD CUP99 dataset [14], has been a common practice. This dataset includes numerous recognized attack variations and has traditionally been utilized in intrusion detection. Nevertheless, the KDD CUP99 dataset is now regarded as outdated, as it presents several unresolved issues that fail to meet the updated criteria for DDoS identification [15]. In our work, we focus exclusively on the TCP-SYN flood attack. Since it is the most effective DDoS attack in fog computing, in order to exhaust the system’s resources or overwhelm the target server, the attackers typically infect several devices that behave as bots and synchronize suspicious traffic or requests, leading to an incomplete three-way handshake procedure [16]. Consequently, legitimate users cannot reach the desired fog server. In this paper, we suggest a novel fog computing-based SYN Flood DDoS attack detection and mitigation using an adaptive neuro-fuzzy inference system (ANFIS) and SDN assistance (FASA). Compared to previous works, FASA utilizes the ANFIS model for network traffic classification, incorporates SDN support to enable real-time mitigation, and relies on the newly released CIC-DDoS2019 dataset. The proposed model demonstrates exceptional performance across multiple metrics, including accuracy, precision, recall, and F1- score. In addition, it exhibits a notably low rate of false positives. In brief, our significant contributions are outlined as follows:(1)We propose a novel model FASA to detect and mitigate a SYN flood DDoS attack in fog computing using SDN assistance.(2)We implement the ANFIS model to self-train the fog servers and make the difference between normal and malicious packets.(3)The ANFIS model is implemented at the SDN controller and deployed at the fog server using a dataset captured from the SDN environment. Its main objective is to allow benign packets access while rejecting malicious ones to release a secure and dependable SDN controller that ensures fog service availability.(4)The proposed evaluation method uses both the newly released dataset CIC-DDoS2019 and the SDN dataset. It is experimentally analyzed from the data availability and the algorithm operating efficiency and it can improve the performance

The paper is structured in the following manner: Section 2 examines and discusses previous work to tackle the issue of DDoS attacks. Section 3 contains background knowledge. In Section 4, we formally define the proposed model, and in Section 5, we introduce our proposed framework followed by the evaluation outcomes and discussion in Section 6. Finally, the conclusion is given in Section 6.

In this section, we have provided an extensive overview of DDoS attacks. Especially, TCP-SYN flood attack detection. Besides, these works have been grouped into three sections. The initials represent statistical methods. The second and third ones highlight a few works based on machine/deep learning (ML/DL) algorithms.

2.1. Statistical Methods

Statistical methods constantly evaluate user/network activities to identify abnormalities [17]. Hence, due to their capacity to analyze the behavior of data packets, they are commonly utilized in DDoS attack detection systems. If the data flow does not match with some test statistics and measures, it is thought to be illegal. Ahalawat et al. [18] suggested a detection method for DDoS attacks based on Renyi entropy and a mitigation solution for SDN based on the packet drop approach, using several probability distributions. They can examine network traffic fluctuations. However, the necessity of setting an optimal detection threshold is a typical limitation of various entropy-based approaches. Hoque et al. [19] presented a novel correlation measure using standard deviation and mean to detect DDoS attacks, the traffic is then classified as attack traffic or normal by comparing the collected traffic to the profiled traffic. However, the suggested metric’s use in identifying low-rate attempts is unclear. A DDoS detection-based multivariate correlation analysis was discussed by Jin and Yeung [20] in their work, and they provided a covariance analysis method for recognizing SYN flood attacks. The experimental results demonstrate that this technology accurately and efficiently detects DDoS attack traffic in networks of varied levels of intensity. However, using the correlation approaches consumes a lot of processing in real-time to detect DDoS attacks. As a result, they are unable to operate in real-time. A novel framework was suggested by Bhushan [7] using fog for detecting DDoS attacks even before they reach the cloud by using an efficient resource provisioning algorithm to service cloud requests through intermediate fog servers. Furthermore, an entropy DDoS detection method and mitigation system designed for cloud computing environment using SDN have been proposed by Tsai et al. [21]. An entropy-based DDoS detection approach was implemented to protect the virtual machines and controller from malicious attacks. As a result, the detection rate is significantly affected by the threshold value. Javanmardi et al. [10] proposed FUPE, a security-driven task scheduling algorithm for SDN-based IoT-Fog networks. FUPE uses fuzzy logic and multiobjective particle swarm optimization to assign tasks to fog nodes, balancing security and efficiency objectives. However, managing and interpreting extensive rule sets poses challenges to maintaining and validating the fuzzy logic framework. Nonetheless, multiobjective optimization with PSO requires parameter tuning and could be computationally intensive, particularly in large-scale environments. Furthermore, it incorporates techniques like threshold random walk with credit-based connection (TRWCB) and rate limiting to detect malicious nodes and utilizes the SDN controller for mitigation by blocking attackers, ultimately leading to a reduction in average response times. Nevertheless, this approach may lead to elevated CPU and RAM usage. FUPE exclusively identifies and addresses anomalies during the scheduling phase, leaving the network susceptible to threats in the absence of scheduling requests [11].

2.2. Machine Learning Methods (ML)

Machine learning-based methods are used to identify DDoS attacks such as decision trees, deep learning, support vector machine (SVM), K-means clustering, and so on [22]. These methods might be unsupervised machine learning (a label for training is not required) or supervised machine learning (a label for training normal/malicious) algorithms. Moreover, the dataset, which contains numerous network and traffic features, is used to train and learn automatically how to recognize suspicious behavior patterns. Rajagopal et al. in [23] provided a meta-classification strategy that integrates many classifiers for both binary and multiclass classification. The decision jungle serves as the meta learner, combining numerous learners to obtain the best prediction performance. This proposed method has a precision of . Tuan et al.’s [24] idea was about proposing a novel TCP-SYN flood attack mitigation by tracing back IP sources of attack in SDN networks using K-nearest neighbors (KNN) machine learning based on SDN. The testbed’s experimental findings reveal that of attack flows are identified and blocked. Priyadarshini et al. [25] demonstrated a new source-based DDoS mitigation approach in order to prevent these attacks in both fog and cloud computing environments. It deploys the defender module that presents at the SDN controller which is based on machine learning (SVM, KNN, and Naive Bayes algorithm). However, the classical ML techniques cannot handle the amount of data. However, the “real world” application of classical ML algorithms is limited due to network attack issues. In addition, these approaches need a lot of time to learn, so they cannot be used in real-time.

2.3. Deep Learning (DL) Methods

In recent studies, there has been a particular emphasis on evaluating the performance of DL models in DDoS detection. This is primarily due to their ability to effectively analyze large volumes of data and identify complex patterns within it. de Assis et al. [26] proposed a near-real-time solution by applying convolutional neural networks (CNN) to cover and defend victims’ servers from DDoS attacks at the end source, the detection model reached a precision rate above . Novaes et al. [27] employed the generative adversarial network (GAN) architecture to mitigate the damage of DDoS attacks on SDNs. For experiment assessments, the accuracy obtained using the published datasets, namely, CIC-DDoS2019, and the emulation was about . The authors compared the GAN framework’s findings against those of other deep learning algorithms, such as LSTM, CNN, and MLP. The authors of [28] employed a variety of machine learning (ML) algorithms to identify low-rate DDoS attacks. They found that the multilayer perceptron (MLP) performs the best among the assessed algorithms, with a detection rate of up to . Other ML models, such as random tree, random forest, and support vector machines, have shown useful in detecting and mitigating DDoS attacks. Deep learning has already been used to identify SYN flood attacks by Brun et al. [29], in which a random neural network was built to classify and differentiate whether the packet is normal traffic or SYN attacks. Evmorfos et al. [30] use a random neural network for identifying typical SYN attacks on Internet-connected equipment including edge devices and gateways, and fog servers, with limited processing capability. Devi et al. [31] presented an intrusion detection system (IDS) approach based on the SUGENO-based fuzzy inference system ANFIS to identify security concerns on relay nodes in a 5 G wireless network. The model was tested and trained using the KDD Cup 99 datasets. Boroujerdi and Ayat [12] developed a novel ensemble of Sugeno-type adaptive neuro-fuzzy classifiers to identify DDoS attacks based on the Marliboost boosting approach. It was tested on the NSL-KDD dataset. However, the data in the NSL-KDD or KDD Cup 99 datasets were considered unsuitable for the new requirement of a DDoS attack since they comprise packet traces rather than flows, implying that the DDoS detection methods may become computationally difficult as the network expands in size. As a consequence, there have been various studies published in recent years on how to identify DDoS attacks, particularly TCP SYN flood using machine and deep learning. However, few of them have addressed using ANFIS to detect such attacks in fog computing based on SDN technology.

In order to address the limitations of the previous studies, in this paper, we propose an ANFIS classifier, implemented in the SDN controller to classify network traffic and deployed at the fog server using the recently published dataset CICDDoS2019. The inclusion of various types of DDoS attacks in this dataset bridges the gaps found in previous databases. In addition, we employ ANFIS using the SDN dataset for real-time mitigation.

3. Background Knowledge

This section highlights the required context for our proposed model. First, we give an overview of DDoS attacks and the different methods used for detection. Then, we introduce the Adaptive Network-based Fuzzy Inference System (ANFIS) detection algorithm. Finally, we present the Software Defined Networking (SDN) technology.

3.1. DDoS Attacks and Fog Computing

The DDoS attack is a highly progressed type of DoS attack. It differs from other attacks in that it may be deployed in a “distributed” manner. A DDoS attack’s primary purpose is to inflict harm on a target for personal reasons, financial gain, or popularity [1]. It is an attack based on availability and aims to make the victim system inaccessible to authorized users [32]. Moreover, it is done by a combination of a huge amount of hacked and dispersed devices known as bots or zombie devices that have been infected with malicious malware or compromised by an attacker [33]. Hence, an attacker centrally controls and coordinates these machines to launch an attack on the target machine [34].

3.1.1. Types of DDoS Attacks on Fog Computing

Several DDoS attack types are used to bring down the functionality or availability of network services on fog computing [32], as illustrated in Figure 1.

(1) Application-Bug Level DDoS. These sorts of attacks, like HTTP POST and HTTP PRAGMA, deplete the application system, causing it to fail or temporarily close down.

(2) Infrastructural Level DDoS. The key purpose of these threats is to exhaust network bandwidth, buffers, CPU, and storage, preventing legitimate users from using them. Thus, the only requirement for this attack is the victim’s IP address. It is categorized into two types: direct and reflector attacks.(i)Direct Attack     This attack is carried out with the assistance of compromised devices or bots. It sends malicious queries to the target using bots in order to deplete its resources, bandwidth, and services, rendering them inaccessible to authenticated users. This attack can be further subdivided into network-layer and application-layer DDoS attacks.(a)Network Layer DDoS: This attack type employs various network and transport layer protocols, including TCP SYN, UDP, and ICMP, among others.(b)Application Layer DDoS: In this attack, HTTP flood traffic is adopted widely to exhaust the victim. This kind of vulnerability is difficult to detect, raising security issues.(ii)Reflector Attack     In this attack, the IP address is spoofed and requests are delivered to a vast range of reflector hosts. Following the receipt of the requests, a response is provided in order to flood the target.

3.1.2. DDoS Defense Mechanisms

In this section, we discuss various defense mechanisms used for DDoS attack detection and mitigation for the security of fog computing [35]. Nearer-to-edge devices, fog computing, offers computing capabilities in the form of fog nodes which creates a heavy load on network management. To address this issue, SDN technology can be implemented to guarantee the safety of fog computing in the following aspects:(i)Monitoring the network: If the network is monitored permanently and continuously, any suspicious data attempting to disrupt services may be recognized and rejected. As this is performed at fog nodes, legitimate users will have no difficulty accessing the services.(ii)Priority-based and isolated traffic: It implies the process of prioritizing legal and illegitimate network traffic, hence requiring the use of shared knowledge resources such as CPU or I/O. As a result, SDN can reject damaging traffic by separating it through a VLAN ID/tag.(iii)Access control mechanism for resources in the network: To prevent DDoS attacks, an effective access control system should be implemented.(iv)Shared network: The shared network is the crucial condition since anyone can access it, holding security at risk.

In addition, two distinct assessments are used to identify DDoS defense mechanisms. The first classification divides the DDoS defense systems into the following four groups based on the activity carried out:(i)Intrusion prevention,(ii)Intrusion detection,(iii)Intrusion tolerance and mitigation,(iv)Intrusion response.

Further, the second categorization mainly classify DDoS defenses into the following three groups based on where they are deployed:(i)Victim network,(ii)Intermediate network,(iii)Source network.

3.1.3. TCP-SYN Flood Attack

The SYN flooding attack is a specific type of DoS attack that targets hosts that operate TCP server processes. It became well-known in 1996 [36]. The concept of the three-way handshake that initiates a TCP connection serves as the mainstay of this attack [37]. It exploits a TCP protocol process characteristic and may be used to restrict server functions from responding to normal user demand to establish new TCP connections. As a result, each service that connects and waits on a TCP socket is highly susceptible to TCP SYN flood attacks. Although several techniques to counteract SYN flood attacks may be found in modern operating systems and equipment.

3.2. The Adaptive Network-Based Fuzzy Inference System (ANFIS) Detection Algorithm

ANFIS is a network model that combines a Sugeno-type fuzzy system with neural learning capability [38]. Neuro-fuzzy systems are ways to learn fuzzy systems from data that use neural network-derived learning algorithms. Therefore, due to their learning capabilities, neural networks are an ideal choice for combining with fuzzy systems [39], which are used to automate or simplify the process of developing a fuzzy system for specific usage. The initial neuro-fuzzy techniques were primarily explored within the field of neuro-fuzzy control, although the approach is now broader because it is used in a number of domains, including control, data analysis, decision support, and so on [40]. ANFIS is based on two parameters (premise and consequent parameters) which are used to connect the fuzzy rules. Moreover, ANFIS is made up of five layers in total, as illustrated in Figure 2. The square nodes have parameters, whereas the circular nodes do not.

The considered fuzzy inference system contains two inputs, y considered as nonlinear parameters, and one output f. Also, each input variable is described by two linguistic terms: and for the variable x, and and for the variable y, respectively.

The following two IF-THEN rules construct the Sugeno fuzzy model [40]:(i)Rule 1: If x is is , then (ii)Rule 2: If x is is , then

Where , , and i = 1, 2, correspond to the linear parameters of the conclusion part to be adjusted during the training.(i)Layer 1: O1 represents the membership function of a fuzzy set . Because of their smoothness and simple syntax, Gaussian membership functions are preferred approaches for defining fuzzy sets. The advantage of these curves is that they are smooth and nonzero at all locations. The Gaussian membership function is used in this study; it is frequently used to reduce the uncertainty of real-world measurement and is represented by the equation (1) where c and represent the mean and standard deviation respectively. Here, c represents the center, and represents the width. a, c are called premise parameters (nonlinear).(ii)Layer 2: the fuzzification layer determines the degree of membership function satisfaction of each input; the output is the product of all the entering signals; it is determined using the following equation:The output of every node shows the firing strength of a rule. The node function in this layer can be any other fuzzy AND T-norm operator, such as min.(iii)Layer 3: the normalization layer, in which the i-th node determines the proportion of the firing strength of the i-th rule to the total firing strength of all rules, as demonstrated in the equation:The outputs of this layer are referred to as normalized firing strengths.(iv)Layer 4: In the defuzzification layer, parameters are named consequent parameters. Each node has a function where is a normalized firing strength from layer 3 and are the set of linear node parameters and are defined as consequent parameters of this node and denotes the output of the rule, as shown in the following equation:(v)Layer 5: in this layer, the single node adds up all of the incoming signals to compute the overall output, as demonstrated in equation:

An adaptive network’s nodes are related to parameters that may affect the final output. To adapt the parameters in an adaptive network, ANFIS typically uses a hybrid learning algorithm that associates gradient descent and the least squares approach [41]. The hybrid algorithm comprises a forward pass and a backward pass. To optimize the consequent parameters, the least squares method (forward pass) is used; node outputs are passed forward until Layer 4, and the least squares determine the consequent parameters. In our work, for optimizing the premise parameters, the ADAM method [42] is employed during the backward pass. Error signals are propagated backward, and the premise parameters are updated using ADAM. This hybrid learning approach offers faster convergence by reducing the search space dimensions compared to the original backpropagation method [31]. It has been demonstrated that this hybrid algorithm is extremely effective in training ANFIS systems [43]. The ANFIS training technique begins by defining the number of fuzzy sets, the number of sets of each input variable, as well as the shape of their membership function. The primary goal of ANFIS is to improve input-output data sets and a learning mechanism to enhance the parameters of a comparable fuzzy logic system. The difference between the intended and actual outputs is minimized as much as feasible during parameter optimization.

3.3. Software-Defined Networking (SDN)

SDN is a network paradigm that enables users to directly manage network resources by orchestrating, controlling, and using software applications [44]. Moreover, the control and data planes are divided by SDN, making it most commonly used to improve network efficiency. When the data plane forwards packets from one location to another, the control plane determines whether or not the packets should propagate through the network. Thus, SDN is formed by the combination of a controller and switches; these switches follow the forwarding rules that are defined by the controller, which can dynamically manage network flows and implement different configurations based on network circumstances. The three fundamental layers of SDN architecture are as follows: (i) The application layer which contains the general network functions including intrusion detection systems, firewalls, and security applications. (ii) The control layer which is the centralized software controller that serves as the SDN’s brain. The network policies and traffic flows are managed by this controller. (iii) The infrastructure layer contains a variety of networking equipment, including switches and routers [45], as shown in Figure 3. The communication between the controllers and switches is outlined throughout the OpenFlow protocol [46], which serves as the communication standard for SDN networks. It is referred to as SDN networks’ southbound communication. The controller can deal with open flow switches (OF-switch) with existing flow tables by using an open flow protocol. When a packet’s flow entry is found in the OF-switch’s table, the packet is forwarded in the usual manner; otherwise, the controller receives it for additional evaluation. Thus, SDN controllers with OpenFlow-enabled switches are widely used for SDN networking. They are especially suitable for light traffic communication and control.

3.4. System Model

This study aims to develop a distributed FASA framework to mitigate SYN flood attacks in the network environment by recognizing and avoiding attacks close to the attacking sources. To enable quicker and more accurate attack detection using the ANFIS model, fog computing is suitable for deploying SDN for mitigating SYN flood attacks by assigning compute power near the operation process and spreading the burden in the system through a FASA mitigation scheme. In this section, we first outline SYN flood DDoS attacks in fog computing. Then, we discuss the FASA network architecture.

3.5. SYN Flood DDoS Attack

As shown in Figure 4, when a standard TCP three-way handshake has been initiated, the end user (EU) transmits the SYN packet to the fog server. Then, the fog server responds with a SYN/ACK packet. Next, the EU should send an ACK packet to the fog server. So, when all of these processes are completed, the connection is established [47]. However, the main drawback of TCP connections is the inability to maintain half-open connections. The fog server is in a half-open connection state because it is standing in line for the EU’s reply to acknowledge the three-way handshake. Furthermore, IoT devices have limited computation, storage capacity, and short battery life, and they can easily be compromised, damaged, or kidnapped. Therefore, due to the aforementioned limitations, an attacker may simply hack IoT devices and utilize them as botnets to generate and send excessive SYN request packets with a fake source IP address to fog servers. As a result, the ACK packet will never reach the fog server which is in the open port state waiting for the ACK packet. Moreover, the SYN/ACK packets are transmitted to the faked host, and the three-way handshake procedure will never be completed. Also, the connection registration is kept in the connection delay buffer till time expires, preventing legitimate users from accessing the services [48].

3.6. FASA Network Architecture

To effectively deal with the SYN flood DDoS attack concerns in the network systems, attack prevention must be built into fog computing based on SDN. Indeed, in this paper, we propose a novel distributed fog defensive system for SYN flood DDoS attacks using ANFIS and SDN Assistance (FASA). The FASA architecture has three layers, the cloud layer, the SDN-based fog (SDFN) layer, and the things layer, as shown in Figure 5.

3.6.1. Cloud Layer

Cloud computing, as a computing model, defines a method of managing a pool of configurable computing resources, offers elastic, on-demand services, and has access to the system anywhere and at any time. Therefore, users can use resources according to their demands. The salient features provided by cloud technology are immediate flexibility and measurable services [1]. SDN and cloud technology can be combined to automate, and cloud application provisioning must be completely integrated with the network. Hence, in the FASA system, cloud computing refers to the application plane which consists of many useful applications that communicate with the controller to abstract a logically centralized controller to make coordinated decisions.

3.6.2. SDN-Based Fog Network (SDFN) Layer

This layer combines the fog computing and SDN paradigm to identify and respond to DDoS attacks. With recent advances in SDN, it opens up new opportunities for providing intelligence within networks. The benefits of SDN, including logically centralized control, software-based traffic analysis, an entire network view, and flexible forwarding rule updates, help to improve and facilitate machine learning applications [49]. Therefore, the SDFN layer provides new trends in DDoS attacks in fog computing environments using SDN. This layer is formed of two sublayers, SDFN-server and SDFN-node.(i)SDFN-server: This sublayer refers to the control plane deployed at fog servers where an intelligent ANFIS classifier is integrated into the control network to classify traffic flow decision and consequently, policies are managed to depend on its decisions. Moreover, the SDFN server communicates with the cloud layer (application) via the northbound interface and with the SDFN-node layer via the southbound interface.(ii)SDFN-node: This sublayer refers to the data plane of physical equipment in the network such as switches and routers. It forwards the network traffic to its destinations using the OpenFlow protocol.

3.6.3. Things Layer

This layer serves the purpose of sensing, collecting, and uploading data from wireless sensors and end-users to fog computing. The transmitted packet can be classified as either benign or malicious.

The following assumptions are made in order to better explain the SYN flood DDoS attack identification and defense framework:(i)The SDN-based Fog Network server (SDFN-server) is susceptible to being compromised(ii)DDoS attacks are TCP SYN flood attacks against SDFN servers.(iii)The SDN controller and the switch are not compromised.(iv)IoT devices can be hacked.

4. Proposed FASA Framework

SYN flood DDoS attacks can instantly bring down a network, and it is difficult to detect them since they can be carried out in a very short time. Therefore, detecting and mitigating such attacks is critical. A detection approach for such threats is needed in fog computing to filter and block the malicious requests before the attack produces a negative impact on the fog services. Consequently, our FASA framework can be used to identify and immediately mitigate SYN flood attacks in real-time through fog computing, as illustrated in Figure 6.

4.1. The detection Process

FASA is based on the ANFIS model and the SDN network to guarantee service availability in the fog network. To attain recognition and detection purposes, a fog layer is established among both the cloud layer and the things layer. Thus, the recognition techniques deployed on the fog layer can handle and process malicious traffic. Also, the SDN controller deployed on the fog layer controls packets arriving from every system node to enhance security and network management. In addition, the SDFN-server is prior trained with ANFIS algorithms and tested using two different datasets, CIC-DDoS2019, and SDN dataset. After a successful data preprocessing step, the most important features will be extracted. Then, these features will be divided into training data and testing data to self-train the SDFN-server to identify the SYN flood attack. Once that is done, the ANFIS model will be able to determine whether an incoming packet is legitimate or not. then, the controller’s decision is based on that, as presented in the flowchart of Figure 7.

4.2. The Mitigation Process

SDN simplifies the implementation of complex mitigation models. When an OpenFlow switch gets a packet, it compares it to the matching rule in its flow table and decides whether to act by forwarding packets to the destination according to the found rule or seek assistance from the controller if the rule is not matched. In addition, the OpenFlow switch initiates this request through the SB-API of the OpenFlow agent in the switch, as demonstrated in the flowchart in Figure 7. Although, the attack may be identified by determining a threshold value, which is the maximum value of serving capacity defined by the availability of computational resources. If the number of service requests exceeds the limit, a malicious packet is sent out [50]. Otherwise, if it is less than the threshold capacity, it will pass through the ANFIS classifier for prediction on the fog server. Therefore, the real-time mitigation phase starts when the ANFIS model detects an SYN flood packet. This phase aims to perform defensive functions to limit the damage caused by an exploit. So, the packet passes through the OpenFlow protocol which takes action by executing the updated rule in the flow table whether it is a legitimate user to allow access. Otherwise, the controller looks for the most often occurring source address Mac with different source address IPs and uses it to determine the infected port number. By correlating the identified Mac address with the corresponding port on the switch, the controller determines the port through which the attack traffic is entering the network. To prevent further damage, the controller instructs the OF switches to drop all packets obtained from the host associated with the identified Mac address. Then, the controller also directs the switch to block traffic on the specific port associated with the infected host, effectively preventing any communication through that port. Next, the controller updates the flow table of the switch to modify the rules related to receiving or forwarding packets to the identified port. This ensures that any packets destined for that port are dropped or redirected to mitigate the attack.

As a result, TCP SYN flooding attacks may be identified and prevented by instantly blocking the switch port that is connected directly to the attacker’s host (see Algorithm 1).

input: incoming packet of traffic flow to the switch
output: response with flow classification and decision
if packet matched in the flow table
 Apply the rule in the flow table;
else
 Forward packet to SDFN-server;
 Apply ANFIS classifier;
if flow classified as malicious packet then
  Retrieve the Mac address of the attacker;
  Update rule table in flow table with a malicious user;
  Make a decision:
  Drop the packets with this source Mac address;
  Block the infected switch port;
else
  Update rule table in SDN with the legitimate user;
  Make decision: Forward the packet to destination;
end if
end if

5. Experiments and Results

5.1. Experimental Setup

In this part, we will go over the various tools that were used to build up the experimental setup for detecting SYN flood attacks in the simulated SDN and fog computing environments, using Wireshark [51] to capture and analyze network traffic in real-time. The entire experiment is carried out on the Windows 10 OS with an Intel i3 processor and 8 GB of RAM. To emulate the network behavior, the SDN Mininet network emulator [52] was used with the Ryu controller [53]. Ryu is an open-source platform that provides transparency and flexibility, enabling customization and extension of functionalities. Its Python-based architecture promotes accessibility and ease of development, facilitating rapid implementation of SDN applications. In addition, support for multiple protocols, including OpenFlow, ensures seamless communication with diverse network devices. Ryu’s compatibility with various networking technologies and hardware makes it suitable for heterogeneous infrastructures, rendering it particularly well-suited for this research [54].

For training and testing our ANFIS model, the Python programming language has been used with libraries for deep learning such as Keras [55], and TensorFlow [56]. In addition, to prevent overfitting, the stratified K-Fold cross-validation [57] was also employed in the ANFIS algorithm. Due to the fact that Stratified k-fold cross-validation guarantees that each fold has a class distribution that is identical to the original dataset, resulting in a more accurate and reliable model assessment, along with binary crossentropy, a classic loss function used in binary classification. Also, we set the default Keras learning rate to 0.001. Furthermore, Adam optimizer [42] was selected as an adaptive algorithm for optimizing learning rates in neural network models. Moreover, by using two different scenarios in this study, we examine the performance and efficiency of the FASA system.(i)Scenario 1: Evaluate the performance of the FASA system by employing the SDN environment.(ii)Scenario 2: Evaluate the performance of the FASA system by using the public dataset CIC-DDoS 2019 [58].

5.2. Experimental Analysis

In our next subsection, we discuss each test scenario and provide the studies’ results.

5.2.1. Scenario 1

In our experiment, the Mininet network emulator [48] was used to design virtual network topologies consisting of controllers, hosts, links, and switches. Therefore, to run Mininet and Ryu controllers [53], we have used two virtual machines based on the Linux operating system. Ryu controller is based on a Python program and supports several network management protocols such as OpenFlow switches. Moreover, the FlowManager is a Ryu controller program that allows the user to manipulate the flow tables in an OpenFlow network manually. We have used the Ryu controller for SDN networking environments due to its ease of deployment, expansion, and simple architecture. Hence, Ryu controllers with OpenFlow-enabled switches are widely used for SDN networking. They are especially suitable for light traffic communication and control. In addition, the Ryu controller provides a routing link to OpenFlow switches to ensure that the topology can perform data analysis. Thus, to emulate our network structure, a linear topology is used on Mininet, in which 8 switches are connected to the Ryu controller, and each switch is connected to 8 hosts. In total, 64 hosts are linked to the OpenFlow virtual switches, as shown in Figure 8. The IP address of the Ryu controller is 192.168.162.133. Likewise, each host is assigned an IP address. For example, the IP address of Host1 = “10.0.0.1/24” and the mac address starting from 00 : 00 : 00 : 00 : 00 : 01 converted from hexadecimal to an integer.

In general, the following processes are involved in scenario 1: the data generation and collection process, the detection process, and the mitigation process. These processes are deployed using Mininet VM and Ryu controller VM based on Python programming language.(i)Data generation and collection process: The SDN dataset is created using both the Mininet emulator and Ryu controller. The normal traffic is collected using the “iperf” command, and we consider one host (Host1) as a simple HTTP server listening on port 80. In addition, we collect the SYN flood traffic data using the Hping3 tool with random IP addresses. Hping3 is an open-source TCP/IP protocol used as a packet generation tool; that is, written in the TCL language. Hping3 enables programmers to create scripts for TCP/IP packet handling and analysis in a restricted period. MAC addresses are an important criterion to mitigate SYN flood attacks because layer 2 switches forward incoming traffic based on Mac addresses. Also, it helps to identify the infected source port. Moreover, the layer 4 switch depends on the source and the destination ports that are essential in the flow table with the following features: datapath id, source IP, source Mac, destination IP, destination Mac, IP protocol, ICMP code, ICMP type, packet counts, and flags. Table 1 provides detailed information about the collected dataset.(ii)The detection process: After the pre-processing of the collected data presented in I, we will split the dataset as follows: The training set contains of the dataset, whereas the testing set contains of the dataset. Then, we use the ANFIS algorithm with cross-validation to avoid overfitting and train the collected dataset to achieve an accuracy of . Next, once the packet-in is received in various forms of regular traffic and attack traffic, the Ryu-controller collects the features and assigns their values to the predicted dataset. For the prediction process, the detection module (ANFIS algorithm) examines each flow entry.(iii)The mitigation process: DDoS attacks are difficult to mitigate because of IP spoofing; therefore, blocking the suspected attacker’s IP is ineffective in mitigating; To achieve our objective of obtaining a list of edge switches directly connected to each host, we will store the Mac address, port number, and switch ID for each host in a Python dictionary. This dictionary will serve as a data structure to retrieve the required parameters for creating mitigation rules.

Every flow entry passes the detection process to check if it is a normal packet or a malicious packet. Then, it will be sent to the Ryu controller to make a decision based on the result of the prediction. Therefore, if the flow entry’s predicted value is 1, it indicates an SYN flood attack in which the attacker transmits both the real source Mac address and a random false source IP. Repeating the higher Mac address with different IPs in each flow entry indicates that the hacker is the host of this Mac address. In this case, we use the assigned Mac address to get the port number and switch id from the dictionary. The Ryu controller then responds by enforcing the rule that rejects all packets originating from that attacker. This rule is then sent to the affected switch, instructing it to block the specific port that is directly connected to the attacker’s host. By implementing this rule, the switch effectively prevents any communication from the attacker’s host through that particular port, helping to mitigate the impact of the attack. Both the hard timeout and the idle timeout are essential parameters that must be adjusted for the mitigation process:(i)Idle time means the flow rule will be deleted if no match occurs with incoming packets within the idle timeout value.(ii)Hard timeout means the flow rule will be deleted automatically after the hard timeout expires since the rule was created.

In the case of an attack, the Ryu controller blocks the packet on the OF switch with idle time = 0 sec and hard time = 300 sec. With a high priority, we used priority 1000 for our model. As a result, the switch continues to block the source port for 300 seconds without notifying the controller. Otherwise, if the detection result is 0, this signifies normal traffic. The idle time will be 200 seconds, and each flow entry has a fixed priority of 10. If no matching happens throughout this time period, the flow rule will be removed after 200 seconds. The hard time will be 400 seconds, after which all flow entries will be deleted.

     During this experiment, the real-time flow traffic captured by Wireshark is represented in Figure 9 which displays the packets per second versus the time plot. In addition, Table 2 presents the parameters employed in this experiment.

     Initially, normal traffic is sent out at time 0 seconds. Next, a Syn flood attack is initiated, and at time 60, the packet rate reaches a threshold value close to 700 packets per second. The ANFIS detection module identifies the attack when Once the attack is detected, the mitigation module takes over. The controller utilizes appropriate flow rules to mitigate the attack by dropping packets, blocking the source ports involved in the attack, and informing the switches to update the flow table accordingly. The attack is successfully mitigated in less than 5 seconds, resulting in a significant drop in the packet rate. The graph shows the continued normal traffic flow without any breakdown until the end of the experiment 140 seconds. This period is crucial as it represents the controller’s capability to receive packets effectively. Figure 10 can demonstrate that during the attack, we observed a decrease in bandwidth consumption, reaching as low as 90 Mbits/sec. Fortunately, it quickly recovered to its pre-attack state and remained relatively stable at around 100 Mbits/sec. This demonstrates the effectiveness of our model in mitigating the impact of the attack and restoring normal network performance.

5.2.2. Scenario 2

In the second scenario, we evaluate the proposed model’s capability to identify the TCP SYN flood DDoS attacks using the CIC-DDoS dataset produced by Sharafaldin et al. [58] for detecting DDoS attacks and classifying attack types. This dataset is in a CSV format. It includes both benign and current popular DDoS attacks launched in 2019. It is collected on the first and second days and reflects the actual real-world data (PCAPs). It also provides the findings of a network traffic analysis performed with CICFlowMeter-V3 that includes labeled traffic flows. This dataset originally had 88 features.

In this scenario, we use the SYN flood dataset presented in Table 3. TCP SYN flood is a type of exploitation category-based DDoS attack that exploits vulnerabilities in TCP connection protocols. It is composed of data from two days, each with a different attack category and a wide range of imbalanced class distribution.

(1) Resampling Data. Both training and testing datasets have a minority class “BENIGN” with a little sample, resulting in an imbalanced classification, which has an impact on a model’s capacity to learn and decide and, furthermore, can cause overfitting in our model. To accomplish this, we build a new dataset in which we take all samples labeled “BENIGN” from the training and testing datasets, forming of the total dataset and of samples labeled “SYN” as shown in Table 4.

(2) Data Pre-Processing. In this section, we will go over the techniques used to analyze our dataset, which contains 88 features. The data will be cleansed and prepared for use in our suggested ANFIS algorithms once certain undesirable attributes have been removed and adjusted. As a result, the implementation of a data preprocessing step, as shown in Figure 11, provides more reliable training and, thus, a more accurate model.(i)First, we removed features that have a unique value in the entire dataset that do not affect the training phase (“Bwd PSH Flags,” “Fwd URG Flags,” “Bwd URG Flags,” “FIN Flag Count,” “Fwd Avg Bytes/Bulk,” “Fwd Avg Packets/Bulk,” “Fwd Avg Bulk Rate,” “Bwd Avg Bytes/Bulk,” “PSH Flag Count,” “ECE Flag Count,” “Bwd Avg Packets/Bulk,” “Bwd Avg Bulk Rate”).(ii)Some values of “Init Win bytes forward” and “Init Win bytes backward” of flow data from the Syn csv file were set to −1. Nevertheless, it is inconceivable to initiate a byte window of size −1. This problem was caused by a software issue with CICFlowmeter and should be set to 0 or removed to not disrupt the training phase.(iii)The need to cope with missing data threw off the model’s training. The lines containing “infinity” and “NaN” were removed from “Flow Bytes/s” and “Flow Packets/s.”(iv)We removed categorical features that can change from one network to another (“Source Port,” “Destination Port,” “Source IP,” “Destination IP,” “Flow ID,” “SimillarHTTP,” “Unnamed: 0,” “Timestamp”).(v)To properly distinguish important features, delete columns with a correlation higher than 0.8 (“Total Backward Packets,” “Total Length of Bwd Packets,” “Fwd Packet Length Std,” “Bwd Packet Length Min,” “Bwd Packet Length Mean,” “Bwd Packet Length Std,” “Flow IAT Mean,” “Flow IAT Std,” “Flow IAT Max,” “Fwd IAT Total,” “Fwd IAT Mean,” “Fwd IAT Std,” “Fwd IAT Max,” “Fwd IAT Min,” “Bwd IAT Std,” “Bwd IAT Max,” “Fwd Header Length,” “Bwd Header Length,” “Max Packet Length,” “Packet Length Mean,” “Packet Length Std,” “Packet Length Variance,” “RST Flag Count,” “Average Packet Size,” “Avg Fwd Segment Size,” “Avg Bwd Segment Size,” “Fwd Header Length.1,” “Subflow Fwd Packets,” “Subflow Fwd Bytes,” “Subflow Bwd Packets,” “Subflow Bwd Bytes,” “Active Max,” “Active Min”, “Idle Mean,” “Idle Max,” “Idle Min”).(vi)In order to detect and classify DDoS attacks, the dataset is split into two classes. The label “BENIGN” is coded as “0” and the label “Syn” is coded as “1” in the dataset created to detect a SYN flood DDoS attack on the network traffic.(vii)Feature selection is used to discover key data features and decrease the amount of data required for detection. We use the XGBoost technique that provides an importance score to each feature based on its influence in making crucial decisions using boosted decision trees [58]. Then, depending on the rated feature, we removed features that were of negligible importance “Protocol,” “Flow Duration,” “Total Fwd Packets,” “Fwd Packet Length Max,” “Bwd Packet Length Max,” “Flow IAT Mean,” “Flow IAT Min,” “Bwd IAT Total,” “Bwd IAT Mean,” “Bwd IAT Min”, “Fwd PSH Flags,” “Fwd Packets/s,” “Bwd Packets/s,” “Min Packet Length,” “SYN Flag Count,” “CWE Flag Count,” “Down/Up Ratio,” “Init Win bytes backward,” “act data pkt fwd”, “Active Mean,” “Active Std,” “Idle Std”, and choose nine ideal feature subsets, as presented in Table 5.(a)We normalize the data by scaling all features in the range of 0-1 value. As previously described, the dataset was divided into two parts training data and testing data. by using cross-validation to avoid overfitting in training steps.(b)Finally, we put the ANFIS model to the test for making predictions on unseen data. The next section discusses the performances and results.

5.3. Performance Metrics

Using the right performance metrics is the key to correctly evaluating models. Therefore, in this section, we explore the following performance metrics to evaluate the FASA framework:(i)True Negatives (TN): Normal flow data is appropriately identified as such.(ii)True Positives (TP): malicious flow data is accurately identified as such.(iii)False Positives (FP): Normal flow data is mistakenly labeled as malicious traffic.(iv)False Negatives (FN): malicious flow data is classified as normal flow data when it is not.

In addition, we provide the confusion matrix to describe our model’s classification performance. It can resume the correct and false predictions obtained using our proposed approach, as demonstrated in Figure 12.

Accurately distinguishing the Benign class within our model is of utmost importance, as elevated false positive rates can result in unnecessary complexity and unwarranted alerts. Our main objective is to minimize the false rate. Hence, our framework achieves a rate of false positives in both CIC-DDoS2019 and SDN datasets. Otherwise, it obtains false negatives in the CIC-DDoS2019 dataset and in the SDN dataset. The receiver operating characteristic (ROC) curve is performed. It represents the relationship between both the True and False parameters. The area under the ROC curve (AUC) measures whether it is possible to distinguish false positives from true positives. As illustrated in Figure 13, our model has an AUC of using the CIC-DDoS2019 dataset and using the SDN dataset and there are two extremely similar values, indicating that our suggested model separates correctly positive from negative classes. By employing established techniques like k-fold cross-validation, the model ensures generalizability and guards against overfitting. Furthermore, the meticulous selection and optimization of impactful traffic features enhance the model’s proficiency in distinguishing between normal and attack behaviors. In addition, the fusion of fuzzy logic and neural learning components proves effective in capturing complex traffic patterns. Lastly, training on diverse attack data distributions further enhances the model’s robustness.We have also used a variety of measures to assess our suggested model, including accuracy, precision, recall, and F-score, to conduct an in-depth comparative assessment with some other relevant methods. These metrics, which are often employed in SYN flood DDoS detection systems, are described in the following:(1)Accuracy refers to the ratio of the number of samples correctly classified to the overall number of samples observed. It is computed as follows:(2)The precision is the ratio of correctly predicted positive samples, it is calculated as follows:(3)The false positive rate is determined by calculating the proportion of negative samples that were incorrectly classified as positive using the following formula:(4)The recall also called the true positive rate is calculated with the ratio of correctly discovered positive samples. It is determined using the equation:(5)Good precision may be more relevant in certain situations, whereas high recall might be more critical in others. In many cases, though, we aim to enhance both values. The f1-score is the combination of these values, and it is commonly stated as the harmonic mean:

5.4. Evaluation Results

To validate our system, we have compared the FASA framework to the FUPE [10] method and other DDoS attack detection systems that were employed on SDN and used the CIC-DDoS 2019 dataset, as illustrated in Table 6.

The first method is FUPE [10] which puts forward a fuzzy-based multiobjective particle swarm optimization approach and a security-aware task scheduler in IoT-fog networks. The second method is the convolutional neural network (CNN) [26], a low-cost based supervised classifier designed to identify suspicious events in a data center. The next approach is based on the generative adversarial network GAN [27] for identifying DDoS threats in SDN environments. Finally, the multilayer perceptron (MLP) [28] is adopted to identify and prevent low rate-DDoS attacks in SDN settings. Figure 14 depicts a comprehensive analysis of the metric findings of the comparative approaches.

As shown in Figure 14, we can observe that the performance of our model using the SDN dataset outperforms all previous techniques with accuracy, precision, recall, and F1-score in each case, and it closely resembles the outcome obtained using the CIC-DDoS2019 dataset. In addition, the accuracy of every learning algorithm is assessed. As a result, the ANFIS achieved the highest accuracy rating of across all classifiers, then the FUPE approach with followed by the CNN algorithm with . Furthermore, MLP and GAN classifiers attained an accuracy of and , respectively. It also illustrates the precision of each algorithm in identifying legal and malicious traffic. Thus, the ANFIS reached precision, and FUPE with a precision of , and the MLP attained a precision value of . Next, the GAN, and CNN algorithms with a precision of , and , respectively. Furthermore, Figure 14 displays the recall values of all methods used in the performance evaluation. The ANFIS algorithm had a recall value followed by FUPE with , whereas GAN had a recall rating. In comparison to the other algorithms tested, the CNN achieved the lowest recall value of while the MLP had a recall of . It also illustrates the F1-score of the classifying methods with , the ANFIS received the highest F1-Score. On the other hand, GAN, MLP, and CNN received F1-scores of , , and , respectively, while the FUPE’s F1-Score is not mentioned. In conclusion, our FASA framework outperforms the other evaluated approaches. The promising test results indicate that it is an effective approach for identifying SYN flood DDoS attacks.

6. Conclusion and Future Work

In this work, FASA, a fog computing-based SYN flood DDoS attacks mitigation using an adaptive neuro-fuzzy inference system (ANFIS) and software defined networking (SDN) assistance was proposed. The choice of the integration of SDN and fog environment with the ANFIS machine learning algorithm brings intelligence to the SDN controller. Also, it makes our framework suitable, efficient, and more secure against SYN flood attacks. We trained and evaluated our framework on the newly released CIC-DDoS2019 dataset that contains the most recent and extensive SYN flood DDoS attacks. The findings of the performance assessment indicate that the suggested model has a high detection accuracy and a low rate of false positive and negative rates, which is a remarkable result, and it also offers the highest evaluation metrics in regards to precision, recall, and F-score when compared to well-known machine learning algorithms. Our future work is to focus on how well our proposed model performs on various datasets. In the current experiments, we have employed a binary classification approach that is implemented on SDN to distinguish between legitimate and malicious input traffic in fog computing. Thus, in future work, we will try to investigate the utility of the suggested approach for other multi-class classification systems. Furthermore, we plan to assess the performance of ANFIS using additional regression metrics, including R-Square, RMSE (root mean square error), and MAE (mean absolute error), beyond the classification metrics presented. This expanded evaluation will offer a more comprehensive understanding of the model’s capabilities. In addition, to create a diversified dataset that truly represents actual Internet traffic, we will emulate the SDN network under various scenarios and with various attack traffic. In addition, we will also consider expanding our work to include the SoDIP6-based ISP/Telecom network, including edge computing network scenarios. This will allow us to evaluate the performance of our proposed model in a more complex and realistic environment. We will also investigate the use of our model for other network security applications, such as intrusion detection and prevention.

Data Availability

Data used in this study are available upon request to the corresponding author.

Conflicts of Interest

All authors declare that there are no conflicts of interest.

Acknowledgments

The authors conducted this research while affiliated with Abou Bekr Belkaid Tlemcen University, Paris-Saclay University, Edinburgh Napier University, and Dakahlia Mansoura University. An early version of the article appears in arxiv [59]. Open Access funding was enabled and organized by JISC.