Abstract

An eclipse attack is a common method used to attack the blockchain network layer; however, detecting eclipse attacks is challenging, and the performance of existing methods is inadequate due to uneven sample distribution, incomplete definition of discriminating features, and weak feature perception. Thus, this paper proposes an eclipse attack traffic detection method based in a custom combination of features and deep learning. To describe the behavior characteristics of attack traffic more accurately, traffic attribute features in there levels are defined in combination with the eclipse attack method. Here, the downstream traffic behavior feature of the eclipse attack is described from the conventional traffic feature, and the frequency distribution characteristics of eclipse attack traffic is by introducing the φ-entropy divergence algorithm. In addition, the structural characteristics of eclipse attack traffic are mapped from the rate of changes in traffic communication and load features. Then, the improved synthetic minority oversampling technique (ISMOTE) up-sampling algorithm is employed to eliminate interference caused by the uneven distribution of eclipse attack traffic samples on the detection results. In addition, the ISMOTE algorithm adjusts the sampling weight of minority class samples, supports automatic clustering and efficient up-sampling of samples, and improves the detection accuracy performance of eclipse attack samples by calculating the local cluster density. Then, deep feature mining is performed on the eclipse attack traffic from the distribution characteristics of space and time series using a CNN and Bi-LSTM. Simultaneously, mining features are fully integrated into mixed feature using the multihead attention mechanism such that the relevance and complementarity of the two feature distributions can be utilized to enhance the model’s ability to perceive the spatiotemporal relationship of the eclipse attack traffic. Finally, the generated multihead attention items are detected for binary classification, and the results are output. Experimental results demonstrate that the proposed method can comprehensively enhance detection performance and sufficiently detect and classify eclipse attack traffic in the blockchain network layer.

1. Introduction

Blockchain, a distributed ledger technology based on peer-to-peer network (P2P) technology, has become a new calculation paradigm with trustless distribution due to its openness, transparency, decentralization, resistance to tampering, and programmability [13]. However, due to the inherent weaknesses of the blockchain network layer, such as architectural heterogeneity, difficult dynamic management and control, and inconsistent network protocol standards, it faces many network layer security threats [2][3].The structured graph model of Kademlia DHT [4,] is used between the nodes of the blockchain network layer to exchange the underlying ledger state information; thus, attackers can target the structural security vulnerabilities of the blockchain network layer and launch network traffic attacks to capture unlawful interests [5]. Previous studies[6, 7] have suggested that blockchain security based on consensus mechanisms, e.g., proof of work, is highly dependent on the security of the underlying network. When network sharing occurs in the blockchain network layer after suffering network layer attacks, e.g., large-scale eclipse attacks, DDoS attacks, Sybil and Erebus attacks, and the attacked subnet, its node resources become isolated from the primary network of the blockchain’s main network. Note that different ledger copies are generated among various blockchain nodes. As a result, the miner node group cannot reach a unique consensus on the blockchain master ledger, which wastes the hash rate, causes hard forks in the ledger data, and undermines the stability and security of the entire blockchain system.

The eclipse attack is a typical blockchain network layer attack that adds a massive number of malicious nodes to the neighbor node set of normal nodes by overrunning the routing table of the blockchain network’s nodes. Here, the attacker requests a malicious forced construction link to fill the routing table of the victim node before restarting the victim node in the blockchain such that the victim node is forced to establish a routing connection with the attacker after restarting. Then, the attacker establishes incoming connection requests to the victim miner node constantly to monopolize the victim node channel and control its information flow. As a result, attacked miner nodes are forced to receive and forward the error or malicious blockchain ledger information sent by the attackers. With this progressive attack method, an eclipse attack node can gradually assume control of the blockchain network layer channels and data flow of an increasing number of nodes. Consequently, the routing structure of the attacked subnet is destroyed, which results in wasted computing resources and provides attackers with computing power advantages (Zhang et al., 2016). Attackers can launch selfish mining and double-spending attacks based on eclipse attacks [8, 9], which will destroy the credibility and uniqueness of the generated results and lead to serious consequences[10], such as damaging data reliability and system reliability. Thus, it is imperative to study the eclipse attack detection method in order to effectively prevent attackers from destroying the hash rate balance and operation stability of the blockchain system.

We have attempted to find a solution to the three problems with the eclipse attack detection on the available blockchain network layer and make corresponding contributions. The three main problems are as follows: (i) incomplete descriptions of traffic characteristics of the available eclipse attacks; (ii) the fact that eclipse attack traffic is sampled under minority class leads to a lack of objectivity of detection accuracy; and (iii) the existing detection method of eclipse attack is lowly susceptible to the characteristics of attack traffic. Our primary contributions are summarized as follows: (i)Eclipse attack traffic features are proposed in terms of downstream traffic behavior characteristics, traffic frequency distribution characteristics, and structural distribution of traffic by extensively analyzing eclipse attack methods and their characteristics. In addition, behavior differences between the attack traffic and normal traffic are summarized. As a result, the deep learning model can efficiently identify the essential difference of eclipse attack traffic and is supported with the profound description on distinguishing features of behavior between normal traffic and eclipse attack traffic(ii)Eclipse attack traffic is a minority class traffic sample with uneven distribution, which may be proactively ignored in the training process, which would affect the objectivity of the classification result and reduce accuracy. Thus, the ISMOTE up-sampling algorithm is designed. The ISMOTE algorithm calculates the local density of a cluster in the up-sampling process using the algorithm, and it adjusts the sampling weight of the minority class sample cluster according to the density change. As a result, the minority class sample can be clustered automatically to support efficient up-sampling. In addition, it can contribute to the reduction of noise in the data and the elimination of redundant features to comprehensively enhance the model’s ability to detect eclipse attack traffic samples(iii)A CNN/Bi-LSTM model that can comprehensively perceive the space and time series distribution feature of the eclipse attack traffic is designed to address the insufficient detection ability of the existing methods for dynamically changing eclipse attack traffic and enhance the inadequate perfection the of eclipse attack traffic feature. In addition, two types of generated features are fused by introducing the multihead attention mechanism to mine the time-space relationship between local features and traffic context features. As a result, a highly correlated and complementary combination of features can be formed to support effective classification

2. Materials and Methods

Heilman et al. [11] first proposed the eclipse attack method in the P2P network of Bitcoin and highlighted the potential risks associated with such an attack. Attackers can control the access of the attacked target to other nodes by covering the route of the attacked node based on the vulnerability of the blockchain consensus mechanism in order to prevent the attacked node from checking the correct ledger view. In addition, complex attacks can be launched against the computing resources of an attacked node, which can cripple the stability of the computing power distribution by affecting the victim’s mining ability and the uplink result of the blockchain. According to [10], attackers can also launch an eclipse attack by exploiting a vulnerability in the Ethereum smart contract when the ledger information received by an Ethereum node is inconsistent. In addition, Yuval Marcus et al. [12] proved experimentally that the eclipse attack can directly affect the security of each Ethereum node and destroy the stability of the blockchain system.

Eclipse attack detection methods in the blockchain have been investigated in previous studies; however, such methods have some disadvantages. Currently, there are two types of eclipse attack detection methods. The first involves eclipse detection based on routing topology perception. Eclipse attackers can send continuous connection requests to the attacked target to occupy the routing table of the node because there is redundant storage in the network node stream table with Chord and Kademlia structure [13, 14]. In this case, detectors can analyze and model the routing structure according to various parameters, e.g., the topology structure of the blockchain network layer and the routing table forwarding state of key nodes, such that detecting changes in the routing state can help identify the occurrence of an eclipse attack [15]. Generally, such a detection method is highly reliable in terms of eclipse attack detection results, with a high reference value for mining structural vulnerabilities in the blockchain network layer [16]. However, its detection mode has a complicated problem that leads to weak model generalization and expansion. As a result, gaps can be found in the model’s real-time detection. In addition, it cannot adapt to the dynamically changing blockchain network layer traffic environment [17] due to its weak perception in the dynamic nonfixed eclipse attack path. The second type of eclipse attack detection method is based on link traffic state analysis. Eclipse attackers must send large numbers of malicious routing traffic to the target nodes and subnets to realize routing shielding and coverage because the eclipse attack is intended to destroy the routing structure. The attack traffic behavior has specific distribution rules [18]. In that case, the proposed method can mine the core feature indicators of eclipse attacks by capturing the traffic in the blockchain network layer and analyzing the real-time traffic status. In addition, the proposed method can construct statistical or machine learning models to determine the occurrence of an eclipse attack [19, 20]. A detection method that features strong real-time detection and model generalizability can monitor and identify eclipse attack traffic dynamically in line with the degree of change in the traffic features. Nonetheless, existing detection methods can poorly perceive dynamic multipath eclipse attacks when a route construction request is concealed by disguising traffic to the blockchain subnet route because attackers may not attack the blockchain subnet directly. Thus, deficiencies can be still found in the ability to dynamically perceive the attack traffic feature (Lu et al., 2018). In addition, eclipse attack traffic is very similar to normal traffic in behavior features, and the core feature difference is insignificant, causing the performance of existing methods, such as the detection accuracy of the attack, remains to be improved. At the same time, a large gap is evident between the detection ability and the security requirements of the blockchain network environment.

Thus, we propose a classification detection method based on a custom combination of features and deep learning to handle various limitations, e.g., weak feature perception and inadequate detection performance due to detection difficulties caused by, for example, uneven sample distribution, incomplete definitions of discriminating features, and tough feature perception in consideration of the reliability requirements of eclipse attack detection based on the routing topology perceiving detection method. Due to its efficient detection and isolation of eclipse attack traffic, the proposed method can protect blockchain network layer routers and miner nodes from eclipse attacks.

3. Analysis of Eclipse Attack Means and Characteristics

Targeted at the systemic weaknesses of the blockchain consensus mechanism, attackers can launch eclipse attacks by controlling a large number of IP addresses to monopolize all connections with victim blockchain network nodes and isolate the accounting results of the attacked nodes from the master network, which wastes computing resources and destroys the stability of the computing resource distribution and the blockchain’s system security. In their review of theoretical research and experiments, Xu et al. [19] identified two main eclipse attack methods in the blockchain network: (i) malicious ADDR building connections and (ii) poisoning the core routing table. In the first cast, i.e., malicious ADDR building connections, the attacker must uninterruptedly send connection requests for ADDR construction to the attacked node while attempting to launch an eclipse attack on a legitimate node. Here, a vast number of illegal connection requests from attackers are sent to occupy the routing table of attacked nodes. As a result, a legitimate node cannot receive the correct ledger result from other nodes or send the blockchain ledger information to other nodes, thereby destroying the routing structure of the blockchain network node group (Tran et al., 2018). In the second case, i.e., (ii) poisoning the core routing table, attackers can send a tampered node identifier set to the core routing node through the DNS seed node after forging a ping command. Here, the identifier can fill the target’s routing table, thereby poisoning the subnet and its nodes and destroying the subnet routing structure. Then, the attacked subnet can be isolated and shielded from the main network node [21, 22]. The eclipse attack process is illustrated in Figure 1.

Both the general behavior characteristics of traditional attack traffic and specific traffic behavior characteristics for the blockchain operation mechanism can be found in the eclipse attack in the blockchain network[23], which are summarized in the following: (i)Multiple stages. Generally, eclipse attackers launch attacks using multistage path change. Here, the attacker sends attack traffic to the target blockchain subnet router, and then, the router generates an error ping command and forwards it to the target subnet. Finally, routing coverage can be realized by forwarding the internal routing of the target subnet. Thus, the eclipse attack can target nodes in the blockchain network layer without connecting directly to the target subnet(ii)Diversity. A previous study [24] experimentally demonstrated that all attack forms that can generate large quantities of ping-pong information can be used to launch eclipse attacks on the target subnet. Thus, eclipse attackers can generate high volumes of attack traffic and force creation of ADDR command connections without exposing the source address traffic via various attack means, e.g., NTP amplification, SSDP amplification, and network flooding, which results in a destructive attack effect on the blockchain subnet(iii)Many-to-one mapping. Eclipse attackers can attack the target subnet from various attack sources; thus, the source address in eclipse attack traffic for the target is a many-to-one mapping to the destination address. Conversely, there are many-to-one, one-to-one, and one-to-many mapping forms in a typical network. In addition, legitimate users can request a single service within a given period, whereas attackers typically request as many services as possible to consume the target subnet node resources quickly. Thus, a many-to-one mapping relationship can be observed between the destination port and destination address, which causes normal traffic to differ from eclipse attack traffic in the traffic structural characteristics, e.g., packet length

The attack traffic presents statistical features of a centralized distribution and abnormal probability distribution according to the above behavior characteristics of the eclipse attack traffic. To improve detection sensitivity, a corresponding algorithm should be designed to describe and amplify the difference between the normal and eclipse traffic such that the model’s ability to detect attack traffic is improved.

4. Definition and Design of Eclipse Attack Traffic Features

Sample combination feature fields are set to describe the characteristics of attack traffic because the deep learning model learns blockchain traffic samples to detect the core features of eclipse attack traffic from blockchain network layer traffic. Here, we define eclipse traffic features on three levels. First, the conventional traffic feature fields , which can be extracted directly from the blockchain network layer traffic, are considered the input vector to perceive the characteristics of downstream traffic behaviors of eclipse attack traffic. Second, the traffic feature quantity based on φ-entropy divergence is introduced as a part of the input feature fields according to the eclipse attack mode and characteristics with the consensus mechanism operation model of the blockchain such that the statistical frequency features of the eclipse traffic can be mined. In addition, the traffic structural characteristic fields are constructed from the perspectives of traffic frequency distribution and traffic data load from aspects of traffic frequency distribution and traffic data load to improve the detection accuracy and avoid misclassifying normal burst traffic in the blockchain network as eclipse attack traffic. As a result, the structural characteristics of eclipse attack traffic can be expressed in a comprehensive manner, which improves the accuracy of eclipse attack detection.

4.1. General Traffic Feature Fields

Generally, eclipse attackers send the traffic behavior characteristics of downstream attack traffic to the target subnet in the blockchain network layer; thus, the statistic features of the corresponding flows are selected from multiple perspectives, e.g., the data packet and session flow of the traffic data in the blockchain network layer. Such features, which can be obtained in the detection window directly and automatically, can reflect the statistical features of data packets and session flow in the data traffic of eclipse attacks [25, 26] in an intuitive manner. The conventional traffic feature fields used by the model are listed in Table 1.

4.2. Statistical Feature Field Based on φ-Entropy Divergence Value

The attack traffic has the statistic features of centralized distribution and high probability distribution due to the typical many-to-one traffic behavior feature of the eclipse attack traffic. Events with high probability have a great impact on the entropy model algorithm; thus, better detection sensitivity can be obtained to effectively distinguish eclipse attack traffic from normal traffic. φ-entropy detection is more accurate than traditional Shannon entropy detection because it can effectively amplify the statistical difference between normal traffic and eclipse attack traffic, support the differentiation of various sample distributions, and effectively perceive abnormal events with high probability. In addition, real-time specificity exists in the traffic status of the blockchain network layer, and the blockchain network layer might give rise to network jitter due to the burst traffic of legitimate nodes. The failure of a static threshold based on information entropy to describe the dynamic process of the eclipse attack completely may cause a high false-positive rate or high false negative rate in the detection process.

Bhatia et al. [27] proposed an information measurement theory based on φ-entropy by improving the existing Shannon entropy algorithm. Then, Behal et al. [28] introduced the φ-entropy algorithm to the abnormal traffic detection task, which is denoted as where represents the probability of event , and parameter is used to adjust the sensitivity of the event frequency, where . Compared to the conditional entropy of the classical Shannon entropy, φ-entropy divergence can be expressed as follows: where is the probability of event .

To effectively strengthen the objectivity and sufficiency for the detection model to select statistical fields, the φ-entropy divergence values are proposed as a part of the input feature vectors of the deep learning model by effectively extracting class distinguishing features in the eclipse attack traffic combined with the eclipse attack features and analyzing the data packet headers and attack characteristics. (i) is the φ-entropy divergence of the source IP concerning the destination IP. When the eclipse attack traffic sends ADDR construction service requests to the target node via a mass of hosts, its host IP address has an identifiable many-to-one mapping relationship with the destination host IP address according to the characteristics of the eclipse attack traffic, which leads to a significant increase in the statistical value of D α(src_ip|dest_ip) because there are many-to-one, one-to-many, and one-to-one mapping relationships in a typical network traffic environment(ii) is the φ-entropy divergence of the destination port concerning the destination IP. Generally, legitimate miner nodes have small changes in request services within adjacent time windows in blockchain network layer traffic. However, the eclipse attack host typically sends as many service requests as possible to the target miner node. As the many-to-one mapping relationship between destination ports is more significant than that of the destination address, is adopted to describe the corresponding mapping relationship(iii) is the φ-entropy divergence of source IP concerning the destination port. is adopted to describe the mapping relationship because the many-to-one mapping relationship between the source IP address and the destination port is more significant than that of destination addresses when numerous hosts send ADDR construction service requests to specific ports to populate the routing table in the eclipse attack process(iv) is the φ-entropy divergence of traffic packet size concerning the destination IP address. Typically, a data packet sent by the target miner node in normal traffic is less regular in size in the blockchain network layer. However, eclipse attack traffic data packets are fixed considering that the limitations of attack tools in eclipse attack traffic and the data packets sent by the request command are relatively consistent. Thus, the value of the eclipse attack traffic is reduced dramatically compared to that of normal traffic in the blockchain network layer

4.3. Structural Feature Fields of Traffic

Two structural features of traffic, i.e., the change rates of traffic communication feature and the traffic load feature, are defined against the strong concealment of the eclipse attack traffic in the blockchain network layer in conjunction with the eclipse attack process and the operating characteristics of the consensus mechanism. Simultaneously, corresponding 10-dimensional attack features are defined as part of the deep learning model’s input feature vector.

4.3.1. Rate of Change in Traffic Communication Feature

First, eclipse attackers continuously send many attack data packets to the target subnet in flooding form to populate the route of the target subnet node. In this case, the eclipse attack traffic and normal network traffic can be distinguished according to the change in the average number of data packets in a given period. As a result, the behavior characteristics are depicted by defining two statistical characteristics, i.e., the average number of data packets and the maximum number of data packets in a unit period. where expresses the number of data packets collected by the detection node from the traffic in the blockchain network layer, is the statistical numbers, and is the total collecting number in the statistical window.

Second, the eclipse attack traffic generated by attack tools, e.g., scripts, is periodic with small changes in adjacent periods. In contrast, typical network traffic has an irregular rate of traffic sending with large differences in the changes in packet sending rate in adjacent periods due to the various states of nodes. The eclipse attack traffic and normal traffic in the blockchain network layer can be distinguished according to the rate of packet frequency change in adjacent periods. Thus, the statistical deviation of a packet is defined to reflect the degree of the rate of change, and the average absolute value offset of the packet is proposed to reflect the degree of change of the sending rate of data packets per unit period.

The statistical deviation of a packet is expressed as follows:

The average offset of the absolute packet value is expressed as follows:

Finally, the eclipse attacker overwrites the routing table of the target nodes by sending maliciously constructed routing traffic for a long period, which may exhaust the storage space of the traffic table, thereby inhibiting normal routing from being written in the routing table. In addition, the sender of normal traffic may cancel the connection in a short period when miner nodes in the blockchain network layer fail to give a timely pong response to the sender. Conversely, eclipse attackers prefer to wait for the reply of the attacker rather than disconnecting the network connection. In this case, the survival time of the attack traffic is increased under the eclipse attack, which goes beyond the survival time of the normal blockchain network layer traffic and the space timeout period of the subnet routers. However, normal blockchain network layer traffic has a short survival period due to different traffic loads. Evidently, the eclipse attack traffic can be distinguished from normal traffic in terms of traffic survival time; thus, in the proposed method, behavior characteristics are described using the average survival time and activity degree of data traffic.

The average survival time of data traffic is expressed as follows:

The traffic activity level is expressed as follows: where is the survival duration of a single data stream in the blockchain network layer within the statistics window and is the idle timeout period of routing in the routing table.

Attackers may send aperiodic eclipse attack traffic during the attack; thus, a small aperiodic change of such traffic is assumed under the premise that the effects of the eclipse attack target and attack features remain unchanged. In this case, the features of the aperiodic attack can be considered to be similar to that of periodic eclipse attack traffic. Regarding the rate of change in the traffic communication feature, equally, the attack traffic differs significantly from traffic in the normal blockchain network layer.

4.3.2. Rate of Change in Traffic Load Features

The eclipse attack traffic in the blockchain network layer traffic is less affected by external intervention because it is generated by attack tools, e.g., scripts. Thus, the packet load of eclipse attack traffic in a unit period is changed slightly, particularly in terms of the packet length. In contrast, the normal blockchain network layer traffic presents no obvious regularity in the change of packet length in a unit period because it comprises various types of protocol traffic with large differences in packets and loads.

In addition, the packet load of eclipse attack traffic is dominated by business information; e.g., the routing construction command for eclipse attackers must consider the attack cost and traffic complexity together with the equivalent operation mechanism between nodes in the blockchain system. As a result, the load information of eclipse traffic is short in length. In contrast, a normal data packet load in the normal blockchain network layer traffic is dominated by network service information, which leads to a significant difference in the loads of various data packets and large capacity. Consequently, the packet length of normal blockchain network layer traffic is much greater than that of the eclipse attack flow.

The average length of data packets, the maximum length of data packets, the deviation of the average length of data packets, and the mean difference of the absolute value of packet length are introduced to describe the change of the above-mentioned eclipse attack traffic load features as traffic features of the blockchain network layer.

The average length of a data packet is expressed as follows:

The maximum length of the data packet is expressed as follows:

The deviation of the average length of a data packet is expressed as follows:

Finally, the average offset of the absolute value of packet length is expressed as follows: where a 42-dimensional feature set is formed as the input feature of the deep learning detection model for model training by performing vector fusion on the traffic features of the conventional traffic feature field, the statistical feature field based on φ-entropy, and the manually extracted feature field. Note that the input feature dimension is uniformly denoted .

5. Eclipse Attack Traffic Detection Model Using the ISMOTE Algorithm and CNN/Bi-LSTM Multihead Attention Item

The proposed eclipse attack detection method adopts binary classification detection based on deep learning, data preprocessing, sample data up-sampling, and attack detection. The classification detection model may tend to learn majority class data in the classification training process, thereby increasing the probability of erroneous classification of minority class samples. The eclipse attack traffic in the blockchain network layer is small and represents unbalanced data. The ISMOTE algorithm, which is based on the DP clustering algorithm, is utilized for up-sampling to improve the binary classification performance of minority class samples, avoiding the eclipse attack traffic being selectively ignored by the model in the classification process. Then, the up-sampled samples are detected using a CNN/Bi-LSTM model for spatial feature extraction and sequential relationship, respectively. In addition, combined features are output using a multihead attention mechanism. Finally, binary classification detection and reverse iteration are performed using the SoftMax classifier. The structure of the proposed detection model is shown in Figure 2.

5.1. Data Preprocessing

The data preprocessing involves data validation, symbolic attribute feature numericalization, and data normalization processes.

5.1.1. Data Validation

Some default feature values exist in the network traffic data collected in a real network environment; thus, normal traffic features in the attribute value and all features with the same value are deleted and filled with 0 to ensure the validity and integrity of the model’s input vector.

5.1.2. Symbolic Attributes’ Numericalization

Symbolic features are converted into binary numeric features using the attribute mapping method; i.e., feature values represented as hexadecimal values in the dataset are converted to decimal values. A large number of nonnumeric address-type attribute values are converted into the frequency of attribute values occurring in the entire dataset.

5.1.3. Data Normalization

The value domains of various features in the dataset differ; thus, normalization is performed on numerical data to eliminate the effects of differences in value domains. Here, all numerical traffic feature values are mapped to the range of [0,1], thereby allowing data attributes to be in the same order of magnitude. It can be expressed as where is the normalized result of the ith feature value, and are the minimum and maximum values of the attributes, respectively.

5.2. ISMOTE Up-Sampling Algorithm

Low-frequency samples are ignored by the model in directivity in the model training and classification process due to the uneven traffic samples in the blockchain network layer environment caused by the small amount of eclipse attack traffic, which reduces detection accuracy. Here, the ISMOTE algorithm is employed to perform up-sampling on the input sample traffic to boost the increment of low-frequency samples and mitigate the influence of noise on the samples. First, the ISOMTE algorithm defines cluster centers and outliers based on the DP clustering algorithm [29] to reduce the influence of noise on the clustering results for various sample clusters according to the density peak. Then, the sampling weight of the minority sample clustering is increased according to the local density to prevent minority subclass samples from being classified into the majority class cluster, thereby improving clustering accuracy. This process is detailed in Algorithm 1.

Input: Data set D to be up-sampled, and sampling coefficient β.
Output: up-sampled dataset D’.
Start:
1: For each sample xi∈D do.
2: Obtain the corresponding subclass through clustering with the DP algorithm.
3: if The sample size of the subclass is small:
4:         //Local density based on exponential kernel.
5:  else:
6:        //Local density of each sample point.
7:        //Calculate the neighbor between the sample point xi and the nearest clustering center.
8:  //Separate the samples into majority-class samples DLG and minority-class samples DTN according to the clustering results.
9:       //Count the number of samples with minority-class sampling flow G.
10: //Calculate the number of samples for each minority subclass; n is the number of minority subclass clusters; DTNi is the sample size of the ith subclass.
11:       //Calculate the sampling weight of each sample in each minority subclass.
12:       //Determine the up-sampling number of each sample for each minority class.
13: Each minority-class sample xi locates all neighbor samples in the subclass DTNi according to δi.
14: //Execute gi times of randomly selecting a neighbor sample xTNi for xi to synthesize a minority sample.
15: Generate up-sampled dataset D’.
End

Here, is the Euclidean distance between samples and , and is the cutoff distance of the samples, which is selected from the 1% quantile at the ascending sample interval according to Algorithm 1. Note that is a truncation function, which is expressed as follows:

is an up-sampling coefficient, where . When , the number of and samples in the up-sampled training set is balanced. In addition, is a random number in the range (0,1). The inverse proportion of the number of traffic samples in the subclass is considered the basis for the up-sampling number of subclass samples (Algorithm 1, lines 9–10). This is performed to handle the uneven distribution problem in the sample class. A greater sampling weight is also assigned to boundary sample points to solve the boundary ambiguity and missing boundary sample problems in the up-sampling process and support automatic clustering and efficient up-sampling of minority class samples. The ISMOTE up-sampling process is shown in Figure 3.

5.3. Multihead Attention Fusion Feature Based on CNN/Bi-LSTM

The CNN and LSTM are typical deep learning algorithms that can extract deep-level features from the levels of space and time series, respectively. A CNN can extract layer-by-layer features in a hierarchical structure, which supports mining the deep-level local complex features of samples and reconstruction of the spatial dimensions of the sample data [30]. Bi-LSTM, which is an improved bidirectional RNN algorithm, can perceive time series features of samples from the time dimension after identifying the historical relationship of sample sequences via the attribute correlation information before and after calculation. The eclipse attack traffic is characterized by stages, diversity, and the many-to-one mapping; thus, it is necessary to mine the local relationship between features at the spatial level and measure the historical change law of samples from the sequential level. Thus, a model that combines a CNN and Bi-LSTM is proposed for feature extraction, where the multihead attention mechanism is employed to fuse features. In addition, eclipse core features can be extracted from both spatial and temporal domains, which contribute to improving detection accuracy and enhancing feature perception abilities. The input vector of the preprocessed sample and the ISMOTE algorithm after up-sampling are expressed as , which is considered to be the model input vector.

5.3.1. CNN Structure

The CNN, which comprises multiple convolutional layers, pooling layers and fully-connected layers, performs local feature extraction and perception on input data via convolution kernels, which is expressed as follows: where is the activation function, the ReLU function is employed as the activation function, is the convolution kernel of each layer, and is the offset of each layer. Here, the feature matrix is generated upon passing through the convolution layer as follows:

Then, a locally maximized down-sampling operation is performed on feature matrix using a MaxPooling-based pooling layer algorithm. As a result, the output feature of the pooling layer is generated as follows:

The feature vector of the serialized structure is generated through the fully connected operation of the output feature of the pooling layer as follows:

5.3.2. Bi-LSTM Structure

LSTM can perceive sample sequence features when information increments are limited by conducting loop operations on the input information and performing multilevel feature selection on historical experience information based on the gating mechanism in its memory cells [31]. The structure of an LSTM memory cell is shown in Figure 4. Various functions in the LSTM structure are expressed as follows: where the LSTM cellular structure incorporates ingate , forget gate , secret gate , outgate , and cellular state . Note that , , , , , , , and are the weight matrices and offsets for each corresponding gate. For the input sample sequence , the sequence state is the output.

An eclipse attack is characterized by a multistage attack, and an attack traffic sample with feature expression via LSTM from a sequence information perspective may lead to missing the context information feature; thus, it is favorable to extract the context sequential information using Bi-LSTM.

Bi-LSTM forms the output vector after performing weight sharing and bidirectional superposition of the reverse and forward sequential features of and based on the structure of LSTM. By considering the influence of the content before and after the sample on the current state information, a comprehensive sample historical feature information can be generated, which is expressed as follows:

5.3.3. Multihead Attention Mechanism

A simple CNN pooling dimension reduction strategy may cause a loss of feature information due to the sparsity of feature information in the eclipse attack samples. In this case, a model that learns a variety of core feature information from traffic samples can assign greater weight to important feature information to improve feature extraction by calculating the similarities of various attribute features using the multihead attention mechanism.

The historical feature information generated by the Bi-LSTM model is representative of the overall information of the eclipse attack samples for its supports for the high-quality expression of sequential features of the traffic samples, whereas the CNN can effectively extract the local spatial information of the samples. Thus, both features are integrated into the attention mechanism’s structure to improve classification accuracy. Here, the cosine similarity of the sample historical feature information and the local convolution feature is first calculated via soft addressing to describe the overall contribution of the sample historical feature information of each subitem to local convolution feature . Then, the generated cosine similarity is normalized to calculate the weight value of the historical feature information of each subitem. As a result, the multihead attention item is generated after the sample history feature information of each subsequence is weighted by the similarity value.

The cosine similarity is obtained as follows: where is the offset vector; .ultiple attention items are obtained by calculating the input data lo via weighted superposition through the parallel inquiry process of multiple soft addresses, which are recorded as follows:

Then, all generated attention items are spliced and fused to support the model classifier to fully grasp the output feature content, which is expressed as follows: where is the number of generating the process of calculating the multihead attention. The final classification output is generated by performing binary classification on the fusion feature of the multihead attention machine layer with the SoftMax classifier.

In the model backpropagation training process, the combined term loss function is expressed as follows: where and represent cross-entropy loss and weighted cross-entropy loss, respectively; is the regularization term; is the number of sample tags; and are regulatory factors for loss weight importance; is the predicted class probability of the ith class; and is the concrete class of the ith class.

6. Experiments and Result Analysis

Experiments were performed in a real blockchain network environment to comprehensively evaluate the performance of the proposed method. Here, the proposed model was trained to verify its effectiveness prior to investigating the importance of different features and the rationality of eclipse features proposed.

In addition, the advantages and disadvantages of the proposed eclipse attack detection method were analyzed in comparison with existing detection methods and classic machine learning techniques.

6.1. Experimental Configuration and Evaluation Metrics

Model training and comparison tests are performed on a Windows 10 PC with an Intel Core i7-9750 CPU, 32 GB RAM, NVIDIA GeForce GTX 2060 GPU, and 8 GB of memory. The proposed model was implemented using the Anaconda Python deep learning library PyTorch 1.5 and the JetBrain PyCharm 2020.3 software.

The detection performance of the proposed model was evaluated according to various metrics, e.g., accuracy (ACC), F1-score, G-means, and area under the curve (AUC), in terms of detection sensitivity of minority class samples and the detection accuracy of majority class samples in consideration of unbalanced data distribution.

Based on the accuracy, the ratio of the number of correct predictions by a computational classifier to the total number of input samples, the overall sample detection performances of the classification models were calculated. F1-measure can combine precision and recall to evaluate and classify comprehensive results effectively. G-means can perceive the classification performance of the samples under minority class. The larger AUC (area under the curve) model can intuitively reflect the performances of different classification detectors. The larger the AUC is, the better detection performance is.

Among them, true positive () and false positive () represent an attack sample is correctly and incorrectly classified, respectively. In addition, true negative () and false negative () represent a normal sample that is correctly and incorrectly classified, respectively. The AUC is the area under the receiver operating characteristic curve (ROC) with the ordinate of the true positive rate () and the abscissa of the false-positive rate ().

In the AUC equation, , are the numbers of the corresponding positive and negative examples, , are the corresponding positive and negative example sets, is the probability of detecting positive samples, and is the indicator function, where is 1 when is true. When performing detection on datasets with uneven distribution, large ACC, F1-score, G-Means, and AUC values indicate favorable detection performance.

To examine the rationality and importance performance of the proposed eclipse attack traffic feature ImportanceFactor, which is the feature importance index of each subitem, is designed to describe the importance presentation of each proposed feature i in the model’s output, which is expressed as follows: where is the training model, is the number of features in all samples, is the subset, is the sample set containing only , and and are regulatory factors of feature importance, where . The ImportanceFactor indicator combines the Shapley value [32] of each subitem feature, and the degree of change in accuracy comprehensively demonstrates the contribution of each feature item to model training.

6.2. Experimental Environment and Data

The experimental data were derived from traffic collected during the operation of a real blockchain system. Here, the network traffic environment incorporated the normal blockchain network layer traffic environment and the eclipse attack traffic environment. Focusing on the detection environment adopted by Xu et al. [19], in this experiment, the blockchain network layer environment was modeled under a multitype traffic protocol combined with the Hyperledger operating environment. The experimental traffic data were from real UDP and TCP traffic samples captured using Wireshark. In addition, the Ethereum Devp2p protocol Dissector plug-in was added to Wireshark to interpret the load information of Ethereum packets. Here, more than 154,000 normal traffic records and over 28,000 eclipse attack traffic records were collected for the experiment, which was performed using 10-fold cross-validation.

The experiment uses Ethereum 2.0 transaction traffic and Hyperledger Fabric 1.4 operational traffic to simulate mixed traffic at the blockchain network layer. Precisely, the Ethereum 2.0 environment included 12 virtual hosts, and the Geth core program, which runs the transaction process between blockchain miners, was run on Ubuntu 18.04 with an IP network segment of 192.168.108.0/24. In addition, the Hyperledger Fabric1.4 environment comprised eight virtual hosts, and the blockchain program was executed using the Docker program of Ubuntu18.04 with an IP network segment of 192.168.106.0/24.

In order to simulate the eclipse attack scenario under real conditions, two methods are used to generate Eclipse attack traffic for experiments. (i) First, vulnerabilities were mined according to network protocols using the Yersinia attack tool, which can deceive network neighbor nodes by forging specific protocol load information and data packets to destroy the network routing topology in order to realize an eclipse attack. (ii) Second, an attack script based on the Scapy library was compiled using Python 3.9. This script was used to periodically generate eclipse attack commands, e.g., ping, pong, findnode, neighbors, and ADDR construction commands. The attack period was 1 s. The attack script continuously executed the above commands to attack the target subnet nodes and routers to force the construction of incorrect routes in order to shield the victim subnet node routes. Here, the host IP network segment was set to 192.168.13.0/24 for both eclipse attack environments.

The topological environment simulated in this experiment is shown in Figure 5.

6.3. Model Training Results and Verification of Model Effectiveness

The model training process was trained according to the mini-batch mode. The real blockchain network layer traffic and eclipse attack traffic were sampled from the blockchain network layer environment and then preprocessed. Then, multiple new datasets were partitioned into training and test sets via random and independent sampling, and the proportions for training and testing data were 90%:10%. The training sets were trained as per the model structure with reverse training for parameter tuning to obtain the optimum parameter set. The test sets were used to verify the model performance. In addition, a sample bootstrap sampling iteration was performed every 50 times of training during batching training. Here, 5000 traffic samples were sampled in each iteration with an even split between normal traffic and eclipse attack traffic. The datasets were independently and repeatedly experimented on with cross-validation performed between the test sets to ensure that the experimental results were unbiased. The detection results on each dataset were averaged during the experiment. The experimental parameters and their associated values are presented in Table 2. The experimental process is shown in Figure 6.

Model training and experimental comparisons were performed simultaneously using the CNN and Bi-LSTM models to evaluate the detection performance of the proposed model under equal experimental conditions. In this experiment, multiple rounds of iterative training were first performed on the training sets and test sets with the recording of changes in the accuracy of the proposed model. In addition, the convergence of the proposed model was then verified before comparing the convergence rates of several models. The accuracy of the proposed model was ultimately compared to that of two classical algorithms.

Here, more than 1000 training and testing rounds were performed on the training and test sets, respectively. In addition, the model parameters were adjusted as per the recorded accuracy change in each round. The results are shown in Figure 7. As can be seen, the average accuracy obtained on the training and test sets were greater than 92.6% and 95.3%, respectively, after 500 rounds of model training. These results demonstrate that the proposed model with satisfactory training results can be applied to experiments in an actual blockchain network layer environment.

The loss changes of three models obtained with different numbers of training rounds are shown in Figure 8. The experimental results suggest that the early loss value of the model was high with slow convergence speed when there were small numbers of training iterations because the high complexity of the proposed model resulted in insufficient model learning sample features. However, the loss value was less than that of the other two algorithms with accelerated convergence speed as the number of training iterations increased, which indicates that the proposed model can well realize convergence over multiple iterations. In addition, the feature perception ability was enhanced rapidly with the deepening of the training level. Ultimately, the loss value of the proposed model stayed below 0.05, which demonstrates that the proposed model can achieve convergence after fully learning the sample features due to its strong ability to detect eclipse composite traffic samples.

Figure 9 shows the changes in iterations and detection accuracy of the proposed model, CNN, and Bi-LSTM under equal experimental conditions. After ten training iterations, the average detection accuracy of the proposed detection method, the Bi-LSTM algorithm, and the CNN algorithm were 98.25%, 98.11%, and 97.89%, respectively. We found that the proposed CNN/Bi-LSTM model outperformed the existing Bi-LSTM and CNN methods in terms of detection accuracy, which increased as the number of iterations increased. It is evident that the proposed model combined with the distribution features of space and time improved detection performance, aside from perceiving and mining profound features of eclipse attack traffic at different levels.

6.4. Verification of Rationality of Feature Selection

To verify the rationality of the proposed eclipse attack traffic feature, we calculated the ImportanceFactor, i.e., the feature importance value of various machine learning algorithms, which evaluates the importance degree of all feature attributes in eclipse attack detection by conducting visual expression with the thermodynamic diagram. Here, classic machine learning models with strong feature perception abilities, i.e., SVM, C4.5 decision tree, and DNN, were selected for horizontal comparisons without loss of generality.

In Figure 10, the row and column represent different detection methods and the ImportanceFactor values of different eclipse feature items, respectively. The blue item represents weak feature importance, while the red item indicates strong feature importance. It can be seen that six algorithms proposed in this paper, including SVM, C4.5, DNN, and Bi-LSTM, have similar importance selection results against various features, and the 38th, 39th, 40th, and 41st items have higher importance. As can be seen, the detection algorithm of eclipse that is a kind of downstream traffic data is sensitive to the frequency calculation based on φ-entropy. In addition, change rates of the traffic communication feature and load feature well describe the structural characteristics of eclipse attack traffic, which verifies that the proposed features are reasonable.

The average of four feature items was calculated with corresponding stitching diagrams drawn by analyzing the collected traffic in 1000 statistical time windows in order to verify and compare the performances of the φ-entropy divergence values of feature vectors in the eclipse attack traffic and normal traffic in the model. When an eclipse attack occurred, the φ-entropy divergence values of the feature vectors changed significantly, with the average being much greater than the normal traffic. In contrast, the φ-entropy divergence value of the normal network layer traffic is small and changed insignificantly, as shown in Figure 11. Evidently, the φ-entropy divergence value can well discriminate exceptional events of high probability.

6.5. Comparison of Detection Performance

The proposed method was compared to the random forest algorithm [8] and the computer immune model in terms of the detection rate, TPR, and F1-score obtained on the same dataset with the AUC value. The results demonstrated that the proposed detection method outperformed the compared methods. As shown in Figure 12, the AUC value of the proposed method is 0.9853, which is greater than that of the random forest algorithm and the computer immune model, which indicates that the proposed method has more advantages than these existing detection methods.

With the accuracy rate, F1-score, and G-means indicators for performance evaluation, the proposed methods are advantageous over the above three indicators, as can be seen from the figure. Combined with the results shown in Figure 13, we find that the proposed method outperformed the existing detection methods for two reasons. First, the proposed combined CNN/Bi-LSTM algorithm better perceives the space and time series information of the eclipse attack traffic features; thus, the multihead attention items generated by the training model can support the model’s ability to classify and output attack features apart from effectively increasing its ability to perceive the correlation of the eclipse attack traffic to the spatial-temporal features. Second, the proposed features can better describe eclipse features and effectively promote the proposed model’s ability to mine and detect the features of samples in conjunction with the ISMOTE algorithm.

The proposed method was compared experimentally to classical machine algorithms (CNN, Bi-LSTM, DNN, and SVM) in terms of G-means, accuracy, and detection time. The results are shown in Table 3.

As shown in Table 3, the proposed detection method obtains high F1-score and G-means values; thus, the proposed model satisfies the security requirements of the blockchain network layer. However, the detection accuracy of the proposed was slightly less than that of the Bi-LSTM and SVM algorithms. The authors believe that the proposed method with the accuracy rate lower than the other two algorithms is because the accuracy of classification detection is lowered after the collection weight of the model for majority class normal samples is reduced by the ISMOTE algorithm in the up-sampling process, and the ability to learn normal samples is weakened in the model learning process. In addition, the computational time used in this method increased significantly compared to other algorithms, which indicates that the proposed model with high computational complexity of the proposed model may limit classifier performance. Even though the proposed model required more than the compared algorithms, it exhibited better detection performance because it can effectively detect and classify eclipse attack traffic. In summary, we believe that the proposed eclipse attack detection method can comprehensively improve the security of the blockchain network layer.

7. Conclusion

This paper has proposed an eclipse attack detection method based on a custom combination of features and deep learning to address the limitations of existing detection methods for eclipse attacks in the blockchain network layer. In the proposed method, eclipse attack features are first described from the perspectives of downstream traffic behavior characteristics, frequency distribution characteristics, and the change rates of the traffic communication and load features. Then, the ISMOTE algorithm is employed to perform efficient up-sampling to prevent eclipse traffic from being ignored by the classification model to address the problems associated with the uneven distribution of eclipse attack traffic. Finally, a model that combines the CNN and Bi-LSTM techniques with the multihead attention mechanism is employed to improve the classification detection performance by perceiving eclipse attack traffic features at different levels.

The proposed method was evaluated experimentally in a real blockchain network layer environment, and the results demonstrate that the proposed method exhibits significant improvement compared to existing detection methods and classical machine learning algorithms.

In future, we plan to reduce the computational complexity of the proposed model to improve the real-time performance of eclipse attack detection in a complex blockchain network layer environment.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by the Open Fund Project of Information Assurance Technology Key Laboratory of China (Grant no. KJ-15-109) and the National Natural Science Foundation of China (Grant no. 61471344).