Abstract

When there are many suspected loss links, the links in the path with a higher pass rate are assumed to be nondrop packet links or assuming that the link with the largest number of shares is a loss link, but this assumption lacks valid proof. In order to overcome these shortcomings, this paper proposes a link loss inference algorithm with network topology aware. The network model is established based on the historical data of the network operation and network topology characteristics. A weighted relative entropy ranking method is proposed to quantify the suspected packet loss links in each independent subset. The packet loss rate of the packet loss link is obtained by solving the unique solution of the simplified nonsingular matrix. Through simulation experiments, it is verified that the proposed algorithm has achieved better results in terms of congestion link determination and link loss rate estimation accuracy.

1. Introduction

In the communication network, there are multiple subsystems and multiple devices. These systems and devices require different data transmission rates, resulting in the uneven characteristics of the transmission link. In order to meet the real-time requirements of some systems and equipment for data transmission, a large number of communication services use UDP protocol for data transmission. Due to the unreliability of the UDP protocol and transmission equipment, data packets are easily lost. On the other hand, the UDP protocol lacks congestion control. If network link congestion occurs, especially when the network load is high, it will easily cause the transmission performance of the communication network to decline, affecting the service quality of communication services. Therefore, obtaining the packet loss rate of communication network links in time helps network maintenance personnel to quickly discover the hidden dangers in the network, restore the transmission rate of network links, and ensure the quality of service of network transmission. Because network tomography (NT) technology only uses end-to-end (E2E) measurements made by network end nodes, the performance indicators of the link can be deduced without directly accessing the internal nodes of the network, which does not involve user privacy and has become the main research method for link loss rate inference [16].

At present, the link loss inference algorithms can be divided into the following three categories: (1) link loss inference algorithms in the multicast network environment [710]; (2) research on link loss inference algorithms in the unicast environment to obtain unknown conditions through multiple probes [1114]; and (3) in the unicast network environment, the link loss inference algorithms are performed by making assumptions on some unknown conditions [1519]. Although a lot of research results have been obtained in previous studies, the existing methods are limited to the following basic conditions: when there are many suspected packet loss links, the links in the path with a higher pass rate are assumed to be nondrop packet links or assuming that the link with the largest number of shares is a packet loss link, but this assumption lacks valid proof. This is inaccurate in most cases, especially for large communication networks.

In order to improve the credibility of assumptions, this paper makes full use of network topology knowledge and historical data knowledge and proposes a link loss inference algorithm with network topology aware. Through simulation experiments, it is verified that the proposed algorithm has achieved better results in terms of congestion link determination and link loss rate estimation accuracy. The main contributions of this article include the following:(1)The network characteristics related to the link loss rate inference, such as the historical data of link congestion and network topology characteristics, are analyzed, and a network model is established.(2)Based on the analysis of the network characteristics and the problem of the link loss rate inference algorithm, a weighted relative entropy ranking method is proposed to quantify the suspected congested links in each independent subset.(3)A network topology-aware link loss inference algorithm is proposed. The network topology simplification, packet loss, link probability ranking, and nonsingular matrix solution are used to calculate the loss rate of the congested link.(4)Through simulation experiments, it is verified that, compared with the existing algorithms, the algorithm in this paper has achieved better results in terms of congestion link determination and link loss rate estimation accuracy.

Network measurement is an effective method to obtain network performance information. Network tomography [1], as a network indirect measurement technology, is regarded as an effective method and has received continuous attention. Network tomography has been widely used (but not limited to) to detect path selection [2], traffic matrix estimation [3], network optimization [4], link feature inference [5], and congestion link location [6] and other fields. In this paper, we mainly study the problem of inferring the link loss rate. Currently, the link loss rate inference methods are divided into the following three categories: multicast measurement inference, unicast interactive inference, and unicast preselection inference.(1)In terms of multicast measurement inference, there have been studies using multicast detection to achieve multiple end-to-end performance detection. In [7, 8], multicast routing is used to send multicast probe packets and the strong time correlation between packets is used to infer congested links. In [9, 10], multicast measurement is used to estimate the link loss rate from top to bottom in a tree network topology. However, this type of inference method can only be applied to networks that support multicast. If unicast simulation multicast is needed, the cost of deployment and calculation is high.(2)In the aspect of unicast interactive inference, the value of unknown conditions is obtained through multiple probes, so as to realize the inference of the link loss rate. Li et al. [11] address the problem that the real network environment has been changing and propose a network monitoring system that can capture the location of packet loss and packet header information to help diagnose and mitigate these losses. Aiming at the problem that multiple probes add large overhead to the network, Yu et al. [12] proposed a method that can infer the link loss rate even when the network is uncertain, which can effectively reduce the number of network probes. Aiming at the time dependence of link loss, Fan et al. [13] used the method of solving Boolean algebra equation to calculate the packet loss threshold for each link once. In order to reduce the multiple detection overhead, Qiao et al. [14] set the threshold of the upper limit of the loss rate of the link based on the coverage relationship between the link and the path it passes through. This method is applied to the hidden link fault diagnosis algorithm. However, the above method requires multiple network detections and path loss rate information calculations, which not only adds extra network load to the network environment, but also the calculation process is often complicated.(3)In terms of unicast preselection inference, a linear relationship model is constructed by analyzing path detection results, path, and link parameters and inferring the loss rate of the link. In [15], assuming that all links have the same congestion probability, a greedy heuristic link inference algorithm is proposed. This method iteratively selects those links that can explain the most failure paths as congested links. In order to relax the assumption that all links have the same congestion probability, Nguyen and Thiran [16] proposed an improved method for identifying congested links. This method takes the prior probability of congestion of the link as a weight, which improves the scope of the algorithm. Ghita et al. [17] proposed a link loss rate inference algorithm based on the correlation of congestion events on certain links in the network. Zarifzadeh et al. [18] proposed a method based on Boolean tomography and analog tomography to infer congested links. However, when inferring, the bottleneck link with the largest number of shares in the path is the link most likely to be congested. Qiao et al. [19] proposed an inference algorithm for link loss rate. This algorithm divides the graph composed of links into multiple family sets from the bottom up according to the association relationship of the single connection tree formed by links and selects the link with the higher pass rate which is regarded as the normal link. However, there have been studies to reason about the loss rate under the conditions that the probability of congestion on the links is equal and the number of congested links is small.

From the existing research and analysis, it can be known that the unicast preselection inference method has a less additional impact on the network and is more suitable for environments that require higher security and confidentiality of the network environment. However, the existing methods are limited to the assumption that the link in the path with a higher pass rate is a nonpacket loss link, or assuming that the link with the largest number of shares is a loss link. These assumptions limit the scope of application of the algorithm. In this paper, a network topology-aware link loss inference algorithm for communication networks is proposed. By analyzing the characteristics of the loss link and the correlation of loss events at different moments in the link, these characteristics can be mined for loss link inference and improve the inference performance.

3. Problem Description

An undirected graph is used to represent the network topology. Among them, represents a set of network terminals or routing nodes and represents a set of underlying links . Let be the number of underlying nodes. is used to denote the number of underlying links. Set consisting of an E2E path based on active detection is denoted by . The network topology in Figure 1 includes a total of 11 edges, 4 terminal nodes, and 6 internal nodes. Among them, the datagram sent from host 3 always contains both e5 and e6. In this case, the e5 and e6 pass rates cannot be solved by inference algorithms. Therefore, this paper combines e5 and e6 into a virtual link e12. This article describes both virtual and physical links as links.

Figure 2 shows the routing matrix shown in Table 1, which contains 6 rows and 10 columns. The number of rows represents the number of paths, and the number of columns represents the number of links. Let be the value of the k-th and j-th elements in the routing matrix . When , it indicates that path k includes link j, and when , it indicates that path k does not include link j.

Similar to [14, 19], the definition of loss rate given in this paper is as follows. The path loss rate refers to the loss fraction of the detection packet received by the target receiver of the detection. The link loss rate is the fraction of packets lost in all the paths that pass through it. The sum of the link’s pass rate and loss rate is 1. Let be the pass rate of path . Let be the pass rate of link .

Similar to existing studies using network tomography to infer link loss, the formula for calculating the pass rate of a path is shown as follows:

Taking the logarithm of both sides of formula (1) gives the following formula:

Therefore, the set formed for all paths and links can be expressed as follows:

Let and , then formula (4) is obtained. From the existing literature [10], in order to reduce the impact of detecting E2E on network performance, E2E paths generally contain more links, so the number of paths is less than the number of links, and matrix is a column dissatisfaction rank matrix. Therefore, there is no unique solution for the matrix :

4. Network Model

4.1. Historical Data of Link Congestion

The historical data of link congestion number in time period T can use a simple min-max normalization method [20] to scale the value of the congestion time to the range [0,1], as shown in the following formula:

This method achieves equal scaling of the original data, where is the normalized data, is the original data, and and are the maximum and minimum values of the congestion times of the link in the time period T, respectively.

4.2. Network Topology Characteristics

Considering that the degree of nodes in the network topology, the link relationships included in the path, and the average loss rate of the path are related to link congestion, these characteristics are described formally as follows:(1)Degrees of nodesLet be the degree of routing node .(2)The average packet loss rate of the path containing the current linkLet be the average packet loss rate of the path containing the current link and be calculated using formula (6), where the number of path containing the current link is :(3)Independence of the congested path containing the current linkThe independence of the path containing the current link refers to the proportion of the current link among all the links that belong to the congested path containing the current link. It is represented by and calculated using formula (7). The larger the value of is, the less the number of links belonging to the congested path containing the current link is, and the greater the probability that the current link belongs to the congested link. Among them, indicates the number of links included in the path including the current link, and indicates the number of paths including the current link:

5. Inference Algorithm

Link loss inference algorithm with network topology aware in communication networks includes the following three steps: (1) model simplification; (2) weighted relative entropy ranking of links; and (3) inference link loss rate. Details are introduced as follows.

5.1. Model Simplification

For the routing matrix M, the routing matrix includes k rows and j columns. The k lines represent k detection paths. The j column represents j network links. When the passing rate of the detection path is 1, the passing rates of all network links that passes are 1. Based on this, the rows in which the detection path has a passing rate of 1 and the columns in the network link having a passing rate of 1 can be deleted. The simplified form formula (4) is given by

From [19], it can be known that the modified Gaussian Jordan elimination method can divide the routing matrix into block row step matrices, and in each block matrix, any row will not become part of other rows. Therefore, based on the modified Gaussian Jordan elimination method, this paper divides the routing matrix into multiple independent subsets, and each independent subset is represented by formula (9). Because in any independent subset , any row will not become a part of other rows, and the pass rate of the links contained in it is unknown, so in this paper, the link sequence of each row of each independent subset is called an independent link sequence:

5.2. Weighted Relative Entropy Ranking of Links
5.2.1. Analysis and Demonstration of Factors Related to the Link Loss Rates

It can be seen from the existing literature [18] that when inferring link loss rates, most algorithms infer based on factors such as link packet loss rate, path packet loss rate, and the relationship between path and link. Based on these research results, this paper further studies from the following two aspects. First of all, network operators have accumulated a large amount of network performance data and operation data during long-term operation. By fully using these data, they can understand the characteristics of links and paths more comprehensively. Secondly, if the problem is analyzed from the perspective of the network topology, it is helpful for the algorithm to analyze the relationship between the path and the link from a global perspective, so as to comprehensively measure the congestion status of the path and the link. Based on this, the factors related to the link loss rates used in this paper include the network operation data, network topology data, and the relationship between paths and links. Each factor is explained as follows:(1)The network operation data: based on long-term network operation data, it can be known that a link with more congestion time has a higher probability of recongestion. A link with recent congestion has a higher probability of recongestion. These historical data and experience are helpful in the calculation of link congestion. Based on this, this article uses link congestion number and link congestion occurrence time to record the historical data of link congestion. Therefore, in terms of network operation data, this paper uses historical data of link congestion indicators to measure link performance.(2)The network topology data: nodes with a higher degree are more likely to have a neighboring edge as a parent link or a bottleneck link with a larger number of shares. These links have a higher probability of becoming congested links. Therefore, in terms of network topology data, this paper uses the degrees of nodes index to measure link performance.(3)The relationship between paths and links: to evaluate the performance of the paths related to the current link, from the perspective of the relationship between all paths and links, and to analyze the packet loss rate and independence of the related paths. Among them, the packet loss rate of the related path is measured by the average packet loss rate of the path containing the current link, and the independence of the related path is measured by the independence of the congested path containing the current link.

5.2.2. The Rationality of Using TOPSIS for the Analysis of Multiple Factors

From the above analysis, we can see that factors related to the link loss rates include many aspects. If only one is used, it is easy to cause one-sidedness in the analysis results. In order to make full use of various related factors, this paper uses TOPSIS for the analysis of multiple factors.

TOPSIS (technology for order preference by similarity to an ideal solution) [21, 22] can use the link’s multiattribute features to achieve the ranking of link packet loss probabilities. The TOPSIS method takes the attribute value of the link packet loss probability and the Euclidean distance to the ideal point as a standard and realizes the ranking of the link packet loss probability values. The TOPSIS theory and calculation process are applied to solve the problems in this paper [21, 22].

5.2.3. Weighted Relative Entropy Ranking of Links

Assume that there are suspected packet loss links, each link contains attributes, and then the j-th attribute value of the i-th suspected packet loss link can be represented by . Based on this, the TOPSIS decision matrix is given by

For the quantification of each link, the attributes in the decision matrix include the sum of the degrees of the starting and ending nodes of link , the prior probability of link congestion, and the average packet loss rate and the independence of the congested path through the current link . Therefore, , , , and , where and represent nodes at both ends of link , respectively:

In order to ensure the consistency of the attribute values, a standardized decision matrix is established by the following formula:

Based on network operation and maintenance experience, each attribute can be set weight , where . Based on this, the weighted TOPSIS decision matrix is realized as shown in the following formula:

is used to represent the benefit-type attribute set and to represent the cost-type attribute set. Then the attribute ideal point is calculated using formula (14), and the attribute negative ideal point is calculated using formula (15). The distance from each link’s attribute value to and is calculated using formulas (16) and (17):

Considering that the use of the Euclidean distance is not conducive to evaluating the attributes of intermediate nodes and , the calculation method of relative entropy is given based on entropy weight and TOPSIS [2325], which is more convenient to describe the relationship between nodes to the intermediate nodes of and , and is calculated using formulas (18) and (19). Formula (20) is used to calculate the congestion probability value of each suspected congested link . The larger the value, the greater the probability of congestion on the link:

The relative entropy of each link is solved and the links are arranged in descending order. Considering that the attributes of each suspected congested link are standardized to a value of (0, 1), for the convenience of calculation, the weight of the attribute indexes is set . The ranking method of weighted relative entropy links is shown in Table 2.

5.3. Inference Link Loss Rate

Linear equation is non-full-rank equations and cannot be solved for unique solutions. In order to solve the unique solution, this paper selects the appropriate link set from , and the equation set composed of the remaining link set is the full-rank equation set. When selecting the link set , the weighted relative entropy calculation method is used to solve the weighted relative entropy of the links and arrange them in descending order. The links in turn are selected and removed from the set of links until the system of equations is a full-rank equation system. At this time, by calculating the solution of formula (21), the pass rates of all links can be obtained:

5.4. Algorithm Steps and Pseudocode

The algorithm steps and pseudocode are shown in Table 3. The algorithm includes three steps, namely, (1) model simplification, (2) sorting all suspected congested links, and (3) inferring the packet loss rate of the link. As can be seen from reference [18], there are generally fewer links with a packet loss rate. Therefore, the operation of the third step (2) is reasonable.

6. Evaluation

6.1. Environment Setup

In order to analyze the influence of the network environment on the performance of the algorithm, similar to the existing research, the experiment uses the tool Brite to generate Waxman and Barabasi–Albert network topologies [26]. In both networks, the number of network nodes is 500. The characteristics of the Waxman network topology are that the degree of the nodes is small, the number of links included in the probe is large, and the number of links is small.

It can be seen from reference [27] that each algorithm has its applicable environment, and the performance results of the algorithm are different under different constraints. Similar to [18], in terms of simulated link congestion, all links in the network are congested with p probability. The value of p is changed to verify the performance of the algorithm under different levels of congestion. The value of p in the experiment ranges from 5% to 15%.

In terms of the packet loss model, based on the LLRD1 model [26], the congested link loss rate is set to follow the uniform distribution in the [0.05, 0.15] interval and the normal link loss rate to follow the uniform distribution in the [0, 0.002] interval. End-to-end detection technology is used to obtain the packet loss data of the path. In terms of the division of detection nodes, the nonleaf nodes of the network topology are used as routing nodes, and the leaf nodes of the network topology are used as terminal nodes. Each probe uses one terminal node to send 500 data packets to all other terminal nodes in the network.

After the network congestion data are processed by the Java program, algorithm analysis is performed using Matlab. According to the analysis of the existing literature, the algorithm RangeTLA [18] and the algorithm LIABLI [19] are classic algorithms for inference about the packet loss rate of the network link. The detailed description of the comparison algorithms is shown in Table 4.

In Waxman and Barabasi–Albert network topology environment, the algorithm NTLA is compared with the algorithm RangeTLA and the algorithm LIABLI. The comparison indicators include the congested link detection rate (CLDR), the congested link misjudgment rate (CLMR), the absolute error of the link pass rate (AELPR), and the inference time of the algorithm. The calculation methods of indicators are shown in formulas (22)–(24), where represents the set of congested links, represents the inferred set of congested links, represents the pass rate of the links , and represents the inferred pass rate of the links:

6.2. Results
6.2.1. Analysis of CLDR, CLMR, and Inference Time

The experimental results of CLDR, CLMR, and inference time are shown in Figures 35. The X-axis represents the network topology environment when the congestion rate of the network link changes between 5% and 15%. The Y-axis indicates the CLDR, CLMR, and inference time, respectively. It can be seen from Figures 35 that as the congestion rate of the network link increases, the detection performance of the three algorithms decreases, but the performance is relatively stable.

In terms of the detection rate of congested links, the algorithm RangeTLA decreases more in the two network topology environments of Waxman and Barabasi–Albert, while the detection rate of congested links of the algorithm NTLA in the two network topology environments of Waxman and Barabasi–Albert is stable, which is about 5% higher than the existing algorithm. Therefore, in terms of the detection rate of congested links, the algorithm NTLA in this paper has achieved a higher detection rate.

In terms of congestion link misjudgment rate, the algorithm NTLA’s congestion link misjudgment rate is relatively stable under the two network topology environments of Waxman and Barabasi–Albert, which is about 8% higher than the existing algorithm. Therefore, in terms of false-positive rate, the algorithm NTLA in this paper has achieved a lower congestion link misjudgment rate.

In terms of the inference time of the algorithm, the inference time of the algorithm NTLA in the Waxman and Barabasi–Albert network topology environment is relatively stable, while the inference duration of the algorithm RangeTLA increases rapidly in the Waxman and Barabasi–Albert network topology environments.

6.2.2. Analysis of AELPR

The analysis results of the AELPR are shown in Figures 6. The X-axis represents the mean of AELPR. The Y-axis represents the ratio of the result of the AELPR (RR-AELPR), which is calculated using formula (25), where indicates the number of times that all values of AELPR appear and indicates the number of times that a value of AELPR appears and the number of times that the AELPR value is less than a specified value:

In terms of AELPR, the algorithm RangeTLA has the highest AELPR in the Waxman and Barabasi–Albert network topology, while the NTLA algorithm in this paper has the lowest AELPR. When the mean error of AELPR is 0.005, the algorithm in this paper improves about 5% compared with the traditional algorithms.

6.2.3. Analysis of the Influence of Network Topology on the Algorithm

From Figures 3 to 6, it can be seen that the performance of the three algorithms in the Waxman network topology environment is better than that in the Barabasi–Albert network topology environment. This is because, compared to the Barabasi–Albert network topology environment, the degree of nodes in the Waxman network topology is small, so the probe contains a large number of links and fewer shared links, so the probe contains fewer uncertain links, so the algorithm has better performance.

Of the three algorithms, the algorithm RangeTLA is the most affected by the network topology. This is because the algorithm RangeTLA determines the links that share more bottleneck links as congested links. In the Barabasi–Albert network topology environment, there are more shared links. Therefore, in the Barabasi–Albert network topology environment, the performance of the algorithm RangeTLA is poor. The algorithm LIABLI first divides the network topology into multiple families, which reduces the influence of network topology characteristics on the performance of the algorithm. Therefore, the performance of the algorithm LIABLI is improved over the algorithm RangeTLA. Compared with the algorithm RangeTLA and the algorithm LIABLI, based on the historical data characteristics of the network topology, this paper makes better use of the historical data related to network congestion and reduces the special requirements of the algorithm on the network. Therefore, the algorithm performance is improved compared to the traditional algorithm.

7. Conclusion

Quickly and accurately determining the link loss rate is very important for the reliable operation of the communication network. In order to solve the problems of increasing the network load for multiple detections in the link loss rate inference algorithm and the need to further improve the calculation accuracy of the algorithm, many research results have been obtained. When the detection includes multiple shared links, there have been research studies using inferring conditions to infer the link loss rate, which affects the accuracy rate.

In order to solve this problem, this paper builds a network model based on the historical data of the network operation and the characteristics of the network topology and proposes a link loss inference algorithm with network topology aware in communication networks. Through simulation experiments, it is verified that the algorithm in this paper has achieved good results in terms of detection rate and accuracy of the packet loss rate compared with the existing algorithms. It can be seen from the research results that when there are more shared links in the network topology, the inference performance of the packet loss rate needs to be further improved. In the next work, we will conduct in-depth research on the detection path selection algorithm to better solve the impact of shared links on the performance of the loss inference algorithm.

Data Availability

The raw/processed data required to reproduce these findings cannot be shared at this time as the data also form part of an ongoing study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Key Research and Development Program of China (no. 2018YFB1800704), the Public Security Technology of 1331 Engineering Key Discipline Construction Project in Shanxi Province (no. XK201727), and the Foundation of Shanxi Police College (nos. 2019yzd001 and 2019yzb006).