Abstract

With the advent of the 5G era, due to the limited network resources and methods before, it cannot be guaranteed that all services can be carried out. In the 5G era, network services are not limited to mobile phones and computers but support the normal operation of equipment in all walks of life. There are more and more scenarios and more and more complex scenarios, and more convenient and fast methods are needed to assist network services. In order to better perform network offloading of the business, make the business more refined, and assist the better development of 5G network technology, this article proposes 5G network slicing: methods to support blockchain and reinforcement learning, aiming to improve the efficiency of network services. The research results of the article show the following: (1) In the model testing stage, the research results on the variation of the delay with the number of slices show that the delay increases with the increase of the number of slices, but the blockchain + reinforcement learning method has the lowest delay. The minimum delay can be maintained. When the number of slices is 3, the delay is 155 ms. (2) The comparison of the latency of different types of slices shows that the latency of 5G network slicing is lower than that of 4G, 3G, and 2G network slicing, and the minimum latency of 5G network slicing using blockchain and reinforcement learning is only 15 ms. (3) In the detection of system reliability, reliability decreases as the number of users increases because reliability is related to time delay. The greater the transmission delay, the lower the reliability. The reliability of supporting blockchain + reinforcement learning method is the highest, with a reliability of 0.95. (4) Through the resource utilization experiment of different slices, it can be known that the method of blockchain + reinforcement learning has the highest resource utilization. The resource utilization rate of the four slices under the blockchain + reinforcement learning method is all above 0.8 and the highest is 1. (5) Through the simulation test of the experiment, the results show that the average receiving throughput of video stream 1 is higher than that of video stream 2, IOT devices and mobile devices, and the average cumulative receiving throughput under the blockchain + reinforcement learning method. The highest is 1450 kbps. The average QOE of video stream 1 is higher than that of video stream 2, IOT devices and mobile devices, and the average QOE is the highest under the blockchain + reinforcement learning method, reaching 0.83.

1. Introduction

Relieving users’ network congestion, reducing network latency, and offloading the network are the top priorities for 5G networks. As a core technology, the 5G network slicing technology can effectively solve the challenges of business creation and exclusive network access for different users, as well as the coexistence of multiple application scenarios. The 5G network is expected to meet the different needs of users [1]. 5G network slicing may be a natural solution [2]. A wide range of services required for vertical specific use cases can be accommodated simultaneously on the public network infrastructure. 5G mobile networks are expected to meet flexible demands [3]. Therefore, network resources can be dynamically allocated according to demand. Network slicing technology is the core part of 5G network [4]. The definition of 5G network slicing creates a broad field for communication service innovation [5]. The vertical market targeted by 5G networks supports multiple network slices on general and programmable infrastructure [6]. The meaning of network slicing is to divide the physical network into two virtual networks so that they can be flexibly applied to different network scenarios. The future 5G network will also change the mobile network ecosystem [7]. The 5G mobile network is expected to meet the diversified needs of a variety of commercial services [8]. 5G mobile networks must support a large number of different service types [9]. Network slicing allows programmable network instances to be provided to meet the different needs of users. Blockchain can establish a secure and decentralized resource sharing environment [10]. Blockchain is a distributed open ledger [11] and is used to record transactions between multiple computers. Reinforcement learning algorithms can effectively solve large state spaces [12]. Reinforcement learning is mainly used to solve simple learning tasks [13]. 5G networks are designed to support many vertical industries with different performance requirements [14]. Network slicing is considered an important factor in enhancing the network and has the necessary flexibility to achieve this goal. Network slicing is considered one of the key technologies of 5G network [15]. You can create virtual networks and provide customized services on demand.

2.1. 5G Network Slicing
2.1.1. Network Slicing

Network slicing refers to the implementation of offload management of the network when the network is congested and complicated [16]. When facing the different needs of different users, the network is divided into many pieces to meet customer needs. Moreover, it provides targeted services and assistance.

2.1.2. Network Slice Classification

The ultimate goal of 5G network slicing is to organically combine multiple network resource systems to form a complete network that can serve different types of users. Network slices can be divided into independent slices and shared slices as shown in Table 1:

2.1.3. 5G Network Application Scenarios

The application scenarios of 5G networks are divided into three categories: mobile broadband, massive Internet of Things, and mission-critical Internet of Things [17]. The details are shown in Table 2:

2.2. Blockchain
2.2.1. Definition of Blockchain

The blockchain consists of a shared, fault-tolerant distributed database, and a multi-node network [18].

2.2.2. Blockchain Structure

The block chain is composed of a block header and a block body, which forms into a chain structure through the hash of the parent block [19]. The structure is shown in Figure 1:

The structure contains the parent block hash, timestamp, random number, difficulty, and the Merkle root [20]. Its functions are shown in Table 3:

2.2.3. Blockchain Properties

Blockchain technology has three attributes of distribution, security, and robustness [21], as shown in Table 4:

2.3. Reinforcement Learning
2.3.1. Definition of Reinforcement Learning

Reinforcement learning is one of the methods of machine learning. It is mainly to solve the method of how the agent takes different actions in the environment in order to maximize the accumulated rewards obtained.

2.3.2. Reinforcement Learning Process

In the process of reinforcement learning, the agent needs to make decisions on the information in the environment [22]. At the same time, the environment will also reward the agent for the corresponding behavior, and the agent will enter a new state after the behavior. The process is shown in Figure 2:

2.4. Model Design

The 5G network slicing architecture is composed of network slicing demander, slice management (business design, instance orchestration, operation management), slice selection function, and virtualization management orchestration.

The process of the 5G network slicing model is as follows: network services enter the slice manager through the network slice demander, and the slice manager includes business design, instance arrangement, and operation management. After the slice manager enters the slice selection function, it is divided into shared slice function and independent slice special function, and it can also enter the virtualization management orchestration as shown in Figure 3:

3. Formula

3.1. Blockchain
3.1.1. Scalability within Shards

In the process of verifying the block consensus, the scalability within the shard [23]is as follows:

Among them, is the average transaction size, is the block header size, and is the number of shards.

3.1.2. Scalability of Directory Fragmentation

Assuming that the average transaction size is , the block header size is , and the scalability of the directory fragmentation is as follows:

3.1.3. Scalability of Sharded Blockchain

The scalability of the entire sharded blockchain is composed of the internal scalability of the shards and the scalability of the catalog shards [24]. Assuming that the block packing time within the fragment and the directory fragment is the same as and the block header size is the same as , the formula is as follows:

3.2. Reinforcement Learning Methods
3.2.1. Value Function Method

The value function method is to give an estimate of the value for different states. 0 is the given value, and starts from state . The formula is as follows:

The optimal strategy has a corresponding state-value function , which is expressed as follows:

In the RL setting, it is difficult to obtain the state transition function P. So, a state-action value function is constructed.

Given , in each state, the optimal strategy can be adopted. Under this strategy, can be defined by maximizing as follows:

At present, mature deep learning methods such as SARSA and offline Q learning can all be used to solve the value function.

SARSA:

Offline Q learning:

3.2.2. Strategy Method

The strategy method is to directly output the action by searching for the optimal strategy . The objective function is defined as the cumulative expected reward.

The policy parameter is estimated in the discounted cumulative expected reward gradient and obtained based on a certain learning rate . The formula of the strategy gradient method is as follows:

3.2.3. MDP

MDP mainly solves the problem of learning-related experiences in the interaction between the agent and the environment to achieve the goal [25]. Assuming that the state space is , it is defined as follows:

Among them, represents the state of all wireless channels in the 5G network slice, represents the channel state space, and is represented as follows:

Among them, represents the channel state and represents the channel state space.

means connection status, means connection status space. is defined as follows:

represents the state of all data transmission rates in the slice, and represents the data transmission rate state space. is defined as follows:

represents the topological state of the physical network, and represents the topological state space in the physical network. is defined as follows:

means that the action space is allocated for unlimited resources, which is defined as follows:

Among them, is the 5G network radio resource allocation action, and is its corresponding network action space, expressed as follows:

Among them, represents occupied wireless resources.

3.3. Model Building

Suppose the weighted undirected graph of the physical network is , where the set of network nodes is denoted as , the calculation level of is denoted as , and the link set composed of nodes is denoted as .

The first dynamic dispatch queue state transition function is as follows:

The second dynamic scheduling queue state transition function is as follows:

Combining the above analysis, the 5G network slicing model, the formula is expressed as below:

4. Experiment

4.1. Model Test
4.1.1. Variation of Time Delay with the Number of Slices

This article mainly studies 5G network slicing methods to support blockchain and reinforcement learning. First, we will test the model and compare the blockchain + reinforcement learning method with the blockchain, reinforcement learning, and unused methods. The results are shown in Figure 4.

The comparison results show that the delay increases with the increase of the number of slices, but the blockchain + reinforcement learning method has the lowest delay and can maintain the minimum delay. When the number of slices is 3, the delay is 155 ms. The overall delay of the blockchain is lower than the delay of reinforcement learning because the blockchain will give priority to nodes with rich resources and strong data processing capabilities when selecting nodes and link mappings, so the delay is lower.

4.1.2. Delay Comparison of Different Slice Types

Under different slice types, set the number of users to 30 and compare the delays generated by several methods. We compare 5G network slicing, 4G network slicing, 3G network slicing, and 2G network slicing in blockchain + reinforcement learning, blockchain, reinforcement learning, and unused methods. The results are shown in Figure 5.

Through the comparison results, it can be seen that the latency of 5G network slicing is lower than that of 4G, 3G, and 2G. 5G network slicing has the lowest latency of only 15 ms in the method of blockchain and reinforcement learning. This is because the greater the number of VNFs, the more nodes that the slice will pass through to process the same data packet, the longer the link that passes, and the greater the delay.

4.1.3. System Reliability

System reliability is an indispensable step before the experiment. We will compare the system reliability of different methods (blockchain + reinforcement learning, blockchain, reinforcement learning) under different numbers of users. The comparison result is shown in Figure 6:

It can be seen from the graph that the reliability decreases with the increase of the number of users because reliability is related to delay. The greater the transmission delay, the lower the reliability. The reliability of the supporting blockchain + reinforcement learning method is the highest, with a reliability of 0.95. This means that 5G network slicing that supports blockchain + reinforcement learning methods can provide services for more businesses.

4.2. Resource Utilization of Different Slices

This article studies the methods that support blockchain and reinforcement learning. We will study the resource utilization of blockchain and reinforcement learning for different slices. Set up 4 slices and perform three tests on each slice, namely, blockchain + reinforcement learning, blockchain and reinforcement learning, and finally compare their resource utilization experiment results as shown in Figure 7:

According to the experimental results, it can be concluded that the method of blockchain + reinforcement learning has the highest resource utilization rate. The resource utilization rate of the four slices under the blockchain + reinforcement learning method is all above 0.8, and the highest is 1. It shows that the block chain + reinforcement learning method has the best resource utilization for the block.

4.3. Simulation Test

According to the 5G network proposed in this article: Support blockchain and reinforcement learning methods to design a simulation test. Excluding other factors, the experimental subjects are video stream 1, video stream 2, IOT devices, and mobile devices.

4.3.1. Average Cumulative Receiving Throughput

The experiment will compare 4 types of equipment using three methods: blockchain + reinforcement learning, blockchain, and reinforcement learning. By comparing the average cumulative received throughput (kpbs), which method is better is decided. Throughput refers to the number of requests processed by the system in a unit of time. The results are shown in Table 5.

The result is plotted as a histogram, and the result is shown in Figure 8.

According to the experimental results, the average receiving throughput of video stream 1 is higher than that of video stream 2, IOT devices, and mobile devices, and the average cumulative receiving throughput is the highest under the blockchain + reinforcement learning method, reaching 1450 kbps.

4.3.2. Average QOE

Under three different methods, compare the average QOE of different devices to prove which method is more suitable for 5G network slicing. QOE refers to the user’s comprehensive experience of the quality and performance of the network system. The results are shown in Table 6:

The result is plotted as a histogram, and the result is shown in Figure 9.

According to the experimental results, the average QOE of video stream 1 is higher than that of video stream 2, IOT devices, and mobile devices, and the average QOE is the highest under the blockchain + reinforcement learning method, reaching 0.83.

5. Conclusion

With the advent of the 5G era, current technologies can no longer meet the needs of users. Network congestion and slow network speeds are major problems currently facing. In order for users to use network services more smoothly, network services are more convenient. This article designs 5G network slicing: a method model supporting blockchain and reinforcement learning. This model will perform better distribution management of the network, increase the transmission rate of users in the business, and reduce the transmission delay.

The research results of the article are given below:(1)In the model testing stage, the results of the study on the variation of the delay with the number of slices show that the delay increases with the increase of the number of slices, but the blockchain + reinforcement learning method has the lowest delay and can maintain the minimum delay When the number of slices is 3, the delay is 155 ms.(2)The comparison of the delay of different slice types shows that the delay of 5G network slicing is lower than that of 4G, 3G, and 2G. 5G network slicing has the lowest delay in the method of blockchain and reinforcement learning, only 15 ms.(3)In the detection of system reliability, reliability decreases as the number of users increases. This is because reliability is related to delay. The greater the transmission delay, the lower the reliability. Supporting blockchain + reinforcement learning method has the highest reliability.(4)In the resource utilization experiment of different slices, it can be known that the method of blockchain + reinforcement learning has the highest resource utilization. The resource utilization rate of the four slices under the blockchain + reinforcement learning method is all above 0.8 and the highest is 1.(5)Through the simulation test of the experiment, the results show that the average receiving throughput of video stream 1 is higher than that of video stream 2, IOT devices, and mobile devices, and the average cumulative receiving throughput under the blockchain + reinforcement learning method The volume is the highest, reaching 1450 kbps. The average QOE of video stream 1 is higher than that of video stream 2, IOT devices, and mobile devices, and the average QOE is the highest under the blockchain + reinforcement learning method, reaching 0.83.

Although the results of this experiment are obvious, it has certain limitations and is limited to the use of 5G network slicing. A lot of research is needed in the future to enhance its universality and apply it to more scenarios. In future research, the methods for supporting blockchain and reinforcement learning proposed in this article can be improved, so that blockchain and reinforcement learning methods can be realized in the network service requirements with more goals.

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding this work.