Abstract
In the scenario where multiple device-to-device (D2D) users and cellular users coexist, the large number of D2D users not only results in the shortage of spectrum resources but also brings interference to the communication of cellular users. In this paper, we establish a clustering model centered on cellular users and propose a resource allocation algorithm based on a D2D clustering model. On the basis of ensuring the throughput requirements of cellular users, this algorithm reconstructs various matrices in the graph theory through probability models, sets the priority of D2D pairs, and maximizes the number of user accesses under interference tolerance. Besides, on the optimization of the number of users, we adopt the Rubinstein game model to adjust the game order according to the priority and optimize the bandwidth allocation mode, so as to improve the overall throughput of the network. Simulation results show that the proposed algorithm can increase the number of users and the network throughput, while shortening the spectrum allocation time by more than half.
1. Introduction
With the proliferation of mobile devices and the rapid development of data services, the increasing number of connected devices poses challenges for future mobile networks in terms of spectrum access [1]. In the future connected society, people, processes, data, and things are combined to achieve the goal of “Internet of everything” (IoE) [2]. However, the shortage of spectrum resources will become more serious.
Faced with the explosive growth of mobile terminals and the continuous increase of emerging mobile services, the existing communication systems under centralized control have exposed the disadvantages of large transmission delay and limited system capacity. As a new short-distance communication technology, device-to-device (D2D) does not need to transmit data between base stations. By reusing cellular resources, it makes up for the defects of traditional cellular communication, improves the capacity and spectrum utilization of the whole system, and alleviates the problem of spectrum resource shortage [3]. Combining D2D communication technology with the cellular network, there are two communication modes, overlay and underlay [4]. In underlay, D2D users communicate by reusing the spectrum of a cellular network. In this mode, D2D and cellular users interfere with each other and affect their normal communication. However, the spectrum utilization rate and the number of D2D access are significantly increased [5].
Spectrum management technology is the key to improve the utilization of spectrum resources, and it is often combined with classical mathematical models and economic models, such as graph theory, game theory, and auction model. Among them, game theory has been widely used in a D2D communication network in recent years, because it can efficiently allocate spectrum resources by formulating the interaction between various incentive elements [6]. Recently, Kaleem et al. [7] proposed a D2D discovery maximization iterative algorithm to reduce the users’ power consumption while the power-limited situations, they focused on discovering more users in public safety scenarios by adopting the concept of open-loop power control. In [8], in order to improve spectral efficiency, Kaleem et al. proposed the frame structure for in-band full-duplex (IB-FD) system with prioritized public safety (PS) users in resource allocation and time-efficient device discovery resource allocation scheme. Moreover, compared with random access mode, the discovery time of PS priority mode is about 37%. Li et al. proposed an efficient interference-aware frequency resource-sharing scheme for multiple D2D groups, and it can efficiently maximize system throughput by considering grouping method, adaptive antenna arrays, and application of interference alignment for the D2D communications [9]. A QoS-based dynamic fractional frequency reuse (FFR) scheme is proposed that efficiently allocates the nonoccupied center zone and optimizes cell-edge user throughput and sector throughput and reduces cochannel interference simultaneously [10].
Based on the above research, we propose a spectrum allocation algorithm using graph theory and game theory models in this paper. Firstly, the algorithm divides neighboring D2D user pairs into clusters based on the geographic location of the D2D pair users, centering on the cellular users. Then, it establishes a mapping relationship between the signal-to-interference and noise ratio (SINR) of D2D pairs and various matrices and discount factors of graph theory. Further, the number of D2D pairs in the multiplexed uplink is optimized under ensuring the maximum throughput requirements of cellular users. Finally, the frequency band occupancy rate of D2D pair users is divided through the Rubinstein game model. In addition, we analyze and derive the total network revenue and spectrum allocation time brought by proposed algorithm. Simulation experiments show the superiority of this algorithm.
The remainder of this paper is organized as follows. The system model is introduced in Section 2. In Section 3, the Rubinstein game model is introduced and combined with the system model to divide the frequency band occupancy rate of D2D pairs users. Then, by mapping the SINR of D2D pairs with various matrices and discount factors of graph theory, the mathematical representation under the probability model is constructed in Section 4. Furthermore, simulation results one given in Section 5. And finally, we draw conclusions in Section 6.
2. System Model
In this paper, we consider the uplink spectrum sharing in a single cellular network where D2D pairs users are less disturbed. We assume that cellular users (CUs) and M D2D pairs within the coverage of the central macro base station are randomly distributed in the cell, as shown in Figure 1. In the system, orthogonal spectrum resources are used among cellular users, and D2D users communicate by reusing the uplink of the cellular network. Due to the different QoS requirements of different cellular users, the frequency characteristics used are also different. Therefore, cellular user-centered D2D pairs are divided into D2D clusters, and each D2D cluster reuses spectrum resources with different characteristics. To increase the number of users and spectrum utilization, a multiplexing scenario is established, that is, multiple full-duplex D2D pairs multiplexing multiple cellular users’ resources.

The traditional coloring problem mainly studies the small number of colors needed for fixed-point coloring, where each user can only get one color. In order to improve the spectrum efficiency, it is assumed that the frequency band of cellular users to be reused by multiple pairs of D2D users under the premise of guaranteeing the normal communication of cellular users. In addition, the concept of the cluster is introduced to form a cluster of users with the same color. Because there are cellular users in the system, clusters can be formed.
As shown in Figure 1, taking one of the cellular users as an example, a user cluster centered on the cellular users is constructed. Assuming that spectrum resources and transmit power of CUs have been allocated in advance, the set of CUs is , and the set of full-duplex D2D clusters in the network is , the th D2D user pair in the th cluster can be expressed as , where and , respectively, represent the sending end and receiving end of the th D2D user pair.
The SINR of the cellular user is where denotes the transmit power of the cellular user , is the link gain between the cellular user and the base station, and, respectively, represent the transmit power of the th D2D pair in the cluster and the link gain between transmitter and cellular user , and denotes the additive white Gaussian noise.
In the uplink communication link, the cell users in the -cluster and the D2D pair coreuse the spectrum of this cell user will cause interference to receiver . The transmit power of the D2D user is adjusted according to Ref [11], and SINR of the th D2D pair of receivers is where represents the transmit power of the receiving end of the D2D pair , is the link gain from the transmitter to the receiver of the th D2D user, is the link gain from to the , and and , respectively, represent the transmit power of other D2D pairs in the cluster and link gain to .
The utility function in [12] is used to deal with the difference of QoS requirements and the existence of the extremum of the nonnegative convex function for different D2D shared spectrum resources, and it is given by where is the utility function of the th D2D pairs users in the th cluster; according to the needs of different types of users, the threshold can be flexibly adjusted to obtain different system throughput, and and represent the impact factor of the SINR and power impact factor for D2D users, respectively.
The utility function is composed of two parts. The former is a power function with the difference of the SINR of the target as the variable, which represents cognitive users’ satisfaction with SINR. The latter part is the price function which is used to prevent the selfish behavior of the cognitive user, which blindly increases its transmit power, regardless of the other cognitive user’s income. Through the establishment of price function, cognitive users are forced to “cooperate.”
The iterative formula of optimal power under power control is obtained as [13].
The final SINR of each D2D user is obtained according to the optimal power after iteration, and the discount factor of each user in the game is obtained based on the mapping relation through the final SINR.
Under the condition that the SINR of the cellular user and D2D pair is satisfied, the total throughput of the system is optimized. According to the Shannon formula, the user throughput of single cellular network can be obtained as where represents the throughput of the cell user on the uplink, denotes the uplink throughput of D2D pairs multiplexed cellular users, represents the channel bandwidth value of the cell user on the uplink, represents the channel bandwidth value of D2D to the multiplexed cellular user uplink, and the channel bandwidth is normalized such that .
In this paper, the channel bandwidth is normalized. After satisfying the cellular users’ throughput requirements and ensuring their normal communication requirements, the Rubinstein bargaining game model is used to obtain the remaining bandwidth share from the competition and maximize the network throughput. The minimum throughput required by the cellular user to transmit its data maintaining a certain level of QoS is set to ; then, the bandwidth required by the cellular user is
The remaining bandwidth is
Then, the rest of the bandwidth is allocated by Rubinstein’s bargaining game model based on the number and link quality of the comultiplexed D2D pairs.
3. Game Model
3.1. Rubinstein’s Bargaining Model
The Rubinstein game model is a kind of dynamic cooperative game in which players communicate with each other. Because of the existence of discount factors in the game model, the players follow the principle of maximizing their own profits and shorten the game process, so as to reduce the disadvantageous influence of discount factors on self-income.
As shown in Figure 2, two participants A and B in the alliance jointly divide a piece of land with a total area of “1.” Firstly, A proposes a distribution plan that is “bid.” Because the information in the alliance is interoperable, B chooses to accept or reject according to A’s “bid.” If it refuses, B proposes its own distribution plan that is “bargain.” Then, A decides whether to accept it and so on until a compromise is reached between the two participants.

When a bargaining game is a finite game, we can use the backward induction to find the refined equilibrium by setting the game to end at time T.
We assume that the bargaining game is infinite and the backward induction cannot be used directly. However, we can use the idea of the backward induction and the self-similarity of the game tree in its own structure (each subgame is structurally similar to the original game) to obtain its unique subgame refinement equilibrium.
In an indefinite bargaining game, the one who states a price first will get a greater benefit (share) than the later, and for both parties, whoever has the greater discount factor and patience will have a more favorable balanced result.
So when player A makes the first bid, player A will be the biggest beneficiary. According to [14], we can know that the subgame perfect equilibrium result, that is, the final share of A and B is where and are the discount factors for participants A and B, respectively.
3.2. Multiplayer Bilateral Game Model
Since the spectrum is an indispensable and valuable resource in wireless communication system, communication users compete for spectrum resources to maximize their own revenues. To solve the problem of competition among users, the game theory model is usually used to obtain the optimal strategy of spectrum allocation by comprehensively considering individual behaviors.
We consider the residual bandwidth allocation of multiple D2D pairs under the normalized channel bandwidth model. The various elements in the game model are as follows: (1)D2D pairs are participants in the game process, namely, decision-making subjects, represented as . The corresponding sticker factor is (2) is the share of each network user when reaching equilibrium, namely, the policy set of participants(3)The payoff function for each participant is (), which is determined by a combination of the discount factor and the bandwidth allocated(4)The cost per participant is (), which represents the bandwidth loss for each player due to the time consuming of the game process
In the general noncooperative game, the optimal strategy set among each participant is obtained through the Nash equilibrium, where participants do not cooperate with each other and are irrational and selfish during the competition. After reaching the Nash equilibrium, in order to maximize game participants’ own interests, no individual participant is willing to change his strategy. Therefore, this behavior often fails to reach the overall optimal goal. However, the existence of the discount factor makes the multiplayer bilateral game model different from the general noncooperative game [12]. The discount factor can reduce the revenue of the participants over time. Under the action of the discount factor, in the th game and the th game (), that is, the participants at the corresponding two moments get the same share; then, the final participant revenue .
In the bargaining game model of players, players “offer” to the next player in order, and the next player chooses to accept or reject. When the next participant chooses to accept, the “offer” continues in order. When reject is chosen, the subgame ends and enters the next subgame process.
Supposing that the subgame starts from . It is suggested to that is allocated to share , then chooses to accept or reject. The two cases are described as follows. (1)If accepts, gets a share of and does not participate in all the subsequent subgame processes. Then, continues to play games with and proposes to get share for the remaining resources(2)If refuses in the current subgame, the current subgame ends and enters the next subgame process. And will make the first offer.
The mapping relationship between the discount factor and SINR is where denotes SINR of each player and denotes the adjustment factor between SINR of each player and the corresponding discount factor.
Based on the above analysis, the subgame process can be expressed as where and is the introduced intermediate variable, it can be obtained as
When the subgame refined Nash equilibrium is reached, the relationship among the strategies of each participant is
After obtaining the policy relationship between each participant, the D2D pair can allocate the bandwidth based on it.
4. Allocation of Resources
4.1. Graph Theory Model
In this paper, the traditional coloring algorithm is improved to increase the number of D2D users and the throughput of the network, namely, D2D pairs with interference are allowed to access the same frequency band at the same time.
The traditional spectrum allocation algorithm based on graph theory describes the interference relationships among users by establishing an undirected graph , which is composed of vertices and edges [13]. We assume the following: (1) if the secondary user is within the interference radiation range of the authorized user, the secondary user cannot access this frequency band for communication and (2) if two users access the same channel will affect each other, they cannot simultaneously carry out information transmission.
In the graph theory coloring allocation model, 0/1 judgment is often used to describe users’ availability of frequency band, interference between users and revenue, etc., and its mathematical expression is shown as follows. (1)Available matrix: , represents that user can use channel ; otherwise, the user cannot use this channel. Each D2D pair has a different set of available channels depending on the channel occupied by the cell user(2)Interference matrix: , indicates that there will be interference between user and , so they cannot use the same channel to communicate at the same time(3)Efficiency matrix: , indicates that channel is not available for user . When , it represents the revenue obtained by users, and represents the revenue weights of different channels
On the basis of the above model, we establish a mapping relationship between link quality and matrix elements and define several matrices in graph theory with the goal of increasing the number of channel user access.
In the normalized channel model, we establish the mapping relationship between the amount of interference between adjacent users and matrix elements and allow the remaining users to access the channel under the condition that the cellular users and the connected D2D pair have normal communication. First, based on the channel gain, transmit power, and noise interference of D2D pairs and cellular users, the SINR is calculated to construct a new interference matrix. There are a large number of D2D pairs in a single clustering network, and access may cause network congestion. Furthermore, a new available matrix is established based on the link quality of D2D pairs, and the access priority of D2D pairs is determined according to the matrix element value. Finally, in order to prevent the same user from occupying the channel for a long time, the available matrix is updated in time after a D2D pair is connected to ensure the fair reuse of spectrum resources by users.
4.2. D2D Cluster
Each pair of D2D users has a list of colors, each color corresponds to its available channel for the cellular user, and initially, the D2D user can use all the colors. When the interference of D2D users to cellular users affects their normal communication, D2D users do not take the color of the cellular users.
As shown in Figure 3, we take cellular user 1 as an example to analyze the clustering process. The set in Figure 3 represents the set of available channels for D2D users, where the numbers in the circles represent D2D candidate members. For the first color, we use the color list to find candidate for the first cluster. From the interference matrix, the least interference to the cellular user is user 3, then 3 is added to the first cluster, and users were added to the noninterference with user 3, so we remove from the candidate pool at the same time; at this point, user 1 has the least interference to the cellular user, but there is an edge between user 3 and 1, while user 1 access does not affect the normal communication of the existing users in the cluster, so user 1 is added to the first cluster and removed from the pool of candidates; finally, user 4 has the least interference with the cellular user, but the access of user 4 can affect the normal communication of the connected members. At this point, the clustering process centered on the first cellular user is over.

4.3. Access Control
In the system model of one cellular user and multiple D2D pairs, it is assumed that each D2D user has the cognitive function. The transmitting signal coverage of cellular users is circular, radiating from inside to outside. A series of D2D pairs are located within the coverage. Users are represented by set , and the access control process is as follows.
Within the range of a cellular user’s signal, the total number of D2D pairs is
In order to increase the access number of users in the underlay access mode, the lower limit and of SINR tolerance of cellular users and connected D2D are set. Based on the element values in the available matrix, the subnode access priority is determined, and the user set is established.
According to the interference tolerance lower limit, it is necessary to identify whether to access the D2D pair with the minimum interference, namely, the highest priority in set .
When the SINR of the cellular user and the connected D2D pair satisfies and , it means that the normal communication needs of the cellular user and the connected D2D pair can be satisfied, and the new D2D pair is allowed to access the channel. However, when or , the normal communication between the cellular user and the connected D2D pair is disturbed; then, the optimization process is ended and is changed to
The steps of the spectrum resource optimization algorithm are shown in Table 1.
5. Simulation Results
In the case of cellular networks, the interference properties of multiple D2D clusters are similar, so we only consider the spectrum allocation scenarios of a single D2D cluster. It is assumed that the cell users with and D2D pairs with are randomly distributed in the cell, and the simulation experiment is carried out on the MATLAB platform. The simulation parameters are set as shown in Table 2.
5.1. Access Number Optimization
Due to the difference in geographical location of each D2D pair of receivers, the SINR of each receiver is calculated under the random given transmit power of the transmitter. The priority of D2D pairs is determined according to the link quality of each receiver. Under the condition of ensuring the normal communication of the connected D2D pairs, the access number of D2D pairs in a single channel is optimized within the interference tolerance of cellular users.
As shown in Figure 4, the D2D pairs in two clusters are colored under the graph theory model. There are 4 and 5 pairs of D2D user access channels in the SINR tolerance range of cellular user and D2D pair, respectively, which multiplex the uplink of a cellular user.

5.2. Network Throughput Simulation Process
Under the condition of ensuring the minimum transmit rate of 100 kbit/s for cellular users, we obtain the bandwidth required by cellular users according to the Shannon equation.
As shown in Figure 5, we observe the change of network throughput after optimizing the number of D2D pairs by taking clusters 1-2 as an example. Each line in the figure represents the experiment of optimizing the number of users once, and each node represents the access of a pair of D2D users. As shown by the solid line at the top of the figure, the minimum throughput demand of cellular users is 100 kbit/s, and the total throughput of the network is increased to 127 kbit/s by the multiplexing uplink of 4 pairs of D2D users.

5.3. Bandwidth Allocation
On the premise of ensuring the maximum transmit rate of cellular users, the network bandwidth is normalized, and the bandwidth required by cellular users is calculated by the Shannon equation. Thus, the remaining network bandwidth can be divided by the game model through access to the D2D pair, and the game order is determined according to the priority of D2D pair.
As shown in Figure 6, the remaining bandwidth is divided into 4 pairs of D2D, and the higher the priority in the game model (the higher the game order), the larger the bandwidth share of the D2D pair will be.

5.4. Network Revenue Comparison Simulation Process
As shown in Figure 7, the total benefits of the network based on the multiplayer bilateral game model and the random allocation of channel bandwidth are analyzed and compared in the case of normalized channel bandwidth. (1) The black line indicates that network throughput of D2D pair with residual bandwidth was allocated according to the game model based on the multiplayer bilateral game model. As shown in the figure, the transmit rate of the network is always in a high and stable state under the game model. (2) The red line indicates the change in the total transmit rate when D2D users play noncooperative games.

In Figure 7, each node represents the total revenue of the system, and each line represents the change of the total revenue of the 4 times system. And comparing the results of the 4 times spectrum random allocation experiment with the algorithm in this paper, we can clearly find that the total system return obtained by the algorithm tends to be stable and it is higher than that obtained by a noncooperative game. It can be seen that our algorithm improves the total transmit rate of the network.
As shown in Table 3, in order to measure the real-time performance of the algorithm proposed in this paper for spectrum allocation, we carry out a comparative test on the time consumption of the two allocation methods. Table 3 shows the time consumed in spectrum allocation under the game model and noncooperative game model. It can be seen from Table 3 that the spectrum allocation algorithm under the game model can save more than half of the time, compared with the random allocation, which is an improvement for time-delay sensitive services. And the real-time performance of the spectrum allocation algorithm under the game model is better.
6. Conclusion
In this paper, a resource allocation algorithm based on the D2D cluster is proposed to solve the problems of spectrum resource shortage and system capacity limitation caused by the explosive growth of mobile terminals and emerging services. In order to improve the reuse rate of cellular uplink for D2D users, the spectrum access is transformed into graph coloring process, and the interference tolerance and maximum throughput requirements of cellular users are set to allow multiple pairs of D2D users to reuse the uplink at the same time, which alleviates the problem of spectrum resource shortage to some extent. Then, the multiplayer bilateral game model is introduced to establish the mapping relationship between SINR and discount factor, and the game order is determined according to the user priority under the graph theory model. By comparing the total return of the network under the game model with the random allocation of channel bandwidth, it is found that the total return of the overall transmit rate of the network and the spectrum allocation time are significantly improved. In addition to improving user throughput, the real-time performance of the system is also improved. However, when clustering D2D users, we consider the impact of geographical location on the cluster. In the next step, factors such as the QoS requirements of D2D users, cellular users, and real-time characteristics of different channels will be combined with the revenue matrix of users to construct different types of utility functions.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The work of Ji-ai He, Lu Jia, Lei Xu, and Wei Chen was supported by the National Natural Science Foundation of China under grant 61561031.