Abstract
This paper studies the user selection problem for a cooperative nonorthogonal multiple access (NOMA) system consisting of a base station, a far user, and near users. The selected near user receives its own message and assists the far user by relaying the far user’s message. Firstly, we propose a user selection strategy to maximize the selected near user’s data rate while satisfying the quality-of-service (QoS) requirement of the far user. Considering that the channel state information (CSI) of users in actual communication is usually imperfect, we then analyze the outage probability of the NOMA system based on the user selection strategy under imperfect CSI and obtain a closed-form expression. The theoretical analysis shows that the diversity order of the NOMA system under imperfect CSI is 0, which means the multiuser diversity order disappears. In order to improve the impact of imperfect CSI on system performance, we use the deep learning method to identify and classify channels of imperfect CSI and improve the accuracy of CSI. The simulation results show that the theoretical analysis of outage performance is consistent with the numerical results. Compared with the strategy without the deep learning method, the proposed deep learning-based user selection scheme significantly improves the system performance. Furthermore, we verify that our scheme recovers the diversity gain.
1. Introduction
With the development of communication technology, spectrum resources are increasingly scarce. To make efficient use of the limited spectrum, nonorthogonal multiple access (NOMA) technology was proposed [1]. Existing research [2–5] shows that NOMA allows multiple users to communicate simultaneously using a single channel resource, thus achieving more significant spectral efficiency than traditional orthogonal multiple access (OMA). However, there is mutual interference between users in the NOMA system. The most common way to solve this problem is to eliminate interference with successive interference cancellation (SIC) [6, 7], which gradually reduces the interference of the maximum power user in the received signal. Consequently, some NOMA users will know other users’ messages; thus, they can serve as relays to help other users. In this sense, the idea of cooperative NOMA is proposed. A classical strategy was designed for cooperative NOMA in [8], where users successfully decoding other users’ messages helped other users in turn. Note that existing research related to cooperative NOMA networks [9–12] mainly considered improving system performance under perfect channel state information (CSI). However, it is actually difficult to obtain perfect CSI. In NOMA networks, imperfect CSI not only causes additional interference to the expected signal but also leads to the wrong decoding sequence [13–15]. Therefore, it is particularly necessary to minimize the channel estimation error as much as possible.
With the development of artificial intelligence, deep learning (DL) technology [16] has been widely used [17–19]. In order to obtain perfect CSI, some works used DL technology for channel estimation and achieved good performance. In [20], a channel estimation network based on DL was proposed in the high-speed mobile scenarios, reducing computational complexity and improving performance. In [21], a channel learning scheme based on the deep autoencoder was developed, which learned CSI at the energy transmitter based on the energy feedback harvested by the energy receiver. In [22], based on the fact that the propagation environment is almost identical, a DL-based CSI estimation for high mobility networks was proposed, which allowed the deep neural network to learn the nonlinear CSI relations. In [23], a five-layer deep neural network (DNN) was designed to estimate channels in orthogonal frequency-division multiplexing (OFDM) systems. In [24], the authors regarded CSI as 2D images and used DL-based image processing techniques to estimate the channel. In [25], a channel estimator using the sliding bidirectional gated recurrent unit network was designed at the receiver, which can be combined with other channel estimation techniques. In [26], a DNN was constructed for channel estimation and direction of arrival estimation, improving performance without increasing complexity. In [27], a deep image prior based DNN was proposed to improve estimation performance without training. In [28], in order to reduce the overhead in multi-input multi-output (MIMO) systems, a convolutional neural network- (CNN-) based estimator was proposed. In [29], a learned denoising-based approximate message passing (LDAMP) network was exploited for beamspace channel estimation.
Motivated by the successful research of DL in channel estimation, we introduce DL technology into cooperative NOMA networks to solve the imperfect CSI estimation and user selection problems. It is worth mentioning that the above DL schemes for channel estimation are not applicable in our NOMA system even if they have good performance on their problems. First, each transmission block is equally divided into two phases in this paper, and CSI of each hop is irrelevant. To improve learning performance, it is an effective approach to learn the CSI of each hop, respectively. Second, a key step in NOMA systems is SIC, which needs a special network design. Details of network design are shown in Section 4. The main contributions of this paper are summarized as follows. (i)In our work, the optimal power allocation and user selection in the proposed downlink cooperative NOMA system are studied. Based on the proposed user selection strategy, the system outage probability and diversity order under imperfect CSI are analyzed. The theoretical analysis shows that the diversity order of the NOMA system under imperfect CSI is 0, which means the multiuser diversity order disappears(ii)A channel estimation network based on CNN and long short-term memory (LSTM) is designed for our NOMA scenario, which includes offline training and online prediction. Firstly, in order to minimize the complexity of training and improve learning performance, the input data is pretrained to extract feature vectors by a three-layer one-dimensional CNN. Secondly, the learning network with two LSTM layers in parallel is built, which learns CSI and reduces the estimation error to make the appropriate user selection(iii)The estimated performance of the proposed scheme is verified. Firstly, the LSTM network shows high learning accuracy, which largely satisfies the reliability of the mapping between user selection and input data. Secondly, the proposed scheme successfully improves the data rate of near users with imperfect CSI. Compared with the other strategy without the DL method, our proposed scheme has obvious advantages. Meanwhile, the performance of the proposed scheme is similar to the strategy with perfect CSI. Thirdly, the proposed scheme improves the outage performance and recovers the diversity gain while the diversity gain under imperfect CSI is zero without the DL method
The remainder of this paper is organized as follows. In Section 2, a downlink NOMA system with multiple near users and a far user is constructed, and the far user and one of near users are matched. In Section 3, the optimal power allocation and user selection are analyzed, and the system outage probability and diversity order under imperfect CSI are derived. In Section 4, we propose a scheme based on the CNN and LSTM for the NOMA scenario, which pretrains data and trains the network to reduce the channel estimation error. In Section 5, the performance of the proposed scheme is simulated and analyzed, which is followed by conclusions in Section 6.
2. System Model
We consider a NOMA downlink scenario consisting of a base station (BS), a far user , and near users , , …, , …, , as shown in Figure 1. The location of is fixed, and near users are randomly distributed between the BS and . Assume the channels between BS/ and near users follow Rayleigh distribution. The near users have direct links with BS, whereas there is no direct link between BS and . Hence, can only rely on the near users’ help to communicate with BS. Specifically, nears users employ the decode-and-forward protocol to forward the far user’s information.

Each transmission block is equally divided into two phases. In the first phase, supposing the near user is selected to help , BS sends a superimposed signal containing the information for and . The signal received by can be expressed as where and denote the messages for and , respectively, is the channel coefficient from BS to , , denotes the path loss exponent, denotes the distance between BS and , represents the power allocation factor for , represents the transmit power, and represents additive white Gaussian noise (AWGN) at with mean 0 and variance .
After receiving , first tries to decode and then decodes with estimated . Assuming the minimum mean square error (MMSE) channel estimator [30] is used, then we have where is the complex channel coefficient estimated and denotes channel estimation error following a complex Gaussian distribution, denoted by . Hence, the received signal-to-interference-plus-noise ratio (SINR) at to detect and can be, respectively, given by where is the transmit signal-noise ratio (SNR).
If successfully decodes , it forwards in the second phase. The signal received by can be expressed as where is the perfect complex channel coefficient from to , , denotes the distance between and , and is the AWGN at with mean 0 and variance . Similar to , let the estimate for the channel be . By assuming MMSE estimation error, it holds that and the SINR at to decode is given by where is the variance of .
According to (3) and (7), the achievable rate of (denoted by ) can be expressed as
If exceeds the targeted data rate of (denoted by ), the quality-of-service (QoS) requirement of is satisfied. Thus, according to (4), the achievable rate of (denoted by ) can be expressed as
Note that is achievable only when , which means the signal is decoded at successfully. If , the corresponding near user will not be selected and we regard its data rate as 0.
The aim of this paper is to maximize the achievable rate of near users while ensuring the QoS requirement of , i.e., maximizing , by jointly optimizing near-user selection and power allocation coefficient . In the next two sections, we will study this problem without and with the DL method.
3. Power Allocation, User Selection, and Outage Performance Analysis
In this section, we study the optimal power allocation coefficient and user selection and then investigate the system outage performance without the DL method.
3.1. Optimal Power Allocation Coefficient and User Selection
Firstly, we optimize the power allocation coefficient. Based on (3), (8), and (9), the supremum of can be given by where . In order to ensure that decodes successfully, should be larger than zero, in which case we can find that
Under the condition of (11), since the aim of this paper is to maximize the performance of near users on the basis of meeting the QoS requirement of , the optimal power allocation coefficient should be taken as the upper bound, which is given by
According to (7), (8), (9), and (11), the set of effective near users which can ensure the QoS requirement of can be expressed as
To maximize the achievable rate of near users, the selected near users should be contained in set . Thus, the best user selection is expressed as follows:
It can be observed from (12)–(14) that channel estimation has a significant impact on power allocation and user selection.
3.2. System Outage Probability and Diversity Order
The outage probability of this paper is defined as the probability that fails to decode or fails to decode . Let be the targeted rate of . If successfully decodes , i.e., , we can find that where . According to (12), formula (15) can be further expressed as follows:
Taking the constraints of (13) and (16) into consideration, the system outage probability is given by where denotes the outage probability for . Note that only when all near users fail, the system is completely unable to communicate. Therefore, is the cumulative outage probability of all near users. According to the definition of diversity order [31], when , we can find that where , , , , , and . According to (17) and (18), we can find that which is a constant. Thus, the diversity gain of the cooperative NOMA system without the DL method can be expressed as
According to the abovementioned analysis, we can tell that the accuracy of each hop CSI has a major effect on each user’s detection performance and also the user selection result. Next, we will introduce our proposed DL-based channel estimation method for the considered cooperative NOMA network.
4. CNN and LSTM-Based User Selection Scheme for the Downlink Cooperative NOMA
Due to the poor system outage performance under imperfect CSI, this section considers using the DL method to obtain more accurate CSI to improve system performance. In this section, a scheme adopting the LSTM network to solve the proposed optimization problem is considered. The proposed problem needs a large amount of data for learning when using DL, which leads to the vanishing gradient problem. The memory cells in the LSTM network can save previously extracted information for later use, solving the vanishing gradient problem [32]. Thus, LSTM is employed in this paper to learn the CSI of each hop. In addition, considering the large amount of data for training and the SIC method is complex, CNN is added in front of the LSTM network to extract channel features and improve the learning performance.
4.1. Design Concept of Neural Network
Our aim is to minimize channel estimation error and get accurate CSI to select the optimal user and achieve the best performance. In order to learn the CSI of each hop, a neural network with multiple LSTM layers in parallel is designed, and each LSTM layer corresponds to one-hop CSI. In this paper, the network uses two LSTM layers as each transmission block has two-hop CSI. In order to fit the SIC process, multiple hidden layers are added to the network, and the number of hidden layers corresponds to the times of decoding in each transmission. Thus, two hidden layers are added after the LSTM layers in this paper. Meanwhile, a CNN, which pretrains data to make the data easier for learning, is added in front of the LSTM network. It is worth mentioning that according to our design, if the NOMA system becomes more complex, the neural network can be adjusted to adapt to the new NOMA scenario.
The network design is summarized as follows. The first three layers of the network are one-dimensional CNN, pretraining data to extract feature vectors and simplifying learning parameters. The extracted feature vectors are passed to the fourth layer. Then, the fifth layer learns the feature vectors of the imperfect CSI. Note that feature vectors may vanish; thus, the fifth layer is the LSTM layer. After that, two hidden layers are added. Finally, the eighth layer is used to handle the network output. Details are shown in the following subsection.
4.2. Proposed Channel Estimation Framework
The channel estimation framework proposed is shown in Figure 2, which includes offline training and online prediction. The learning network is trained via using a mass of data which is pretrained well for offline training. In the online prediction part, the feedback of the downlink NOMA system is the input of the learning network and the estimated channel coefficient is the output. The following parts introduce the data pretraining and the proposed learning network in detail, respectively.

4.2.1. Data Pretraining
In order to improve the generalization ability and accelerate the convergence speed of the proposed network, the original input data needs to be pretrained. Note that data is converted into a sequence of transmitted symbols in the communication system. Since the data is in the plural form, the real part and the imaginary part need to be separated first and then reconstitute a real number sequence for pretraining. Thus, let the transmitted signal vector be a sequence with all data of , where is the number of transmitted symbols in the sequence at . If the number is not in the process of pretraining, the data is invalid. Note that the transmitted signal vector is one-dimensional. Therefore, the CNN used for pretraining data is one-dimensional here. In the task of extracting feature vectors of sequences, one-dimensional CNN shows excellent performance. It can extract the input data sequence into a shorter sequence composed of high-level features, so as to shorten the training time of the neural network and reduce the load of each neuron. After feature extraction, a different sequence will be output.
The CNN constructed in this paper is the superposition of convolution layers and a pooling layer. The specific structure is as follows. The first layer is a one-dimensional convolution layer, which calculates the convolution sum of the input data through the sliding window and obtains the output of the convolution layer after weighting processing. After passes through the convolution layer, the original information sequence is convoluted and transformed, and all convolution outputs form a characteristic matrix, and the matrix data is output to the next layer of the network. The second layer is a one-dimensional maximum pooling layer, which searches the maximum output value of the previous convolution layer through the pooling window, and extracts the characteristic vectors from the characteristic matrix to reduce the number of characteristic parameters. The third layer again stacks a one-dimensional convolution layer, performs the same operation as the first layer, and completes the extraction of feature vectors. The pretrained data can be input into the LSTM network for channel estimation, because the data still represents CSI. The activation function used by the CNN in this paper is the hyperbolic tangent (tanh) activation function, and its formula is expressed as follows:
The pretraining method is shown in Algorithm 1. The pretraining improves the performance of the network, avoids the situation that the network cannot converge, and speeds up the convergence speed.
|
4.2.2. Proposed Learning Network
The structure of offline training is shown in Figure 3. In order to improve the learning capacities of the network, we have built the LSTM network behind the 3-layer CNN. For each layer of the network, the output is the weighted sum of neurons the layer is equipped with. Since data has been pretrained and the dimension of input data has been reduced, the number of features used for training will not be very large. When the pretrained data enters the LSTM network, these features will perform some kinds of action and propagate some kinds of symbols to the node of the next layer. The dimension of the fourth layer, which is a dense layer with 32 neurons conveying features, is set to the length of the sequence used for training. Afterwards, the fifth layer consists of multiple parallel LSTM layers. In this paper, two LSTM layers are used to learn the two-hop CSI without correlation. The sixth and seventh layers have 64 and 32 neurons, respectively, which are dense layers, corresponding to two decoding in the NOMA system. If the number of hidden layers here is less than the decoding times, the learning performance will decrease, which we will verify in Section 5. When the number of decoding times in the NOMA system increases, the hidden layers should be added to maintain the learning capacities. Additionally, the output layer processed by the softmax function provides the estimated output signal vectors. The fourth, sixth, and seventh layers are processed by the Rectified Linear Unit (ReLU) function. The above two functions are expressed as follows: where is the output value of the th node and is the number of output nodes, namely, the number of categories of classification.

Note that the proposed scheme attempts to minimize the channel estimation error, approximating imperfect CSI to perfect CSI. Whether the judgment of CSI is accurate or not has a great impact on user selection. Based on the successful identification of accurate CSI, the most appropriate near user can be selected.
|
The principle of the proposed LSTM network is shown in Algorithm 2. The goal of our model is to minimize the channel estimation error, which is equivalent to minimizing the difference between the input data and the output of the network. Therefore, the loss function we use is the categorical cross-entropy, and its formula is given by where is the dimension of output vectors, is the expected output, and is the actual output. It is used to measure the distance between the probability distribution of the network output and the distribution of the label we specify. By minimizing the distance between the two distributions, a well-trained network can make the output as close to the desired result as possible. Meanwhile, in order to reduce the computational complexity and improve the learning performance of the network, we choose the RMSprop as the optimizer, which tunes hyperparameters to minimize the value of loss function. Based on the LSTM network, we use the recurrent dropout to improve the generalization ability of the neural network. We set the input unit of a certain layer of the network to 0 randomly, in order to break the accidental correlation in the training data of this layer.
5. Simulation Results and Discussion
In this section, we check the performance of the proposed deep neural network with LSTM framework when it is used in the NOMA system. Specifically, Python 3.7 and Keras are used for programming neural networks, and MATLAB is used for simulation of NOMA data. Keras is a DL framework that makes it easy to define and learn almost any type of DL model. Simulation parameters are set as follows. Every 100000 simulations are averaged to eliminate the randomness caused by channel fading. The distance between BS and (denoted by ) is set as 10 m or 30 m, and the near users are randomly distributed between and BS.
Our simulation is divided into the following three parts. Firstly, we verify the performance of the proposed neural network. Specifically, the learning accuracy of the neural network is analyzed, and the convergence is reflected by the loss value. Secondly, the average data rate of the proposed scheme is simulated. We take the proposed DL-based user selection as our learning strategy under imperfect CSI and apply it to the simulation of near-user data rate in this paper. We also compare this strategy with the other two user selection strategies without DL, which are contrasted strategy with imperfect CSI [30] and contrasted strategy with perfect CSI. The user selection with imperfect CSI utilizes the imperfect instantaneous CSI and selects the user with , and the user selection with perfect CSI selects the user with . Thirdly, the system outage probability of the proposed scheme is simulated and compared with the other two strategies without DL to verify that the proposed scheme has good outage performance and diversity gain. The numerical results of outage performance are consistent with the theoretical analysis. Details are shown in the following subsections.
5.1. Performance of the Proposed Learning Network
In this subsection, we analyze the performance of our deep neural network. As shown in Figure 4, the constructed LSTM network can finally achieve 91% accuracy to identify CSI. Such precision can largely satisfy the reliability of the user selection. The loss value of the proposed network is shown in Figure 5, in which the network gradually converges after 400 epochs. The above simulation results show that the trained neural network can accurately judge the CSI of the input data, which means the appropriate near users can be selected.


In order to verify the impact of the number of dense layers behind LSTM layers on the performance of neural networks, the learning accuracy of different networks is simulated in Figure 6. When the number of added dense layers is 2, the learning accuracy of the neural network is higher. If the number is less, performance will drop dramatically. In addition, continuing to add dense layers will not achieve greater gains and will increase learning costs. Therefore, we can conclude that when the number of added dense layers is the same as decoding times, the neural network has the best effect.

5.2. Average Data Rate of the Proposed User Selection Scheme
Based on the high accuracy of the learning network, we analyze the average data rate of the proposed user selection scheme when the transmit SNR changes in this subsection. Figure 7 simulates the average data rate of selected near users when is 10 m and 30 m, respectively, and Figure 8 compares the average data rate of selected near users at low speed of 1 m/s and at high speed of 20 m/s, respectively.


As shown in Figure 7, when is 10 m, our proposed strategy has a great advantage over the strategy without the DL method under imperfect CSI. The performance improvement in the average data rate is so significant that it is not far behind even the strategy with perfect CSI. When the transmit SNR increases, the average data rate of near users also increases, and the proposed strategy maintains its superiority. We change the random initial distribution of near users, and the performance advantage remains the same. In order to verify the performance of each strategy under a poor channel state, we add and randomly deploy near users. The average data rate under this circumstance is simulated. It is obvious from Figure 7 that the average data rate of selected users has dropped dramatically. Due to the great distances between users, the channel quality is poor; thus, high power is required to communicate successfully and average data rate is low. Although the channels are very bad, the proposed strategy can still play a good role and is even closer to the perfect CSI strategy than the one with the good channel state. This benefits from the advantages of DL. The more obvious the features are, the better the fitting effect will be, and it is easier to find a better choice in a poor communication environment.
The moving speeds of near users are 1 m/s and 20 m/s in Figure 8, respectively. As shown in Figure 8, the proposed scheme still has advantages, and the curve trend remains constant when the speed changes, which proves that the proposed learning strategy can adapt to user mobility well. By comparing the performance at low speed and high speed, it can be seen that the average data rate at low speed is higher than that at high speed, which is in line with common sense. According to the simulation results above, we can draw a conclusion that the proposed DL-based user selection can maintain good performance advantages on an average data rate regardless of the speed of near users.
5.3. Outage Performance and Diversity Gain of the Proposed User Selection Scheme
In this subsection, we simulate the system outage performance and discuss the diversity gain. The analytical result in (19) is validated in Figure 9 that when , the system outage probability of the strategy without DL under imperfect CSI is a constant, which means the diversity gain is zero. The outage performance of our proposed DL-based user selection scheme is shown in Figure 9. Obviously, the proposed scheme effectively reduces the system outage probability. Although the full diversity is not obtained, the proposed scheme successfully recovers the diversity gain, which is no longer 0 under imperfect CSI. In addition, the system outage probability under perfect CSI is studied in [33], which has full diversity. The theoretical analysis results of the two contrasted strategies are consistent with the Monte Carlo simulations in Figure 9.

As shown in Figure 10, with the decrease of channel estimation error, the system outage probability decreases and gradually approaches the system outage probability under perfect CSI. Note that higher transmit SNR is needed to make the outage probability of the user selection under imperfect CSI tend to a constant value when the estimation error decreases. However, due to the fact that the channel estimation error exists, the user selection without the DL method under imperfect CSI always has no diversity gain, even if the error is small. For our proposed user selection scheme, a smaller estimation error can help the network obtain higher diversity gain. Thus, our proposed scheme has great advantages over the user selection without the DL method.

6. Conclusion
In this paper, a DL-based user selection scheme for cooperative NOMA is proposed, and its performance is well investigated. Different from most studies, this paper considers the NOMA system with imperfect CSI and analyzes the outage performance and designs a special neural network to accurately identify CSI. In particular, the proposed scheme improves outage performance and recovers diversity gain, showing the superiority of the scheme. We believe that the DL method has more room for development in the domain of NOMA research.
Data Availability
The simulation data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors would like to declare that there are no potential conflicts of interest.
Acknowledgments
The work of Guoxin Li was supported in part by the National Key Research and Development Program of China under Grant 2018YFB1801103, in part by the National Science Foundation of China under Grant 62101595, and in part by the Jiangsu Province Natural Science Foundation under Grant BK20200580.