Abstract

To solve the problems of large data sparsity and lack of negative samples in most point of interest (POI) recommendation methods, a POI recommendation method based on deep learning in location-based social networks is proposed. Firstly, a bidirectional long-short-term memory (Bi-LSTM) attention mechanism is designed to give different weights to different parts of the current sequence according to users’ long-term and short-term preferences. Then, the POI recommendation model is constructed, the sequence state data of the encoder is input into Bi-LSTM-Attention to get the attention representation of the current POI check-in sequence, and the Top- recommendation list is generated after the decoder processing. Finally, a negative sampling method is proposed to obtain an effective negative sample set, which is used to improve the calculation of the Bayesian personalized ranking loss function. The proposed method is demonstrated experimentally on Foursquare and Gowalla datasets. The experimental results show that the proposed method has better accuracy, recall, and F1 value than other comparison methods.

1. Introduction

With the maturity of internet technology and the widespread application of global satellite positioning systems, location-based social network (LBSN) has gradually received more and more attention and research. Recommending future potential points of interest (POI) to users by analyzing their existing historical check-in points has become a research hotspot in this field [1]. Specifically, personalized POI recommendation tasks can be divided into two types according to the recommendation basis. One is static POI recommendation. The basis of this recommendation is limited. It only collects the POIs that the user has checked in and does not consider the information before and after each check-in of the user [2, 3]. The second is the context-aware POI recommendation. This recommendation is based on the use of the user’s check-in behavior and the scene of the user’s context, dynamically learning the user’s dynamic preferences in a certain scene, and generating accurate and timely private customized solutions for the user [4, 5]. In contrast, static POI recommendation focuses on learning users’ static preferences and cannot consider users’ dynamic and real-time needs. Therefore, POI recommendation based on context awareness is the core of research in recent years [6].

At present, the POI recommendation system is more complicated than the classic news, movie, or online commodity recommendation system [7]. The first is the severe data sparsity problem. Among the massive POIs provided by LBSN, the POIs that can be accessed by users are almost negligible, resulting in extremely sparse datasets [8]. Secondly, there are three types of cold-start problems commonly encountered in recommended system tasks. A location that has never been visited is called a cold-start POI. Users who have never visited any location are called cold-start users. Users who move from one place to another unfamiliar place to live or travel will also encounter cold-start problems [9, 10]. Finally, there is the problem of user dynamic preferences, that is, user preferences will change with the passage of time and changes in the environment. Therefore, using check-in points to analyze user preferences will face great difficulties and challenges [11, 12].

In order to solve the above-mentioned problems of user dynamic preferences and data sparsity, a POI recommendation method based on deep learning in LBSN is proposed. Compared with the traditional POI recommendation method, its innovations are summarized as follows:(1)Due to the temporal dynamics problem that the traditional recommendation method does not solve and the in-depth recommendation does not consider the attention problem. To this end, the proposed method (Bi-LSTM-Attention) embeds a user-location cross-attention mechanism layer in the neural network to capture effective contextual information(2)In order to better capture the long-term and short-term preferences of users in the POI sign-in sequence, the proposed method introduces a Bi-LSTM-Attention mechanism. The long-term and short-term preferences of the user are mined from the user’s history and the current POI sign-in sequence. It solves the problem of inaccurate recommendation caused by users’ dynamic preferences(3)Aiming at the problem of the lack of valid negative samples in the dataset, the proposed method combines the popularity sampling weight and the distance sampling weight in a weighted combination. In this way, negative samples are obtained for model learning, which improves the efficiency of POI recommendation

The rest of this paper is arranged as follows. The second section introduces the related research progress in this field. The third section introduces the proposed POI recommendation method based on the BiLSTM-Attention model. In the fourth section, the feasibility and optimality of the proposed method are simulated by comparing with the current POI recommendation method. The fifth section is the conclusion of this paper.

In the POI recommendation system, the user’s check-in record has only positive samples, there is lack of negative sample information, and the amount of data is sparse. Therefore, recommending a location is challenging [13]. The early POI recommendation was to recommend the most popular Top- points of interest to users. Although it provides certain guidance for the user’s choice, it lacks personalization and the recommendation accuracy rate is low. The current personalized POI recommendation system is constructed based on the early POI recommendation algorithm by fusing various characteristic information. The characteristic information includes user sign-in behavior, social relationship between users, location evaluation information, location description information, location latitude and longitude information, and time context information, thereby improving the accuracy of recommendation [14, 15].

Among the existing research results, most POI location recommendations use traditional collaborative filtering recommendation algorithms. For example, Reference [16] proposed a three-layer network structure POI recommendation method based on LBSN. More tags and social and geographic information are separately modeled and integrated into a matrix factorization framework. This method has better accuracy than other methods, but the overall recommendation performance is still slightly insufficient. Reference [17] proposed a POI recommendation model based on user contextual behavior semantics. The metapath of the heterogeneous information network is used to represent the complex semantic relationship between the user and the POI, and the recommendation results are ranked by combining with the fusion method of learning ranking. The results show that its recommendation performance is ideal in simple scenarios. The aforementioned POI recommendation algorithm models all belong to the fusion of context information to improve recommendation accuracy. But how to better integrate contextual information into the model has become a difficult problem to be solved [18]. Reference [19] proposed a new POI recommendation framework based on hierarchical category conversion. Different levels of category conversion are used to model the preference conversion modes of users with different granularities. Effectively alleviate the cold-start problem, but the recommended accuracy is poor. Reference [20] recommends POI by integrating user preferences for geography, category, and attribute criteria with personalized weights. And through the collaborative filtering method, the opinions of similar users were merged under three conditions. This method has great advantages in processing large-scale data, but the accuracy and efficiency need to be improved.

Due to the powerful information processing capabilities of deep neural networks, many experts and scholars have applied it to the research of recommendation algorithms in recent years. Reference [21] proposed a real-time POI embedding model, which can mine real-time geographic information and learn potential information according to the corresponding geographic tagging posts. Among them, the convolutional neural network is used to mine the text information of the POI and learn the internal association. Although the universality of the model is better, it is less efficient when dealing with large-scale data. Reference [22] uses a recurrent neural network (RNN) to implement the next POI recommendation. It not only considers the location interests of similar users but also considers contextual information such as time, current location, and friends’ preferences. Therefore, the recommendation accuracy and efficiency are relatively ideal, but the learning ability of the RNN single network needs to be improved. Reference [23] combined stacked denoising autoencoders (SDAE) and Bi-LSTM and proposed a new neural network model for context-aware citation recommendation. Use the attention information in the citation context to extend SDAE and enhance its learning ability. However, the huge amount of information data affects the recommendation efficiency. Reference [24] proposed a continuous POI recommendation method based on latent factors with the help of RNN. It integrates POI’s sequential access and user preferences to recommend POIs. Although this method has higher accuracy than continuous POI, there is still room for improvement in the extraction of potential information.

3. POI Recommendation Based on Bi-LSTM-Attention Model

3.1. Problem Definition

For the POI recommendation problem, define as a user set, as a POI set, and as a label set. And specify , , and to represent the -dimensional vector of the user , the point of interest , and the label . Each point of interest has its longitude and latitude information and label collection . Each user is bound with its label set , friend list , and historical visit record , where indicates that the user visited the point of interest at time . The historical behavior sequence of all users is denoted as , given the user’s historical behavior record information and related friend information. The task of the model is to predict the points of interest that the user is most likely to visit next time.

3.2. Attention Layer Model

The attention model was originally used in the field of machine translation and has now become an important concept in the field of neural networks. In the field of artificial intelligence, attention has become an important part of the neural network structure and has a large number of applications in natural language processing, statistical learning, speech, and computer fields [25]. In response to the POI recommendation problem, it is possible that only certain locations in the user’s historical time series may be related to the predicted next location. Therefore, the attention mechanism can be used to capture some of the information that assists in the judgment of the next POI and ignore the unimportant information. The principle of using attention is to calculate the matching degree between the current input sequence and the output vector. The high degree of matching means that the relative score of the attention point is also high [26]. The matching degree weight calculated by attention is only limited to the current sequence team, rather than the weight obtained by the traditional neural network model as the overall weight. Attention is mainly used in natural language processing and less involved in POI recommendation problems. Therefore, a user-location cross-attention mechanism layer is embedded in the neural network, and this new attention mechanism is applied to the next POI recommendation. The attention layer structure is shown in Figure 1.

Among them, , and are a set of time and location sequence of user from to , respectively, and the output vector of user after embedding layer and linear hidden layer. to is the attention weight of the location output vector, which is the degree of importance for the model prediction, obtained by the transposition of the location output vector and the inner product of the user vector. + represents the addition of the corresponding positions of the location vector. <> represents the splicing of vectors. The calculation expression is as follows:

3.3. Model Framework

In order to better capture the long-term and short-term preferences of users in the POI sign-in sequence, Bi-LSTM-Attention is introduced into the decoder. It can dig out the user’s long-term preferences from all the historical POI sign-in sequences of the user, while focusing on the short-term preferences in the current POI sign-in sequence, and allows the decoder to assign different weights to different parts of the current sequence according to the user’s long- and short-term preferences. The structure of the POI recommendation model that introduces this attention mechanism is shown in Figure 2.

In Figure 2, the encoder represents the part of the encoder structure excluding the dropout layer. Assume that the user has a total of POI sign-in sequences, and the th is the current POI sign-in sequence. The overall representation of the user’s first historical POI check-in sequence is denoted as . The hidden state obtained at each moment of the user’s current sequence is denoted as . is the attention representation of the user’s current sequence. represents the scoring prediction results for all items.

In Bi-LSTM, the input at the current moment depends not only on the previous video frame but also on the subsequent video frame. The combination of the two units fully considers the timing information before and after the video frame. The model structure is shown in Figure 3.

In Figure 3, represents the weight from one unit layer to another. represents the feature vector obtained after extracting deep features through the Inceptionv3 layer. represents the LSTM unit composed of the feature sequence input from front to back by . represents the LSTM unit composed of feature sequence input from back to front. represents the corresponding output result after the feature vector passes through the Bi-LSTM network. The mathematical calculation process of Bi-LSTM is as follows:

In the formula, is the bias of the hidden unit of the th feature vector in the Bi-LSTM network. and are the result of two LSTM units, respectively, processing the feature vector output by the Inceptionv3 layer at the corresponding time. The result of adding and averaging the two eigenvectors at the corresponding time is used as the output eigenvector . Finally, the feature vector is sent to the attention mechanism to perceive the network weight [27].

The proposed model first puts the user’s historical POI sign-in sequence into the encoder to encode the overall sequence. At the same time, the encoder is used to encode the hidden state at each moment of the user’s current sequence. Then, input these data into Bi-LSTM-Attention to get the attention representation of the current POI check-in sequence. is calculated as follows:

After calculating , splice it with to obtain . In order to prevent the model from overfitting, it is first put into the dropout layer, and then, the score of each candidate POI is obtained through the fully connected layer and the nonlinear activation function tanh. Finally, the candidate POIs are sorted according to their scores, and a Top- recommendation list is generated.

where is the weight coefficient from the splicing layer to the output layer and is the offset.

3.4. Loss Function and Optimization Algorithm

The basic task of the recommendation system is to sort the items in the system according to a series of related information. Although this task can be regarded as a classification task, it is still a sorting task in essence [28, 29]. The proposed method regards the task of generating the Top- recommendation list as a sorting task. The specific purpose is to make the POI actually checked in by the user in the next step to rank higher in the generated recommendation list. In order to improve the ranking performance of the model, consider using Bayesian personalized ranking (BPR) loss [30, 31]. The definition of BPR is calculated as follows:

where represents the predicted score of the target . represents the predicted score of the target . is the number of negative samples.

In order to allow the model to learn more effectively, it is usually desirable to use more effective negative samples for learning, which can make BPR generate higher gradients. If the negative sample has no value, it will produce a gradient close to zero. Due to the averaging operation in the BPR calculation, when there are many worthless negative samples, the gradient of the loss function will become very small. That is, the phenomenon of gradient disappearance occurs, so that learning can no longer continue [32, 33]. In order to avoid the above-mentioned problems of BPR, the ranking loss function BPRmax is used to measure the prediction error of the model. The calculation of BPRmax is as follows:

where represents the likelihood that is similar to the target . The softmax function can be used to process the prediction score output by the model, and is the regular term coefficient.

For negative samples with low prediction scores, it is considered invalid to repeat learning for them, so they are given a lower calculation weight [34]. At the same time, in order to prevent the predicted score from being too extreme, BPRmax adds a regular term for the predicted score. These measures effectively solve the problem of gradient disappearance caused by the BPR loss function and improve the performance of the model [35].

Since implicit feedback data often only has positive feedback, and negative feedback is usually missing or difficult to judge, it becomes very necessary to generate a corresponding negative sample for each positive sample. If a user has not performed a check-in operation on multiple POIs, it is impossible to explicitly determine which POI is the user who did not perform the check-in operation because he did not like it. Only the POIs that users really do not like can make the model learn effective user preferences. Invalid negative samples will increase the noise in the data and affect the learning effect of the model. At the same time, in the recommendation system, the number of items is usually very large, and it is not realistic to calculate a score for each item. This will cause the calculation of the system to continue to expand as the product of the number of projects and the number of user-project interactions increases, making the system unusable. A large number of calculations will also cause the accuracy of the final result to deteriorate. For the above reasons, effective negative sampling in the process of model learning can not only make the training process of the model more efficient but also reduce the amount of model calculations.

The proposed method also takes into account the popularity of the target POI and the geographical distance between the target POI and the last time the user checked in to the POI. A negative sampling method that is more suitable for the characteristics of the POI recommendation task is proposed. The sampling method firstly calculates the sampling weight of place popularity based on the statistics of the number of check-ins of POI. Then, the distance sampling weight of all POIs is calculated according to the POI and POI distance matrix that the user signs in at time . The specific calculation is as follows:

where is the number of check-ins for . is the number of check-ins of the most popular . is the distance between the user’s sign-in at time and the target . is the adjustment parameter.

Then, the popularity sampling weight and the distance sampling weight are weighted and combined to obtain the overall sampling weight of the POI. The calculation is as follows:

where and are the adjustment factors of the two weights, which add up to 1.

Finally, according to the sampling weight, negative sampling is performed on all POIs except the last POI that the user signs in to obtain a set of negative samples used to calculate the loss function.

4. Experiment and Analysis

4.1. Dataset

The experiment uses two public datasets: Foursquare and Gowalla datasets. In order to show the spatial distribution of user check-in data more clearly, the geographic spatial distribution of the Foursquare and Gowalla datasets is shown in Figure 4.

As can be seen from Figure 4, the distribution of the check-in points of interest in the two datasets presents different geographic distribution characteristics. The spatial distribution of user check-ins in the Foursquare dataset is relatively more scattered, while the spatial distribution of user check-ins in the Gowalla dataset is concentrated in several different centers. The differences in user activity patterns in different geographic regions are explained. At the same time, the sign-in sequences of several users were randomly selected in the two datasets and visualized on the map, as shown in Figure 5.

It can be seen from Figure 5 that the movements of different users present different geographic distribution patterns. It proves that there are individualized differences in the activity patterns and spatial preferences of different users. For example, in Figure 5(a), the geographical distances of users who check in continuously are large and scattered, while the geographical distances of users who check in continuously in Figure 5(b) are relatively small and concentrated. This also further illustrates the differences in the degree of preference of different users for geographic distance. In addition, from the perspective of the two datasets as a whole, users prefer to visit points of interest closer to them, and the visited points of interest show geographic clustering. Therefore, in the POI recommendation task, consider using continuous geographic distance information to model the user’s personalized spatial preferences.

For the two check-in datasets, in order to alleviate the sparseness of the sign-in data, sparse users who visit less than 5 points of interest are removed, and at the same time, the unpopular points of interest that visit less than 5 users are removed. After data preprocessing, the statistics of Foursquare and Gowalla datasets are shown in Table 1.

It can be seen from Table 1 that the user sign-in data is very sparse. The average number of points of interest visited by users in the two datasets is less than 1% of the total number of points of interest in each. The average number of visitors to each point of interest is also less than 1% of the number of users in their respective datasets. Therefore, the problem of data sparsity is a big challenge for the task of point of interest recommendation. With limited historical check-in data, fusing spatiotemporal context information can not only improve the accuracy of point-of-interest recommendation but also greatly alleviate the problem of data sparsity.

4.2. Evaluation Index

In order to accurately evaluate the performance of the proposed method on the next POI recommendation task, Precision is used to measure it, and the calculation is as follows:

Each user has one positive sample data and multiple negative sample data in the test set. Among them, represents the positive sample data of user in the test set. Both positive and negative sample data have a score, and the model predicts the score of the user’s next destination. represents the set of POI locations with the highest scores predicted by the model for the positive and negative samples of user in the test set. If the location of the positive sample data of the user is in the set, then is recorded as 1, which means that the model predicts the user’s POI location correctly; otherwise, it is recorded as 0. represents the total number of users. Generally, the larger the value of , the higher the accuracy of the model’s prediction.

Aiming at the Top- ranking problem of point-of-interest recommendation, the two most commonly used evaluation indicators are used, namely, recall and F1-score. They are also commonly used algorithm evaluation indicators in the field of machine learning and data mining. The two types of evaluation indicators are abbreviated as Recall@ and F1@, respectively. For a given user set , any user and the set of points of interest that the user has visited in the test set are . The set of points of interest recommended to the user through the algorithm is . Then, Recall@ and F1@ are calculated as follows:

where is the set of label points of interest of the user.

4.3. Parameter Discussion

In the two datasets, the optimal embedding dimension and spatial window width are maintained. By setting different moving sequence lengths {10, 20, …, 100}, observe the changes of the Bi-LSTM-Attention under the Recall@5 and Recall@10 evaluation indicators. The effect of different moving sequence lengths on the performance of the Bi-LSTM-Attention is shown in Figure 6.

It can be seen from Figure 6, whether it is Recall@5 or Recall@10, for the Foursquare dataset, when the sequence length is set to 80, the Bi-LSTM-Attention can obtain the best recommendation results. For the Gowalla dataset, when the sequence length is set to 60, the Bi-LSTM-Attention obtains the best recommended performance. The different results of the two datasets may be caused by the difference in the number of user check-ins in the two datasets.

4.4. Performance Comparison with Comparison Method

In order to demonstrate the recommended performance of the Bi-LSTM-Attention, multiple experiments were carried out using the evaluation indicators of Precision@2, Precision@4, Precision@6, and Precision@8. Compare it with the results in References [16, 17, 23, 24]. Pay special attention to the fact that the length of the sequence accessed by each user is different, so the test sequence length is greater than or equal to parts. Therefore, the larger the , the smaller the number of test datasets. The experimental results of the two datasets are shown in Figure 7.

It can be seen from Figure 7 that on the Foursquare and Gowalla datasets, the recommended performance of the Bi-LSTM-Attention is better than other comparison methods. When Precision@2, its recommended accuracy rates on the two datasets are 0.045 and 0.038, respectively. Because the attention layer is integrated into the Bi-LSTM model, different parts of the current sequence are given different weights according to the user’s long-term and short-term preferences, thereby improving the accuracy of interest point prediction. And the effect of the Bi-LSTM-Attention on Foursquare compared to other methods is more obvious than that on Gowalla. Because the Foursquare dataset has a shorter average sequence length than Gowalla’s dataset, the use of additional nonsequence information is more helpful in improving accuracy. This demonstrates that the Bi-LSTM-Attention has a better handling effect in dealing with the cold-start problem.

Similarly, the Recall@ comparison results of different methods on the Foursquare and Gowalla datasets are shown in Figure 8.

It can be seen from Figure 8 that as increases, the recall value continues to increase. And the Recall@ value of the Bi-LSTM-Attention on the two datasets is better than other comparison methods. When is 20, the Recall@ values of the Bi-LSTM-Attention are 0.33 and 0.36, respectively. Because it uses the Bi-LSTM-Attention mechanism to increase the weight of important location information and combines a variety of context-aware information, it greatly improves the recommendation performance. Reference [16] integrates more label and social and geographic information into a matrix factorization framework. This method can achieve better recommendation accuracy, but the overall recommendation performance is still slightly insufficient. Reference [17] completes POI recommendation based on user context behavior semantics. But the method is more traditional, so the recommendation effect is not ideal. Reference [23] combines SDAE and Bi-LSTM to implement location recommendation for context awareness. Reference [24] uses an improved RNN for the next POI recommendation, and both are better than RNN. Taking the Foursquare dataset as an example, the Recall@20 values of the two are 0.32 and 0.34, respectively. However, it lacks consideration of the user’s long-term and short-term location preferences, so the overall recommendation performance is still lacking.

In addition, the F1@ comparison results of different methods on the Foursquare and Gowalla datasets are shown in Figure 9.

It can be seen from Figure 9 that the Bi-LSTM-Attention shows the best recommendation effect on the Foursquare and Gowalla datasets. Among them, the Bi-LSTM-Attention is significantly better than the traditional recommendation methods in References [16, 17]. Comparing Reference [17], it has an average increase of 51.58%, 47.24%, 57.12%, and 48.35% under the evaluation indicators of F1@5, F1@10, F1@15, F1@20, etc., of the two datasets, respectively. At the same time, the Bi-LSTM-Attention is also significantly better than the other two methods based on deep neural networks. Compared with Reference [23], the Bi-LSTM-Attention has an average increase of 11.24%, 23.17%, 37.18%, and 33.34% under the evaluation indicators of F1@5, F1@10, F1@15, F1@20, etc., respectively.

In summary, the Bi-LSTM-Attention is effective on the task of point-of-interest recommendation, mainly because it can more effectively integrate spatiotemporal context information, thereby more accurately modeling the user’s sequence preferences and personalized spatiotemporal preferences, and to a certain extent alleviate the problem of data sparsity.

5. Conclusion

Points of interest recommendation, as an important branch of the location social network, can help users find the most interesting personalized locations. However, POI recommendation is affected by issues such as data sparseness and dynamic changes in user preferences, and it is challenging to achieve high-accuracy recommendation. For this reason, a POI recommendation method using deep learning in LBSN is proposed. Enter the user’s history and current POI sign-in sequence into the encoder to encode the entire sequence. And input these data into Bi-LSTM-Attention to get the attention representation of the current POI sign-in sequence. Afterwards, it is processed by the decoder to generate a Top- recommendation list. Finally, the Bi-LSTM-Attention is demonstrated on foursquare and Gowalla datasets. The results show that the accuracy, recall, and F1 value of the Bi-LSTM-Attention are better than other comparison methods, and it has better recommendation performance. In addition, the attention layer is integrated into the Bi-LSTM model, which can give different weights to different parts of the current sequence according to the user’s long-term and short-term preferences, further improving the accuracy of the recommendation.

Since the length of the time series of the input user of the Bi-LSTM-Attention is part of the historical POI location of the user, in the next study, the length of the sequence will be increased to improve the universality of the Bi-LSTM-Attention. And the Bi-LSTM-Attention also solves the cold-start problem to a certain extent, but the improvement is limited. In the future, we will need to find new ways to improve model performance.

Data Availability

The data included in this paper are available without any restriction.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Natural Science Foundation of China (No. 71673220) and the Special Scientific Research Plan of Education Department of Shaanxi Provincial Government-Humanities and Social Sciences (No. 20JK0232).