Abstract
With the rapid popularization and development of location-based services (LBS), location privacy preserving has become a mainstream concern. Existing anonymity-based methods rely mainly on anonymous location selection based on geographic information, which neglects the semantic information on the location, and behavior of self-interested users may raise a privacy vulnerability. A blockchain-based approach for location privacy protection schemes has been proposed to address these issues. In this scheme, the privacy level of the location is improved using private chain-based collaborative anonymous communication with anonymous set construction methods that consider the semantic diversity associated with the related user. Compared with other existing methods, evaluation based on real-world datasets shows that the proposed method prevents location privacy leakage from collaborative users and constructs a semantical anonymity set, thus effectively protecting the user’s location privacy.
1. Introduction
As the most popular service in recent years, location-based services (LBS) have enriched people’s lives. With these applications belonging to LBS, users can quickly access nearby services, such as nearby friend search [1] and nearby food recommendation services [2]. To help facilitate these conveniences, LBS requires users to submit their current precise location information to their LBS provider. Since this location information reflects users’ personal information, the untrusted LBS server can easily infer who is doing what, thus posing a threat to user privacy. Therefore, it is a necessity and a challenging task to ensure that we protect user privacy within LBS.
To address the privacy issue in LBS, many approaches have been proposed over the past several years. They are classified as obfuscation [3, 4], offset [5, 6], and dummy [7, 8] based on their means of protection. However, these approaches vary from their original location to achieve privacy protection, which usually degrades the quality of service provided by the service provider. Because of the trade-off between privacy and usability, the -anonymity approach has been intensively studied in recent years. The -anonymity program enables LBS servers by generating a set of anonymous locations while avoiding identifying the real location. The literature [9] first proposed a -anonymity solution for location privacy, which was achieved by the real location generalized adjacent locations selected. Recently, most -anonymity has focused on the security of anonymous servers [10–13] and improving the quality of anonymous locations under different realistic constraints [14, 15].
However, most of the existing -anonymity-based location privacy protection methods depend on third-party servers, which are unreliable due to the performance and security bottlenecks of third-party servers, resulting in privacy vulnerabilities. To address this issue, a -anonymity approach without the need for a trusted third party has been proposed [16–23]. In this method, the requesting user can negotiate directly with the surrounding collaborating users and protect their real location by generating an anonymity set that contains the collaborating anonymous location. Most of these studies have focused on improving the efficiency and privacy level of collaboration.
Undeniably, these schemes have further improved the theory and practice of privacy preservation in distribution architecture and provided better preserving services for users. However, some unsolved issues cannot be neglected.
First, these approaches always assume that collaborative users are honest and trustworthy. These assumptions are based on the trustfulness of collaborative users, but this situation rarely happens in reality. The self-interested collaborative users can reveal the real location information of other users they have obtained during the collaborative anonymization process and gain material benefits accordingly. Although the advanced features of blockchain technology (e.g., anonymity, immutability) provide some promise for a better solution to the above issue, the direct application of blockchain is still not sufficient [24, 25].
Second, semantic information brings a new challenge to the anonymity set. While exploiting the feature that semantic information can reflect the deeper implicit content of transactions can improve the accuracy of the model, it also creates new challenges for privacy protection [26, 27]. Semantic information can help an attacker to identify the true location in an anonymous set although anonymity has been achieved. In addition, simply applying location-based geography semantic information for anonymous location selection is not sufficient. This is because when an attack based on user-related semantic information is performed, it causes a new security threat effect on the anonymous set. For example, the anonymous set containing locations that the user did not stay to visit can lead to anonymity failure problems. Overall, it is a challenge to ensure collaborative users are honest and trustworthy and provide user-related semantic privacy preservation for users’ location while providing quality services in LBS.
To cope with these aforementioned challenges, a location privacy protection mechanism, based on collaborative users according to blockchain and user-related semantic knowledge in LBS, is proposed. By this scheme, users can ensure collaborative privacy and location privacy as well as establish a private chain among users and location anonymity that meets user-related semantic diversity. Unlike existing schemes, our scheme considers the privacy preservation implications of both collaborative processes and semantic information. With blockchain technology, a private chain for collaborative communication among users is established to avoid the privacy leakage problem caused by unrelated people in the collaboration process. Meanwhile, based on the semantic features related to users, a smart contract with a location description algorithm is established to automatically collect location semantic information. On this basis, the semantic diversity entropy is used to construct anonymous sets, which solves the problems of inactive participants and semantic privacy leakage. The major contributions of this paper are as follows: (1)A location privacy protection mechanism by integrating blockchain and user-related semantic location model was proposed and improves privacy level in multiuser collaboration and anonymity(2)To prevent semantic privacy of location disclosure, we propose a method based on user-related semantic for anonymity, which overcomes the attack by anonymous location selection for user-related semantic diversity(3)The evaluations on real datasets demonstrate that our approach can achieve semantic privacy which has higher efficiency and utility than other algorithms
In this paper, motivation is introduced in Section 3. Then, Section 4 describes the proposed method. The security analysis is described in Section 5. Next, we evaluate the performance of our proposed scheme in Section 6. Finally, we conclude this paper in Section 7.
2. Related Work
In this section, we briefly review some recent work research on the privacy protection of LBS. More precisely, we review some mainstream research works related to our study.
2.1. Anonymity-Based Method
Gruteser and Grunwald first generate anonymity sets based on the random walk algorithm for enhancing location privacy [9]. Recently, a series of cache-based anonymity schemes have been proposed to address the privacy leakage problem caused by frequent anonymous requests between user and server [10–12]. The basic idea of these schemes is to exploit the user’s local cache to prestore anonymous set user needs and thus satisfy the user’s privacy needs. To overcome untrusted third-party servers, the solution in [14] adopts the Order Preserving Symmetric Encryption (OPSE) used by frequent interaction with the server. To deal with the issue of low hit rate caching locations, Zhang et al. proposed an enhanced user privacy scheme by caching anonymous locations predicted by user history request locations [13]. To solve the problem of incomplete anonymous data, Yang et al. developed an anonymity scheme by using compressed sensing and differential privacy techniques to achieve anonymous location selection [15]. However, they all ignored the new challenges posed by semantics.
As the threat of semantic attacks to privacy becomes more and more serious, the methods to enhance the protection of personal semantic privacy also attract the attention of academic circles. To enhance semantic privacy level with physical constrain, Ye et al. in [26] considered road-network semantic information to choose anonymous location for anonymization. Combining the semantics-aware information with the background semantic knowledge of the underlying map, a novel approach is proposed to anonymization and thus conceal the visited places. To prevent the semantic attack in publishing trajectory datasets, the author proposed an algorithm providing strong privacy protection against both the semantic and reidentification attacks while reserving high data utility [27]. To provide high-level privacy and utility, the author models these issues as a multiobjective optimization problem and proposes the Improved Multiobjective Particle Swarm Optimization (IMOPSO) to generate the optimal anonymous set [28].
2.2. Collaboration-Based Method
In the method of user collaboration, peer-to-peer communication is used to select collaborating users, and their locations are used to construct anonymous regions to obscure the actual request location, thus protecting the location privacy of the users. The peer-to-peer (P2P) cloaking scheme [16, 17] forms a cloaking group by users via single-hop communication, and thus, they could disturb a particular user’s location without TTP. The literature [18] query from collaborative users could cache query results for a period, thus avoiding interaction with servers and improving privacy levels. To solve the privacy leakage problem caused by ignoring the association between query and location, Zhu et al. propose a scheme, QDER, with query information divide and exchange with random collaborative users, based on short-range communication [19]. To provide privacy protection under collusion attacks, Zhang et al. designed a location privacy-preserving scheme, which for anonymity is achieved by integrating smart contract and Shamir encryption mechanism with collaborative users to improve privacy level [20]. To enhance the trustworthiness of anonymous users, Hwang and Huang [21] and Hwang et al. [22] proposed a set of requesting users that can use social networks to select collaborative users and use their real locations to construct anonymity set by using their real location. Yang et al. use a single-round sealed double auction mechanism to motivate users in the collaboration process to allow multiple requesting users to request users to obtain the real location of the collaborating users through the auction results and complete the anonymous location selection. Incentive method for collaborating users by introducing a reputation mechanism is based on the subquantification of user collaboration [23]. In this scheme, collaborating users only help request users with high reputation value anonymous zones, and each user’s reputation value is increased by helping other users construct anonymous zones. Each user’s reputation value is enhanced by helping other users construct anonymous zones.
3. Motivation and Architecture
3.1. Motivation
Although there have been numerous research works on location privacy protection based on multiuser collaboration, a few issues have been overlooked. Among them, the most significant issues affecting anonymous location security include the following aspects: (1)On the one hand, in previous researches, works on multiuser collaborative approaches to achieve location privacy, they all assumed that the anonymously collaborative users were trustworthy and proactive. However, this is an assumption that rarely happens in reality. When a request user sends a collaboration request for neighbor users to build an anonymous set, requesting users send anonymous location requests to each other and get their locations in response to nearby users. In this case, the suspicious user can masquerade as a real trusted user to make anonymous collaborative requests. Through multiple user collaborative requests, the attacker can obtain the complete location information the neighboring users hold. And then, the attacker can get the real requested location of the adjacent users through further analysis of this information(2)On the other hand, although the impact of semantic features of anonymous locations on anonymity security has been considered in some centralized location privacy protection schemes [26–28], however, it is still lacking in distributed schemes. Also, the immediate adoption of previous semantic anonymization methods can still lead to anonymous locations being identified by attackers due to the possession of user-related background knowledge. For example, the anonymous set constructed in Figure 1 can lead to location privacy vulnerability (the location of bar and supermarker is identified since students impossibly visit them). This is due to the anonymous set containing anonymous locations that are not visited by the user or have no actual movement behavior

This paper proposes a location anonymity technique to cope with these challenges by considering the user-related semantic features of blockchain technology associated with users.
To solve the first problem, privacy disclosure caused by untrustworthy users in user collaboration, we utilize the blockchain to solve it. Blockchain technology is a decentralized bookkeeping system with tamper-proof and privacy-protecting features. Blockchain technology can provide user-recognized user collaboration records to avoid attackers’ modification of their collaboration records and avoid the problem that attackers are difficult to detect. Specifically, the public chain is used to establish a channel for publishing privacy requests and use the private chain to select anonymous locations that satisfy the privacy requests to join the anonymous collection, which improves the security of the anonymous collection. First, the private chain is used to establish a constructive blockchain between collaborating users to prevent the privacy leakage problem caused by the participation of unrelated people. Second, the smart contract technology in the blockchain records the number of times users collaborate to complete anonymous requests and makes the reward for users with more anonymous collaborations greater, completing the incentive for the anonymous willingness of collaborating users.
To solve second problem, semantic anonymity privacy risk exposure, the anonymization method combining smart contract technology in blockchain and user-related semantic diversity entropy is proposed. Smart contracts are a model of automatic contract execution that does not require human intervention. Since smart contracts are executed independently on the blockchain, the user-related feature extraction and semantic diversity of smart contracts can be used to build anonymous collections without relying on the intervention of third-party organizations. Therefore, this way can effectively avoid the privacy leakage problem in the extraction and anonymization process while improving efficiency. With smart contract technology, we enhance the security of anonymous locations by two functions: first, based on smart contract, combining user-related temporal semantic features with geographic semantic features to build the description model of anonymous locations, which makes the description of location semantics more consistent with the real behavioral purpose of users. Thus, it improves location semantic description accuracy by considering these features. Second, the semantic diversity entropy model based on smart contract is applied to the anonymous location selection, which improves the balance of different categories of semantic locations while ensuring semantic diversity and thus further improves the security of the anonymous set.
3.2. System Architecture
As shown in Figure 2, the system architecture of our scheme is depicted. This architecture has three basic entities: the mobile user, blockchain, and the LBS provider (LSP). The mobile user comprises two parts: the initial user and the collaborative user. The initial user initiates the anonymous request; the couser responds to the anonymous request and provides their corresponding locations. The blockchain substitutes the role of the traditional centralized anonymous server, and the sending and responding of anonymous requests and the establishment of private chains are done through the blockchain.

Based on the above analysis, our method includes the following steps: (i)User register: both requester and collaborator need to register with the blockchain system. Each registered user is assigned a pair of keys, and their identity is stored in the user pool. The user’s registered information as a transaction is recorded in the blockchain’s public ledger(ii)Anonymity request: the initial user sends the request of collaboration in this network of public chain and declares the number of collaborative users as well as his need for anonymity(iii)Smart contract creation for creating private blockchain: to improve the success rate of anonymity, requesters establish a smart contract to develop a private chain for reducing risk privacy during collaborative anonymity. It is automatically according to a predefined protocol when users meet the initiator’s needs and in the public chain. The collaborative user will establish the private chain with the initial user(iv)Smart contract creation for location semantic describe model: after establishing the private chain, the smart contract for the location semantic describe model is performed. First, the user-related temporal and geographic semantic feature information is extracted based on smart contracts, and then, a personal location semantic description model is built based on the hierarchical clustering of user access locations based on these two features(v)Smart contract creation for anonymous set establishment: When users submit their privacy requirements on the chain, the smart contract constructed by considering the anonymous set of semantic diversity entropy is executed automatically. This contract includes two functions of building anonymous candidate sets with equal historical probabilities and anonymous sets that satisfy semantic diversity
4. Our Method
4.1. Collaboration Based on Private Chains
The limitation of the storage space of the user terminal is that it only stores a few users’ recently visited locations and anonymous set. In this case, it is challenging to meet users’ privacy requirements. To address this problem, collaborative user participation is critical because anonymous collections stored in collaborative users contain information about the locations the users’ access. However, posting anonymous collaboration requests on public blockchains can easily lead to the disclosure of the location provided by the collaborating user and the requested location of the requesting user, which creates a privacy and security vulnerability.
A private chain-based approach to collaborative anonymity is proposed to solve this problem. Instead of publishing exact anonymous requests directly on the public chain, requestors only need to publish ambiguous anonymous requests on the public chain and establish a private blockchain of collaborative anonymity between responding users who satisfy the demand. Exchanging location information on the private chain avoids theft of personal information by unrelated users. The specific implementation steps are as follows, and the pseudocode is shown in Algorithm 1. Since the highest complexity operation of Algorithm 1 is to select the users who meet the privacy requirements among users to join the private blockchain, the time complexity of the algorithm is .
Step 1. All requesting users and collaborating users need to register on the blockchain and use anonymous identities instead of real identities to complete anonymous requests and collaborations.
Step 2. The requesting users publish their privacy requirements on the blockchain network, including the anonymous area and the anonymous semantic category.
Step 3. The smart contract automatically executes the private chain establishment algorithm and ends the execution when the number of collaborating users satisfying the privacy requirements is more significant than k and selects users to build a private blockchain based on the execution result, and the private chain includes the requesting users and the privacy users satisfying the anonymity requirement.
Step 4. The requesters publish their specific anonymous information such as anonymous location and anonymous demand on the private blockchain.
Step 5. Collaborators meeting the privacy requirements provide their stored location information, and the blockchain will create new blocks to record the collaborating users’ operations.
To obtain anonymous locations that satisfy anonymity security, collaborating users that can be added to the private blockchain need to satisfy the following constraint: (1) their historical request locations are stored by users who have collaboration records with the requesting user, so they need to be added to the private blockchain, and (2) to avoid the privacy leakage problem caused by user identity-based attacks, the owners who have the same semantic role (such as student or teacher) of locations need to be selected to be added to the private chain.
|
4.2. Build Semantic Description Model of Location
To enhance the security and responsibility of anonymous location, semantic features of each location, temporal semantic features, and geographical semantic features should be considered simultaneously. For this reason, we analyze the semantic information of the location data in the blockchain and extract their semantic features in the following.
To avoid semantic privacy leakage of anonymous sets, the geographical semantic characteristics (GSF) of location should be first considered. The GSF of each location reflects the semantic information represented by the location related to the user’s basic mobile behavior. It is the basis for the description of location user-related semantic information. We first adopt the location mark on a map to describe its geographical semantic knowledge, such as hospitals and schools. Then, the GSF can be obtained by using the idea of [10] for reference to classify the geographic semantic location information and thus further improve the fine-grained and accurate description. Based on the different effects of location on users’ real life, geographic location can be divided into six semantic categories, and Geographic Semantic Information (GSI) of each category can be represented as . Then, the GSF described as where is the GSF of location and is the th class of GSI.
To further improve security regarding users’ semantic privacy, the Time Semantic Feature (TSF) should be considered. The stay duration depicts the TSF of each location on location, reflecting the semantic information associated with the real mobile purpose of the user. Combining with the location’s TSF reveals the information of what the place actually serves as shown in Figure 3, such as the user staying a longer time for studying and shorter stays for visiting in the same school. Therefore, the anonymous location chosen based on this feature that can be avoided does not match the user’s actual preference. Based on the arriving time and the time of user leave at each location , TSF of location can be expressed as

Indistinguishable mobile semantics achieved by the above two factors make the anonymous locations powerful in protecting the user’s location geographically and semantically. To complete the extraction of these semantic features, when the blockchain performs the location collect step, the latitude and longitude information is collected, and the dwell time information is automatically collected. In addition, when the user history upload is completed, the smart contract automatically extracts the semantic features of the location.
To improve the accuracy of semantic information of location, the user-related semantic model can be represented as a personal semantic hierarchy tree, described as , which is constructed by considering GSF and TSF. note a set of edges, and each of which represents the relationship between two adjacent nodes. stands for a group of nodes, and it can be divided into internal nodes and leaf nodes. The node in internal nodes describes each category’s semantic information, such as education and research. The node of a leaf in the same cluster can be regarded as locations of the same user-related mobile semantics.
Combining the TSF and GSF, this semantic tree characterizes the user-related mobile semantics of locations according to the users’ roles at locations. The deep semantic tree is used to describe the user-related mobile semantics on locations. It is built based on the semantic feature , , as shown in Figure 4. The basic idea of building the deep semantic tree is first to build the base layer based on semantic categories and then extend it up and down. The personal semantic hierarchy tree is constructed by a group of the locations based on GSF to build the first layer and cluster them based on TSF in each group, then sorting them in order by average residence time (ART). This process is shown in Figure 4. We can generate the semantic tree by following the steps and detail in Algorithm 2. Since the proposed method is a special kind of hierarchical clustering method, the time complexity of Algorithm 2 is equivalent to the time complexity of the hierarchical clustering method . (i)User’s TSF and GSF of each location are extracted from the personal dataset by equations (1) and (2)(ii)Then, the locations divide into the same group which have the same GSF, and then, the node in this layer can be described as their category labels such as education(iii)Additionally, the location in each category is clustered based on TSF and each cluster center is taken as the node of the current layer(iv)Finally, the clusters are sorted in an ascending order by their average residence time (ART, it is the average value of all TSF in each cluster) of the cluster in each group and obtained node as an order number
|

4.3. Construct Anonymity Set
In contrast to traditional semantic attacks, user-related semantic attacks filter anonymous locations based on semantic similarity and background knowledge of the number of locations of different semantic types. It relies on the more significant number of anonymous locations containing the real request location because anonymous locations are more likely to choose nearby locations with similar semantic types. To resist such semantic attacks, the selection of anonymous locations needs to consider semantic variability while ensuring a balanced number of locations belonging to different semantic categories. Moreover, two conditions need to be satisfied simultaneously in this differentiation among anonymous locations, including the semantic difference between real and anonymous locations and the semantic difference among all the anonymous locations.
We introduce the semantic distances (SD) for semantic tree species to measure semantic location differences. The SD represents the relational distance between two nodes in an individual semantic tree, computed by the number of line segments included in the connection path between and . It is shown in equation (3). As shown in Figure 4, to satisfy the semantic differences (i.e., to belong to different semantic categories), the SD in the graph must be greater than 6 to fulfill this requirement.
Following the analysis above, anonymous location satisfies equilibrium based on satisfying semantic diversity. Therefore, we introduce the concept of semantic diversity entropy (SDE). The SDE is designed based on the feature that entropy [22] can measure the degree of distribution chaos in a set and is used to measure the variability of the semantic distribution of locations in the whole anonymous set. It has two benefits by adopting semantic entropy to measure the anonymous set: on the one hand, it can ensure that the anonymous set meets the requirements of semantic diversity. On the other hand, it can improve the balance of different semantic categories of anonymous locations based on the feature that the greater the entropy, the more balanced the different semantic locations in the anonymous set. Therefore, we take the , the proportion of different semantic categories in anonymous locations, as the input parameter, and the SDE can be expressed as where is the number of all semantic categories in an anonymous set and Neglecting the query probabilities of different classes of locations also negatively affects the security of anonymous sets despite the semantic diversity of anonymous locations that has been achieved. To further improve anonymous location security, anonymous locations need to not only satisfy semantic variety but also ensure that the historical query probabilities of the selected semantic locations are as equal as possible; anonymous location selection should satisfy both of these objectives. Therefore, the anonymous location choosing problem can be regarded as a multiobjective optimization problem, which contains a goal of the anonymous location that needs to satisfy the semantic distance as significant as possible. In contrast, the query probabilities of different locations are as equal as possible. To quantify the above two objectives, the following two concepts are introduced.
The semantic diversity distance represents the difference between the SDE of the currently selected location after joining the anonymous set and the SDE before joining , and can be expressed as
The query probability distance represents the difference between the query probabilities of any 2 locations and can be denoted as where represent historical query probability of location and represent the number of historical queries at location .
Combining these two above notions, anonymous location selection is aimed at ensuring that the query distance between the location in the anonymous set and the real location is the closest while ensuring that the semantic entropy is relative to the maximums. Therefore, the multiobjective function for anonymous location selection is
|
Solving the above problem directly is problematic because it is an NP-hard problem. Therefore, we utilize a stepwise optimization approach for solving it. In particular, by first selecting semantic locations with equal query probability to build the anonymous candidate set and (satisfying the objective function 1, as shown in lines 3-12 of Algorithm 3), and then selecting locations from them to meet the semantic diversity to make the anonymous collection (fulfilling the objective function 2, shown in lines 13-24 of Algorithm 3). Furthermore, to avoid the privacy leakage problem caused by all user locations, some locations accessed by collaborating users are also added in this process by selecting locations with semantic similarity (positions in their semantic tree with the same semantic type and equal ART, line 25 in Algorithm 3) to the anonymous locations for replacement. We describe this process as a function in the smart contract shown in Algorithm 3. Its time complexity is due to the anonymous set construction operation which is the most complex operation in this algorithm, which has a time complexity of .
5. Security Analysis
According to the previous analysis, privacy level affecting the process of LBS service mainly includes the following three contexts: (1) the semantic attack against anonymity set; (2) the attack against anonymity server; and (3) the attack against collaborative process. Therefore, the privacy level of the proposed method in the above three scenarios is analyzed next.
Theorem 1. The proposed method can ensure the security under the semantic attack against the anonymity set.
The semantic attack of the attacker is mainly based on the semantic background knowledge (semantic information about the request location) for the inference real location from the anonymous set. Therefore, it is sufficient to show that the attacker’s acquisition of the true location through the anonymous set is an event with a very small probability of occurrence to confirm the security.
When the attacker has mastered the anonymity set (AS), the probability of inferring the real location (RL) in the anonymity set (AS) based on his background knowledge (BK) (semantic and geographic knowledge) is expressed as . When an attacker executes an attack based on geographic knowledge, since the anonymous set contains anonymous locations with similar geographic knowledge to the real requested location, the probability of the attacker inferring the real location is , which is a small probability event . When an attacker has the semantic background knowledge related to the user, since the anonymous set includes different classes of semantic locations, the probability that the attacker identifies the real request location based on semantic knowledge is . Since the selection of is based on Algorithms 2 and 3 for maximizing the selection , it is a small probability event . Based on the above analysis, the probability that the real location is based on the above analysis, the probability of the real position being acquired by the attacker is . Because both are small probability events, their simultaneous events are low probability events () and it is difficult to infer the true location.
Theorem 2. The proposed method can ensure security under the attack against anonymity servers.
The attacks against the anonymity server represent a way for users to steal location datasets by hijacking the anonymity server , represented as . It is enough to show that the probability of this event is very small to prove that the anonymous server attack is ineffective.
First, the proposed method takes blockchain technology to choose a user to store on the chain (representing anonymous servers). Since the trusted users (UserT) where all collaborative users dynamically select through a consensus mechanism based on a credible factor submit the anonymous location to store on the chain, as , the probability that he is a suspicious attacker (Userdanger) , and is a small probability event. Then, the anonymous location on the blockchain for the encrypted storage , and only the smart contract has the decryption key, so the probability of being decrypted by the attacker can be expressed as <), where is a small probability event. The proposed method takes blockchain technology to choose a user to store on the chain (representing anonymous servers). Finally, the operation of generating an anonymous set through the location set is represented as (). Since the generation of anonymous sets is automatically executed by smart contracts and is not influenced by individual users or third parties, an attacker’s probability of being obtained shoots , and is a small probability event. Since the security threat to the anonymous server mainly comes from the above steps, so the probability of an attacker successfully attacking the anonymous server can be expressed as , because they are a small probability event that is impossible to happen.
Theorem 3. The proposed method can ensure the security of privacy in the collaborative process.
Since the proposed method includes two collaborative processes, the attack against the collaborative process consists of two parts: the theft attack against the requested user location information in the public chain-based collaborative process, and the attack against the collaborative user location information and the requested user location information in the private chain-based collaborative process. Therefore, proving the impossibility of attacks on users and collaborating users can fully illustrate the security of the proposed method under collaborative attacks.
In the collaborative process of the proposed, the fuzzy location region of the requested user is published, that is, AA in the public chain. In this process, avoid leakage of user information as a fuzzy area replaces the real location (RL), the probability that the collaborative user can obtain the user’s real requested location is , and is a small probability event. In the process of private chain collaboration to generate an anonymous set, all collaborative users can only see their location, so the probability of collaborative users getting the real location (CL) of other collaborative users at this time is , and represents the knowledge gained by the attacker in the process of private chain collaboration. Because the smart contract automatically runs the anonymous generation method by adding the anonymous locations from collaborating users into anonymous collections that meet privacy needs on the blockchain, thus, collaborative users get the requested user’s real location , where represents the knowledge gained by the attacker during the execution of the smart contract. Based on the above analysis, the privacy leakage probability in the collaborative process of anonymous users at this time, which means it is not likely to happen.
6. Experiment and Analysis
6.1. Experiment Setting
The experimental data came from the domain recognized TDS dataset [29], which included more than 1000 users’ request locations generated by Brinkhoff software simulation and simulated user’s query added in our experiment. The TDS dataset is the widely used Thomas Brinkhoff Road Network Mobile Node Data Generator, based on the traffic network data of the city of Oldenburg, Germany, with a rectangular area of about and user-defined attributes for the dataset. The communication bandwidth between the user and the Central Server (CS) is 3 Mbit/s. Each dataset is indexed by a 1 byte R-tree structure on the server side. In addition, we use the API provided by Gaode Map to add the semantic feature of TDS data.
To evaluate the performance of the proposed method, we implemented all the methods in Python language and ran them on a 3.20 GHz Intel core GPU, 16 GB RAM. This experiment is based on Ether [24, 25] to build a blockchain network platform. Ether can execute self-written smart contracts. This Ether includes 22 network nodes simulating 22 user operations, and one of them is randomly selected as the anonymous requester. The rest of the users are used as collaborative users for anonymous set construction. At the same time, the number of anonymous collaborative users of the anonymous set is used as a reward; the higher the reward, the easier the anonymous collaborative request obtained by the user about the success. Thirdly, the main parameter settings are shown in Table 1.
Finally, the following five basic metrics are used to validate the proposed methodology, the first three evaluate usability, and the last two evaluate privacy evaluation.
Block generation time: the block generation time is measured as the average generation time of a single block. It can reflect the underlying performance of a blockchain-based privacy protection system.
Anonymity generation time (AGT)[30]: anonymity generation time represents the time between the user initiates the anonymous request and the anonymous set is generated; the smaller the value, the higher the availability of the anonymous set.
Anonymity communication cost (ACC) [31]: anonymity communication overhead represents the location overhead that needs to be passed between users during the anonymization process, and the larger the value, the lower the availability of the anonymity set.
Privacy exposure probability of real location (PERL) [32]: privacy exposure probability of real location represents the probability that the real location in the anonymous set will be identified by an attacker; the smaller the value, the higher the security of the anonymous set.
Semantic diversity entropy (SDE): semantic diversity entropy represents the distribution of anonymous locations in different semantic categories in the anonymous set; the larger its value, the safer the anonymous location is under semantic attacks. It is computed by formulate (4).
6.2. Availability Analysis
In this section, we first use usability and privacy metrics to analyze the effectiveness of the proposed approach. Then, based on ASR and AGT, we further analyze the advancement of the proposed method by applying blockchain technology.
Figure 5(a) shows that the average blockchain generation time first indicates a flat trend then increases with the number of requesting users. This phenomenon is due to the capacity of our system to efficiently process requests from a small number of users with redundancy in the beginning. However, when the number of requested users exceeds two users, the system processing time increases because the system is running at total capacity.

(a) Validation based on block generation time

(b) Validation based on anonymity cost

(c) Validation based on AGT
Figure 5(b) shows that the communication cost between blockchain, requester, and collaborator shows a growing trend with increasing privacy requirements. The reasons for this phenomenon include the following: on the one hand, the growth in anonymity overhead is due to the growth in the number of locations required for anonymity as the need for privacy grows. On the other hand, collaborative-based anonymity requires more collaborative users to provide their locations and more messages for communication interactions.
Figure 5(c) demonstrates that the time required for constructing an anonymous set grows as privacy requirements grow. This is due to the fact that higher privacy requirements require more collaborative requests to be sent to more collaborative users and more time to build semantic description models and anonymity sets. It is also shown that the anonymous set construction time for collaborative users does not vary with the growth of user privacy requirements. This is due to the fact that collaborative users communicate in a peer-to-peer manner and therefore do not change as the number of collaborative users grows.
As Figure 6(a) depicts, PERL shows a downtrend when user privacy requirements grow. This phenomenon is because as the increases, the extent to which the true location is obscured also grows. Further, the number of anonymous locations of different semantic categories grows, reducing the probability of real location exposure under semantic attacks.

(a) Validation based on PERL

(b) Validation based on SDE
As Figure 6(b) depicts, the SDE curve has a growing trend with user privacy requirements growing. This is due to the fact that as the user privacy requirements increase, the number of different semantic categories in the anonymous set and the number of anonymous locations of various semantic types have to grow based on our approach to select anonymous locations, therefore making the SDE increase. Also, since we consider only eight categories, this increasing trend becomes slow beyond .
6.3. Comparison Analysis
6.3.1. Comparison Analysis with Different Privacy Protection Methods
In this section, we compare the effectiveness of the proposed approach with other approaches based on three evaluation metrics, SDE, PERL, and AGT, including STA-LPPM [27], a centralized architecture approach considering simple location semantics, and QDER [19], a collaborative anonymization scheme using only simple semantics between users without blockchain technology.
Figure 7(a) shows that, with the increasing, three methods have an increasing trend and our scheme has a middle slow trend in all compared algorithms based on AGT. This is because our approach utilizes a multiuser collaborative approach to achieve faster results compared to the centralized one. At the same time, the construction of private chains of blockchain and the semantic model operation in our method results in the growth of anonymization time overhead. QDER achieves the best results using distributed collaboration, but at the expense of privacy and security.

(a) Validation based on AGT

(b) Validation based on ACC

(c) Validation based on PERL

(d) Validation based on SDE
Figure 7(b) shows that, with the increasing, three methods have an increasing trend in ACC and our scheme has a slowest trend and lowest overhead in all compared algorithms. This is because our approach considers collaborative anonymization on a private chain. It prevents all users from sending their location information, reducing the overhead. At the same time, AGT of QDER is greater than the proposed method. This is because it needs more multiple interactions with collaborating users to obtain anonymous locations and thus increases the overhead. The centralized scheme, STA-LPPM, has the highest overhead because it needs to collect location information from all users.
Figure 7(c) shows that with the increasing, three methods have a decreasing trend in PERL, and our scheme has the slowest trend in all compared algorithms. This is due to the fact that our approach uses a private blockchain to avoid privacy leakage in the collaborative anonymization process. Further, the semantic description of location semantics based on semantic correlation with semantic diversity entropy for anonymous location selection improves the privacy level of the anonymous set under semantic context attack. The other two approaches considered simple semantics and did not consider semantic privacy and achieved the second and third experimental results, respectively.
As shown in Figure 7(d), with the increase of value, the SDE of all the methods presents a downward trend, and our scheme achieves the highest SDE compared with the other two techniques. The main reasons for this phenomenon are as follows: compared with the other methods, our method adopts the user-related location semantic model based on the user-related feature, which improves the number of anonymous locations from different types, which can be selected near the request location. Further, the anonymous set construction scheme employing semantic entropy ensures the semantic variability of anonymous locations and the balance of the number of anonymous locations belonging to different semantic types. Since STA-LPPM only considers simple semantics of locations and anonymity in QDER almost based on no-semantic information, STA-LPPM achieved better results than QDER in SDE.
6.3.2. Comparison Analysis in Blockchain-Based Method and No-Blockchain Method
To further illustrate the technological advancement brought by blockchain technology of the proposed method, we use two key metrics, anonymity generation time (AGT) [33] and anonymity success rate (ASR) [15], and compare them with the traditional distributed collaboration approach. And this comparison under the number of collaborative candidate users (, a key variable affecting anonymous efficiency) changes from 3 to 10 and . The difference between the traditional and proposed approaches is to achieve privacy protection based on blockchain. Specifically, the collaboration process of the traditional method does not rely on private chains for multiuser collaboration and its anonymous set generation does not run on the smart contract.
It can be seen from Table 2 that with the number of collaborative candidate users growing, the anonymity time based on blockchain is less than that of the traditional approach. This is because smart contract technology running on the blockchain reduces the time for collaborative user selection and ensures that anonymous sets can be constructed on time for satisfied privacy need. Moreover, the security of smart contracts also makes joint users more comfortable in providing anonymous locations to users, making more locations available for selection. The above analysis shows that anonymity based on blockchain technology can greatly save anonymity time.
From Table 3, it can be seen that with the number of collaborative candidate users growing, the success rate of anonymity using blockchain is higher than that of the traditional methods. This phenomenon is because our method can get more anonymous locations than the traditional methods, due to the security of blockchain technology that makes more users want to participate in collaboration and collaborative technology in blockchain can provide more computational power for the generation of anonymous sets. The above analysis proves that blockchain technology can significantly improve anonymity success rate efficiency.
By conducting our experiments on real datasets, we obtain two aspects of experimental results. In terms of usability experiments, AGT, block generation time, and anonymity have a growing trend with user privacy needs increasing. Meanwhile, PERL has a decreasing trend with privacy needs increasing. In terms of comparative experiments, comparing to other existing methods, ACC PERL and SDE of our method have the best result. To further illustrate the advancement of blockchain technology after its application, the blockchain-based scheme obtains superior results in both AGT and ASR compared to the scheme without blockchain.
7. Conclusion
To deal with the privacy vulnerability caused by self-interested users in the process of collaborative anonymity and the neglect of semantic information, this paper proposes a blockchain-based model for location-semantic privacy protection. To resist the privacy leakage problem caused by the self-interested behavior of collaborative users, a private chain-based means of collaborative communication is proposed. The semantic security of the anonymity set is improved by constructing a model using semantic diversity entropy associated with the user. Experimental results show that our method is advanced compared with existing methods.
However, there are some drawbacks in this solution. One is that we do not consider the trade-off between usability and privacy in this paper. Blockchain-based privacy protection methods add hardware and time overhead to a certain extent. The other one is that the proposed method can further improve efficiency and privacy level by using edge computing technology. In the future, we need to do more research in these two directions.
Data Availability
The experimental data came from the domain recognized TDS dataset [29], which included more than 1000 users’ request locations generated by Brinkhoff software simulation and simulated user’s query added in our experiment.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This paper was supported by the National Key R& D Program of China under grant No. 2019YFC1521400 and National Natural Science Foundation of China under grant No. 62072362, No. 61672426, No. 61572401, and No. 61902300.