Abstract
Mobile crowdsensing (MCS), as a novel large-scale data acquisition method, has attracted more and more attention. Since the participants’ quality directly affects the quality of perceptual task completion in MCS, participant selection has become a focus of researchers. However, due to the sparsity of participants’ information and data privacy, existing solutions have certain limitations in terms of security and accuracy in participant selection. To tackle these problems, this paper proposes a secure and accurate participant selection (SAPS) method. It employs blockchain-based cross-domain reputation sharing while labeling participants with personalized reputation tags as quality references to achieve security and accuracy in participant selection. In particular, SAPS utilizes a model of differential privacy to protect privacy during cross-domain sharing while guaranteeing the credibility of the data sources by leveraging the traceability and non-tamper nature of the blockchain. Comprehensive experiments on real datasets indicate that compared with CMABA, the tasks’ completion quality in SAPS is improved by 18%, and the execution cost of SAPS is reduced by 6%.
1. Introduction
Recently, more and more people have begun to pay attention to the Internet of Things (IoT) for using ubiquitous sensors to provide services for humans and has made breakthroughs, one of which is MCS. It is a new data acquisition mode that combines crowdsourcing ideas and perceptual ability of multiple mobile devices. MCS leverages the movement of people to perceive and collect data according to specific requirements, which has been applied in air monitoring, intelligent transportation, urban management, and other aspects [1–4]. Compared with the traditional fixed-deployment perceptual system, MCS has the advantages of low cost, easy maintenance, and good scalability.
Due to the different performance of various mobile devices and the subjective factors of the participants, the quality of the collected perceptual data is significantly different. One assumption of most of the existing works is that the perceptual quality of the participants is known or calculable. Since the environment of MCS is dynamic and participants can freely join or leave, the assumption is unrealistic in practical application scenarios. For example, it is difficult to judge the reliability of participants who join MCS for the first time. Another special case is that although some participants have already participated in tasks, they may leave for various reasons. When they join MCS again after a period of time (at this time, there is no corresponding history or incomplete history in the cache of MCS), the perceptual platforms cannot effectively use the data for high-quality selection. In addition, malicious participants can rejoin with new identities many times, making the MCS unable to refer to effective historical data.
So, it is necessary to integrate various factors and take more effective steps to select high-quality participants. Existing participant selection methods usually can be categorized as follows: deposit mechanism, auction mechanism, perceptual history record-based selection, and multi-armed bandit (MAB) machine [5–8]. In detail, the deposit-based mechanism made the participants pay a certain amount of deposit as a precondition of perceiving a specific task. The deposits and rewards will be returned to participants upon completion of their tasks. In the auction mechanism, the participants quote for tasks, and then the perceptual platforms select a certain number of participants from all the quotes. As the selecting process of participants is not fully judged and verified, the two above schemes cannot guarantee the security and reliability of participants. The history-based method selects the best participants according to the quality records of the past completed tasks. Yet, the available historical data are sparse, especially for the new entrants or re-entrants. MAB uses reinforcement learning to interact with the outside world through agents, infers the interest characteristics of entities, and then makes effective choices. However, since MAB has a high learning cost and mainly relies on prediction, it cannot guarantee the performance, cost, and other aspects of the perceptual platforms.
In addition, the privacy and security should be taken into account in participant selection. The privacy disclosure of data has become a bottleneck, limiting the rapid development of MCS. At present, privacy protection methods in participant selection mainly include encryption [9, 10] and blockchain technologies [11–13]. Encryption technology encrypts the transmitted data or establishes a trusted channel between the perceptual platforms and the participants. But this has the risk of single-point failures and high load of the perceptual platforms. To solve the above problems, some works have attempted to decentralize the perceptual platforms with the help of blockchain. However, the design only considering decentralization increases the cost of participating nodes and the perceptual platforms, indirectly leading to the low efficiency of MCS.
In this paper, we propose a secure and accurate participant selection (SAPS) method by using cross-domain reputation data with privacy protection based on blockchain. It enables accurate selection of participants and ensures the security of participants’ private data. The main contributions of this paper can be summarized as follows.
First, we propose a trustworthy participant selection framework based on cross-domain data sharing. It can effectively obtain the reputation of participants by integrating the reputation data from multiple domains. Also, the proposed framework can take advantage of the non-tamper nature of the blockchain to guarantee the credibility of reputation data.
Second, we design a blockchain-based incentive mechanism in SAPS to reward blockchain nodes that act as subperceptual platforms and an adaptive matching algorithm to improve the matching accuracy of participants and tasks.
Then, in SAPS, the privacy of participants’ reputation data and the results of queries from multiple domains are protected by distributed data storage and differential privacy. Meanwhile, the anonymity mechanism of blockchain can also play an important role in protecting privacy.
Finally, we prove that this method outperforms related works with better participant selection capabilities through experiments. Specifically, the tasks’ completion is improved by 18%, and the overall execution cost is reduced by 6%.
The rest of this paper is organized as follows. In Section 2, we review the relevant works. We introduce the content of differential privacy and description of symbols in Section 3. The proposed SAPS and related algorithm are described in Section 4. In Section 5, we make a theoretical analysis of the proposed method. The experimental performance of this method is evaluated in Section 6. We make concluding remarks in Section 7.
2. Related Work
2.1. Participant Selection Strategy
Existing selection strategies for participants can be divided into two categories: historical data as reference and ignoring historical data.(1)Selecting participants with historical records: Pu et al. [14] proposed an online selection model to maximize data quality. It selected participants based on their task completion ability, efficiency, real time, and other aspects. Song et al. [15] considered the limited budget and the quality of participants and proposed a multi-task-oriented participant selection strategy by using information satisfaction index as reference for selection. Yang et al. [16] utilized the fine-grained analysis of users and presented a personalized task allocation system for MCS, which matched tasks according to users’ preferences and historical data so as to complete the selection of participants.(2)Selecting participants without historical records: Gao et al. [17] measured the reliability of participants through MAB. Xiao et al. [18] presented an offline task allocation algorithm and an online task allocation algorithm from the perspective of task allocation. They used a greedy allocation strategy to indirectly solve the problem of participant selecting. Jin et al. [19] studied the task pricing problem with multi-requester price competition in MCS and the dynamically arriving workers. They proposed a Markov correlation equilibrium that minimizes the social cost of selecting participants. According to the above analysis, there is some uncertainty in this kind of work. For instance, a malicious participant would pose a threat to system security and data privacy, and this type of scheme cannot accurately detect the real reaction conditions and characteristics of participants.
2.2. Privacy Protection Strategy
2.2.1. Making Privacy Data Invisible by Encryption Technology
Xiong et al. [9] proposed an edge-assisted privacy protection sharing framework, which used additive secret sharing technology to encrypt original data into two ciphertexts, and constructed two security functions to realize convolution divine network for privacy protection. Lu [20] studied the BGN homomorphic encryption technology to protect query range and privacy of a single IoT device data and presented a privacy protection range query scheme with high communication efficiency in the fog-enhanced IoT. Miao et al. [21] utilized homomorphic cryptosystem to weight and aggregate users’ encrypted data and presented a privacy protection truth discovery framework, and it protected users’ perceptual data and their reliability scores obtained through the truth discovery method.
2.2.2. Making Multiple Records Indistinguishable by Clustering and Grouping
Fan et al. [22] proposed a privacy-aware and trusted data aggregation protocol that allowed the server to summarize the data submitted by mobile users without knowing personal user data. Wang et al. [23] studied the issue of publishing real-time crowdsourcing statistics with strong privacy protection under untrusted servers and presented a privacy protection framework based on distributed agents. The framework introduced a level of multiple agents between users and untrusted servers. Users randomly selected an agent and uploaded data information to the agent through anonymous communication.
2.2.3. Obscuring Sensitive Data by Disturbance
Li et al. [24] provided a location proof with privacy protection relying on existing Wi-Fi or cellular network access points (or APs). Shen et al. [25] introduced edge nodes to optimize the task acceptance rate while protecting the privacy of participants and proposed a privacy protection task allocation framework for edge computing-enhanced MCS. Shahabi et al. [26] presented a framework for assigning tasks to staff online, the framework disturbed the location of tasks and staff according to geographical indistinguishably, then it quantified the probability of arrival between tasks and staff without affecting the location privacy of staff and tasks.
The above works put forward some solutions in participant selection and privacy security but still have the following problems. (1) In fact, historical data of participants are often limited while incurring the sparsity problem of data. (2) There is a complex coupling relationship between privacy protection mechanisms and participant selection methods. To select high-quality participants and protect their data privacy, the above two problems need to be solved urgently.
3. Preliminaries
In this section, we review the relevant knowledge and definitions of differential privacy and explain the concepts of this paper.
Definition 1. (differential privacy). Given two adjacent datasets and with only one record difference. Assuming that is a positive real number and A is a random algorithm, if the output of A acting on two adjacent datasets is indistinguishable, the requirement of differential privacy is met. The formulation is defined as , where o is the output of A and ε is privacy budget, representing level of privacy protection. The smaller ε, the higher the degree of privacy protection, the greater the noise needed, and the lower the availability of data. According to the formula definition, when tends to 0, the degree of privacy protection reaches the best state, but the availability of data is close to loss.
In this paper, we use the Laplacian mechanism of differential privacy to add noise to datasets, and the noise size is determined by privacy budget ε and the global sensitivity of the query function. Global sensitivity represents the maximum change to query results caused by deletion or modification of any record in the dataset, as defined below.
Definition 2. (global sensitivity). For any query function f: D ⟶ Rd, the global sensitivity of f is , where D and D′ are adjacent datasets.
Definition 3. (Laplacian mechanism). A ε-differential privacy protection is implemented by adding random noise to data subject to Lap (λ), where the probability density function of Laplacian distribution is , in which is a scale parameter. When = ∆f/ε, it can satisfy ε-differential privacy. Another function of Laplacian distribution is expressed as , where μ is the position parameter and b is the scale parameter, and then the mathematical expectation of Laplacian distribution is μ and the variance is .
In addition, differential privacy also has the property of sequence combination, which is defined as follows.
Definition 4. (sequence combination properties). If Algorithms meet -differential privacy (1 ≤ i ≤ n), then the sequence combination of algorithms can provide -differential privacy protection for a dataset D. This property shows that the privacy protection level is the sum of all privacy budgets when multiple differential privacy algorithms are applied to a dataset.
For the convenience of readers, the symbols and related descriptions used in this paper are summarized in Table 1.
4. Our Method
Figure 1 describes the MCS model based on SAPS.

4.1. System Model Description
This method includes four roles: , the task requester, the task participants, and . It is different from traditional MCS in two aspects. First, the perceptual platforms of traditional MCS are replaced by blockchain to achieve decentralization, so that each node of the blockchain can be used as and be responsible for managing a certain number of participants. Second, are introduced, such as JD.com and Alibaba, and it can help to verify at the time of registration to guarantee the authenticity and reliability of the data. By this way, realize coarse-grained distribution of personalized tasks to improve accuracy of submitted data.
The proposed system model is shown in Figure 1, which mainly includes registration module, reputation sharing and verification module, participant selection, task assignment module, and other functional modules. Details of each module are introduced below.
4.1.1. Registration Module
When some want to join MCS, they first send a request to the nearest according to their areas (as shown in ②). Here, includes personal information (ID, age, etc.), preferences (personalized tags), and reputation from certain domains with sufficient data (such as movies, commodities, and social networking).
4.1.2. Reputation Sharing and Verification Module
After received , check the authenticity of through (as shown in ③ and ④). If the query results match , the registration applications of are accepted. Otherwise, are treated as the malicious requesters and will be blacklisted. is equipped with an interface in this method that can carry out their query connections independently because the interface can reach a decentralized state and avoid the whole system’s failure probability caused by the collapse of the centralized query interface. In this paper, the query will return results in Boolean form to appropriately reduce communication overhead.
4.1.3. Participant Selection and Task Assignment Module
According to the query results, calculate new reputations of and arrange them in descending order (as shown in ⑤ and ⑥). Then, we consider whether the personalized tags from meet tasks’ requirements and select participants from the sorted . After obtaining a certain number of participants, will be updated and synchronized among to maintain consistency and facilitate querying and tracking.
Next, divide and manage the areas by the relevant information of tasks and participants. Each area contains multiple clusters, and each cluster contains tasks with the same type. Therefore, participants can join in the multiple clusters according to their preferences and consider which tasks to choose based on a combination of cost, distance, and other factors.
4.2. Participant Selection of Cross-Domain Data Sharing
In this section, we mainly describe the specific implementation details of the proposed method. First, can manage a designated area, such as area A or B, and receive the tasks from the task requester. The tasks should contain the following necessary attributes: location of execution area (area A or B), completion deadline, reward (set by the task requester), and other specific requirements (such as reaching a certain accuracy or pixels and number of participants required for ). Also, the above attributes are attached as tags, which are convenient for to distinguish the suitable tasks. Then, can receive a considerable number of tasks, and the amount of tasks’ threshold W which is constrained to can be set according to the actual situation. When completed tasks, they will broadcast to other and send their IDs and the IDs of the completed tasks for recording.
To mobilize the enthusiasm of blockchain nodes, our method designs an incentive mechanism to motivate them as : the reward rules of are as follows:(1)In each round, can obtain a certain proportion of commission when receiving , for example, 3% of the total rewards of . Also, can receive up to W/2 tasks in each round if the capacity of is free. It should be noted that if needs to be executed in multiple areas (in area A and B), then it will be received by multiple of corresponding multiple areas; if is executed in sequence in multiple areas, then it is executed in specified sequence. Otherwise, will be executed synchronously. Considering that there may be uncompleted tasks, 5% of the total rewards (receiving rewards + completing rewards) of in this round will be deducted as punishment for unfinished.(2)Moreover, can obtain a certain percent of commission for completed, for instance, 15%–20% of the rewards of . The more commissions completes, the more commissions it will receive.(3) process the submitted data and return results to the task requester. If the data quality of is higher than the task requester’s expectations, will receive additional rewards.
Second, as an independent and trusted third party, play an irreplaceable role because the related information of is an important reference source for . But in the process of verifying the authenticity of , there is a risk of the privacy disclosure of . Therefore, differential privacy is used to protect the privacy of .
Finally, after having a certain number of tasks and participants, need to preprocess tasks. For instance, the tasks are classified and tagged based on their types, execution areas, difficulties, and other factors to form clusters (for example: {environmental quality detection; reward: 400–800; area A; medium difficulty}). Then, the participants can match corresponding clusters according to their conditions. By this way, the task completion quality can be improved and the profits of the participants can be maximized.
4.2.1. Calculation of Reputation
When obtaining the query results, a new reputation mechanism is established combining with , which is saved and managed by on the blockchain. It divides into two categories: new and with history. The reputation of with new can be calculated as follows:where is the reputation of in a certain domain, m is a coefficient set by according to the tasks’ requirements, and RE is a constant reputation. The equation calculates reputation of under the preference fa.
For with historical data, the calculation equation is
The equation represents with previous history, where is the reputation of in a certain domain and is the reputation contained from previous history, and it is a constant. In addition, considering that the preference number of is more than one, the following equation is defined for more detailed calculation:
For multiple preference characteristics, they can be calculated by equation (3). The specific proportion of each preference can be set by , and the total score of reputation = 100.
When participants have accomplished their tasks, they upload the collected data to corresponding for rewards. Also, their reputations will be updated by . The reputation of a participant is updated as follows:(1)If is successfully completed, the reputation is updated as follows:(2)Otherwise, the reputation is updated as follows: where is a parameter. The above equations indicate that the reputation of participants who complete tasks within specified time will be increased after updating. Otherwise, if the reputation of a participant is deducted for two consecutive times, it will temporarily stop receiving and executing tasks. and stopping duration are set by .
In addition, since is performed by multiple participants, the final result is obtained from the participants. Therefore, the greater the proportion of data provided by participants is, the larger the reward they will receive. However, some participants will submit false data to get the rewards. To prevent invalid/false data attacks, we consider the tasks’ completion status and reputations of the participants to eliminate or mark detected malicious participants, and then record the data in blockchain. The non-tamper nature and traceability of blockchain make it easy to record and track these malicious attackers.
4.3. Privacy Protection Based on Differential Privacy and Distributed Storage
The privacy risk should be considered during the process of data transmission among , , and . Because the personal information of is learned by or during the data transmission process, it is treated as privacy disclosure and is unacceptable to . Thus, we adopt differential privacy to address this privacy problem. Specifically, we add Laplacian noise to and the query results: , where represents a random algorithm (such as query), Y represents Laplacian noise, , represents sensitivity, represents privacy budget, and represents the final result of confusion after adding noise, which meets ɛ-differential privacy.
Privacy budget of differential privacy in the scenario of MCS is explained as follows: will be consumed in every query of . As the number of queries increases, privacy budget will increase too. Therefore, the allocation of privacy budget is more flexible. In this paper, we use the data query method based on differential privacy budget allocation in [27] for reference to realize privacy protection during data querying. It utilizes the sequence combination property of differential privacy to obtain a privacy budget sequence by establishing an infinite number of differential privacy budget and uneven segmentation. Also, Laplacian random noise added to the data is calculated according to the allocated privacy budgets in the sequence. By this way, it ensures that privacy budget can be infinitely divided to meet conditions of differential privacy and decreases the speed of adding noise. Specific steps are listed as follows: Step 1. Set the total differential privacy budget according to the degree of privacy protection requirements and set the total differential privacy budget . Step 2. Generate a differential privacy budget sequence. Differential privacy budget is calculated in each data query, and the sequence differential privacy budget is recorded as , . According to the equation, , where the is the total privacy budget, and the value of is calculated by the following formula: . Step 3. Use differential privacy mechanism to calculate the random noise according to the submitted query and the sensitivity of the query. Step 4. Add random noise to the query result and return the noised result.
The interaction between and is shown in Figure 2. Before query and verify the authenticity of , they will add noise to with different levels according to the requirements of . Similarly, the same means can also be used for data protection in the process of returning the query results to .

4.4. Algorithm Design
In this section, we introduce the details of designed algorithms. The pseudocode is shown in Algorithm 1. The main description is as follows.
4.4.1. The Input of Algorithm
To realize valid authentication, must provide detailed personal information, such as username, ID, whether there is a history of tasks in the past, personal preferences, and reputation. Among them, ID and reputation must be submitted. The second input is perceptual tasks T.
4.4.2. The Process and Output of Algorithm
Firstly, let set the initial reputation RE and the threshold s and initialize sets for when performing query operation (lines 1 and 2 of Algorithm 1).
If the query results cannot match the registration information provided by , will add them to BlackList (B) and make a network wide announcement (lines 3 to 9 and 26 to 28 of Algorithm 1).
Secondly, verify whether are true by interacting with and check the tags of . If have no previous history but are true and their reputations are greater than the threshold s, they will be pushed into the candidates’ set C. If the history records of are excellent (historical records show that the tasks were completed successfully) and are true, then also can be pushed into C (lines 10 to 13 of Algorithm 1).
Then, calculate the total reputation under different situations: when there are historical records, calculate according to equations (2) and (3); if there are no historical records, calculate according to equations (1) and (3). After that, we compare the calculated results with the threshold s (lines 14 to 20 of Algorithm 1).
Also, the total reputation is sorted in descending order, and the top-N are selected as participants by the requirements of tasks and personalized tags. Then, we add them into formal set (FS) and are stored in the blockchain (lines 21 to 22 of Algorithm 1).
After that, the participants select and perform the tasks that meet their conditions. Also, will process the submitted data and then update participants’ reputations with equations (4) and (5) according to the data quality and contribution in time. Finally, participants get the rewards. And we get the completion results of perceptual tasks, then the algorithm ends (lines 23 to 25 of Algorithm 1).
The pseudocode of Algorithm 1 is as follows.
|
4.4.3. Time Complexity Analysis
In Algorithm 1, the time complexity of selecting and top-rank() is O (n) and O (nlogn), respectively, where n = |M|. So, it is consistent with the compared algorithm CMABA [28]. Therefore, theoretically, the time complexity of the two algorithms is the same.
5. Theoretical Analysis of the Proposed Method
In this section, we present a theoretical analysis of rationality and effectiveness of SAPS.
SAPS can solve the problem of verifiability and traceability of and effectively improve the quality of tasks’ completion.
Analysis. This method makes the following innovations in selecting participants. Firstly, need to provide to . Then, the multi-domain query operation of can ensure the authenticity and reliability of from and suppress intrusion of the malicious from the source. So, if match the query results, the corresponding reputations will be confirmed and are eligible to join MCS. All the above information is stored on the blockchain. Besides, the existence of consensus mechanism guarantees that the data in the blockchain are almost impossible to be tampered, which also ensures that the records existing in the blocks have become the key factors for traceability and verification. Therefore, based on the structure of blockchain and the multi-domain query operation, we can accurately trace the source of data and effectively manage participants, thereby improving the quality of the submitted data.
SAPS is an effective solution to protect privacy in participant selection and query verification in MCS.
Analysis. Considering the risk of privacy disclosure among , , and , we use differential privacy to protect the privacy of when selecting participants, and the levels of privacy protection can be set by . Differential privacy’s efficiency is considerable. Also, the unique encryption mechanism of blockchain also has a certain protective effect on the stored information. Furthermore, owing to the powerful computing power formed by consensus algorithms such as workload proof of each node of the distributed system, blockchain can resist external attacks effectively. Through the above protection methods, the query verification process in SAPS is fully guaranteed.
Based on the above two theoretical analyses, we can see that SAPS is reliable for participant selection, which can realize the efficient and accurate selection of participants while providing effective privacy protection.
6. Experimental Results
In this part, we verify the performance of SAPS through comparative experiments.
6.1. Baselines and Datasets
6.1.1. Random Selection Algorithm
For better comparison, we implemented a random selection algorithm by randomly selecting a certain number of participants.
6.1.2. CMABA
It uses reinforcement learning to interact with the external environment and selects the desired participants. However, CMABA ignores the privacy problem of the participants when they send their perceptual data to the perceptual platforms. Therefore, we can check whether our privacy protection mechanism will affect the quality of the task completion.
We perform comparative experiments on Taxi trajectory prediction dataset [29] and Chicago taxi trips dataset [30]. Taxi trajectory prediction dataset includes trajectory information of hundreds of taxis running in Porto for a year. In this dataset, taxis can be notified by dispatching center to request services or be directly contacted by passengers on the street. Each taxi is equipped with a mobile data terminal to receive commands from dispatching center. Also, each travel record completed by taxi contains multiple tags. In the experiment, we mainly use the following tags: unique identifier of each trip TripID, contact information of passengers OriginCall, unique identifier of taxi TaxiID, timestamp of trip, etc. In addition, dispatching center can locate taxi through its TaxiID and record corresponding tasks it has completed; in the experiment, we simulate dispatching center to allocate and manage the dataset.
By considering the timeliness and sparsity of data, we have added Chicago taxi trips dataset [30] as an auxiliary. It includes the information of thousands of taxis running in Chicago, in which each travel record mainly includes travel ID, taxi ID, and timestamp of travel starts and ends. We select 160 of its records to form a sparse dataset (SparseD) to verify the performance of the proposed method under this condition because it can further prove the rationality and accuracy of this experiment.
We conduct simulation experiments using Python, C++, and OMNeT++ platforms on the above datasets. The initial values of reputation RE and reputation threshold s are set as follows: RE of is set to 40 and s is set to 70. These parameters can be adjusted according to actual specific requirements.
6.2. Performance Comparison
To evaluate the performance of our method, we make experimental analysis in terms of task completion quality, screening cost, execution satisfaction, and security. We define taxis as and select a certain number of participants. Specifically, we divide them into six groups of 30, 50, 70, 90, 110, and 130 to compare the different indicators of three algorithms. We assume that the budget of all selecting processes is fixed, and the maximum number of tasks is 160. The specific experimental results are as follows.
6.2.1. Comparison of Running Time and Completion Quality
To compare the performance of the proposed method, we compared the running time and completion quality of our method with CMABA, and the results are as follows. It can be seen from Figures 3 and 4 that SAPS is superior to CMABA in terms of running time and the tasks’ completion quality when the same number of participants is selected from different datasets. The time cost of SAPS from Figure 3 is between RS and CMABA when selecting different number of participants. Because SAPS considers the privacy security and reliability of participants, it ensures the quality of selected participants. The number of tasks executed by the three methods in Figure 4 is the same. On SparseD of Figure 4, although the tasks’ completion quality of the three methods has declined to varying degrees, for example, the initial accuracy of SAPS is less than 60%, the performance of SAPS is still ahead of the latter two. Specifically, compared with CMABA, the tasks’ completion quality of SAPS is improved by 16% on average. This is because SAPS benefits from the introduction of to realize sharing of data domains. Consequently, SAPS can verify authenticity and reliability of . In contrast, CMABA has higher model complexity and higher trial-and-error cost. It usually achieves local optimization rather than adapting to the global environment of whole MCS. Considering the characteristics of RS, it has the advantage of a short selection time, but the quality of participants cannot be guaranteed and the resulting data privacy and security problems are also unavoidable.

(a)

(b)

(a)

(b)
6.2.2. Comparison of Cost and Satisfaction
Figure 5 shows the comparison results of the screening cost varying with the number of participants. Here, screening cost includes the cost of running time, the cost of verifying the authenticity of , and the cost of screening. It can be seen from Figure 5 that the screening cost of RS is relatively low, but it also inevitably brings various problems, such as quality, safety, and reliability. CMABA needs to interact with the outside multiple times to learn the corresponding results, so its screening cost is the highest. The screening cost of SAPS is between RS and CMABA, which is 6% lower than that of CMABA on average. The reason is that can independently query and verify to avoid single-point failure and high overload. Compared with the two solutions, SAPS has the advantages of low cost and high efficiency.

(a)

(b)
Figure 6 shows the comparison of satisfaction with task completion. The satisfaction is measured through participants’ task running time Ts, tasks’ completion quantity N, and tasks’ completion quality M. Among them, M has the largest weight in satisfaction measurement, followed by running time Ts. The calculated equation is described as follows: SAT = a Ts + b N + c M, where a + b + c = 1 and a ≤ b < c. Since the quality of task completion accounts for the highest proportion of task satisfaction, in this experiment, a is set to 0.3, b is set to 0.3, and c is set to 0.4. The maximum value of SAT is 1. In Figure 6, it can be seen that the SAT of SAPS is the highest among three algorithms with the same number of participants, CMABA is the second, and RS is the lowest. This is mainly because the design of SAPS is superior to the latter two.

(a)

(b)
6.2.3. Comparison of Task Matching Degree
Figure 7 shows the performance of the three methods under the index of the task matching degree. We can see that the results of SAPS are more stable than those of the other two methods. Although the initial task matching degree of SAPS on SparseD is reduced, SAPS is superior in the matching degree between tasks and participants, for which SAPS adopts participant selection in the form of personalized tags and takes all factors into consideration. Compared with CMABA, the matching degree of tasks is improved by about 9%, and a similar trend can also be found in the quality of task completion as shown in Figure 4.

(a)

(b)
6.2.4. Comparison of Privacy Protection
For the aspect of data security, we design the corresponding experiments to prove the reliability and feasibility of our method. We add a certain number of malicious participants to each group. Malicious participants have specific tags. In this experiment, specific tags show very low reputation or poor historical completion quality. In order to keep total number of each group unchanged, six groups selected 20, 30, 40, 50, 60, and 70 participants separately.
Under this condition, we compare the selecting results and privacy security of the three methods. The specific experimental results are shown in Figure 8, where the horizontal axis represents the number of malicious participants in each group and the vertical axis represents the number of malicious participants selected in each group. Note that SAPS chooses the smallest number of malicious participants, and it shows the best performance results. This is because the cross-domain query of data and the usage of blockchain have fundamentally eliminated the intrusion of malicious participants. Moreover, the traceability of blockchain also provides a guarantee for eliminating malicious participants.

(a)

(b)
In a word, SAPS is stable in both Dataset [29] and SparseD because it can achieve cross-domain data sharing with privacy protection and prevent fake attacks by introducing blockchain.
7. Conclusion
In this paper, we propose a secure and accurate participant selection (SAPS) method for MCS. A trustworthy cross-domain data sharing framework is designed to obtain participants’ reputation data from multiple domains based on blockchain. Thus, we are able to perform high-quality participant selection based on their corresponding reputations and personalized tags. Specially, a blockchain-based incentive mechanism is realized to reward blockchain nodes and an adaptive algorithm is designed to match the participants and the perceptual tasks according their tags and requirements. Meanwhile, the privacy of data is guaranteed by distributed data storage and differential privacy. The results of our experiments confirm the superiority of our method in terms of task completion quality, screening cost, execution satisfaction, and security. Our future work will focus on the cooperation and power consumption between participants to further improve the performance of mobile crowdsensing.
Data Availability
The experimental data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This study was supported in part by the National Natural Science Foundation of China (nos. U21A20474 and 62262003), the Guangxi Natural Science Foundation (no. 2020GXNSFAA297075), the Guangxi Science and Technology Project (GuikeAA22067070, GuikeAD21220114, and GuikeAD21220096), the Center for Applied Mathematics of Guangxi (Guangxi Normal University), the Guangxi “Bagui Scholar” Teams for Innovation and Research Project, the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing, and the Guangxi Talent Highland Project of Big Data Intelligence and Application.