Abstract

Differential privacy mechanism can maintain privacy-utility monotonicity. Thus, differential privacy mechanism does not obtain privacy-utility balance for numerical data. To this end, we provide privacy-utility balance of differential privacy mechanism with the collaborative perspective in this paper. First, we constructed the collaborative model achieving privacy-utility balance of differential privacy mechanism. Second, we presented the collaborative algorithm of differential privacy mechanism under our collaborative model. Third, our theoretical analysis showed that the collaborative algorithm of differential privacy mechanism could keep privacy-utility balance. Finally, our experimental results demonstrated that the collaborative differential privacy mechanism can maintain privacy-utility balance. Thus, we provide a new collaborative model to solve the privacy-utility balance problem of differential privacy mechanism. Our collaborative algorithm is easy to apply to query processing of numerical data.

1. Introduction

Nowadays, large-scale and diverse data are rapidly generating. Data contain potential values and sensitive information. Data are collected, processed, and analysed to obtain valuable information and take meaningful decision. However, data carry sensitive information on individual, so the management and use of individual data bring about privacy concerns. Therefore, the individual sensitive information should be preserved to avoid privacy leakage of data analysis. However, a certain extent availability of data should not be destroyed when sensitive information is protected from data analysis.

In the big data era, it is very important to research using personal data guarantees the tradeoff between privacy and accuracy. Hence, Dwork et al. [1] had proposed differential privacy, which could achieve privacy-utility tradeoff. Since differential privacy keeps privacy-utility tradeoff, it has privacy-utility monotonicity [2]. Thus, differential privacy gets good advances [36] and has widely applications such as data releasing [716] and data query [1622]. However, if the privacy budget is too small, using differential privacy will lead to utility disaster. Alternatively, if the privacy budget is too large, the use of differential privacy will easily lead to privacy leakage. Thus, differential privacy mechanism does not keep privacy-utility balance for numerical data. In particular, privacy-utility tradeoff refers to the monotonous increase of one thing and the monotonous decrease of another. Privacy-utility balance or equilibrium obtains the expected or best state of both things.

Due to the extreme phenomenon of privacy leakage or utility disaster caused by privacy-utility tradeoff, some literature has studied the privacy-utility equilibrium of differential privacy mechanism based on game theory [2326]. Comparing with game theory, we can easily achieve online privacy-utility balance by means of collaborative. Thus, we provide the privacy-utility balance of differential privacy mechanism from the collaborative perspective in this paper. According to the required data utility and required privacy budget, we can maintain privacy-utility balance of differential privacy mechanism by adding desired noise to numerical data. We provide a new model to achieve privacy-utility balance of differential privacy mechanism in a collaborative manner. In addition, the collaborative algorithm is easy to be used for query processing of numerical data. Our main contributions are as follows:(1)We defined the required privacy and required utility metrics of numerical query result in this paper(2)We proposed collaborative model of differential privacy mechanism for keeping privacy-utility balance, and we presented collaborative algorithm of differential privacy mechanism(3)Our theoretical and experimental results demonstrated that the collaborative algorithm of differential privacy mechanism can ensure the privacy-utility balance of the numerical query result

The rest of this paper is organized as follows: Section 2 summarizes related work. Section 3 describes the problem statement. Section 4 introduces the preliminary to differential privacy. Section 5 presents a collaborative model and algorithm of differential privacy mechanism and theoretically analyses the properties of collaborative algorithm. We carry out an experimental evaluation for collaborative algorithm of differential privacy mechanism in Section 6. Section 7 concludes this paper.

A valuable analysis and preserving sensitive information are mutual contrary. In research and applications of differential privacy, current work can obtain the tradeoff between privacy and utility. Next, we review the related work of achieving privacy-utility tradeoff of differential privacy, the related work of obtaining privacy-utility tradeoff in data releasing and data query using differential privacy, the related work of achieving differential privacy-utility equilibrium based on game theory. Comparing with our work, we briefly analyse the advantages and disadvantages of the existing work.

There have been some approaches to get the tradeoff between privacy and utility for differential privacy. He et al. [3] proposed Blowfish privacy which tuned the tradeoff between privacy and utility by using a policy. The main feature of this policy was that users could specify sensitive information to be protected and database knowledge that had been published to potential attackers. Lin and Kiffer [4] analysed the utility axiom and showed that the amount of information preserved by a sanitizing algorithm should be measured as the expected error of the Bayesian decision maker. However, designing effective Bayesian algorithms for the complex noise distribution, sanitizing algorithms for maximizing information preservation, and effective estimating algorithms for the amount of information preserved were key challenges. According to the privacy requirement of an individual, Jorgensen et al. [5] proposed personalized differential privacy, and it is meaningful to extend the notion of personalized differential privacy to graphical data. Although personalized differential privacy achieved the desired privacy preserving, it would lead to utility disaster. Soria-Comas et al. [6] proposed individual differential privacy based on local sensitivity, which allowed the data controller to adjust the distortion to the actual data set and resulted in less distortion and more analytical accuracy. However, the performance of individual differential privacy needed to be further studied in an interactive computation model of data. Although new differential privacy approaches can achieve the tradeoff, they cannot achieve the privacy-utility balance. Thus, our work can solve this problem.

For data releasing, it is necessary to ensure the tradeoff between privacy and utility for a valuable analysis and preserving sensitive information. Fan et al. [7] presented a real-time differential privacy aggregation monitoring system with filtering and adaptive sampling. The system provided real-time and accurate publishing to facilitate data holders to share privacy aggregation of data monitoring applications. Fioretto and Hentenryck [8] proposed OptStream that is a novel algorithm for releasing differentially private data streams under the -event model of privacy. The algorithm ensures privacy while guaranteeing bounded error on the released data stream. By using correlation-aware search frontier construction and nonoverlapping covering design, Su et al. [9] presented a sequential update of differential privacy based on the Bayesian network, which solved the problem of multiparty high-dimensional data publishing. This method ensured the differential privacy of any local dataset and provided high data utility. Ou et al. [10] proposed a mathematically rigorous -body Laplace framework of the releasing of correlated trajectories, which efficiently prevents a social relation inference through the mutual correlation between -node trajectories of two users. The proposed approach achieved better privacy and data utility. Zhang et al. [11] presented a differential privacy method of releasing high-dimensional data using Bayesian networks and a proxy function of mutual information. This method greatly improved the accuracy of data publishing, but the key challenge was how to extend it to multitable databases. Wang et al. [12] proposed differential privacy M-estimator algorithms to select an optimal public subset in high utility. These algorithms played a guiding role in the practical application. Gao and Li [13] designed an anonymous scheme of maintaining persistent homology under differential privacy to address the utility concerns of the published graph. The scheme achieved high graph utility of both in graph metrics and application metrics, but it still needed to optimize the noise in the injection phase. Considering adversaries with the knowledge of temporal correlations between continuous data releasing, Cao et al. [14] quantified the risk of differential privacy and showed the privacy loss of event-level privacy increasing over time, while the privacy guarantee of user-level privacy protection was as desired. This work opened up interesting future research direction of investigating the privacy leakage and privacy preserving under temporal correlations. Eliáš et al. [15] proposed differential privacy mechanism for generating synthetic graph approximating all cuts of the input graph up to an additive error. This mechanism can achieve privacy-utility tradeoff, if one seeks to get purely additive cut approximations. Gohari et al. [16] introduced the Dirichlet mechanism with differential privacy, which is used for privatizing data inputs that belong to the unit simplex. The Dirichlet mechanism establishes a tradeoff between the level of privacy and accuracy of the mechanism output. Until now, data publishing with differential privacy only maintain the tradeoff. However, data publishing with differential privacy do not achieve privacy-utility balance. We resolve the problem by using our work in this paper.

Considering availability of query results, it needs to guarantee the tradeoff between privacy and utility. Nikolov et al. [17] presented near optimal mechanisms of any linear query for dense and sparse databases under both pure and approximate differential privacy, and the mechanisms were simple and effective. Soria-Comas et al. [18] showed that -anonymity helped to improve the utility of differential privacy responses to arbitrary queries. Specifically, -anonymity is achieved through a specially designed microaggregation of all attributes. If noise is added to a -anonymity version of the data set, the amount of noise satisfying -differential privacy can be reduced. Thus, this method improved the general analytical utility of the anonymous output. Zhang et al. [19] proposed an anonymous query log framework based on differential privacy and showed that the framework could achieve a good balance between retrieval utility and privacy. According to the hierarchical histograms and Haar wavelet transform, Cormode et al. [20] proposed two methods to accurately answer range queries under local differential privacy. Although these methods had strong theoretical accuracy in terms of variance, it was a key challenge that these methods were directly applied to multidimensional range queries and advanced data analysis. Wang et al. [21] studied the problem of answering multidimensional analytical queries using local differential privacy. By using local differential privacy coding and estimation algorithm, this method could answer a class of multidimensional analytical queries about tight error bounds and scaled well with a large number of dimensions. Vietri et al. [22] proposed oracle-efficient algorithms for constructing differentially private synthetic data, a sanitized version of a sensitive dataset that approximately preserves the answers to a large collection of statistical queries. These algorithms provide better accuracy in the large workload and high privacy regime. These works have achieved the tradeoff of data query using differential privacy but does not obtain privacy-utility balance. To this end, we present the collaborative model and algorithm of differential privacy mechanism to achieve privacy-utility balance in this paper.

To the best of our knowledge, there is some existing work that can achieve privacy-utility equilibrium. Pawlick and Zhu [23] conceptualize the conflict on machine learning and data obfuscation from a Stackelberg game perspective. Each user perturbs her data independently, which leads to a high net loss in accuracy of equilibrium. Thus, Pawlick and Zhu show that the learner improves his utility by proactively perturbing the data himself. Zhou et al. [24] introduced an aggregative game to model spectrum sharing in large-scale, heterogeneous, and dynamic networks. Zhou et al. designed a mediated privacy-preserving and truthful mechanism which admits a Nash equilibrium between privacy and truthfulness. Chen et al. [25] constructed multiple games that model an interaction between an agent which has a secret type and an adversary whose goal is to discover this type. This game shows that the Bayesian Nash equilibrium strategy does fall into the framework of randomized response under different payment. Fioretto et al. [26] introduced the privacy-preserving Stackelberg mechanism, which enforces the notions of feasibility and fidelity of the privacy-preserving information to the original problem objective. This mechanism complies with the notion of differential privacy and ensures that the outcomes of the privacy-preserving coordination mechanism are close-to-optimality for each agent. Thus, the existing work can achieve privacy-utility equilibrium from the game-theoretical perspective. Although the privacy-utility equilibrium can be well achieved based on game theory, it is a challenge problem to construct a reasonable game-theoretical model because of the privacy-utility monotonicity of differential privacy.

Thus, we achieve the balance between differential privacy and data utility in a collaborative manner. First, we proposed a collaborative model of differential privacy mechanism to keep privacy-utility balance. Next, we presented collaborative algorithm of differential privacy mechanism to achieve privacy-utility balance. Our theoretical and experimental results verify that collaborative algorithm of differential privacy mechanism can ensure privacy-utility balance of the numerical query result. However, cloud storage and computing are easy to bring data security and privacy leakage problems. Therefore, it is necessary to construct a hybrid method based on differential privacy and cryptography to guarantee data security and privacy of cloud storage and computing. In the future work, we will achieve security and privacy preserving of data by combining differential privacy with blockchain [27] and password-based single-sign-on authentication [28] in cloud computing.

3. Problem Statement

In this section, we present the query model, threat model, and privacy and utility metrics of numerical data and identify the problem.

3.1. Numerical Data Query

There are two application models in the model of computation for a database, including the publishing model and query model. In the publishing model of numerical data, data curator produces synthetic database, collection of summary statistics, or sanitized database using a related function to database . If a data analyst requests data curator releasing the original database, we regard this as an identity query function . In the query model of numerical data, a query function is applied to a database according to the requirement of data analyst. The query model of numerical data permits the data analyst to ask queries adaptively, deciding next query function to pose based on the observed responses to previous query . Then, data curator releases to the data analyst. For the publishing and query models of numerical data, denotes numerical query results on database for simplification in this paper. In this paper, we do not consider the correlation between data attributes or data records as the future research direction.

3.2. Privacy Threat Model

In the above query model of numerical data, we consider the data curator is fully trusted, while the data analyst is honest-but-curious. The data analyst is curious about sensitive information about data analysis. We assume that the data analyst can get full background knowledge except for a single record of a database. We assume that the communication channel is reliable and two entities can be mutually authenticated within the numerical data query. Furthermore, we consider both data curator and data analyst to be rational. More specially, the data curator can get the maximum privacy preserving of data releasing, but the data analyst can obtain the maximum data utility. However, since differential privacy has privacy-utility monotonicity, using differential privacy cannot reach the maximum privacy preserving and maximum data utility at the same time under the rational model. Therefore, maintaining the privacy-utility balance of differential privacy is a favorite result in a collaborative manner.

In this paper, the specific privacy threat model is as follows in the numerical data query: (1) the data analyst is honest-but-curious, (2) the data analyst can obtain full background knowledge except for a single record of releasing the database of the data curator, and (3) the data curator and data analyst are rational.

3.3. Privacy and Utility Metrics

In this paper, the privacy budget takes as metric of privacy-preserving level of differential privacy mechanism. The data curator can set the privacy budget based on the required privacy-preserving level.

The relative error is suitable for continuous numerical data. Furthermore, the relative error matches the data analyst’s subjective perception on data utility, and it is easy to deal with mathematically in this study. Since we consider the privacy-utility balance of query results of numerical data in this study, we define the utility metric of numerical query result based on the absolute value of the relative error , where is the perturbation value of . denotes the set of all real numbers in the follow-up section.

Definition 1. (utility metric). The utility metric is of numerical query result whenFor any query function about database , is the perturbation value of , where is known as the utility factor of achieving data utility . Depending on (1), we can compute the utility factor of any numerical query result under the data utility . Since , the range of utility factor is . In this paper, the absolute value of the relative error is fixed to guarantee the same data utility of any numerical query result. There is no doubt that the absolute value of the relative error can be different to guarantee the different data utility of any numerical query result.
If we regard the utility factor as noise adding to numerical query result , we can achieve the data utility of numerical query result . However, since is not generated by a differential privacy mechanism, we cannot achieve differential privacy. Thus, we define the conditional filtering noise [29] to achieve differential privacy and get desired privacy preserving by combining the utility factor of the data utility.

Definition 2. (conditional filtering noise). Conditional filtering noise is required to satisfy by filtering, where is generated by a differential privacy mechanism under any privacy budget .
Therefore, we can achieve differential privacy preserving by adding conditional filtering noise. The random perturbation only using conditional filtering noise can get better data utility. However, this can lead to worse privacy preserving. We define the range of and the rounding value of is 1, so we can obtain the privacy-utility balance by adding noise to numerical query result .

3.4. Problem

Differential privacy mechanism has been applied to numerical data query to ensure privacy preserving of sensitive data. However, since noise is random, differential privacy mechanism can achieve good privacy preserving but cannot achieve required data utility of the numerical query result. In the query model of numerical data, the data curator hopes to achieve the required privacy preserving of numerical query result, and the data analyst hopes to get the required data utility of numerical query result for data analysis. Therefore, a collaborative model and algorithm of differential privacy mechanism for ensuring privacy-utility balance of numerical query result are extremely desirable. This problem is defined as follows.

In the query model of numerical data, we propose a collaborative model of differential privacy mechanism based on the required data utility and required privacy budget, and we present the collaborative algorithm of differential privacy mechanism for keeping privacy-utility balance.

4. Preliminaries

In this section, we introduce the preliminaries to differential privacy [30]. A database is a multiset of rows, each from a data universe , which is the set of all possible database rows. Thus, a database of size is a tuple for some . is the number of individuals with data consisted the database . denotes the set of all nonnegative integers. Databases and are adjacent if they have the same size and are identical except for a single record. Thus, Hamming distances between adjacent databases and are 1.

Definition 3. (differential privacy). A randomized mechanism with domain is -differential privacy, if for all and for all and , thenwhere the probability space is over the coin flips of the mechanism . If , is -differential privacy.
According to the definition of differential privacy, the probability of the mechanism without satisfying -differential privacy is . For each record of every individual, the coin flips of mechanism mean that inherently has only two possible and equally likely outcomes. Thus, differential privacy ensures that any sequence of outputs in response to queries is essentially equally likely to occur, which probability space is the coin flips of the mechanism, independent of the presence or absence of any individual.
The -sensitivity of a query function is

Definition 4. (Laplace mechanism). Given any query function , the Laplace mechanism is defined aswhere is independent identical distribution random variables drew from Laplace distribution .
Discrete Laplace distribution can be viewed as a discrete approximation of Laplace distribution [31].

Definition 5. (discrete Laplace mechanism). Given any query function , the discrete Laplace mechanism is defined aswhere is the independent identical distribution random variables drew from discrete Laplace distribution .
Differential privacy mechanism has the property of parallel composition [32].

Theorem 1 (parallel composition). Each randomized mechanismis-differential privacy, and theis arbitrary disjoint subsets of the input database, so the sequence ofis-differential privacy.

5. Collaborative Model and Algorithm of Differential Privacy Mechanism

Firstly, this section presents the collaborative model of differential privacy mechanism. Secondly, we state the collaborative algorithm of differential privacy mechanism. Finally, we analyse the properties of the collaborative algorithm of differential privacy mechanism.

5.1. Collaborative Model and Collaborative Algorithm

In Figure 1, the collaborative model of differential privacy mechanism consists of three parts, including computing utility factor, generating conditional filtering noise, and adding noise. Part I computes the utility factor of numerical query result according to the data utility . The utility factor of any numerical query result can be adaptively adjusted based on the data utility of the data analyst. Part II generates the conditional filtering noise satisfying under desired privacy budget when given the required privacy budget for numerical query result . The required privacy budget of any numerical query result can be adaptively adjusted according to the utility factor of the data utility and the desired privacy budget of generating the conditional filtering noise . Part III achieves differential privacy by adding noise to any numerical query result . In this study, the data curator achieves expected privacy preserving by using required privacy budget, and the data curator obtains required privacy budget by combining the expected data utility of the data analyst with the desired privacy budget of generating conditional filtering of differential privacy mechanism.

According to the collaborative model of differential privacy mechanism, Algorithm 1 is collaborative algorithm of differential privacy mechanism. Algorithm 1 shows the implementation procedure of differential privacy mechanism in a collaborative manner. In view of the interactive process of the collaborative model of differential privacy mechanism, the collaborative algorithm is suitable for online data query processing.

Input: Data utility , desired privacy budget , -sensitivity
Output:
(1)Computing the utility factor from data utility ;
(2)while required privacy budget do
(3)Generating noise using differential privacy mechanism with desired privacy budget ;
(4)end while
(5)whiledo
(6) and ;
(7)end while
(8)Computing noise
(9)Adding noise to numerical query results ;
5.2. Theoretical Analysis of Collaborative Algorithm

Next, we prove the collaborative algorithm satisfying differential privacy. In the collaborative algorithm of differential privacy mechanism, we have the following theorems.

Theorem 2. In the collaborative algorithm of Laplace mechanism, Algorithm 1 is -differential privacy.

Proof. For a database and a query function , Algorithm 1 will return , where and conditional filtering noise generated Laplace mechanism under desired privacy budget . The probability density function of with mean isSince , the probability density function of random variable isDepending on (3) and Theorem 1, we haveAccording to Theorem 2, if is required for any desired privacy budget , the collaborative algorithm of Laplace mechanism can achieve the required privacy preserving. Algorithm 1 can resist full background knowledge except for a single record of the data analyst under the privacy threat model within the query model of numerical data. Since the data curator is rational under privacy threat models, the collaborative algorithm of Laplace mechanism can achieve the required privacy preserving to the honest-but-curious data analyst.

Theorem 3. In the collaborative algorithm of discrete Laplace mechanism, Algorithm 1 is -differential privacy.

Proof. For a database and a query function , Algorithm 1 will return , where and conditional filtering noise generated discrete Laplace mechanism under desired privacy budget . The probability density function of isSince , the probability density function of random variable isDepending on (3) and Theorem 1, we haveAccording to Theorem 3, if is required for any desired privacy budget , the collaborative algorithm of discrete Laplace mechanism can achieve the required privacy preserving. The collaborative algorithm of discrete Laplace mechanism can resist full background knowledge except for a single record of the data analyst under the privacy threat model within the query model of numerical data. Since the data curator is rational in the privacy threat model, the collaborative algorithm of discrete Laplace mechanism can achieve the required privacy preserving to the honest-but-curious data analyst.

Theorem 4. In the collaborative algorithm of differential privacy mechanism, the approximate data utility of Algorithm 1 approximates to the data utility .

Proof. In Algorithm 1, the approximate data utility of any numerical query result isSince rounding operation has no addition, subtraction, multiplication, and division rules, there is not a strict mathematical relationship between data utility and approximate data utility . Thus, we only consider the rounding value of absolute value of conditional filtering noise. Since the range of is and the rounding value of is 1, the approximates to the data utility in the collaborative algorithm of differential privacy mechanism.
By Theorem 4, since the data analyst is rational in the privacy threat model, the collaborative algorithm of differential privacy mechanism can achieve approximate data utility of the numerical query result .

6. Experimental Evaluation

We can experimentally compare the properties of differential privacy [30], personalized differential privacy [5], individual differential privacy [6], and our collaborative differential privacy. We make a comparative experimental analysis between differential privacy, personalized differential privacy, individual differential privacy, and our collaborative differential privacy on privacy preserving and data utility. We use the expected estimation error as privacy metric. The expected estimation error is , where is a random perturbation value of the original , and is the probability of random perturbation value . We analyze the data utility of differential privacy according to Definition 1 of utility metric. We analyze the approximate data utility of collaborative differential privacy according to (12). In all experimental analysis, we get the average experimental result of repeating 10 times. We use the T-Drive taxi trajectory dataset [33] to evaluate privacy preserving and the data utility. The T-Drive taxi trajectory dataset is publicly available. If we perturb the latitude and longitude of the T-Drive taxi trajectory dataset by using collaborative differential privacy mechanism, we can achieve the expected privacy preserving and the approximate data utility of latitude and longitude. To explain that collaborative differential privacy ensures expected privacy preserving and approximate data utility, we choose the longitude of taxi ID 1065’s trajectory dataset as experimental data. In all experiments, we set the sensitivity as , the local sensitivity as , and the personalized privacy budget as .

6.1. Privacy Preserving

In Figure 2, we experimentally analyse privacy preserving of Laplace mechanism (LM), discrete Laplace mechanism (DLM), personalized Laplace mechanism (PLM), personalized discrete Laplace mechanism (PDLM), individual Laplace mechanism (ILM), and individual discrete Laplace mechanism (IDLM). We observe that the expected estimation error decreases as the privacy budget increases. Thus, differential privacy mechanisms can achieve the expected privacy preserving. Personalized differential privacy mechanisms can achieve better privacy preserving than the differential privacy mechanisms and the individual differential privacy mechanisms. Since the local sensitivity is lower than the global sensitivity, the individual differential privacy mechanisms can achieve the better utility.

In Figure 3, we experimentally analyse privacy preserving of using conditional filtering noise (CFN) of LM, DLM, PLM, PDLM, ILM, and IDLM. By using conditional filtering noise of the differential privacy mechanisms, the expected estimation error fluctuates as privacy budget varies. This shows the randomness of adding conditional filtering noise. Thus, the use of conditional filtering noise can achieve the privacy preserving. However, since the small random perturbation, the use of conditional filtering noise of differential privacy mechanisms does not achieve expected privacy preserving.

In Figure 4, collaborative Laplace mechanism (cLM) ensures the expected estimation error almost decreases as desired privacy budget increases to different required data utility . Moreover, the expected estimation error of collaborative discrete Laplace mechanism (cDLM) fluctuates as desired privacy budget increases to the different required data utility . The expected estimation error using collaborative Laplace mechanism and collaborative discrete Laplace mechanism increases as the required data utility decreases under the same desired privacy budget. Thus, collaborative Laplace mechanism and collaborative discrete Laplace mechanism can achieve expected privacy preserving of query results.

In Table 1, we can calculate the utility factor of any query result based on different required data utility . We can obtain the required privacy budget based on the utility factor of any query result and the desired privacy budget of generating the conditional filtering noise . Then, we can achieve collaborative differential privacy of any query result by using collaborative Laplace mechanism under the required privacy budget. Since Figure 4 shows that the expected estimation error of collaborative discrete Laplace mechanism fluctuates as desired privacy budget increases to different required data utility, we can achieve collaborative differential privacy of any query result by using collaborative discrete Laplace mechanism under the required privacy budget. Therefore, collaborative Laplace mechanism and collaborative discrete Laplace mechanism can achieve required privacy budget for any query result . By Theorems 2 and 3, collaborative Laplace mechanism and collaborative discrete Laplace mechanism can achieve -differential privacy for the whole query result under the different desired data utility. When , if the required privacy budget is for the whole query result , then the required privacy budget is for a single query result . Thus, collaborative Laplace mechanism and collaborative discrete Laplace mechanism can achieve expected privacy preserving for any query result from Theorems 2 and 3.

6.2. Data Utility

In Figures 57, we experimental analyse data utility of LM, DLM, PLM, PDLM, ILM, and IDLM. As the privacy budget increasing, the data utility of longitude using these differential privacy mechanisms is increasing. However, we observe that the data utility of using these differential privacy mechanisms fluctuates a lot under the same privacy budget. Thus, differential privacy mechanisms can lead to data utility disaster.

In Figures 810, we experimentally analyse data utility of using conditional filtering noise of LM, DLM, PLM, PDLM, ILM, and IDLM. We observe that the data utility can achieve almost 100% data utility. While the use of conditional filtering noise can achieve the high data utility, it can lead to privacy leakage.

In collaborative Laplace mechanism, given the absolute value , , and of relative error, so the data utility is , , and . Thus, given the desired privacy budget , are corresponding required privacy budget of any output based on utility factor of Table 1. For the required privacy budget of any output under desired privacy budget , the approximate data utility of longitude is close to 0.5, 0.7, and 0.9 in Figures 1113. According to Theorem 4 and Table 1, considering required privacy budget under utility factor of data utility for any output , the collaborative Laplace mechanism can get the approximate data utility . Similarity, depending on Theorem 4 and Table 1, the collaborative discrete Laplace mechanism can also get the approximate data utility in Figures 1113.

According to the above observation, collaborative differential privacy mechanisms can get approximate data utility for any query result under any required privacy budget and data utility . When the data utility is larger, the approximate data utility of collaborative differential privacy mechanisms is closer to the data utility . Therefore, the use of collaborative differential privacy mechanisms can achieve approximate data utility of any query result for different privacy budget under data utility . In other words, collaborative differential privacy mechanisms can ensure the required privacy budget and the approximate data utility of any output .

In Table 2, our experimental results verify that differential privacy (DP), personalized differential privacy (PDP), and individual differential privacy (IDP) can get expected privacy preserving but do not achieve approximate data utility. However, collaborative differential privacy (cDP) can achieve expected privacy preserving and approximate data utility.

We compare differential privacy using game theory and collaborative differential privacy in Table 3. By constructing a game-theoretical model for specific application scenarios using differential privacy, these game mechanisms can achieve privacy-utility equilibrium, but these privacy games need higher computational cost than that of collaborative differential privacy. Moreover, because of the privacy-utility monotonicity of differential privacy, it is difficult to construct a reasonable game model of differential privacy-utility equilibrium. Thus, we construct the collaborative model and algorithm of differential privacy mechanism in a collaborative manner. According to the expected data utility of the data analyst, the collaborative approach can directly achieve the privacy-utility balance in this study. The game model of differential privacy achieves privacy-utility equilibrium through strategy interaction. We can analyse the privacy-utility balance of collaborative algorithm according to the dynamic game with incomplete information. However, this paper focuses on the collaborative model and algorithm of differential privacy mechanism and intuitively analyses the collaborative algorithm achieving privacy-utility balance. Therefore, the analysis of collaborative model and algorithm based on game theory will be as the future work.

7. Conclusions

In this paper, we defined privacy and utility metrics and conditional filtering noise. We proposed the collaborative model of differential privacy mechanism. We presented the collaborative algorithm of differential privacy mechanism that can ensure required privacy preserving and approximate data utility. Our theoretical and experimental results demonstrate that the collaborative algorithm of differential privacy mechanism can maintain privacy-utility balance. This study provides a new approach achieving privacy-utility balance of differential privacy mechanism from the collaborative perspective. The most important thing is that the collaborative algorithm of differential privacy mechanism can be used for query settings of numerical data.

In this study, the collaborative model of differential privacy mechanism is only suitable for numerical data. We need to provide a new collaborative model for other data types. The privacy budget as privacy metric is general. In the future work, we will define other privacy and utility metrics for the collaborative model. We do not consider the correlation between data attributes or the correlation between data records in the collaborative model. Therefore, we can propose utility metrics that measure the correlation of data attributes or data records in the future work. Furthermore, we will improve the collaborative model for achieving the privacy-utility balance of correlated data.

Data Availability

The T-Drive taxi trajectory data were used to support this study and are available at https://www.microsoft.com/en-us/research/publication/t-drive-trajectory-data-sample/. These prior datasets are cited at relevant places within the text as references [33].

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the National Natural Science Foundation of China under grant nos. 62002081, 62062020, and U1836205, the Project Funded by China Postdoctoral Science Foundation under grant no. 2019M663907XB, the Foundation of Guizhou Provincial Key Laboratory of Public Big Data under grant no. 2018BDKFJJ004, and the Major Scientific and Technological Special Project of Guizhou Province under grant no. 20183001.