Abstract
Various applications of the Internet of Things assisted by deep learning such as autonomous driving and smart furniture have gradually penetrated people’s social life. These applications not only provide people with great convenience but also promote the progress and development of society. However, how to ensure that the important personal privacy information in the big data of the Internet of Things will not be leaked when it is stored and shared on the cloud is a challenging issue. The main challenges include (1) the changes in access rights caused by the flow of manufacturers or company personnel while sharing and (2) the lack of limitation on time and frequency. We propose a data privacy protection scheme based on time and decryption frequency limitation that can be applied in the Internet of Things. Legitimate users can obtain the original data, while users without a homomorphic encryption key can perform operation training on the homomorphic ciphertext. On the one hand, this scheme does not affect the training of the neural network model, on the other hand, it improves the confidentiality of data. Besides that, this scheme introduces a secure two-party agreement to improve security while generating keys. While revoking, each attribute is specified for the validity period in advance. Once the validity period expires, the attribute will be revoked. By using storage lists and setting tokens to limit the number of user accesses, it effectively solves the problem of data leakage that may be caused by multiple accesses in a long time. The theoretical analysis demonstrates that the proposed scheme can not only ensure safety but also improve efficiency.
1. Introduction
The development of emerging computing technologies (e.g., cloud computing) have brought opportunity for various industries, such as hyperspectral remote sensing image algorithms [1, 2], classification algorithms [3], matrix operations under linear systems [4, 5], and data generated by Internet of Things (IoT) devices. If the data in a solution is stored in the cloud or the calculation is outsourced to the cloud, the local storage and calculation pressure will be greatly reduced. Among them, for IoT big data, because IoT devices generate huge amounts of data, the structure of the traditional machine learning model is relatively simple, which can no longer meet the new needs of IoT applications. Thus, deep learning technology has been widely used in IoT applications [6], e.g., smart home [7], smart city [8, 9], and autonomous driving [10].
In the scenario of applying deep learning technology to big data in the IoT, in order to train a neural network, large amounts of data need to be obtained from the IoT devices. For example, crowdsensing systems collect data that comes from sensors embedded on personally owned mobile devices [11]. These data may contain sensitive information of some users. However, IoT networks are becoming more vulnerable to various web attacks [12]. Obviously, once they “share” these IoT data with the same field, they are likely to lose control of this data. If these data containing private information are leaked, and there is a lack of effective protection mechanism in the process of IoT search [13], it may cause irreversible harm to the people whose information is leaked. For example, in the field of healthcare, human physiological data collected by wearable IoT devices are put into deep learning models, which can predict the physical condition of patients [14–17]. Once these data are leaked, it will not only cause a patient’s economic loss but also endanger life [18]. In the field of autonomous driving, the prediction system of deep learning may be maliciously interfered. Once location privacy data is obtained maliciously, it may cause traffic safety problems and bring troubles to society [19]. It can be seen that how to protect users’ private data still faces severe challenges for projects that use deep learning to assist IoT applications, and it is a problem that must be solved.
At present, many solutions have been proposed to solve the big data privacy protection problem in machine learning [20] or deep learning. Generally, these schemes are divided into three categories: federated learning [21, 22], encryption-based technologies [23–26], and differential privacy technologies [27, 28], as shown in Figure 1. Figure 1 shows the working principles of three different types of privacy protection. Among them, encryption-based technologies mainly use direct encryption of data, such as using homomorphic encryption algorithms or setting access control on data uploaded to cloud servers. However, in actual situations, data owners not only want to share training data with others but also want to guarantee data security. Although homomorphic encryption solution realizes the encryption of data, it cannot meet the needs of multiuser data sharing when sharing data in the same field, and it cannot achieve one-to-many fine-grained communication. In attribute-based encryption, only users who meet the access strategy set by the owner can obtain the data, which can achieve more flexible access control. Therefore, to handle the problem of the incompatibility of secure storage and fine-grained sharing of IoT big data in deep learning, an attribute-based encryption solution can be introduced. Among them, the encryption of the ciphertext strategy is more suitable to be used in this scenario than the key-based encryption due to the characteristics of the ciphertext contact access strategy and key contact access structure.

In the actual data sharing scenario, due to the numerous attributes of the visitor, there are many departments in the enterprise engaged in the IoT, so the attribute fluidity is relatively large. Access users obtain the key through their own identity attribute information. If the attribute used to represent the identity does not have a valid period, it means that even if an employee resigns or a department merges, it will not affect the access rights of the resigned employee or the original department staff, and these employees can still obtain data through their own identity attributes. If a resigned employee sells IoT big data in exchange for economic benefits, it will not only endanger the interests of the company but also harm people’s personal safety. This shows that it is necessary to set the validity period for each user attribute. The attribute will be cancelled when it expires. Moreover, many current solutions allow users to access unlimited times within the set time. To prevent the number of visits from being abused, it is necessary to limit the number of visits within the set time. By limiting the user’s access period and access frequency, to a certain extent, it is possible to reduce the occurrence of data leakage caused by the sale of data information by employees or outsiders using decryption attributes to access big data of the Internet of Things.
We consider the data privacy problems of big data generated in the field of IoT for mobile computing and use attribute revocation idea [29, 30], then propose an IoT big data privacy protection scheme based on time and the number of decryption restrictions. This scheme combines homomorphic encryption and attribute-based encryption. In summary, the main contributions of this paper are as follows:(1)We propose a scheme that limits attribute usage time and user decryption frequency. By setting the attribute version number for each attribute as a mark, it is compared with the local time to determine whether the time has expired and realize the revocation. Besides, it limits the number of user accesses by establishing a user decryption frequency table and setting access tokens.(2)We combine homomorphic encryption with ciphertext-based attribute-based encryption technology, which makes this solution more effective in improving data confidentiality without affecting neural network model training.(3)We analyse the security of the scheme in a real deployment.
The remainder of the paper is organized as follows. After introducing the related work in Section 2, we provide related technologies used in this paper in Section 3. Section 4 describes the design of our scheme. We analyse security and effectiveness of our scheme in Section 5. Finally, Section 6 concludes this study.
2. Related Work
Although deep learning has brought great convenience to human life, its application is inseparable from data. If some IoT data involves the user’s private information, once it is leaked, it will cause property and life safety issues. More and more solutions [31–34] are proposed to solve data security issues, which are implemented by not directly processing data. In addition, people can also protect their privacy by processing data. Lv et al. [35] proposed a secure transaction framework based on the blockchain, which uses the encryption mechanism of the blockchain to ensure information security, but it does not achieve fine-grained access control. Lindell et al. [36] proposed that two parties can process data sets collaboratively without revealing their privacy. Agrawal et al. [37] proposed a scheme that implements the function of outsourcing data to others for data mining tasks. This scheme is confirmed that it does not reveal the data owner’s private information during the outsourcing process. Homomorphic encryption technology is considered to be the most effective and most direct means of protecting user privacy [38]. It can directly perform operations, and the results can be consistent with the results of plaintext operations. In 2007, Orlandi et al. [39] introduced homomorphic encryption technology and multiparty secure computing technology to feed the encrypted data into the neural network model for training, which not only ensured the consistency of the plaintext and ciphertext calculation results but also considered security. In [40], the authors proposed a neural network model that uses encrypted data for training. At the same time, in this scheme, it is also proved that cloud services can be used to put encrypted data into the neural network for prediction operations, and the results are returned from the cloud in the form of ciphertext. In [41], the authors improved the scheme [40] and proved that encrypted data can also train neural networks.
In addition to directly encrypting big data, there are also many solutions for setting access control to the data protection layer. In [42], the author created the first CP-ABE solutions, the access policy and ciphertext are sent to the receiver together. Due to the existence of user or attribute revocation problems, research on revocation of ABE has always received extensive attention. Shi et al. [43] proposed a scheme under a hierarchical cryptosystem. Once the attributes are revoked, the public key, private key, and ciphertext of the scheme need to be updated, so the revoking efficiency of this scheme is not high. In [44, 45], the authors pointed out that the private key can be divided into two parts. If the attribute is revoked, the two keys need to be updated, and it is necessary to reencrypt the ciphertext and header files, so the cost of revocation is relatively large. In [46], the authors proposed a user revocation scheme based on a time limit, but it did not achieve fine-grained attribute revocation. In [47], the authors proposed a scheme for using smart contracts to revoke attributes. In addition to these revocation schemes, the purpose of revocation can also be realized by limiting the number of user visits. In [48], the authors proposed a scheme that decryption frequency can be limited. But the function of this scheme is a bit single. While sharing IoT big data that can be used for neural network training, users can adopt a scheme that combines homomorphic encryption and CP-ABE. The solution proposed in [49] has proved that combining the two technologies in such scenarios can not only reduce the risk of data leakage but also reduce the number of key communications. However, in the field of deep learning-assisted IoT applications, there are very few solutions that can combine these technologies to limit user access time and specify the number of user accesses.
3. Preliminaries
3.1. Bilinear Maps
Suppose there is a large prime number and two cyclic groups and , their orders are both , and is a generator of . Then, there is a mapping from to , and it has the following properties [50]:(1)Bilinearity: for and (2)Nondegeneracy: there exists , such that , where 1 is the identity element of group (3)Computability: for can be calculated by an effective algorithm.
Then, we call the above mapping a bilinear mapping. In general, the cyclic group is an additive cyclic group, and the cyclic group is a multiplicative cyclic group.
3.2. Diffie-Hellman Problem
For the additive cyclic group in the above bilinear map , there are the following difficult problems in cryptography and discrete mathematics, various cryptosystems based on bilinear mapping are built on the basis of these difficult problems.
Definition 1 (discrete logarithm problem (DL)). If there are any two elements and , , , and satisfy , where , it is difficult to calculate the value of .
Definition 2 (computational Diffie-Hellman problem (CDH)). Given that a triplet is , where is a generator of group , , it is difficult to calculate the value of .
Definition 3 (decisional Diffie-Hellman problem (DDH)). If there is a four-tuple , where is a generator, , it is difficult to determine whether is true.
Because the above three types of problems are based on group , they are all regarded as group problems.
3.3. DBDH Assumption
Given that a five-tuple is [], where is a generator of group , , , it is difficult to determine whether is true.
3.4. Access Structure
The structure is a set of judgment conditions, usually expressed as , which contains several attribute elements in the attribute set and threshold logic operators (such as OR and AND). If there is an attribute set that satisfies the judgment condition, this attribute set is called an authorized set, otherwise, we called it an unauthorized set. Let be the entity set of participants. For , if and , there is , then, the set is monotonous. An access structure is a nonempty subset of {}, namely, \{}. In this proposed solution, the identity information of each user can be described by multiple attributes, such as company, department, and position, which are all his attributes.
3.5. Secure Two-Party Computing Protocol
A secure two-party computing protocol [51–53] means that in a network environment with a low safety factor, two participants can obtain the value of a function after collaborative calculation. Then, they can also obtain the desired value from each other according to this agreement. However, apart from knowing the value of oneself, other information cannot be derived. Through this agreement, it can be ensured that the privacy of the participants themselves will not be leaked when they do not trust each other, which improves program security.
3.6. Homomorphic Encryption
Definition means using an encryption algorithm to encrypt , the key is , and means a certain algorithm of homomorphic encryption, if there is an effective algorithm , it can be satisfied: . It means that is homomorphic to .
4. The Proposed System
4.1. System Solution
In our proposed solution, there exist six types of entities: IoT device, cloud server, data user, attribute authorization centre, key generation centre, and time server. The scheme model is shown in Figure 2.

From Figure 2, we can know that the data owner can encrypt all kinds of data from IoT devices and upload the data to CSP. The access user makes an access request to the cloud server. Legitimate users can download document set from the cloud server and decrypt it. CSP and KGC jointly generate keys for users through continuous interaction. The time server is responsible for detecting whether the time sent to it by other entities has expired or has been forged or tampered with.
4.2. System Algorithms
We let group be a bilinear group, let be a generator in group . Let be a bilinear mapping. We choose three hash functions in this scheme: , so that each attribute can be mapped to the group, , and . In addition, for any , an attribute set , the Lagrangian coefficient is defined as .(1). First, the security parameter is used to generate three pairs of public and private keys, which are the key generation centre’s key pair , the cloud server’s key pair , and the public and private key pair for digital signature . KGC randomly selects and sets , so (). At the same time, KGC also selects a random number , so that the public and private key pair used for digital signature is . CSP randomly selects , it sets (, ). Second, CSP allocates initialization information other than public and private keys for users accessing IoT data, including setting the unique identity of the th user as , where . A list is stored in the cloud server, which contains the user’s unique mark , the number of user visits , and the state-related mark . Third, KGC selects a random secret value for the user, and AAC selects a mark for each attribute. Therefore, the system public key is , and the master key is . The initial value of is set to 0.(2). In this part, the digital signature private key , the user’s attribute set , and the attribute version key , and outputs the user’s decryption key. The following four parts are included:(a)Generate attribute version key. This part is executed by AAC. AAC randomly selects any value for each attribute, and is used as a parameter for subsequent use, so the attribute version key is set to , and the attribute version key is generated and sent to CSP.(b)Generate partial user keys. This part is formed by the simultaneous operation of KGC and CSP via introducing a secure two-party computing protocol. First, KGC takes the parameters () as input, and CSP takes the parameter as input. Through calculation, is obtained, and the result is output to CSP. CSP selects a random number , calculates , and sends the calculation result to KGC. When KGC receives the result, calculate , and finally, the result is sent to CSP. CSP calculates from the received result . KGC inputs the set attribute version key and outputs partial user’s private key (). The partial user’s decryption key is composed of a combination of the private key generated by CSP and KGC: (c)In this part of the algorithm, , , is the output value of the algorithm VRF [54], refer to the calculation and detection scheme of the algorithm VRF. Therefore, the final decryption key is , the generated decryption key is sent to the user.(d)Set the expiration time for each attribute and digitally sign .(3)HKeyGen (). This algorithm generates the key of a homomorphic encryption algorithm. This scheme uses the DGHV encryption algorithm. In this algorithm, the key is selected as follows: we choose a randomly generated positive prime number as the key , where .(4). This algorithm first inputs the system public key and access policy tree , homomorphic encryption key , and plaintext message . Then, this algorithm outputs encrypted ciphertext . First, the data owner uses the homomorphic encryption key to encrypt the plaintext . The specific operation is as follows: they choose two random numbers , where , . The ciphertext of the document set is calculated by formula , is expressed in binary, and the generated ciphertext is uploaded. Second, the data owner encrypts the homomorphic key and uploads the key with attribute access control to the cloud. The data owner regards the attributes as leaf nodes, the root node of the tree is , and the other nodes are threshold logic operators. The encryption operation performs from the root node and, from top to bottom, produces a linked order for each node, which is polynomial . If is the threshold of nonleaf nodes, then there is a relation . Then, they select a random value , set the polynomial on the root node to , and use the homomorphic encryption key to encrypt the plaintext, and use the encryption result to calculate . Let the polynomial of other nodes be , where represents the number associated with any node . The order of nodes is indicated from left to right. In the entire access policy tree, the information carried by each leaf node must be calculated, , . Then, the final is .(5). In this part of the algorithm, after the time server receives the validity period of the attribute, it first needs to verify it with digital signature technology to check whether it has been forged or tampered with and verify it with the following calculation method: If the verification is successful, it means that the attribute has not been forged or tampered with. The time server compares the validity period with the present time to determine whether the attribute has exceeded the validity period. If it has not expired, you need to continue to execute step 6. If it expires, the attribute needs to be revoked. On the contrary, if the verification fails, it means that the validity period has been maliciously modified, then return .(6): after verifying the attribute validity period , it also needs to verify the user’s access times, but the difference is that even if a certain attribute fails, the user still has the possibility of access rights, but if the access times exceed the set threshold, then the user does not have the right to access IoT resource data. This algorithm makes the user’s unique identity , the user’s current state , and the maximum allowed number of decryption into a token and sends the token to the cloud server.(7). In this part, the cloud server first detects and after receiving the token with information. If it meets the verification conditions, CSP will detect the number of decryption in the list , if it is satisfied, let , update at this time and store it in the list , and then, the user and CSP continue to perform step 8. Then, let , otherwise, . If , it means accessing users can no longer access IoT big data even if they have access rights.(8). This part of the algorithm is executed by the decryption user and is divided into the following four parts:(a)When the node in the access policy tree belongs to the leaf node in the access policy tree, let , it means that the attribute corresponding to the node computes If the attribute is not in the user’s attribute set, return .(b)When belongs to a nonleaf node in the structure tree, we let be the set of child nodes of each node of size . When exists and the user’s current decryption frequency meet the requirements, then compute If the root node in this structure tree is replaced by the node in the above formula, it can be computed as .(c)When the user’s attribute set meets the requirements, decryption is performed:(d)After the data visitor obtains the homomorphic key , users can obtain the document set by using the homomorphic key .(9). When the attribute is revoked, this algorithm is executed. In the algorithm, it consists of three parts: (a)First, KGC randomly selects a reencryption parameter , which is assigned to AAC, CSP, and users whose attributes have been revoked, so that they can update relevant component information in time. Receiving the update information, AAC updates the attribute version keys of the revoked attributes that it manages, .(b)The next step is to update the user key. CSP obtains the reencryption parameters allocated in the previous step and regenerates the user’s latest version key together with KGC. The updated user key is .(c)The third step is to update the ciphertext. In this part, CSP first selects a random cipher value to ensure forward security and then updates the relevant components of the ciphertext after receiving the reencryption parameters. The updated ciphertext is
5. Safety and Efficiency Analysis
5.1. Solution Security Analysis
5.1.1. Confidentiality
The confidentiality of this scheme is achieved through two aspects. On the one hand, the attributes of the user must be able to meet the policy set by data owner. If the access policy is not met, then the attributes cannot be used to calculate , so it can prevent unauthorized users from stealing sensitive data. On the other hand, while generating the user’s key, to reduce the condition impact of low safety factor and untrustworthy, a secure two-party computing protocol is used to protect the related information of the private key from being obtained by anyone other than itself.
5.1.2. Forward Security
Since each user is set to limit decryption frequency, when users access data, if they meet the requirements of the access policy, they also need to send a token carrying the number of times of decryption to the cloud server. If the number of accesses exceeds the limit, then the user can no longer be decrypted, which ensures forward security.
5.1.3. Collusion Resistance
Users need to use their own attributes to calculate . If users with different permissions want to create a conspiracy attack, then KGC and CSP will generate partial decryption keys through a secure two-party calculation protocol , where is a unique random value for each user, so even if the attackers collude, they cannot calculate the value of .
5.1.4. Chosen-Plaintext Attack
Proof. We consider that there exists a polynomial adversary that is able to break this solution and algorithm that can overcome the DBDH problem with the advantage of .
Initialization: adversary selects an access structure tree and sends this access strategy tree to challenger , and challenger executes the Setup () initialization algorithm. This part of the process is as follows:
Randomly select four values to calculate , where .
For each attribute , select a random value , when the attribute does not exist in the access structure tree , we set , , if the attribute exists in the access structure tree , we let and .
The public key is published, and challenger keeps the private key .
Phase 1: after challenger obtains the public key, adversary can issue a query request. Adversary selects an attribute set and and submits the information to challenger to apply for a private key. Challenger randomly selects generates the corresponding private key. The calculation process is as follows:If the number of decryptions meets the requirements, will not affect the final decryption effect.
Challenge: adversary has obtained the access control tree at this time and then submits two plaintexts of the same length to challenger . By comparing the attribute sets, if the attribute set sent in the previous step does not meet the structure tree , then the two plaintexts are set to , and the two plaintexts are sent to challenger along with the access strategy tree. Then, randomly selects , calculate:Challenger sends this information to .
Phase 2: can always ask for private key-related information, and then, guesses the ciphertext and needs to give his own guess value .
Guess: if , then DBDH is established, the advantage is , if , the ciphertext cannot be judged, and the advantage is . In summary, . It shows that this scheme can realize that no adversary can break the scheme with a nonnegligible advantage in polynomial time.
5.2. Theoretical Comparison
Our scheme is compared with other schemes in terms of revocation mechanism, time limit, number of decryption limits, and anticollusion. The comparison results are shown in Table 1.
From Table 1, it can be seen that in [46–49, 56], the revocation schemes proposed by the authors do not fully meet the revocation needs. Although in [55] the authors proposed a scheme that can support user revocation and attribute revocation, in the scenario we mentioned, it is also a requirement that the ciphertext can be operated. This scheme in [49] realizes that users can operate on ciphertext, but it is not suitable for scenarios where attributes need to be revoked. Our scheme realizes two revocation functions, solves the basic system security problem, and achieves the ciphertext operable function. What is more, we also consider two factors: time and frequency of decryption.
Our scheme is compared with other schemes in terms of key generation efficiency, decryption efficiency, and revocation efficiency. is the exponential calculation cost, and is the bilinear pair calculation cost. The comparison results are shown in Table 2.
It can be seen that in [55] only the user performs the decryption operation and in [56] only CSP performs the decryption operation, which will cause one-side pressure. Our scheme can effectively reduce the amount of user tasks by placing part of the decryption task on the cloud server. Also, in [55], while realizing user revocation, the cost is . However, in our scheme, if the user is revoked after judgment, the user only needs to be removed from the list , thus, its computational complexity is better than the schemes [55, 56]. Although the cost of generating the key is relatively high due to the use of a two-party security protocol, the security of the key is guaranteed through this multiparty cooperation method.
6. Conclusions
Since important personal privacy may be leaked while storing and sharing IoT big data on the cloud, we have proposed an IoT big data privacy protection scheme based on time and decryption frequency limitation, the solution realizes the revocation within the time range and the revocation within the range of decryption times. The access control is set by the combination of homomorphic encryption and attribute-based encryption. In our scheme, legitimate users with a homomorphic encryption key can obtain the original data, and users without a homomorphic encryption key can perform operation training on the homomorphic ciphertext. Our scheme does not only affect the training of the neural network model but also improves the confidentiality of the data. At the same time, the security of the system is improved by introducing a secure two-party agreement. Through theoretical analysis, we found that our scheme realizes two revocation functions, solves the basic system security problem, and achieves the ciphertext operable function. While realizing user revocation, the computational complexity is preferable to other schemes. Besides, our scheme can effectively reduce the amount of user tasks by placing part of the decryption task on the cloud server. Therefore, our scheme can not only ensure safety but also improve efficiency. In the next step, we plan to combine the advantages of decentralization and anonymity of blockchain to protect big data in the Internet of Things in a distributed storage environment.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
There is no conflict of interest regarding the publication of this paper.
Acknowledgments
This work was partially supported by the National Natural Science Foundation of China Project (Nos. 61701170 and U1704122), the Key Scientific and Technological Project of Henan Province (Nos. 202102310340 and 202102210352), the Young Elite Scientist Sponsorship Program by Henan Association for Science and Technology (No. 2020HYTP008), the Foundation of University Young Key Teacher of Henan Province (Nos. 2019GGJS040 and 2020GGJS027), and the Key Scientific Research Project of Colleges and Universities in Henan Province (No. 21A110005).