Efficient Boolean Keywords Search over Encrypted Cloud Data in Public Key Setting

Zhang, Yu; He, Wei; Li, Yin

doi:https://doi.org/10.1155/2020/2904861

Mobile Information Systems

On this page

Abstract Introduction Preliminaries Conclusions Appendix Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 2904861 | https://doi.org/10.1155/2020/2904861

Efficient Boolean Keywords Search over Encrypted Cloud Data in Public Key Setting

Yu Zhang,¹Wei He,¹and Yin Li¹

Academic Editor: Laurence T. Yang

Received05 Nov 2019

Revised14 Jul 2020

Accepted11 Aug 2020

Published26 Aug 2020

Abstract

Searchable public key encryption- (SPE-) supporting keyword search plays an important role in cloud computing for data confidentiality. The current SPE scheme mainly supports conjunctive or disjunctive keywords search which belongs to very basic query operations. In this paper, we propose an efficient and secure SPE scheme that supports Boolean keywords search, which is more advanced than the conjunctive and disjunctive keywords search. We first develop a keyword conversion method, which can change the index and Boolean keywords query into a group of vectors. Then, through applying a technique so-called dual pairing vector space to encrypt the obtained vectors, we propose a concrete scheme proven to be secure under chosen keyword attack. Finally, we put forward a detailed theoretical and experimental analysis to demonstrate the efficiency of our scheme.

1. Introduction

Currently, thousands of information retrieval systems, such as e-mail systems, database management systems, and document management systems, are operating successfully in both the government and private sectors. As the data stored in these systems increase rapidly, more and more people want to migrate these data to cloud. To keep data privacy, users often encrypt these data before uploading them to the cloud. Since the encrypted data are difficult to retrieve, how to execute keyword search over encrypted data has attracted tremendous research attention over the past few years. Among these research studies, the searchable encryption (SE) is one of the most important techniques to address the issue of searching over encrypted data [1, 2].

The SE enables data users to retrieve the encrypted data of interest from a cloud server without decrypting the data. Commonly, SE is divided into two categories: one is searchable symmetric key encryption (SSE); the other is searchable public key encryption (SPE). During recent years, many SSE schemes have been proposed to support keyword search over encrypted data [3–6]. The key of SSE for encrypting data is the same as the key for generating search trapdoor. By contrast, the key of SPE for encrypting data is open to public, while the key for generating search trapdoor is only given to the authorized data receivers. Compared with SSE, SPE is more suitable for the situation in which there are many data senders and only a few data receivers, e.g., e-mail system [7], personal health record [8], and wireless sensor network [9]. As illustrated in Figure 1, in the scenario of e-mail system, the security requirements can be summarized as follows: (1) any data senders can generate encrypted e-mail data; (2) only data receiver can query and decrypt the encrypted e-mail data; (3) except the data receiver, none of the other entities, including the cloud server, can know the content of the encrypted e-mail data. Since security characteristics of SPE satisfy all these requirements in the above scenario, it is argued that SPE is very suitable for this application. Therefore, how to construct an efficient and secure SPE scheme supporting keyword search is always a hotspot in the field of SE.

1.1. Motivation

The very first SPE scheme supporting keyword search was introduced by Boneh et al., and it is so-called public key with keyword search (PEKS) [7]. However, their work only supports a single keyword search. In order to support more expressive query, many SPE schemes [10–12, 16] were proposed to realize advanced search, for example, conjunctive and disjunctive keywords search. In practice, most of the applications need more advanced keywords search function than the conjunctive and disjunctive keywords search. More precisely, many applications require Boolean keywords search. For example, in an e-mail system, users want to make a query like , where , , , and are keywords. A naive thought is that a Boolean query can be obtained by remoulding a PECK or PEDK scheme, i.e., by combining the query results of conjunctive or disjunctive keywords search. However, we argue this simple method has many drawbacks. To better illustrate our motivation, based on a PEDK or PECK scheme, we construct a naive scheme supporting the Boolean keywords search like , where , , and are three keywords. We then briefly review the simple solution and explain why it is unsatisfactory.

The approach is that we first execute the query and the query by making use of the PECK scheme, respectively, and obtain the union of the results of query and query. However, this method will leak the trapdoors of and . By utilizing the trapdoors, the search results of and are also leaked. Over time, the adversary may combine this information to derive the contents of user’s documents. In addition, we also can execute the query of and by making use of the PEDK scheme, respectively, and then obtain the intersection of the results of the query and the query . However, this method carries the same drawback.

1.2. Contribution

In this paper, we seek to construct a secure and efficient SPE scheme supporting Boolean keyword search which is not based on the PECK and PEDK schemes. We define a Boolean keywords search Q as a combination of conjunctive normal form (CNF) and disjunctive normal form (DNF), denoted by , where is defined as . Here, is a keyword, and . This Boolean keywords search is more expressive than the conjunctive and disjunctive keywords search. The contributions of our work are summarized as follows:(1)Inspired by the keyword conversion method introduced in [17], we create a novel keyword conversion method which can transform the index keyword set and Boolean query into an attribute and a predicate vector, respectively. These vectors can efficiently realize Boolean keywords search by an inner product operation. Moreover, the vector dimension is much less than that generated by adopting the previous method.(2)Through elaborately applying the existing technique called dual pairing vector space (DPVS) to encrypt the attribute and predicate vectors, we propose a secure and efficient SPE scheme supporting Boolean keywords search (SPE-BKS), which can accomplish Boolean keywords search over encrypted data with a better search efficiency than the previous schemes.

Moreover, for security concern, we introduce a formal security definition for SPE-BKS and give a detailed proof to demonstrate that our scheme is secure against chosen keyword attack. To verify the efficiency of the proposed scheme, we conduct an experiment for comparing our scheme with some recent schemes over a real-world dataset (Enron Email Dataset).

1.3. Related Work

The first SPE scheme supporting keyword search was introduced by Boneh et al. [7]. They called it as public key encryption with keyword search (PEKS), which only supports a single keyword search. To support multikeyword search, Park et al. proposed an SPE scheme supporting conjunctive keyword search, which is called public key encryption with conjunctive keywords search (PECK) [10]. In their scheme, each keyword is associated with a keyword field. The mechanism of the keyword field is based on two assumptions: one is that the keywords in a keyword field must be arranged in a preset order; the other is that the same keyword never appears in two different keyword fields of the same document. However, in many applications, the keyword field will make the multikeyword search unpractical. For instance, in an e-mail system, the keyword fields usually contain “From,” “To,” and “Title.” Many e-mails may have the same keyword in different keyword fields, e.g., “From: LeBron James” and “To: James Harden.” Moreover, the keywords in the keyword field “Title” may be organized in an alphabet order. To address this issue, the subsequence work is to create a PECK scheme without keyword field. In [11], Boneh and Waters proposed a public key encryption scheme called hidden vector encryption, which can efficiently support conjunctive keywords search without keyword field. After this, some efficient PECK schemes with better performance were proposed in [12–15]. To support disjunctive keyword search over encrypted data without keyword field, Katz et al. introduced a novel encryption scheme called predicate encryption supporting inner product, which is also named as inner product encryption (IPE) [16]. Through changing the index and query into an attribute and a predicate vector, respectively, a public key encryption with disjunctive keywords search (PEDK) scheme can be built based on the IPE scheme. Considering that the previous SPE schemes cannot use one trapdoor to realize conjunctive and disjunctive keywords search simultaneously, Zhang et al. proposed two public key encryption with conjunctive and disjunctive keyword search (PECDK) schemes [17, 18], which can efficiently support conjunctive and disjunctive keyword search at the same time. In order to support expressive query over encrypted data, based on the Paillier cryptosystem with threshold decryption (PCTD) [19], Yang et al. proposed an SPE scheme supporting versatile search query patterns, such as the range, conjunctive, disjunctive, and Boolean keywords search [20]. Miao et al. presented a hybrid keyword-field search scheme that supports both keyword search and range search simultaneously [21]. In addition, their scheme also provides an efficient key management mechanism to reduce the storage cost of keys. For the issue of fuzzy keyword search, Yang et al. designed a method to segment keyword according to the position of wildcards and proposed an SPE scheme supporting wildcard keyword search by combining the segmentation method and PCTD [22]. To support keyword search over arbitrary languages, Yang et al. realized a general method which can convert a variety of languages into a uniform big integer. By utilizing this conversion method and PCTD, they can carry out an SPE scheme supporting multikeyword rank search in arbitrary language [23]. To add the access control mechanism to SE, Li et al. created an attribute-based encryption (ABE) scheme which supports not only keyword search but also update operations for users ciphertext and secret key [24]. Then, they presented an outsourced ABE scheme supporting keyword search, which can transfer operations of decryption and key issuing to the cloud server partially [25]. He et al. proposed an SPE scheme which can control user’s search permission according to an access control policy [26]. Miao et al. proposed an attribute-based keyword search scheme under a shared multiowner setting [27]. Zhang et al. proposed an SPE scheme achieving both Boolean keywords search and fine-grained search permission [28]. For the problem of tensor decomposition over encrypted data, by elaborately combining homomorphic encryption and block chain techniques, Feng et al. designed several schemes to implement different types of tensor decomposition, such as high-order Bi-Lanczos and Tucker decomposition [29–31]. To improve the efficiency of SPE, Hwang et al. created a more efficient SPE scheme, by replacing the operation of bilinear pairing with ElGamal encryption system [32]. Lu et al. proposed a certificate-less encryption supporting keyword search under a multirecipient setting [33]. In order to obtain a better efficiency, their scheme avoids using a costly operation called bilinear pairing. Considering the scenario in which devices have limited resources, two secure and efficient energy-saving platforms were proposed to protect user’s sensitive data [34, 35]. To resist the DoS attack, Li et al. gave an efficient remote user authentication and privacy-preserving scheme by adopting the technique called extended chaotic maps [36]. In order to improve search accuracy, Zhang et al. proposed an SPE scheme supporting semantic keywords search by adopting a method called “Word2vec” [37].

1.4. Organization

This paper is organized as follows. In Section 2, we give the framework of SPE-BKS and its security definition. Some basic tools are also provided in the section. In Section 3, the construction of SPE-BKS is given, and its security proof is also presented. The experimental and theoretical analysis is provided in Section 4. We conclude this paper in Section 5.

2. Preliminaries

In this section, we will give a formal definition of the framework and security model of SPE-BKS. In addition, we also briefly introduce some basic ingredients used in our scheme, including dual pairing vector space (DPVS), two important lemmas, and complexity assumption.

2.1. Framework of SPE-BKS

The SPE-BKS consists of three roles: data sender, data receiver, and cloud server. The responsibilities of these three roles are listed as follows:(1)Data receiver generates the public key (pk) and secret key (sk) and sends the pk to the public. Data receiver also generates the trapdoor for any query of his/her interest and sends the trapdoor to the cloud server.(2)For a message M with a keyword set , data sender encrypts to create the encrypted index by using pk. Moreover, data sender will produce the encrypted message C for M. After this, data sender sends and C to the cloud server.(3)When the cloud server receives the trapdoor generated by the data receiver, the server tests the trapdoor against each encrypted index and returns the matched messages to the receiver.

According to the responsibilities of these three roles, we give a formal definition of the framework of SPE-BKS.

Definition 1. SPE-BKS consists of four polynomial-time algorithms (KeyGen, IndexBuild, Trapdoor, and Test) as follows:(1)KeyGen : this algorithm is run by the data receiver. It takes a security parameter as input and outputs pk and sk.(2)IndexBuild (pk, ): this algorithm is executed by the data sender to encrypt the keyword set . It produces a searchable encrypted index by using pk and .(3)Trapdoor (pk, sk, and Q): the algorithm is executed by the receiver to construct a trapdoor of Q. It takes pk, sk, and Q as input and outputs a trapdoor .(4)Test (pk, , and ): for the query and the index keyword set , we define the function as follows: if there exists some such that the keyword set in is a subset of , then . Otherwise, . This algorithm is run by the cloud server. It takes a trapdoor , a secure index , and pk as input and outputs 1 if , or 0 otherwise.

2.1.1. Correctness

For a query Q and a keyword set , for pk, sk, , and correctly generated by the algorithms KeyGen , IndexBuild (pk, ), and Trapdoor (pk, sk, Q), respectively, the correctness property asks that the following two situations are needed to be met:(1)If , Test (pk, , ) outputs 1(2)If , Test (pk, , ) outputs 1 with negligible probability

In practice, data senders will send a message M with a keyword set . The above algorithms aim to construct a secure and searchable index for . For the message M, we can apply the symmetric encryption scheme, e.g., AES and triple DES, to protect the security of M. Like the previous SPE schemes, we only concentrate on searchable encryption part.

2.2. Security Definition of the SPE-BKS

In this section, we present a formal definition for SPE-BKS, which defines a group of adversaries who can adaptively query the trapdoors of chosen keyword sets, and issue two challenge ciphertexts. The essential of the security of SPE-BKS is that the adversaries fail to distinguish these two ciphertexts based on the given trapdoors. Depending on the above description, inspired by the security definition of the previous SPE schemes, the security definition of SPE-BKS is given as follows.

Definition 2. An SPE-BKS scheme is adaptively index-hiding against chosen keyword attack if for all probabilistic polynomial-time (PPT) adversaries , the advantage of in the following game is negligible for the security parameter :(1)Setup: the challenger runs the KeyGen algorithm to generate pk and sk and gives pk to the attacker .(2)Phase 1: the attacker can adaptively ask the challenger for the trapdoor for any query Q of his choice.(3)Challenge: first selects two keyword sets and and sends them to . Suppose that , are the keyword queries which are queried to construct trapdoors in Phase 1; the only restriction is that these queries cannot distinguish these two challenge keyword sets. Then, randomly chooses a bit and generates . Finally, are sent to .(4)Phase 2: continues to ask for trapdoor for any query Q of his/her choice under the restriction mentioned in the Challenge phase.(5)Response: the attacker outputs and wins the game if .Based on the above game, the advantage of is defined as follows:

2.3. Prime Order Bilinear Group

Let G, be two cyclic groups of prime order p. There are three properties in the bilinear pairings map as follows:(1)Bilinear: , where a, and (2)Nondegenerate: if , then (3)Computable: for any a, , can be efficiently computable

An efficient bilinear map can be obtained by applying the Weil pairing or the Tate pairing [38].

2.4. Dual Pairing Vector Space

Suppose that and ; we have . We can perform the scalar multiplication and vector addition in the exponent. For any and , we have and . We can also have and . Here, the dot product is taken as modulo .

We will employ the concept of DPVS which is introduced in [39]. The notation used to describe DPVS is introduced in [40]. Suppose that and are two random bases of , where l is a fixed dimension; if whenever and for all , where is a random elements in , then we call and dual orthonormal bases. Obviously, for a generator , whenever , where 1 can be seen as the identity element of .

2.5. Two Important Lemmas

We will introduce two important lemmas used in the security proof of our scheme. The first lemma is presented in [40]. To describe the lemma formally, first of all, we give some notations and definitions which are also introduced in [40]. Let t, l be two fixed positive integers where , be an invertible matrix and be a subset of size t. Suppose that and are random dual orthonormal bases; a new pair of dual orthonormal bases and was defined as follows.

Let be a matrix over whose columns are the vectors such that . We can easily find that is also a matrix. By keeping all of the vectors for and exchanging for with the columns of , is then constructed. Because is also a matrix, also can be constructed by using the same method.

For a fixed dimension l and prime p, we denote randomly choosing a pair of dual orthonormal bases and by . can be viewed as a dual orthonormal bases set.

The first lemma is described as follows.

Lemma 1. For any fixed positive integers , any fixed invertible and set of size t, if , is also distributed as a random sample from . In particular, the distribution of is independent of .

The second lemma introduced in [39] (Lemma 23) is described as follows.

Lemma 2. Let , where is l-dimensional vector space, and and are its dual. For all , ,where and .

2.6. Complexity Assumption

In order to prove our scheme’s security, subspace complexity assumption introduced in [40] is needed. This validity of this assumption is also given in [40].

For a fixed dimension and a prime , the dual orthonormal bases and which are randomly chosen are denoted by . can be seen as a dual orthonormal bases set. For a positive integer , the definition of this assumption is described as follows.

Definition 3 (subspace complexity). Given a group generator , we define the following distribution:We assume that, for any PPT algorithm A with output in , the advantage of defined by is negligible in the security parameter .

3. The Proposed SPE-BKS Scheme

In this section, we first introduce a keyword conversion method which converts the index and query keywords into a group of vectors. Then, through taking advantage of DPVS to encrypt these vectors, the construction of SPE-BKS is given. Finally, the security proof of our scheme is presented.

3.1. Keyword Conversion Method

Before describing the method, some notations will be introduced. Suppose that any keyword can be expressed as a string in , we define a function . Since p is a large prime and is larger than the number of all words, can be collision-resistant. This means that if , then , where and are two distinct keywords.

For the index keyword set , we construct an equation of degree n with one unknown:

According to the coefficient of the , the vector for is obtained.

For the query , we first split Q into a group of keyword sets. For each , we obtain a keyword set , where . For each , we can create a vector:

Note that if it exists some i such that , where , according to (4) and (5), it is not difficult to verify that .

As a result, we can test each in Q against to make a Boolean keywords search. If , there is at least an such that . Based on this property, a concrete SPE-BKS scheme will be proposed in the next section.

3.2. Construction

According to Definition 1, we present a concrete construction of our SPE-BKS scheme:(i)KeyGen: choosing a bilinear group G of a prime order and setting n′ = 3n + 3, the algorithm randomly selects a pair of dual orthonormal bases from the dual orthonormal bases set , where , and (mod p), where . The algorithm outputs pk and sk as follows:where .(i)IndexBuild: given a keyword set , the algorithm constructs an n-degree polynomial, where are n roots of the equation f (x) = 0. Choosing two random elements , for the vector , this algorithm creates the index as follows:(i)Trapdoor: given a query Q, this algorithm first generates a group of vectors , by using the keyword conversion method introduced in Section 3.1. Then, it randomly chooses and an invertible matrix . Suppose that and in which and , where , for each , the trapdoor generation algorithm computes(i)The trapdoor of Q is .(ii)Test: the test algorithm first computes for each . Suppose that ; it outputs where and . Based on , the test algorithm works as follows:(1)Choose a counter , and set .(2)If , then go to step (3); otherwise, the algorithm computes. If , the algorithm outputs 1 and ends. Otherwise, it sets and goes to step (2).(3)The algorithm outputs 0 and ends.

3.2.1. Correctness

Suppose that and are correctly generated by the “IndexBuild” and “Trapdoor” algorithms, respectively, then we have the following equation:where and .

Owing to , based on the equation above, we have the following equation:

If there exists some such that , it has , which makes , and, thus, the test algorithm outputs 1.

3.2.2. Application

According to the user’s identity, the proposed scheme works as follows:(1)Data Receiver. Data receiver runs the “KeyGen” function to generate pk and sk, and pk is open to the public. When data receiver wants to perform Boolean keywords search, the “Trapdoor” function is called to generate a trapdoor by using sk and a Boolean query condition. After this, the trapdoor is sent to the cloud server.(2)Data Sender. For a document set, the data sender builds the secure index by calling the “IndexBuild” function and sends the index to the cloud server.(3)Cloud Server. Upon receiving a trapdoor generated by the data receiver, the cloud server launches the “Test” function and returns documents associated with the query to the data receiver.

In the real world, any practical application that needs ciphertext retrieval can integrate our scheme to realize the function of searching on encrypted data.

3.3. Security

To prove the security of our SPE-BKS system, we adopt the dual system encryption method proposed in [41, 42]. According to this method, we give the construction of semifunctional index and trapdoor in our scheme. The semifunctional index and trapdoor will not be implemented in the real system but used in the proof:(i)Semifunctional Index. Let , where i and is introduced in “KeyGen” algorithm. A normal index is constructed by the “IndexBuild” algorithm. Choosing random values , the semifunctional index is created as follows:(i)Semifunctional Trapdoor. Let , where i. A normal trapdoor is constructed by the “Trapdoor” algorithm. Choosing random values where , the semifunctional trapdoor is created as follows:

When using the semifunctional trapdoor to test the semifunctional index, the additional factors will be generated, where .

The security proof of our SPE-BKS scheme relies on subspace complexity assumption which is presented in Section 2.6. We will prove security by using a hybrid method which consists of a sequence of games. These games are described as follows:(1): this game is the real security game.(2): for each , is similar to except that the index given to is semifunctional and the first k trapdoors are semifunctional. The remaining trapdoors are normal. In , all the trapdoors given to are normal and the index is semifunctional. In , the index and all trapdoors are semifunctional.(3): suppose that a keyword set is the challenge keyword set; we construct an n-degree polynomial by using the function , where are n roots of the equation f (x) = 0. Then, we define this game. For each , is similar to except that index is a semifunctional encryption of a vector in which the first k + 1 elements are random and the remaining elements are . is a game such that the index is a semifunctional encryption of a real challenge keyword set, which is identical to . is a game such that the index is a semifunctional encryption of a random keyword set. We will show that these games are indistinguishable in the following lemmas.

Lemma 3. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 3n + 3, k = n + 1.

Lemma 4. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 3n + 3, k = n + 1.

Lemma 5. Suppose that there exists a PPT algorithm such that is nonnegligible. Then, we can build a PPT algorithm with nonnegligible advantage in breaking subspace complexity assumption, with n′ = 6, k = 2.

Considering the length of the article and the coherence of the article structure, the proofs of Lemmas A–C are given in Appendix.

Theorem 1. If subspace complexity assumption holds, then our SPE-BKS scheme is secure.

Proof. If subspace complexity assumption holds, the real security game is indistinguishable from based on the previous lemmas. In , the value of is information-theoretically hidden from the attackers. Hence, we can state that the attackers can attain no advantage in breaking our SPE-BKS scheme.

4. Performance Evaluation

In this section, we present a detailed experiment to demonstrate that our scheme can efficiently perform Boolean keywords search over the encrypted data. We implement our scheme in JAVA with Java Pairing-Based Cryptography (JPBC) Library [43]. In our implementation, the bilinear map is instantiated as Type A pairing (base field size is 128 bits), which offers a level of security equivalent to 1024-bit DLOG [43]. Our experiment is run on Intel® Core™ i7 CPU at 2.90 GHz processor and 16 GB memory size and is over a real-world e-mail dataset called Enron Email Dataset [44]. In our experiment, we randomly choose 1000 e-mails from the Enron Email Dataset and denote the number of documents by d (d = 1000). To show the efficiency of our scheme, we compare our scheme to three previous SPE schemes in terms of key generation, index building, trapdoor generation, and search. For simplicity, we denote these three schemes introduced in [17, 18, 20] by PECDK-1, PECDK-2, and YY18. These three SPE schemes can perform conjunctive, disjunctive, and Boolean keywords search over encrypted data.

4.1. Key Generation

From Figure 2(a), the time costs of key generation in PECDK-1 and our scheme are both linear with, while that in PECDK-2 is linear with O (n). The reason for this phenomenon is the case that both our scheme and PECDK-1 adopt DPVS to generate group elements in G. Because the dimension of DPVS in our scheme is 3n while that in PECDK-1 is 4n, the time cost of key generation in our scheme is less than that in PECDK-1. In addition, since the key generation algorithm in YY18 is independent of n, the time cost of key generation is not related to n. Although the time cost of key generation in our scheme is higher than that in PECDK-2 and YY18, it has little impact on our practical application since this algorithm only runs when system initialization and key pair replacement are carried out.

(a)

(b)

(c)

(d)

As shown in Figures 3(a) and 3(b), because both pk and sk contain group elements in G, the space cost for key pair in our scheme and PECDK-1 are both linear with the square of n. By contrast, the space cost for key pair in PECDK-2 is linear with O (n). Besides, for YY18, since both pk and sk contain constant big integers, the space cost for key pair is not related to n. Though the storage cost of keys in our scheme is more than that in the other three schemes, our scheme still does not need much space to store the keys as these keys are stored only a few copies.

(a)

(b)

(c)

(d)

4.2. Index Building

From Figure 2(b), the time costs of index building in PECDK-1, PECDK-2, and our scheme are all linear with, while that in YY18 is linear with O (n). For PECDK-2, the index building algorithm needs to convert the keywords into a matrix and then needs exponentiation computation of G to encrypt the keywords. For the proposed scheme and PECDK-1, they also require exponentiation computation of G owing to DPVS. More precisely, compared to PECDK-1, our scheme needs less time cost in index building since the dimension of DPVS in our scheme is less than that in PECDK-1. Besides, the time cost of index building in our scheme is slightly higher than that in PECDK-2 since our scheme needs exponentiation computations while PECDK-2 requires exponentiation computations. The reason for this phenomenon is that, compared to PECDK-2, our scheme needs more group elements to support more complex search function. Compared with YY18, our scheme needs more index building time since our scheme needs exponentiation computations while YY18 only runs the encryption algorithm of PCTD n times.

For the storage cost of indices, the group elements on G in the index for our scheme are linear with n. For YY18, since each document’s index contains n ciphertexts generated by PCTD, the space cost of index building is linear with O (n). By contrast, the group elements in the index for PECDK-1 and PECDK-2 are both linear with the square of n since the index structures for PECDK-1 and PECDK-2 are both a matrix. As shown in Figure 3(c), the storage costs of indices in our scheme and YY18 are linear with O (n) while those in PECDK-1 and PECDK-2 are both linear with.

4.3. Trapdoor Generation

As shown in Figure 2(c), the time costs of trapdoor generation in PECDK-1, PECDK-2, YY18, and the proposed scheme are linear with m, m, and respectively. More precisely, for PECDK-1, the keywords in the query are first converted to be a vector, whose dimension is n. Then, this vector will be encrypted by using DPVS. Since the encryption operation needs exponentiation computations of G, the time cost of trapdoor generation in PECDK-1 is linear with. For PECDK-2, suppose that the number of keywords in the query is m, the query is converted to be a vector whose dimension is m, and each dimension needs one exponentiation computation on G. Thus, the time cost of trapdoor generation in PECDK-2 is linear with m. For YY18, if the query contains m keywords, the trapdoor algorithm will perform encryption algorithm of PCTD n times, so the time cost of trapdoor generation is linear with O (m). For the proposed scheme, the query is converted to be m vectors in which each vector’s dimension is n. After this, each vector is encrypted by making use of DPVS, and thus, the time consumption of trapdoor generation in our scheme is linear with.

From Figure 3(d), the space costs for PECDK-1, PECDK-2, YY18, and our scheme are linear with n, n, n, and mn, respectively. The reason for this phenomenon is that the trapdoors in PECDK-1, PECDK-2, and our scheme contain n, m, and mn group elements on G, respectively, and the trapdoor in YY18 involves m ciphertexts of PCTD.

4.4. Search

As shown in Figure 2(d), the time cost of search in PECDK-1 is linear with, while that in PECDK-2, YY18, and our scheme is linear with mn. More precisely, for PECDK-1, the index of W contains n ciphertexts, and each ciphertext needs n pairing operations. For PECDK-2, the index is a matrix, and the trapdoor is a vector whose dimension is m. The test algorithm in PECKD-2 performs mn pairing operations between the first m rows of the matrix and the vector. For YY18, since the index and trapdoor hold n and m ciphertext of PCTD, respectively, the test algorithm will run secure less or equal (SLE) protocol and secure multiplication protocol across domains (SMD) mn times. For the proposed scheme, the trapdoor has m items, and the test algorithm in our scheme performs n pairing operations between each item and the index. Thus, total pairing operations in our scheme are mn. Since PECDK-1, PECDK-2, and our scheme need nearly 2 mn and 3 mn pairing operations, respectively, the time consumption in our scheme is slightly more than that in PECDK-2 and is less than that in PECDK-1. Moreover, since the time cost of a pairing operation is less than that of SLE and SMD, our scheme is more efficient than YY18 in test phase.

4.5. More Comments

As shown in the experimental results, when n = 5, d = 1000, and m = 5, the time cost of index building in our scheme is 331 s, the generation time of a single trapdoor is 1.7 s, and the search time is 142 s. According to the statistical data given in [17, 45], the number of keywords in a document (n) is usually less than 20, e.g., only 3∼5 keywords in the scientific paper, and the number of keywords in a query (m) is often less than 10. We can argue that our scheme is suitable for the applications with fewer keywords, such as the keywords in the scientific literature, e-mail title and summaries, medical data summaries, and so on.

Although Figure 2 shows that the time complexity of our scheme is as good as that of PECDK-2, our scheme can support Boolean keywords search, which is much advanced than the conjunctive and disjunctive keywords search. Compared with YY18 that supports Boolean keywords search, our scheme needs less search time, despite the fact that it increases index building time. In practice, the index building in real-world application is usually a one-time activity, while queries are frequently performed. Thus, we reckon that it is worth sacrificing index building time to reduce retrieval time. For the space complexity, from Figure 3, our scheme needs less space for index storage, though requiring more storage space for the trapdoor and keys. Considering the fact that trapdoor and keys often require much less storage space than the index, we argue that our scheme is practicable in the real world.

5. Conclusions

In this paper, by applying DPVS and the bilinear pairing, we proposed a searchable public key encryption scheme supporting Boolean keyword search, which is proven to be secure under chosen keyword search attack. Compared to previous SPE schemes supporting conjunctive and disjunctive keywords search, the proposed scheme can support more advanced search function. Moreover, through a detailed experiment over a real-world dataset, we can argue that the efficiency of our scheme is suitable for practical applications with fewer keywords. Considering that the efficiency in our scheme still needed to be improved, we will construct a more efficient scheme in the forthcoming work.

Appendix

A. Proof of Lemma 3

Proof. Given , needs to decide whether are , distributed as , or , .
By using , C can simulate or with . To create pk, firstly, randomly selects an invertible matrix . Then, we define a dual orthonormal bases F and by , , , , , and , , , , , .
implicitly sets and where the matrix is applied as a change of basis matrix to and is applied as a change of basis matrix to , as described in Section 2.5. Note that the first 2n + 2 basis vectors are unchanged. According to Lemma 2, and are properly distributed.
Choosing a function , computes and sends it to . Each time asks to provide a key for a keyword query , creates a normal trapdoor of . Choosing and an invertible matrix and , computesand sends to , where , , .
At some point, sends two challenge keyword sets, and . By randomly choosing and computing an n-degree polynomial , setswhere implicitly sets , .
Then, gives the index to . If are equal to , , then this is a properly distributed normal index. In this case, has properly simulated .If are equal to , , then there is an additional term of in the exponent part of the index. The coefficients in the basis are the vector . In order to acquire the coefficients in the basis , we multiply the matrix by the transpose of these vectors and obtain . Since is random, these coefficients are uniformly random. Therefore, in this case, has properly simulated . So, if can distinguish from with nonnegligible advantage, then can use the output of to break subspace assumption with nonnegligible advantage.

B. Proof of Lemma 4

Proof. Given , needs to decide whether are distributed as follows: , or , .
By using , can simulate or with . To create pk, firstly, randomly selects an invertible matrix and implicitly sets , and , where A is applied as a change of basis matrix to and is applied as a change of basis matrix to , as described in Section 2.5. Then, , , , for . According to Lemma 2, E and are properly distributed.
Choosing a function , computes and sends it to .
When requests the lth trapdoor query, generates the normal trapdoor or the semifunctional trapdoor as follows:
For l < k, choosing and implicitly setting , , where and , can produce semifunctional trapdoor by using .
For l > k, runs the normal trapdoor generation algorithm to produce the normal trapdoor.
To create the kth requested trapdoor, firstly chooses and an invertible matrix . Let and . Then, for each , computesThe above equation implicitly sets and , where . If are equal to , ; then this is a properly distributed normal trapdoor. If are equal to , , then this is a properly distributed semifunctional trapdoor. For each , ’s exponent vector contains the item .
At some point, sends two challenge keyword sets, and . By randomly choosing and computing an n-degree polynomial , where , are n roots of the equation f (x) = 0, setswhere implicitly sets and .
After that, sends the semifunctional index to . Obviously, contains the exponent vector . The authors observe that if attempts to test whether the kth trapdoor of Q is semifunctional by creating a semifunctional index of keyword set W which satisfies , then can find that test algorithm can still work whether the kth key is semifunctional or not, since and will be eliminated when . Therefore, we can say that the kth key is a nominally semifunctional key.
In view of this, for each , and are distributed as random vectors in the spans of and . In , the coefficients in the basis are the vector , where and , . In order to acquire the coefficients in the basis , we multiply the matrix by the transpose of these vectors and obtain ). Since is random and if , we can say that are uniformly random. In V, the coefficients in the basis are the vector . In order to acquire the coefficients in the basis , we multiply the matrix by the transpose of these vectors and obtain . Since is random and , the coefficients and mentioned above are uniformly random according to Lemma 2, where and , .
According to the above analysis, we conclude that if are distributed as , , has properly simulated . If are equal to , , has properly simulated . Thus, we argue that if can distinguish from with nonnegligible advantage, then can use the output of to break subspace complexity assumption with nonnegligible advantage.

C. Proof of Lemma 5

Proof. Given , , needs to decide whether are distributed as and or as and , respectively.
By using , for , can simulate or with . To construct pk, implicitly sets and , where . Apparently, and are properly distributed dual orthonormal bases.
Because can obtain , pk can be easily created. Each time asks to provide a key for a keyword query , creates a semifunctional trapdoor of . Choosing and an invertible matrix and , computesand sends to , where .
At some point, sends two challenge keyword sets, and . By randomly choosing , , two random vectors , , and computing an n-degree polynomial by using the function , where are n roots of the equation f (x) = 0, then setsThen, gives the index to . If are equal to and , then this is a properly distributed semifunctional index of the vector . In this case, has properly simulated . If are equal to and , respectively, then this is a properly distributed semifunctional index of the vector , where . In this case, has properly simulated . So, if can distinguish from with nonnegligible advantage, then can use the output of to break subspace assumption with nonnegligible advantage.

Data Availability

The data used to support the findings of this study are available from the website http://www.cs.cmu.edu/∼./enron/.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors gratefully acknowledge the support of the National Natural Science Foundation of China under Grant nos. 61402393 and 61601396 and Nanhu Scholars Program for Young Scholars of XYNU.

References

S. Zerr, D. Olmedilla, W. Nejdl, and W. Siberski, “Zerber + r: top-k retrieval from a confidential index,” in Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 439–449, Saint Petersburg, Russia, March 2009.
View at: Google Scholar
Z. Xia, X. Wang, X. Sun, and Q. Wang, “A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 2, pp. 340–352, 2016.
View at: Publisher Site | Google Scholar
Z. Fu, K. Ren, J. Shu, X. Sun, and F. Huang, “Enabling personalized search over encrypted outsourced data with efficiency improvement,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 9, pp. 2546–2559, 2016.
View at: Publisher Site | Google Scholar
Z. Fu, X. Wu, C. Guan, X. Sun, and K. Ren, “Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 12, pp. 2706–2716, 2016.
View at: Publisher Site | Google Scholar
C. Guo, R. Zhuang, C.-C. Chang, and Q. Yuan, “Dynamic multi-keyword ranked search based on bloom filter over encrypted cloud data,” IEEE Access, vol. 7, pp. 35826–35837, 2019.
View at: Publisher Site | Google Scholar
Q. Jiang, Y. Qi, S. Qi, W. Zhao, and Y. Lu, “Pbsx: a practical private boolean search using Intel SGX,” Information Sciences, vol. 521, pp. 174–194, 2020.
View at: Publisher Site | Google Scholar
D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key encryption with keyword search,” in Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, pp. 506–522, Berlin, Germany, May 2004.
View at: Google Scholar
Y. Zhu, D. Ma, and S. Wang, “Secure data retrieval of outsourced data with complex query support,” in Proceedings of the 2012 32nd International Conference on Distributed Computing Systems Workshops, pp. 481–490, June 2012.
View at: Google Scholar
P. Xu, S. He, W. Wang, W. Susilo, and H. Jin, “Lightweight searchable public-key encryption for cloud-assisted wireless sensor networks,” IEEE Transactions on Industrial Informatics, vol. 14, no. 8, pp. 3712–3723, 2017.
View at: Google Scholar
D. J. Park, K. Kim, and P. J. Lee, “Public key encryption with conjunctive field keyword search,” in Proceedings of the International Workshop on Information Security Applications, pp. 73–86, Berlin, Germany, August 2004.
View at: Google Scholar
D. Boneh and B. Waters, “Conjunctive, subset, and range queries on encrypted data,” in Proceedings of the Theory of Cryptography Conference, pp. 535–554, Berlin, Germany, February 2007.
View at: Google Scholar
B. Zhang and F. Zhang, “An efficient public key encryption with conjunctive-subset keywords search,” Journal of Network and Computer Applications, vol. 34, no. 1, pp. 262–267, 2011.
View at: Publisher Site | Google Scholar
C. C. Lee, S. T. Hsu, and M. S. Hwang, “A study of conjunctive keyword searchable schemes,” International Journal of Network Security, vol. 15, no. 5, pp. 321–330, 2013.
View at: Google Scholar
M. S. Hwang, S. T. Hsu, and C. C. Lee, “A new public key encryption with conjunctive field keyword search scheme,” Information Technology and Control, vol. 43, no. 3, pp. 277–288, 2014.
View at: Publisher Site | Google Scholar
Y. Zhang, Y. Li, and Y. Wang, “Efficient conjunctive keywords search over encrypted e-mail data in public key setting,” Applied Sciences, vol. 9, no. 18, p. 3655, 2019.
View at: Publisher Site | Google Scholar
J. Katz, A. Sahai, and B. Waters, “Predicate encryption supporting disjunctions, polynomial equations, and inner products,” in Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 146–162, Berlin, Germany, April 2008.
View at: Google Scholar
Y. Zhang, Y. Li, and Y. Wang, “Conjunctive and disjunctive keyword search over encrypted mobile cloud data in public key system,” Mobile Information Systems, vol. 2018, Article ID 3839254, 11 pages, 2018.
View at: Publisher Site | Google Scholar
Y. Zhang, Y. Li, and Y. Wang, “Secure and efficient searchable public key encryption for resource constrained environment based on pairings under prime order group,” Security and Communication Networks, vol. 2019, Article ID 5280806, 14 pages, 2019.
View at: Publisher Site | Google Scholar
X. Liu, R. H. Deng, K. K. R. Choo et al., “An efficient privacy-preserving outsourced calculation toolkits with multiple keys,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 11, pp. 2401–2414, 2016.
View at: Publisher Site | Google Scholar
Y. Yang, X. Liu, and R. Deng, “Expressive query over outsourced encrypted data,” Information Sciences, vol. 442-443, pp. 33–53, 2018.
View at: Publisher Site | Google Scholar
Y. Miao, X. Liu, R. H. Deng et al., “Hybrid keyword-field search with efficient key management for industrial Internet of Things,” IEEE Transactions on Industrial Informatics, vol. 15, no. 6, pp. 3206–3217, 2019.
View at: Publisher Site | Google Scholar
Y. Yang, X. Liu, R. H. Deng, and J. Weng, “Flexible wildcard searchable encryption system,” IEEE Transactions on Services Computing, vol. 13, no. 3, pp. 464–477, 2020.
View at: Publisher Site | Google Scholar
Y. Yang, X. Liu, and R. H. Deng, “Multi-user multi-keyword rank search over encrypted data in arbitrary language,” IEEE Transactions on Dependable and Secure Computing, vol. 17, no. 2, pp. 320–334, 2020.
View at: Publisher Site | Google Scholar
J. Li, Y. Shi, and Y. Zhang, “Searchable ciphertext policy attribute based encryption with revocation in cloud storage,” International Journal of Communication Systems, vol. 30, no. 1, Article ID e2942, 2017.
View at: Publisher Site | Google Scholar
J. Li, X. Lin, Y. Zhang, and J. Han, “KSF-OABE: outsourced attribute based encryption with keyword search function for cloud storage,” IEEE Transactions on Services Computing, vol. 10, no. 5, pp. 715–725, 2017.
View at: Publisher Site | Google Scholar
K. He, J. Guo, J. Weng, J. Weng, J. K. Liu, and X. Yi, “Attribute-based hybrid Boolean keyword search over outsourced encrypted data,” IEEE Transactions on Dependable and Secure Computing, p. 1, 2018.
View at: Publisher Site | Google Scholar
Y. Miao, X. Liu, K. K. R. Choo et al., “Privacy-preserving attribute-based keyword search in shared multi-owner setting,” IEEE Transactions on Dependable and Secure Computing, p. 1, 2019.
View at: Publisher Site | Google Scholar
K. Zhang, M. Wen, R. Lu, and K. Chen, “Multi-client sub-linear boolean keyword searching for encrypted cloud storage with owner-enforced authorization,” IEEE Transactions on Dependable and Secure Computing, p. 1, 2020.
View at: Publisher Site | Google Scholar
J. Feng, L. T. Yang, Q. Zhu, and K. K. R. Choo, “Privacy-preserving tensor decomposition over encrypted data in a federated cloud environment,” IEEE Transactions on Dependable and Secure Computing, vol. 17, no. 4, pp. 857–868, 2020.
View at: Publisher Site | Google Scholar
J. Feng, L. T. Yang, and R. Zhang, “Practical privacy-preserving high-order Bi-lanczos in integrated edge-fog-cloud architecture for cyber-physical-social systems,” ACM Transactions on Internet Technology, vol. 19, no. 2, pp. 1–18, 2019.
View at: Publisher Site | Google Scholar
J. Feng, L. T. Yang, R. Zhang, and B. S. Gavuna, “Privacy preserving tucker train decomposition over Blockchain-based encrypted industrial IoT data,” IEEE Transactions on Industrial Informatics, p. 1, 2020.
View at: Publisher Site | Google Scholar
M.-S. Hwang, C.-C. Lee, and S.-T. Hsu, “An ElGamal-like secure channel free public key encryption with keyword search scheme,” International Journal of Foundations of Computer Science, vol. 30, no. 2, pp. 255–273, 2019.
View at: Publisher Site | Google Scholar
Y. Lu, J. Li, and Y. Zhang, “Privacy-preserving and pairing-free multi-recipient certificateless encryption with keyword search for cloud-assisted IIoT,” IEEE Internet of Things Journal, vol. 7, no. 4, pp. 2553–2562, 2020.
View at: Publisher Site | Google Scholar
S. Singh, P. K. Sharma, S. Y. Moon, and J. H. Park, “EH-GC: an efficient and secure architecture of energy harvesting Green cloud infrastructure,” Sustainability, vol. 9, no. 4, p. 673, 2017.
View at: Publisher Site | Google Scholar
K.-S. Lim, J. Park, and J. Park, “An energy-efficient virtualization-based secure platform for protecting sensitive user data,” Sustainability, vol. 9, no. 7, p. 1250, 2017.
View at: Publisher Site | Google Scholar
C.-T. Li, C.-C. Lee, and C.-Y. Weng, “An extended chaotic maps based user authentication and privacy preserving scheme against DoS attacks in pervasive and ubiquitous computing environments,” Nonlinear Dynamics, vol. 74, no. 4, pp. 1133–1143, 2013.
View at: Publisher Site | Google Scholar
Y. Zhang, Y. Wang, and Y. Li, “Searchable public key encryption supporting semantic multi-keywords search,” IEEE Access, vol. 7, pp. 122078–122090, 2019.
View at: Publisher Site | Google Scholar
A. Joux, “The Weil and Tate pairings as building blocks for public key cryptosystems,” in Proceedings of the International Algorithmic Number Theory Symposium, pp. 20–3, Berlin, Germany, July 2002.
View at: Google Scholar
A. Lewko, T. Okamoto, A. Sahai, K. Takashima, and B. Waters, “Fully secure functional encryption: attribute-based encryption and (hierarchical) inner product encryption,” in Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 62–91, Berlin, Germany, May 2009.
View at: Google Scholar
A. Lewko, “Tools for simulating features of composite order bilinear groups in the prime order setting,” in Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, pp. 318–335, Berlin, Germany, April 2012.
View at: Google Scholar
A. Lewko and B. Waters, “New techniques for dual system encryption and fully secure HIBE with short ciphertexts,” in Proceedings of the Theory of Cryptography Conference, pp. 455–479, Berlin, Germany, February 2010.
View at: Google Scholar
B. Waters, “Dual system encryption: realizing fully secure IBE and HIBE under simple assumptions,” in Proceedings of the Annual International Cryptology Conference, pp. 619–636, Berlin, Germany, August 2009.
View at: Google Scholar
A. D. Caro, “The java pairing based cryptography library (JPBC),” 2013, http://gas.dia.unisa.it/projects/jpbc/laatstnagekekenop.
View at: Google Scholar
W. W. Cohen, Enron e-mail dataset, http://www.cs.cmu.edu/∼./enron/.
H. Cui, Z. Wan, R. H. Deng, G. Wang, and Y. Li, “Efficient and expressive keyword search over encrypted data in cloud,” IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 3, pp. 409–422, 2018.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Yu Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies