Abstract
With the rapid development of Internet of Things (IoT), designing a secure two-factor authentication scheme for IoT is becoming increasingly demanding. Two-factor protocols are deployed to achieve a higher security level than single-factor protocols. Given the resource constraints of IoT devices, other factors such as biometrics are ruled out as additional authentication factors due to their large overhead. Smart cards are also prone to side-channel attacks. Therefore, historical big data have gained interest recently as a novel authentication factor in IoT. In this paper, we show that existing big data-based schemes fail to achieve their claimed security properties such as perfect forward secrecy (PFS), key compromise impersonation (KCI) resilience, and server compromise impersonation (SCI) resilience. Assuming a real strong attacker rather than a weak one, we show that previous schemes not only fail to provide KCI and SCI but also do not provide real two-factor security and revocability and suffer inside attack. Then, we propose our novel scheme which can indeed provide real two-factor security, PFS, KCI, and inside attack resilience and revocability of the client. Furthermore, our performance analysis shows that our scheme has reduced modular exponentiation operation and multiplication for both the client and the server compared to Liu et al.’s scheme which reduces the execution time by one third for security levels of . Moreover, in order to cope with the potential threat of quantum computers, we suggest using lightweight XMSS signature schemes which provide the desired security properties with bit postquantum security. Finally, we prove the security of our proposed scheme formally using both the real-or-random model and the ProVerif analysis tool.
1. Introduction
Internet of Things (IoT) has enabled a wide range of objects in our world to send and receive data through internet. The IoT devices have limited power, storage, and processing capabilities. In addition, they are often deployed in the public and hostile environment, which exposes them to a wide range of attacks such as physical and cloning attacks. To provide security in these networks, cryptographic solutions can guarantee authentication and key agreement between IoT devices and the server.
To this end, single-factor authentication schemes such as password-based or secret key-based schemes, in which a shared secret is the only authentication factor, are no longer sufficient for addressing the security requirements. Given the advances made in physical or side-channel attacks, the adversary can obtain IoT device’s secrets and thus compromise the entire system.
Multifactor authentication schemes have been proposed to resolve the potential leakage of device’s secrets and add an extra defense layer in order to provide a more resilient way of authenticating IoT devices.
Recently, big data generated by IoT devices at a great velocity have been adopted as an authentication factor by researchers [1, 2]. In this paper, we focus on Liu et al.’s scheme [2] which claims to provide various security goals such as key compromise impersonation (KCI) resilience, perfect forward secrecy (PFS), and server compromise impersonation (SCI) resilience in addition to standard goals. Assuming a real strong attacker, we show that their scheme fails to provide real two-factor security. Further, the adversary can mount KCI, SCI, and inside attack.
1.1. Related Works
So far, the existing multifactor authentication schemes have used either the following factors or a combination of them (Figure 1):(1)What you know (password): Password is the most conventional method of providing security and authentication. However, it is prone to many attacks such as loss of password, eavesdropping, and password guessing attacks [3].(2)What you are (biometric): In this method, user’s identity is recognized using their fingerprint, facial features, hand shape, iris structure, voice etc. [4, 5]. The main challenge to widespread implementation of this method is its cost. Also, they are unrecoverable once compromised. For instance, at GeekPwn 2019 conference in Shanghai, it was shown how to create and use a photograph of a user’s fingerprint to unlock their smartphone in less than 20 minutes [6].(3)What you have (smart card and PUF): Given the challenges of the above factors, researchers adopted additional factors (smart card/PUF) to enhance security. In the following, we review the existing three-factor (password, biometric, and smart card) authentication schemes. Jiang et al. [7] used ECC to construct a temporal credential-based scheme with untraceability. Li et al. [8] pointed out several security and functional weaknesses in Jiang et al.’s scheme such as lack of PFS and KSSTI attack and proposed an improved user authentication scheme suitable in the IIoT environment. His scheme is a three-factor based authentication scheme which is compatible for WSNs in the IoT environment. Their scheme also cannot provide real three-factor security, suffers DoS attack, and does not support user revocation. Apart from conventional cryptographic schemes such as ECC-based or RSA-based, chaotic hash operations using Chebyshev polynomial has been adopted by researchers. Roy et al. [9] presented a new chaotic map-based user authentication scheme which exploits user biometrics with the fuzzy extractor, password, and smart card simultaneously. Later, Srinivas et al. [10] and Wang et al. [11] used the chaotic map to propose anonymous biometric-based authentication schemes for IIoT and WSNs, respectively. In all above chaotic map-based schemes, the RoR model is used to verify their claimed security properties. These schemes cannot provide real three-factor security and suffer from the clock synchronization problem. In addition, Wang et al.’s scheme does not support the user revocation phase. Kumar et al. [12] used ECC to propose a secure three-factor authentication scheme for WSN. Again, their scheme does not provide real three-factor security, suffers from the clock synchronization problem, and lacks the user revocation phase. To remedy such weaknesses, Abdi Nasib Far et al. [13] put forth an ECC-based lightweight anonymous privacy-preserving three-factor authentication scheme for WSN-based IIoT. Although, their scheme can support sensor node dynamic registration, password and biometric change, and revocation phase, in this paper we show that it cannot provide real three-factor security and suffers from the clock synchronization problem. In addition, the security level of their scheme in regard to PFS is weak. Recently, Raque et al. [14] proposed an efficient symmetric-based authentication scheme for IIoT. In this scheme, they only used symmetric operations such as bitwise XOR operation and hash function. Therefore, their scheme cannot provide some security properties such as PFS. In addition, it suffers from the clock synchronization problem. Saeed et al. [15] addressed preserving privacy of user identity based on the pseudonym variable in 5G. They put forth a solution to preserve the user identity privacy in the 5G system using the variable mobile subscriber identity (VMSI), which is changed frequently. Amanlou et al. [16] suggested a lightweight and secure authentication scheme between the fog gateway and IoT devices based on the Message Queuing Telemetry Transport (MQTT) publish–subscribe protocol in a distributed fog computing architecture. Also, some investigate DoS attack in mobile ad-hoc networks [17]. Given the successful application of physical unclonable functions (PUFs) in IoT systems for remote authentication, PUF has been used in the biometric-based authentication scheme to enhance the security level [18]. Bian et al. [19] and Zhao et al. [20] applied the characteristics of PUF and fuzzy extractors for biometric authentication in single server and multiserver environments, respectively, but neither could provide perfect forward secrecy. Neither of these schemes realized the potential of PUF to construct a cancelable biometrics template. Recently, Zhang et al. [21] proposed a template transformation method using the response of PUF briefly known as PUF-TTM to generate protected cancelable biometrics. His suggested method ensures the security and privacy of biometric templates. In addition, he evaluated the recent methods of constructing a biometric template in terms of performance, unlinkability, and revocability on the LFW dataset (which is a popular data set for face recognition used by many researches on the CB scheme). The PUF-TTM method (which is employed in our proposed scheme) has superior performance in all aspects in this evaluation. The contribution and limitation of each scheme is presented in Table 1. Chan et al.’s scheme, as the first big data-based scheme, only achieves authentication of the server to the client but not vice versa. In addition, it is not truly two-factor secure, assuming a strong adversary. Other security requirements are also not satisfied. On the other hand, Liu et al.’s scheme achieves mutual authentication between the server and the IoT device, but their scheme is not truly two-factor secure and suffers serious vulnerabilities such as KCI attack, inside attack, and irrevocability under a strong attacker model. The same vulnerabilities are observed in their postquantum secure scheme. Our proposed scheme as well as our postquantum secure scheme can achieve all the desired properties under a strong attacker model.(4)Fourth factor: Given the advances made in side-channel attacks, smart card-based methods are not promising methods anymore. In addition, biometric-based solutions require large overhead which is not possible in IoT environments. Further, in settings without human involvement, a fourth factor is needed as a general-purpose method. “Whom you know” [24] and “where you are” [8] methods have been suggested to fill this gap. In a parallel effort, historical big data, stored in the server over such a long time, have been suggested as a new factor (“what have been discussed”). This method was pioneered by Chan et al. [1] who proposed a novel big data-based unilateral two-factor authentication scheme. The two factors include the shared long-term key and all available historical data and their corresponding tags, which are stored as data-tag tuples. They proved the security of their scheme in a bounded retrieval model where the adversary can only have access to part of the data-tag tuples. Following the same method, Liu et al. [2] introduced an enhanced authentication scheme which achieves more security properties including mutual authentication, forward secrecy, key compromise impersonation resilience, and server compromise impersonation resilience.

1.2. Motivation
A practical two-factor authentication and key agreement protocol such as big data-based protocols must achieve security in the presence of real attackers. However, existing big data-based schemes (Chan et al. [1] and Liu et al. [2]) assume a weak type of adversary called the bounded retrieval model that can only access a small fraction of the data-tag tuples stored in the server after compromising the server. This type of respectful adversary is a weak assumption and does not exist in reality. Once the server is compromised, the attacker does not differentiate between different items and steals the whole parameters in the database. Figure 2 shows a schematic of our assumption about the adversary and previous schemes. We show that Liu et al.’s scheme cannot achieve its claimed security properties (including two-factor security, tag secrecy, and KCI and SCI (this attack is not a standard attack against security of authentication protocols) resilience) in a real attacker setting. In addition, in real IoT scenarios, the server is usually in contact with different clients with different identities. However, in Liu et al.’s scheme, the client is anonymous which makes their scheme irrevocable and also prone to inside attack. Therefore, our main motivation is to design a secure two-factor authentication and key agreement protocol based on big data which achieves real two-factor security, KCI, PFS, and user anonymity under a real attacker model.

1.3. Our Contribution
Our main contributions in this paper include the following:(1)At first, we review Liu et al.’s scheme and show its vulnerabilities such as not truly two-factor security, being prone to KCI attack, irrevocability, privileged inside attack etc. Then, we propose our improved big data-based scheme which satisfies truly two-factor authentication under a real and strong attacker model. In addition, it achieves mutual authentication, perfect forward secrecy, key compromise impersonation resilience, resistance to inside attack, revocablity, and unlinkability of the client. Table 1 shows the advantages of our proposed scheme compared to previous ones.(2)Secondly, we propose a postquantum secure scheme which is suitable for resource-constrained IoT devices and proven secure against future advent of quantum computers.(3)Thirdly, we prove the achievement of the mentioned security properties of our proposed scheme informally and formally using the real-or-random (ROR) model and the Bellare and Rogaway model.(4)Finally, we use the Raspberry Pi 3 Model B+ as the IoT device and a PC as the server to implement our proposed scheme and compare its performance with previous schemes. The results indicate that the time complexity of our scheme is reduced by one third compared to Liu et al. Also, the communication cost of our proposed scheme and conventional schemes (Zhao et al. [20], Zheng et al. [22], and Zhang et al. [21]) is in the same range. Our proposed scheme require more communication and storage overhead for 125 and 104 bytes compared to Chan et al.’s [1] and Liu et al.’s [2].
2. Preliminaries
For the integrity of the paper, we present the relevant preliminaries, threat model, and security requirements of authentication protocols. In addition, a summary of notations is shown in Table 2.
2.1. Signer Efficient Multiple-Time Elliptic Curve Signature (SEMECS)
Digital signatures are basic primitives which provide message authentication and nonreputation in security networks. However, typical digital signature schemes cannot be directly used in IoT resource-constrained devices due to large private key and signature sizes. Recently, researchers tried to design an ultra-lightweight signature scheme for resource-constrained devices. In this paper, we use Yavuz’s and Ozmen scheme [25] (signer efficient multiple-time elliptic curve signature (SEMECS)) as state of the art ultra-lightweight signature scheme. This scheme falls into the category of K-time signature schemes, wherein after K signatures, the signing key pair must be regenerated. Algorithm 1 describes the key generation, signature, and verification of SEMECS:
|
The security of the SEMECS scheme is based on the DLP problem:
Definition 1. Discrete logarithm problem (DLP): Let be a prime value and be nonzero integers (mod ). The problem of finding such that (mod ) within polynomial time is a hard task.
2.2. Threat Model and Security Factors
Previous big data-based schemes (Chan et al. [1] & Liu et al. [2]) assumed a bounded retrieval model, wherein the attacker could only access a small portion of the data items stored in the server’s database after compromising the server. In this paper, we adopt a stronger adversary who can obtain the whole data items after compromising the server. In addition, we assume the following capabilities for the adversary:(1)The adversary can obtain the whole secrets stored in the IoT device using side-channel techniques.(2) can block, intercept, modify, delete, and resend any message transmitted through the public channel.(3)In case of carrying out perfect forward secrecy attack, we assume that can obtain the long-term secrets of both the client and the server.(4) can only obtain one of the two security factors but not both of them simultaneously. The main two security factors used in our scheme include the following:
(i) The secrets stored in the server side, i.e., .(ii) The secrets stored in the client side, i.e., .2.3. Security Requirements in IoT Authentication Schemes
2.3.1. Mutual Authentication
The communicating parties should authenticate and verify each other’s legitimacy before exchanging secret messages. This is the most basic requirement which prevents adversaries from impersonating legitimate parties.
2.3.2. Key Agreement with Secrecy
As IoT networks are deployed in various applications including healthcare, industry, and military purposes, sensitive data such as user’s identities, secrets keys, and confidential commands need to be private and shared only between legitimate parties. Therefore, after completing the authentication, the communicating parties should establish a shared session key to safeguard their secret information from adversary.
2.3.3. Two-Factor Security
In order to increase the security level of authentication schemes, two-factor security requires that even if one of the security factors is leaked, security properties should not be violated.
2.3.4. Revocability
According to this requirement, if one of the devices is lost or stolen, the server should be able to revoke its membership without a great change of the whole network.
2.3.5. Anonymity and Unlinkability
In most scenarios, not only the real identity of the device should be hidden (anonymity) but also the adversary should not be able to find any link or relation between the pseudoidentity of the suspected device in different sessions. Otherwise, the suspected device can be easily traced. This requirement is known as unlinkability.
2.3.6. (Perfect) Forward Secrecy
Forward secrecy is a well-known requirement in the authentication protocol which ensures that if long-term parameters of one party (such as long-terms keys and the values stored in devices) are revealed in one session, the session keys of previous sessions should not be disclosed to adversary.
Perfect forward secrecy also requires the secrecy of previous session keys assuming the disclosure of both parties’ long-term parameters. This property can be achieved easily by using public-key operations such as Diffie-Hellman. However using public-key operations in tiny IoT devices is not feasible due to its high computation overhead.
2.3.7. Key Compromise Impersonation (KCI) Resilience
KCI resilience requires that the compromise of one party should not let the attacker impersonate other entities to that party. In another word, a key agreement protocol is KCI-resilient if compromise of the long-term parameters of a specific principal does not give any chance to the attacker to construct a session key with that principal through impersonation as a different principal.
2.3.8. Resistance to Privileged Inside Attack
In IoT networks, the server is usually in contact with billions of tiny IoT devices, wherein the chance of their compromise is quite high. According to this requirement, not a single hostile device should be able to impersonate other devices.
2.3.9. Resistance to Well-Known Attacks
Every security protocol in the IoT network should resist against well-known attacks such as Man-In-The-Middle (MITM), replay, and DoS. These attacks target the basic requirements of authentication protocols.
3. Review of Liu et al.’s Authentication Scheme
3.1. Initialization Phase
In this phase, the security parameter (e.g. or ) is chosen by the server . Also the public parameters are initialized as follows: a group of prime order , a generator of of , a cryptographic hash function , and two pseudo-random functions (PRFs) . In addition, the server produces the public/private key pair , wherein . It also generates , where as a long-term shared key between the IoT device and the server and for tag generation and data processing, where and . The server holds a dataset with data items . A tag is generated for each data item by the server. The dataset is defined as a container for all data items and tag tuples .
The index parameter is also chosen by the server.
3.2. Authentication Phase
The authentication procedure between the IoT device and the server is summarized as follows. All messages are transmitted through the public channel which is denoted by the dashed line in Figure 3.(1)The IoT device chooses a random number to compute and . Then, it chooses and a random subset of distinct indices for the tuples in . Then, it sends and to the server .(2)After receiving the above message, the server calculates and checks if the relation holds or not. If it holds, the server chooses a subset of distinct indices disjoint from . Then, the values of and are computed, where . Here, the values of are computed in the finite field . In addition, it chooses the random value to compute . Finally, the values of and are sent to the IoT device . Here, ① represents the messages transmitted in the first round, i.e. .(3)After the IoT device receives the message, it computes and , where . Next, it computes and verifies if holds. If this relation holds, it computes and transmits it to the server . Finally, the IoT device calculates the session key and the session identifier as and , respectively.(4)Upon receiving the message, the server checks if the relation holds or not. If it holds, its session key and session identifier are calculated as and , respectively.

3.3. On the Flaws of Liu et al.’s Authentication Scheme
3.3.1. Not Truly Two-Factor Security
Two-factor security requires that the security properties should not be violated in the case of compromising one factor. In the Liu et al.‘s scheme, the two security factors are as follows:(i)The long-term private key shared between the client and the server.(ii)The server’s datasets which contains a large number of data items denoted as along with their corresponding tags which are stored as tuples in the server’s database .
Liu et al. clearly claim that the adversary cannot forge the tags after compromising the server. However, they assumed a weak model of adversary called the bounded retrieval model that can only obtain a small fraction of the data and tag tuples , when it compromises the server. However, a strong model of adversary not only steals the whole datasets but also obtains the other factor i.e. the long-term shared private key after compromising the server. In other words, the security of the Liu et al.’s scheme is based on a single factor, i.e. . In this strong model, other security requirements such as KCI no longer hold.
3.3.2. Being Prone to KCI Attack
According to the KCI definition [26], even if the long-term secrets of one party are revealed, the attacker should not have any chance to impersonate other parties to the corrupted one. Despite Liu et al.’s claim, their scheme cannot be KCI resilient. Once the server is corrupted, the attacker can impersonate the client . To this end, the attacker who is given long-term parameters of the server, , chooses a random number to compute and . Then, it chooses and a random subset of distinct indices for the tuples in . Then, it sends and to the server . After receiving the server’s response, i.e , it computes In addition, using server’s long-term parameters, it computes . Then, it computes and transmits it to the server . Finally, it calculates the session key and the session identifier as and , respectively, and makes further connection with the server using this session key. After receiving the attacker’s message, the server accepts the attacker as a legitimate client as the relation holds.
3.3.3. Inefficient Data Items Authentication
In order to authenticate data items for the IoT device in both Chan et al. and Liu et al.’s scheme, each data item has a corresponding tag . The server computes the value of and sends the sum of data items to the IoT device along with the hash value of in message . The IoT device needs to construct the key corresponding to each data item index and compute the value of by himself and make sure if the server owns by checking the relation . However, there is no clear logic behind using tag items and exploiting this complicated authentication method. There are simpler methods to authenticate the data items . For instance, a simple MAC can resolve the issue. In the next section, we propose our enhanced scheme which uses a more efficient method to realize dataset authentication for the IoT device.
3.3.4. Extensive Modular Exponentiation
Modular exponentiation is an expensive discrete-logarithm operation which is so time-consuming for resource-constrained users to perform locally. Some researchers have adopted cloud computing to securely outsource modular exponentiation to cloud servers to reduce computation overhead. Liu’s scheme employs 6 modular exponentiations (3 operations by the client and 3 operations by the server) to achieve its desired security goals. Liu et al. have been aware of the large computational overhead of their scheme. They suggested an enhanced version of their basic scheme which employs 4 modular exponentiations (1 operation by the client and 3 operations by the server) and computes the values of in advance and stores them in the client’s device. This solution is also unrealistic due to the memory limitation of the client’s device. In addition, they truly claim that their enhanced scheme cannot achieve prefect forward secrecy.
3.3.5. Anonymous Client Authentication and Irrevocability
In the IoT environments, the server is usually in contact with so many different clients. In the authentication phase of Liu et al.’s scheme, it is not clear how the server recognizes which client is his partner. Therefore, in the case of the loss or stealing of one device, no revocation mechanism is designed to address this issue.
3.3.6. Privileged inside Attack
All the clients share the same keys with the server. In addition, the server also holds the same data items and tags for the whole clients. Therefore, a hostile client can impersonate the rest of the clients.
4. Our Proposed Scheme
4.1. Initialization Phase
As shown in Figure 4, when a client with identity wants to register to the server , it first generates its signing secret/public key with a specific K (the value K represents how many sessions are required to change the signing keys) using Algorithm 1 and submits to the server through a secure channel. The server initializes the hash function and chooses as an initial shared key between the IoT device and the server. Here, . Then, it generates its signing secret/public key using Algorithm 1. In addition, the server chooses a random pseudoidentity for the client and stores the client’s real and pseudoidentity along with its public key and shared key in its own database. In addition, it initializes a session counter for each client. data items are also stored in the server’s database as the second security factor. Finally, it delivers and to the client in a secure manner. The client also initializes a session counter . The long-term secrets of the server and the client are represented as and , respectively.

4.2. Authentication Phase
(1)At first, the IoT device checks its session counter to see if . If it holds, it sends its pseudoidentity to the server . Otherwise, it warns the client to register again.(2)After receiving the client’s pseudoidentity, the server searches for the ’s corresponding record in its database. If the corresponding session counter , it chooses a random subset of distinct indices of the records in and responds to the client. Otherwise, it aborts.(3)The IoT device chooses a random number and computes . Then, it chooses a subset of distinct indices of the records in disjoint from . Then, it sends and to the server .(4)After receiving the above message, the server calculates and checks the equality . If it holds, then the value of is computed. In addition, it chooses the random value and computes mod and signs the parameters using Algorithm 1. Finally, it sends to the IoT device .(5)When the IoT device receives message , it computes mod and checks if holds. If this relation holds, it constructs the session key and the session identifier as and , respectively. Then, it computes and signs the parameter as using Algorithm 1 and transmits it to the server . Finally, it updates its pseudoidentity, session counter , and shared key as .(6)After receiving the message , the server computes the session key and the session identifier as and , respectively. Then, it checks the equality . If this equality holds, it updates client’s pseudoidentity, session counter , and shared key in its database as . Figure 5 depicts a schematic of our proposed scheme.4.3. Revocation Phase
In the case of the compromise or loss of the IoT device of any client with identity and pseudoidentity , the server will remove from the database and disables the attacker to use the network or sign any messages. Then, the client chooses new pseudoidentity and creates signing secret/public key using Algorithm 1 and delivers the updated to the server in a secure manner. Then, it stores the updated in the new IoT device.

4.4. A Postquantum Secure Scheme
The potential advent of large-scale quantum computers has prompted concerns among security research professionals due to their capability to solve (elliptic curve) discrete logarithm and integer factorization problems in polynomial time. Therefore, the primitives whose security is based on such problems are no longer secure in this area.
In order to adapt to the different requirement and performance challenges of the IoT environment, the PQCRYPTO project [27] has been initiated to address the design of efficient and high-security postquantum systems. Previous postquantum secure authentication and key agreement schemes have either used Supersingular Isogeny Diffie–Hellman (SIDH) [28, 29] or lattice-based cryptography [30, 31] (e.g. Liu’s postquantum secure scheme [23]).
The main primitives in our scheme include xor, hash function, and SEMECS signature operations. The first two primitives are not threatened by the computational capabilities of quantum computers; their security level is reduced by half though. However, the security of SEMECS signature is based on DLP which can be broken by quantum computers. In this paper, we follow PQCRYPTO’s recommendations to design a postquantum secure scheme. In this project, the XMSS signature scheme is recommended as a standard scheme with postquantum security. XMSS belongs to hash-based signature (HBS) schemes whose security is based on the security of hash function which is known to be secure against quantum computers. However, it requires thousands of hash operations for each Sig/Verify operation which is challenging for resource-constrained IoT devices. Recently, Ghosh et al. [32] have proposed a latency-area optimized XMSS Sign or Verify scheme with postquantum security. To this end, a hybrid HW-SW architecture has been designed and implemented, wherein each XMSS Sign/Verify operation takes 4.8 million clock cycles in their hybrid design. The details are skipped for the sake of space. For more information on the details of their design, refer to [32]. In Section 6, we compare the running time of our postquantum secure proposed authentication scheme using hybrid HW-SW design of the XMSS signature vs. the classical secure scheme with SEMECS signature.
5. Security Analysis
5.1. Informal Analysis
5.1.1. Perfect Forward Secrecy (PFS)
In our proposed scheme, even if the public values and secret values of both parties are given to the attacker, based on the one-wayness property of hash function, the attacker has no way to obtain the value , compute , and calculate the previous session keys i.e. . Similarly, the nonces are protected with the hash function . Therefore, compromise of the dataset items does not help the attacker to find .
5.1.2. KCI Resilience
In the following, we consider two scenarios and show how KCI resilience is achieved in both scenarios. Here, the signature is for the classical scheme and for the postsecure scheme.
Scenario 1. Server impersonation: In the first scenario, tries to impersonate the server to the compromised IoT device . Here, has access to the secret parameters of the client i.e. . needs to construct and signs the parameters using the secret signing key . However, the secret signing key of the server is not accessible to the attacker.
Scenario II. Client impersonation: In the second scenario, tries to impersonate the client to the compromised server . As a result, it holds long-term credentials of the server i.e. . Here, we give extra capability to the attacker and assume that it knows the client’s pseudoidentity and sends it to the server as the first message . After receiving the server’s subset i.e. , it generates the random number and computes . Then, it chooses a random subset of indexes and constructs . Finally, it sends and to the server as the third message . The server accepts message and responds with message i.e. . might construct but fails to sign it and respond , as the secret signing key of the client is not available to the attacker .
5.1.3. Anonymity and Unlinkability
In our proposed scheme, not only the client’s real identity is hidden to the attacker but also the pseudoidentity and the whole parameters of every session are changed in each session. Therefore, the attacker has no way to find any link between the sessions or trace the client.
5.1.4. Revocability
In our proposed scheme, each client has a unique identity, pseudoidentity, and signing secret/public key. If one of the IoT devices is stolen or lost, the server can recognize the lost device and stop serving it without changing the secrets of the whole network. The revocation mechanism in Section 4.3 addresses this issue.
5.1.5. Resistance to Privileged inside Attack
In our proposed scheme, the data items and client’s identities are bound to each other in the value. In addition, each client holds a separate secret signing key which prevents other hostile clients to impersonate it because it cannot sign and respond message to the server.
5.2. Formal Proof
In this section, we formally prove the session key secrecy, KCI, and perfect forward secrecy of our proposed scheme using the real-or-random model proposed by Abdalla et al. [33].
5.2.1. Participants
Our schemes involve two participants, i.e.: client and server . The i-th instance of participant is denoted by . The i-th instance of and the j-th instance of are represented by and , respectively.
5.2.2. Queries
Oracle queries represent the interaction between an adversary and the protocol participants. Actually, the adversary capabilities are modelled through queries. The following queries are used by :(i)Execute : This query captures the passive eavesdropping of a protocol which outputs all transmitted messages between .(ii)Send (, Start): The initialization of the protocol is denoted by this query.(iii)Send : This query simulates an active attacker, , who can forge message by manipulation, blocking, and intercepting. Then, transmits to instance and receives the response from .(iv)Reveal : This query models the leakage of the ’s session key to .(v)Corrupt : This query shows that can compromise either the client if or the server if . However, it can not compromise both of them simultaneously.(vi)Test : This query is used to model the secrecy of the session key generated by . After receiving this query, a binary bit is chosen such that in the case of , a random key with the same size as is returned to . If , SK is given to . This query can be used any time by the attacker not more than once.
5.2.3. Random Oracle
All participants including can call the “cryptographic one-way hash function,” H, which is modelled as a random oracle.
5.2.4. Partnering
Two instances and are called partners if (1) and hold the same session identifier , i.e. . and (2) is the partner identifier of and vice versa.
5.2.5. Freshness of Instance
An instance is called fresh if (1) has completed an accepted session key, (2) has used or its partner does not use any Reveal query, and (3) has used Corrupt query no more than once from the beginning of the games.
5.2.6. KCI-Freshness of Instance
An instance is called KCI-fresh if at the end of the games, (1) has completed an accepted session key, (2) neither Reveal nor Corrupt were performed by the attacker, and (3) after issuing Corrupt , the attacker can no longer issue query Send , wherein .
Definition 2. Semantic security of the session key as per the ROR model: can break the semantic security of the session key if it can differentiate an actual session key from a random key in a given instance. Let denote the advantage of the in breaking session key secrecy of our proposed scheme and refers to the event that uses a query for some freshly accepted instances and guesses for the bit that was chosen for the -query such that . We have . Our proposed scheme (PS) achieves semantic security of the session key if is negligible for any PPT attacker.
Theorem 1. Assume that a polynomial time adversary attempts to violate the semantic security of our proposed scheme (PS) or postquantum secure (PQS). ’s advantage in breaking the semantic security iswhere and denote the number of hash queries and range space of hash function , respectively. The proof is explained in appendix A.
Definition 3. Perfect forward secrecy (PFS) in the ROR model:
Let denote the advantage of the in breaking perfect forward key secrecy of our proposed scheme after it issues queries. Let refer to the event that uses a query for some freshly accepted instances and guesses for the bit that was chosen for the query such as . We have . Our proposed scheme (PS) achieves PFS if is negligible for any PPT attacker.
Theorem 2. Assume that a polynomial time adversary attempts to violate perfect forward security of our PS or PQS. ’s advantage in breaking the PFS is
The proof of this theorem is given in appendix B.
Definition 4. Proveable KCI-security: Let denote the advantage of the in impersonating the party to the compromised participant. Also refers to the event that finishes some KCI-freshly accepted instances. We have . Our schemes achieve KCI resilience against corruption of if is negligible for any PPT attacker.
Theorem 3. If denotes the advantage of the attacker in breaking KCI security against corruption of and represents the advantage of the attacker in breaking KCI security against corruption of , we have
Also, for our postquantum-secure scheme, we have
The proof of this theorem is given in Appendix C.
6. Security Analysis with ProVerif
In this section, we validate the security properties of our schemes using the widely used formal verification tool, ProVerif. ProVerif is a standard automatic analysis tool which verifies different security properties of authentication protocols including secrecy, authentication, and anonymity. Primitives such as encryption, decryption, digital signatures, and hash function can all be defined in this framework. Further, cryptographic hard problems such as the Diffie–Hellman problem and the elliptic curve problem can also be defined using “equations”. Here, the attacker can eavesdrop, insert, and delete the messages. After describing the protocol in a high-level language, it is converted to a low-level language or what we call “Horn Clauses”. In order to check security properties, queries are imposed on such horn clauses. We use different queries for checking each security property. To test secrecy of the term , the following query is used.
6.1. Query Attacker (M)
We also use events to verify authentication properties. Correspondence assertions are used to capture relationships between events which can be expressed in the form “if an event e has been executed, then event e has been previously executed.” Moreover, these events may contain arguments, which allow relationships between the arguments of events to be studied.
The result of the query is one of the followings:(i)If the security property holds, all possible scenarios are checked including any number of sessions and message size.(ii)Otherwise, the tool draws an attack trace which violates the required security property. However, it does not provide any solution to resolve the failure.
The main primitives and queries of our proposed BD-based scheme are shown in Figure 6. Also, the main steps of the client and the server are shown in Figures 7 and 8.



The results of queries are shown in Figure 9. The results indicate the secrecy of the session keys of the IoT device and the server. In addition, no attack trace can be found, and the two pairs of events are executed in order in all sessions.

7. Comparative Performance Summary
To demonstrate the usability of our proposed scheme, we provide a comparative measurement on the computation, communication, and storage cost of recent conventional (Zhao et al. [20], Zheng et al. [22], and Zhang et al. [21]) big data-based schemes (Chan et al. [1] and Liu et al. [2]) and our proposed scheme.
7.1. Computational Comparison
In order to compare the computational costs of our proposed scheme with previous relevant schemes, we implement our scheme and report the running time of each operation. To make our comparison more accurate, we used Liu et al.‘s framework which consists of a PC with Intel® Core™ i7-4770 CPU@3.4 GHz processor with 16 GB RAM as the server and a Raspberry Pi 3 Model B+ with ARM Cortex-A53@1.4 GHz processor and 1 GB RAM as the IoT device. In our scheme, we use SHA-256 for the hash function H ( = 256), which provides 128 bits of security against collision attack. Also, we used the Koblitz curve secp256k1.
In Table 3, we compare the number of different types of computations in our scheme compared to previous schemes for the IoT device and the server. Our scheme has reduced modular exponentiation operation and multiplication for both the IoT device and the server compared to Liu et al.’s scheme which causes significant reduction of the execution time. PRF E and PRF F are also no longer required. For the IoT device, one third of the execution time is reduced (4.2 ms). The execution time of the server is also reduced to 0.5 ms. All our source codes are available online [34].
Also, for our postquantum secure scheme, since the server holds a 3.4 GHz CPU and the IoT device holds a 1.4 GHz processor, the running time of each XMSS Sign/Verify operation would take 1.4 ms and 3.4 ms for the server and the IoT device, respectively. In total, our postquantum secure scheme has negligible computation overhead difference with the previous scheme (Liu et al.’s) i.e. less than 1 ms. Although conventional (Zhao et al. [20], Zheng et al. [22], and Zhang et al. [21]) schemes have lower cost, they fail to achieve the desired security requirements.
7.2. Communication and Storage Cost Comparison
In Table 4, we compare the communication cost of our proposed scheme with related schemes in terms of the bit length of the transmitted messages during the authentication phase. Similar to the assumption in [11], we assume the bit length of random number and identity to be 160 bits. As we use Koblitz curve secp256k1 and SHA-256, the ECC point multiplication , modular exponentiation , and hash output are 256 bits. According to Table 1 in Liu et al.’s P-quantum secure scheme [23], the size of KE protocol messages is 1584 and 1697 bytes between the client and the server if we choose the Threebar protocol. Also, we neglect the size of the data item index .
In addition, Table 5 shows the storage comparison of our schemes vs. previous ones. For the SEMECS signature, we set . Therefore, the size of the private/public keys are 256 bit (32 bytes) and 3 K bytes, respectively. Also denotes the big data size which is 3 GB. The size of each signature is also 512 bit (=64 B). Also for our postquantum scheme, the size of both private and public key is . As shown in Tables 4 and 5, the communication cost of our proposed scheme and conventional schemes (Zhao et al. [20], Zheng et al. [22], and Zhang et al. [21]) is in the same range with slight difference. Although, it requires more communication and storage overhead for 125 and 104 bytes compared to Chan et al.’s [1] and Liu et al.’s [2].
8. Conclusion
Using big data as a novel authentication factor for IoT was first initiated by Chan et al. and later improved by Liu et al., who added novel security properties such as PFS, KCI, and SCI resilience. In this paper, we showed that assuming a real strong attacker, KCI, and SCI resilience do not hold anymore in their scheme. In addition, their scheme suffers inside attack and does not provide revocability.
Then, we proposed our novel authentication scheme which provides PFS, KCI resilience, revocability, and inside attack resilience in a real attacker setting. Furthermore, our performance analysis shows that our scheme has reduced modular exponentiation operation and multiplication for both the IoT device server compared to Liu et al.’s scheme. Therefore, the running time of both the IoT device and the server is reduced by one third. Given the potential threat of the quantum computer, we also design a postquantum secure scheme using the lightweight XMSS signature scheme. The computation comparison shows that our postquantum secure scheme has negligible overhead difference with the previous postquantum secure scheme.
Appendix
A. Proof of Theorem 1
Proof. Our proof is based on the following games; say .(i)Game : This game simulates a real attacker running in the random oracle model who has access to all oracles. Thus, we have(ii)Game : This game models a passive attack by using query. can intercept transmitted messages on the public channel. As the session key is constructed as , the attacker needs to compute to evaluate . Eavesdropping messages does not help the attacker to this end. Therefore, has no additional advantage for wining this game. As a result,(iii)Game : This game models an active attacker who can use send and hash queries. The secret parameters of the session key are stored in . Using birthday paradox, the maximum probability of finding a collision with queries is . Therefore, the advantage of the attacker in this game compared to the game isIn order to win the game , after execution of the Test query, needs to guess the bit with maximum probability of :Using equations (A.1)–(A.4), we haveTherefore,
B. Proof of Theorem 2
Proof. Our proof is based on previous games, i.e. . In order to model the corruption of the client and the server, we use the following games :(i)Game: As an extension to , this game uses query to simulate corruption of the IoT device with side-channel techniques. Here, the attacker can obtain . However, the one-wayness property of hash function prevents him to compute which is essential to compute . Therefore, games are identical except for finding collision on using queries. Therefore, using birthday paradox, we have(ii)Game: This game simulates the corruption of the server using query. Compared to the game , the advantage of the attacker is the secret data items of the server. Therefore, the secret value of can be computed by the attacker. However, he can only obtain the hash value of the nonces with this secret i.e. , which is not useful to obtain the secret keys. Using the birthday paradox, the advantage of the attacker isWinning the game requires to guess the bit with maximum probability of :The advantage of the attacker to violate PFS of our proposed scheme isUsing triangular inequality on equations (A.7)–(A.10), we havewhere .
C. Proof of Theorem 3
Proof. In the following, we consider two scenarios: KCI resilience against the compromised client (games ) and the compromised server (games ): Game: This game simulates a real attacker who tries to finish the KCI-fresh instance after compromising the party by . Thus, we have(i)Game: This game models a passive attack by using query for previous sessions . can intercept transmitted messages on the public channel for previous sessions. All secret values are either encrypted or protected by the hash value. Thus, eavesdropping messages in the public channel of previous sessions does not help the attacker to finish the KCI-fresh instance . Therefore, has no additional advantage for winning this game. As a result,(ii)Game: This game models an active attacker who can use send and hash queries. For , in order to impersonate the server, it needs to sign message by the secret signing key . The attacker can only derive the server’s secret signing key given its public signing key by probability . Thus, Based on the games , the advantage of the attacker to violate KCI of our proposed scheme in the first scenarios is(iii)Game: This game corresponds to the corruption of the server in the second scenario. Here, the attacker obtains the secrets of the server . However, in order to impersonate the client, it fails to sign message , unless it solves the DLP problem to obtain the client’s secret signing key. Therefore,Based on the games , the advantage of the attacker to violate KCI of our scheme in the second scenarios isAlso for our postquantum secure scheme, the hard problem of the signature is finding collision in the hash function which can be mounted with probability . Therefore, is replaced with :
Data Availability
The data supporting the findings of the current study are available from the corresponding author upon request. All source codes are available in https://github.com/Crypto164/AKEwithBigdata/.
Disclosure
A preprint has previously been published [35].
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (62002074).