Abstract

The block cipher is an important means to provide data confidentiality in reality, and the S-box is an essential part in most of modern block cipher designs. In 1973, Feistel used a key selected S-box mechanism in his early block cipher designs, whose idea is to let each S-box have two different states and use a key bit to select which of the two states is to be used in an encryption or decryption operation. However, this key selected S-box mechanism has not got much attention in modern block cipher design with the DES block cipher published in 1977. In this paper, we revisit Feistel’s key selected S-box mechanism, give a generalised version of Feistel’s key selected S-box mechanism, compare it with existing close notions, and design the LBC example cipher to demonstrate that the generalised key selected S-box mechanism can be advantageous over the ordinary S-box mechanism in modern block cipher design for improving security and/or performance without intensifying computational effort and space in some application environments.

1. Introduction

The block cipher is an important primitive in secret-key cryptography. A block cipher is an algorithm that transforms a fixed-length data block, called a plaintext block, into another data block of the same length, called a ciphertext block, under the control of a secret user key. One main purpose of a block cipher is to provide confidentiality for data transmitted in insecure communication environments. A block cipher typically involves two types of operations, one is for the confusion property, which aims to make an involved relationship between ciphertext and plaintext/key, and the other is for the diffusion property, which aims to dissipate the statistical structure of plaintext over ciphertext. The diffusion operation is usually a linear permutation operation, and the confusion operation is usually made up of a nonlinear substitution box (S-box for short). An S-box takes as input a certain number of data bits and transforms them into a certain number of output bits in a nonlinear way, which is usually implemented as a lookup table. In modern block cipher designs such as DES [1] and AES [2], the S-box is usually an essential part and plays an important role in securing the ciphers.

In 1973, Feistel (the inventor of so-called Feistel ciphers) used a key selected S-box mechanism in his early block cipher designs [3, 4]. Feistel’s key selected S-box mechanism to let each S-box have two different states and use a key bit to select which of the two states is to be used in an encryption or decryption operation. However, this key selected S-box mechanism has not got much attention since DES was published in 1977 and has not been investigated in modern block cipher design, although there is occasionally an application [5] with the S-box replaced with such key selected S-boxes in AES.

In this paper, we revisit Feistel’s key selected S-box mechanism. First, we generalise Feistel’s key selected S-box mechanism, and the generalised key selected S-box mechanism is to store several specific S-boxes (with the same dimension sizes) into a table and use certain key (or subkey) bits to select which of the S-boxes should be used in each S-box position of the S-box layer of a round of a cipher in an encryption or decryption operation. Then, we compare the generalised key selected S-box mechanism with existing close notions and find that the generalised key selected S-box mechanism can offer extra security without intensifying computational effort and space, by producing many key-dependent choices for the round function of a cipher; in particular, it is well resistant against not only conventional cryptanalysis methods such as differential cryptanalysis [6] and linear cryptanalysis [7] but also recently emerging sophisticated variants such as multiple differential cryptanalysis [8], multiple linear cryptanalysis [9], and multidimensional linear cryptanalysis [10]. The extra security gain can allow us to reduce the number of rounds for the sake of performance, as long as the overhead caused by the key selected S-box mechanism in comparison with the ordinary S-box mechanism is negligible when compared with the gain resulted from the reduced number of rounds. Finally, we design the LBC cipher as an example to demonstrate that the generalised key selected S-box mechanism can be advantageous over the ordinary S-box mechanism in some application environments, where we define the combined difference distribution table and the combined bias distribution table for a generalised key selected S-box and describe frameworks to analyse the security of a block cipher with a generalised key selected S-box against differential and linear cryptanalysis. For this example cipher, the key selected S-box mechanism offers a software speedup of around 12% on a lightweight ARM NEON processor and a software speedup of around 16% on a general-purpose Intel i3 processor and offers a hardware speedup of around 22% in a parallel hardware implementation with one cycle per round, although it requires slightly more gate equivalents (GEs) than the ordinary S-box mechanism in this particular parallel case. However, nevertheless, note that, like most of block cipher designs, we only consider the algorithmic security in the black-box model and do not consider the physical security of its implementations, such as side-channel attacks [11], which work in the gray-box model (that assumes an attacker having more power than the black-box model) and usually need additional resistance countermeasures; we note that an implementation of the (generalised) key selected S-box mechanism may be more vulnerable to side-channel attacks; however, the applicability of side-channel attacks is completely dependent on application environments and no single cipher design can be optimal in all application environments.

The remainder of the paper is organised as follows. In Section 2, we give the abbreviations and notation used throughout this paper. In Section 3, we generalise Feistel’s key selected S-box mechanism and compare it with existing similar notions. We specify the LBC block cipher in Section 4, discuss its design rationale in Section 5, and evaluate the security and performance gain of the key selected S-box mechanism over the ordinary S-box mechanism under the LBC example cipher in Sections 6 and 7, respectively. Section 8 concludes this paper.

2. Abbreviations and Notations

In all descriptions we assume that the bits of an -bit value are numbered from 0 to from right to left, with the most significant bit being the th, a number without a prefix expresses a decimal number unless stated otherwise, and a number with prefix expresses a hexadecimal number. We use the following abbreviations and notations throughout this paper.: gate equivalentSIMD: single instruction multiple dataSPN: substitution-permutation network: bitwise logical exclusive OR (XOR) operation: left rotation (of a bit string) by bits: string concatenation: functional composition; when composing functions and , denotes the function obtained by first applying and then : in binary (base 2) notation: the bit length of a value : bits of an -bit value ,

3. The Generalised Key Selected S-Box Mechanism

In this section, we generalise Feistel’s key selected S-box mechanism and compare it with existing close notions by discussing their similarities and differences.

3.1. Definition

Definition 1. A two-variable function (for specific values of , , and ) is called a key selected S-box if there are ordinary (that is, one-variable) -bit S-boxes with indexes from 0 to and, for each fixed -bit value , that is, some bits of key material (e.g., a round key), refers to the th -bit S-box.
We call the selection vector and write as for any fixed or simply write .

3.2. A Comparison with Key-Dependent S-Boxes

Key-dependent S-boxes [12, 13] are a class of S-boxes whose input bits include some key (or subkey) bits.

At a high level, the key selected S-box mechanism can be treated as a simplified version or a special case of notion of the key-dependent S-box:(i)The key selected S-box has two input parameters, one is of course the key parameter and the other is what we usually refer to as the data parameter, so is the key-dependent S-box.(ii)Like the key-dependent S-box, if designed carefully, the key selected S-box may also result in a better performance with a reduced number of rounds by providing a greater security than the ordinary S-box mechanism and making the resulting cipher particularly resistant to multiple differential cryptanalysis [8], multiple linear cryptanalysis [9], and multidimensional linear cryptanalysis [10]. Because different differential characteristics or linear approximations usually require different sets of key (or subkey) bits under the key selected S-box, an attacker needs to specify the corresponding selecting key bits when establishing a differential characteristic or linear approximation, which shrinks the remaining key space that can be guessed in the key recovery phase. By contrast, under the ordinary S-box mechanism, a differential characteristic or linear approximation generally works under a random key, and an attacker does not need to specify the corresponding key bits when establishing a differential characteristic or linear approximation, and different differential characteristics or linear approximations can presumably work under the same key, and all these facts leave the full key space that can be guessed in the key recovery phase. As a consequence, under the key selected S-box mechanism, we do not need to additionally increase the number of rounds of a cipher due to the effect of multiple differential cryptanalysis, multiple linear cryptanalysis, and multidimensional linear cryptanalysis, which may produce a performance gain.

However, the key selected S-box is slightly different from the key-dependent S-box.(i)The current key-dependent S-box construction methods such as [12, 13] generally involve a number of interactions (at least 2, which is from the key-dependent S-box built from an ordinary S-box in [14], one XOR with the input of the ordinary S-box and one XOR with its output) between the key parameter and the data parameter, which is costly. While in the key selected S-box, the key parameter serves simply as the index to the associated ordinary S-boxes and then produces the output after only one simple interaction with the data parameter. In other words, the key selected S-box is usually much less computation intensive than the key-dependent S-box.(ii)In the current key-dependent S-boxes, the key parameter usually has the same role as and the same dimension size as the data parameter for a good randomness, and the key-dependent S-box can usually produce a relatively large number of instantiations over the key parameter space. By comparison, in the key selected S-box, the key parameter has a different role with the data parameter and usually has a smaller dimension size than the data parameter, as we use next in LBC.

3.3. A Comparison with DES (-like) S-Boxes

The notion of key selected S-box is similar to the notion of a DES (or DES-like) S-box [1], which is an ordinary ( bit) S-box involving only one input parameter, the data parameter, rather than a key-dependent S-box involving the data and key parameters, but a DES S-box uses two bits of the data parameter as the index to the four rows of the S-box table each of which can be treated as an ordinary ( bit) S-box. However, the key selected S-box is different from a DES S-box in which the key selected S-box has the other input parameter, the key parameter, which causes a distinction from the two bits of a DES S-box used for the index to the four rows, although they both serve as an index, for example,(i)When applying the differential cryptanalysis method [6] at an S-box level, for a DES S-box, we can generate its difference distribution table and then use it under the general assumption that data is distributed uniformly at random, but for a key selected S-box, although we can generate the difference distribution tables of the associated ordinary S-boxes, we have to guess the specific value of the key parameter in order to determine which difference distribution table should be used, since the key parameter is fixed once a user key is provided and thus is not distributed uniformly at random for the data produced with the user key.(ii)When applying the differential cryptanalysis method at a cipher level, the differential behaviors of the rounds of a cipher using a DES-like S-box are simply iterations of the differential behavior of a round since data is distributed uniformly at random; however, for a cipher using a key selected S-box, although we can make a guess for the values of the key parameters of a few rounds, the guessed values of the key parameters of the few rounds will shrink the space of possible user keys, and eventually the space of possible user keys will become very small or empty after a number of rounds, which would make it no sense to cryptanalyse the cipher any more.In short, like the key-dependent S-box, the key selected S-box makes more difficulty than an ordinary S-box for an attacker to apply differential cryptanalysis. The same situation holds for linear cryptanalysis [7], multiple differential cryptanalysis, multiple linear cryptanalysis, multidimensional linear cryptanalysis, etc.

3.4. A Comparison with Lucifer S-Box Mechanism

The DES precursor Lucifer [15] also uses a key bit to control which of its two S-boxes is to be used as follows. Suppose and are two four-bit S-boxes, and are four-bit nibbles, and some key bit is a so-called Interchange Control Bit (ICB). When ICB is equal to 0, then will go through and will go through ; when ICB is equal to 1, then will go through and will go through .

Lucifer S-box mechanism is different from the key selected S-box mechanism, which is best illustrated by the simple example with only two S-boxes in Figure 1. In the Lucifer S-box mechanism, the outputs of and are dependent. If goes through , then must go through , and vice versa. However, in the key selected S-box mechanism, whether will go through or is independent from whether will go through or , since the two selection key bits are independent. In this simple example, the Lucifer S-box mechanism can produce two possible output patterns: and , while the key selected S-box mechanism can produce four possible output patterns: , , , and . Other small distinctions include (1) the relative positions of and are variable in the output of the Lucifer S-box mechanism, while the relative positions of and are fixed in the output of the key selected S-box mechanism and (2) the relative positions of and are fixed in the output of the Lucifer S-box mechanism, while the relative positions of and are indeterminate in the output of the key selected S-box mechanism.

3.5. A Comparison with Key-Dependent S-Box Layers

In 1994, when discussing how to strength the DES block cipher, Biham and Biryukov [14] mentioned the idea of using several sets of S-boxes (for the S-box layer of the DES round function) and using additional key bits to control which set is used (in an encryption/decryption operation), by writing that ‘One can compute several different sets of S-boxes according to the design principles of DES and use additional key bits to control which set is used.’ In 1999, Harris and Adams [16] mentioned a slightly different idea, which uses several S-boxes in a key-dependent order (also for the S-box layer of the round function of a cipher), by writing ‘Another possibility is to order the s-boxes in a key-dependent way.’ However, neither Biham and Biryukov nor Harris and Adams implemented their idea, and they mentioned that the security gain is small when the number of (the sets of) S-boxes is small; specifically, Biham and Biryukov mentioned that the scheme is strengthened by a factor smaller than the number of the sets of S-boxes, and Harris and Adams mentioned that it is not particularly useful with only four S-boxes, (since there are only possible orders, adding less than five bits of entropy to the key space). In other words, Biham and Biryukov’s and Harris and Adams’s mechanisms would require a large number of S-boxes in order to produce a large security gain in practice.

Compared with Biham and Biryukov’s and Harris and Adams’s mechanisms [14, 16], the (generalised) key selected S-box mechanism can produce a much larger security gain at the expense of relatively more overhead, given a small number of S-boxes. Below, we discuss other similarities and differences between the (generalised) key selected mechanism and Biham and Biryukov’s and Harris and Adams’s mechanisms. These similarities and differences are better illustrated by the typical example in Figure 2, where are four S-boxes with the same size, is a user key, and are round keys for some positive integer :(i)Storage space required: Biham and Biryukov’s mechanism [14] requires storing a number of sets of permuted S-boxes for the S-box layer of the round function, while Harris and Adams’s mechanism and the (generalised) key selected S-box mechanism require storing a number of S-boxes for the S-box layer of the round function. Thus, Biham and Biryukov’s mechanism generally requires a larger storage space than Harris and Adams’s mechanism and the key selected S-box mechanism.(ii)The number of choices on the S-box layer of the round function of a cipher: Biham and Biryukov’s mechanism produces the same number of choices as the sets of permuted S-boxes stored, Harris and Adams’s mechanism produces all possible permutations of the S-boxes stored, while the key selected S-box mechanism produces all possible patterns of the S-boxes stored.(iii)Under Biham and Biryukov’s and Harris and Adams’s mechanisms, the number of total choices for all the S-box layers of a cipher is equal to the number of total choices on an S-box layer of the cipher. While under the key selected S-box mechanism, the number of total choices for all the S-box layers of a cipher can be equal to the key space at maximum in theory.(iv)Given a user key: Biham and Biryukov’s and Harris and Adams’s mechanisms use the same S-box layer for all rounds of a cipher, while the key selected S-box mechanism likely uses different S-box layers for different rounds. This significantly increases the security gain, albeit at the expense of relatively more implementation overhead if used under the same number of rounds, but nevertheless the security gain can allow for a reduced number of rounds so that a better overall performance may be possible, depending on specific cipher designs.(v)When implemented in a parallel hardware with one cycle per round, the key selected S-box mechanism generally requires slightly more hardware area (or GEs) than its counterparts using Biham and Biryukov’s and Harris and Adams’s mechanisms. Anyway, when implemented in a serial hardware, the key selected S-box mechanism may produce a more compact implementation, depending on the reduced number of rounds owing to the security gain.

Particularly, coming back to the typical example in Figure 2, Biham and Biryukov mentioned that the security gain is small (i.e., 2 bits of entropy) when the number of the sets of S-boxes is small, and Harris and Adams mentioned that their mechanism is not particularly useful with only four S-boxes, since there are only possible orders, adding less than five bits of entropy to the key space. However, the key selected S-box mechanism can produce a much larger security gain even with the small number of four S-boxes, and specific security gain depends on a specific cipher design that the key selected S-box mechanism applies to.

3.6. Summary

In summary, the key selected S-box is similar to but more or less different from existing close notions; it is simple to construct a key selected S-box from ordinary S-boxes, and it produces greater security improvement. A modern block cipher can gain extra security by using the (generalised) key selected S-box mechanism and can gain a better performance by reducing the number of rounds according to the extra security gain, as long as the overhead caused by the key selected S-box mechanism in comparison with the ordinary S-box mechanism is negligible when compared with the gain resulted from the reduced number of rounds.

4. The LBC Block Cipher

In this section, we specify the LBC block cipher, which employs a Feistel structure with a 64-bit block size, a variable length user key from 96 to 128 bits and a total of 25 rounds and takes advantage of the key selected S-box mechanism to achieve a good security and performance. LBC uses two elementary operations and involves three subalgorithms, namely, a key schedule algorithm, an encryption algorithm, and a decryption algorithm.

Below, we first describe the two elementary operations used in LBC, then the round function, the key schedule algorithm, the encryption algorithm, the decryption algorithm, and finally several test vectors of LBC.

4.1. Elementary Operations

LBC mainly uses two elementary operations: a confusion operation and a diffusion operation , which are defined as follows:(i) is a nonlinear substitution operation, constructed by applying a key selected S-box eight times in parallel to the inputs. The four general -bit S-boxes involved in the key selected S-box are , , , and , which we chose according to the most recent work on 4-bit optimal S-boxes owing to Zhang et al. [17], whose specifications are given in Table 1. If is a 32-bit block represented as four bytes which are further arranged as a -bit array:and is a 16-bit block; then, is defined to equal a 32-bit value represented as four bytes that are further arranged as a -bit array:where , for .(ii) is a linear transformation. If is a 32-bit block represented as four bytes; then, is defined as

Note that there are several equivalent descriptions for the operation.

4.2. Round Function

The round function of LBC is built mainly on the nonlinear substitution operation and the linear operation, which takes two 32-bit blocks as inputs and outputs a 32-bit block.

If and are 32-bit blocks, then the round function of LBC is defined as follows:

4.3. Key Schedule Algorithm

The key schedule algorithm of LBC takes a -bit user key as input and outputs the required twenty-five 32-bit round subkeys, where can be a variable between 96 and 128 bits and typically . The key schedule algorithm is as follows:(1)A -bit user key is stored in a key register ; .(2)Output the leftmost 32 bits of the current content of the key register as the first round subkey .(3)For to 24,(a)Rotate the key register to the left by 29 bits, that is, .(b)Update the leftmost 32 bits of the key register as follows:where and represent, respectively, the binary representations of and with the left side being extended by concatenating as many zeros as required to reach the required bit length.(c)Output the leftmost 32 bits of the current content of the key register as the th round subkey .

Note that the key schedule uses the key selected S-box in an abused way, where the selection vector for the key selected S-box is not the key material but rather the key length.

4.4. Encryption Algorithm

The encryption algorithm of LBC transforms a 64-bit data block, called a plaintext (block), into a pseudorandom data block of the same length, called a ciphertext (block), under the control of a secret user key.

The encryption algorithm takes as input a 64-bit plaintext block and has a total of 25 rounds. The encryption procedure is as follows, where and are 32-bit variable ( () is a round subkey generated from a user key by the key schedule algorithm of LBC).(1)Let .(2)For to 25,(i).(ii).(3)Ciphertext .

Figure 3 illustrates an encryption round of LBC.

4.5. Decryption Algorithm

The decryption algorithm of LBC is the inverse of the encryption algorithm, and it decrypts a ciphertext to obtain the original plaintext, under the control of the same user key as in the encryption process. It takes a 64-bit ciphertext block as input and works as follows:(1)Let .(2)For to 1,(i).(ii).(3)Plaintext .

4.6. Test Vectors

Three test vectors of LBC with a 96 bit key are as follows: and .

5. Design Rationale of LBC

In this section, we give our design rationale for the structure, parameters, and components of LBC. At a high level, we feature the following distinctions when designing the LBC example cipher: (1) the novel notion of the key selected S-box is used to achieve a good performance and a sufficient security; (2) the Feistel structure is combined with simple substitution and permutation operations to achieve an efficient hardware implementation with a moderate amount of GEs and an efficient software implementation; (3) the same key schedule algorithm as well as the same encryption and decryption algorithms for different key length versions of a variable length user key is used to provide user friendliness and efficient resource utilization; and (4) the strong key schedule ensures partially that data authenticity is robustly provided when LBC is sometimes used to build or abused as a hash function in some applications.

5.1. Structure

There are mainly two types of structures for iterated block ciphers, one is the Feistel structure and the other is the Substitution-Permutation Network (SPN) structure.

LBC has a Feistel structure. Compared with an SPN structure, a Feistel structure with the same block size has the following merits: (1) there are more flexibilities to design its round function, for example, the linear or S-box operation does not need to be invertible; (2) the round function is generally lighter, partially due to the fact that the round function operates on a smaller number of bits; and (3) implementing the circuit for both encryption and decryption does not cost much more than implementing the circuit for encryption only, as decryption is (almost) identical to encryption. By contrast, for an SPN structure we need to implement the round function as well as its inverse for both encryption and decryption. Anyway, the Feistel structure may need a larger number of rounds to be secure, but nevertheless, this is not always the case, for example, the Feistel block cipher LBlock [18] has a comparable number of rounds with the SPN block cipher PRESENT [19] and has resisted extensive cryptanalysis. Moreover, LBC uses the novel notion of the key selected S-box as well as a good diffusion operation to achieve additional security protection.

5.2. Block Size

In reality, a general-purpose block cipher typically has a block size of 64 or 128 bits. LBC uses a block size of 64 bits, in order to meet the requirements of moderate application environments on memory, space, and performance. Although a 64-bit block size may be short in some applications due to the birthday bound paradox, it is still okay with appropriate block cipher modes of operation in many applications.

5.3. Key Length

In 2001, Lenstra and Verheul [20] estimated that, for a symmetric cipher, an 80-bit key size can provide a security margin until (around) 2012, a 96-bit key size can provide a security margin until 2034, and a 128-bit key size can provide a security margin until 2076. NIST recommended not to use an 80-bit key in 2010 and disallowed an 80-bit key in 2012. In 2012, European ECRYPT II project remarked for a symmetric cipher that an 80-bit key size provides a security level of “Very short-term protection against agencies” and “ 4 years protection,” a 96-bit key size provides a security level of “Legacy standard level” and “ 10 years protection,” and a 128-bit key size provides a security level of “Long-term protection” and “ 30 years (protection)”. As Dinur [21] noted in 2015, the Bitcoin network [22] demonstrated that a computation of (cipher encryption operations) is (marginally) practical. In short, an 80-bit key is now considered to be too short to be secure in reality.

When designing the LBC cipher, we use a minimum key length of 96 bits for short-term protection and a maximum key length of 128 bits for long-term protection. Anyway, to be flexible and user friendly, LBC accepts a variable-length user key, and thus the user can use a key length of his choice, as long as it is between 96 and 128 bits (a key shorter than 96 bits may be used, which we do not recommend); for example, a 112-bit user key may be used for medium-term protection. Using a variable key length enables the user to have more flexibility to choose an appropriate key length according to the expected lifetime for the concerned security application, so as not to waste computing and hardware resources.

5.4. S-Box Layer

In reality, a general-purpose block cipher typically uses an -bit S-box, and a lightweight block cipher typically uses a -bit S-box, in order to meet the requirements of lightweight application environments on memory and space, since a -bit S-box is generally more compact in hardware than an -bit S-box.

The PRESENT block cipher uses a -bit S-box based on Leander and Poschmann’s work [23]. However, the S-box has a weak security property in the sense of linear cryptanalysis, that is, there are a number of combinations of one-bit input mask and one-bit output mask [24]. In 2015, Zhang et al. [17] studied -bit optimal S-boxes with more security criteria and presented three classes of -bit optimal S-boxes. The number of valid combinations of one-bit input difference and one-bit output difference is , and the number of valid combinations of one-bit input mask and one-bit output mask is , where .

LBC uses a key selected S-box that is based on four ordinary -bit S-boxes , , , and eight times to build the S-box layer in its round function and uses the same S-box layer in the 25 rounds.

From Zhang et al.’s -Num1-DL category of -bit optimal S-boxes, we further chose each -bit S-box () by the following additional security criterion:(i)The two valid combinations of one-bit input difference and one-bit output difference do not use the same input/output difference; the two valid combinations of (one-bit input mask and one-bit output mask) do not use the same input/output mask. Here, one-bit input difference/mask means that the binary representation of the input difference/mask has one and only one bit position with a one, that is, it has zeros everywhere except for one bit position. The same statement applies subsequently throughout the rest of this paper, although we do not explicitly make it further.

That is, we use the following security criteria in total:(1)The S-box is bijective, that is, if .(2)The S-box has no fixed point, that is, : .(3) and :(4) and :(5)The number of valid combinations of one-bit input difference and one-bit output difference is 2, and the number of valid combinations of one-bit input mask and one-bit output mask is 2, too.(6)Either of the valid combinations of one-bit input difference and one-bit output difference has the smallest (valid) possibility, that is, for a -bit S-box.(7)Either of the valid combinations of one-bit input mask and one-bit output mask has the smallest (valid) bias, that is, for a -bit S-box.(8)The two valid combinations of one-bit input difference and one-bit output difference do not use the same input/output difference.(9)The two valid combinations of one-bit input mask and one-bit output mask do not use the same input/output mask.The four ordinary -bit S-boxes , , , and together meet the following security criterion.(10)Ideally, any two -bit S-boxes do not involve a common valid combination of one-bit input difference and one-bit output difference or one-bit input mask and one-bit output mask.(11)Ideally, any two -bit S-boxes do not concurrently have the largest (valid) probability (i.e., ) under any (input difference and output difference) pair and do not concurrently have the largest (valid) bias (i.e., ) under any (input mask and output mask) pair.

5.5. Diffusion Layer

The diffusion layer has a branch number [25] of 4, which provides a sufficiently large avalanche effect to make LBC secure against currently known cryptanalysis techniques such as differential and linear cryptanalysis, together with the S-box layer. performs only simple operations (namely, rotation and XOR) and is very lightweight in hardware implementation and is suitable not only for hardware implementation but also for software implementation.

5.6. Key Schedule

To achieve a good performance, many lightweight or moderate block ciphers use a simple key schedule, for example, HIGHT [26] and PRESENT; in particular, the style of the key schedule of PRESENT was followed by many subsequent lightweight block ciphers such as LBlock [18] and RECTANGLE [17]. However, the full-round HIGHT was shown in 2011 to suffer from a related-key [27, 28] attack [29], and the full-round PRESENT was shown in 2015 to suffer from a known-key distinguisher [30, 31], mainly due to their simple key schedules. In reality, a block cipher may be used to build or abused sometimes as a hash function for data authenticity to save hardware space, where the unknown key parameter under the block cipher corresponds to the known message parameter under the hash function. Thus, HIGHT and PRESENT are not suitable for this case because of the related-key and known-key cryptanalysis results, and the known-key distinguisher on the full PRESENT puts a security concern on PRESENT-based hash functions.

We aim to design a strong key schedule for LBC so that LBC can resist key schedule attacks and can be used to build or abused as a hash function to provide data authenticity in some devices, considering that confidentiality without authenticity is usually not sufficient for a real-life application (note that a 64-bit digest size may be short for some applications, since the birthday bound is ; however, it is practically okay for many real-life applications). The key schedule of LBC is based on the round function, so as to have a good nonlinearity and save some hardware area; it makes LBC secure against related-key cryptanalysis [27, 28, 32] as well as slide attacks [33, 34] (together with the encryption or decryption procedure of LBC). When the key length parameter is determined by the user, the ordinary S-box used for the key selected S-box in the key schedule can be easily determined (in other words, the key selected S-box becomes an ordinary S-box), and the order of the ordinary S-boxes in the S-box layer can also be easily determined, which results in a determinate S-box layer and thus a simple hardware implementation.

The key schedule of LBC is very user friendly in several aspects. First, a variable-length user key enables the user to have more flexibility in choosing an appropriate key length according to the expected lifetime for the target application, so as not to waste computing and hardware resources. Second, the key schedule uses the same algorithm for different key lengths, which makes LBC different from most existing block ciphers that usually use different key schedule algorithms for different key lengths (if supported); this feature is user-friendly, for example, it enables the user to make a hardware implementation easily for different key length versions. Third, with computing power increasing as time goes on, it is often the case to upgrade to a larger key length when the current key length becomes insufficient after some usage time. A variable-length key enables the LBC user to upgrade to the exactly required key length, so as to efficiently utilize hardware resource by avoiding having to upgrade to a much larger prespecified key length than required. For example, published in 2007, PRESENT accepts only 80- and 128-bit user keys, but since an 80-bit key is considered to be too short nowadays, as mentioned in Section 5.3, PRESENT should be upgraded now if it had been deployed with an 80-bit key in reality, although it is not very long since its publication; however, a 128-bit key may be too long for many lightweight security applications and thus may be wasteful.

The idea of using a variable length key for LBC is motivated by the general block ciphers, Serpent [35] and SHACAL-2 [36], but LBC processes a variable length key in a manner different from that used by Serpent or SHACAL-2: the latter requires extending a shorter user key to the maximum key length by concatenating as many zeros as required or a one followed by as many zeros as required (and thus does not distinguish different key length versions much), while LBC does not require extending a shorter user key to the maximum key length and it distinguishes different key length versions by involving the key length parameter in the key schedule, to avoid potential key-schedule attacks.

6. Security Gain Evaluation

In this section, we briefly give our evaluation results on the security of LBC against a list of advanced cryptanalysis techniques (under the worst case assumption) and finally get the security gain of LBC over the LBC version with the ordinary S-box mechanism. Conservative frameworks are developed for analysing the security of the key selected S-box mechanism against differential and linear cryptanalysis. Recall that like most of block cipher designs, we only consider the black-box security of the algorithm and do not consider its gray-box security such as side-channel attacks [11], which usually assume a more powerful attacker and need additional resistance countermeasures. Note first that LBC uses a user key of at least 96 bits and can withstand elementary cryptanalysis methods. We start with two properties of LBC.

6.1. Properties of LBC

A simple analysis of the operation reveals the following property.

Property 1. For the operation, if the input and the output are represented each as eight 4-bit nibbles corresponding to the eight S-boxes, that is, and , then(1)The eight 4-bit nibbles of the output can be expressed with the eight 4-bit nibbles of the input as follows:(2)The eight 4-bit nibbles of the input can be expressed with the eight 4-bit nibbles of the output as follows:A simple detailed investigation reveals the following property.

Property 2. The propagation of a single bit:(i)A single bit will get at least 62 subkey bits involved, after 3 rounds, depending on the bit position. Detailed numbers of involved subkey bits are given in Table 2, in comparison with the numbers of involved subkey bits under the ordinary mechanism.(ii)A single bit will get at least 92 subkey bits and about 60 output bits involved, after 4 rounds, depending on the bit position. Detailed numbers of involved subkey bits are given in Table 2, in comparison with the numbers of involved subkey bits under the ordinary mechanism.(iii)A single bit will get all 96 subkey bits and all 64 output bits involved, after 5 rounds.The numbers of disjoint subkey bits involved in the propagation of a single nibble position through 3 and 4 rounds are summarised in Table 2 and are briefly illustrated in Figures 411.

6.2. Differential Cryptanalysis

As mentioned in Section 3.3, a key selected S-box makes it difficult for an attacker to apply differential cryptanalysis. Anyway, we develop a conservative framework for the differential cryptanalysis of block ciphers using a key selected S-box. We start the framework with introducing the concept of the combined difference distribution (CDD) table for a key selected S-box as follows.

Definition 2. The combined difference distribution (CDD) table for a key selected S-box: (for specific values of , , and ) is a table with rows being the possible input differences, columns being the output differences, and the th entry being the set of the possible combinations (the number of -bit inputs satisfying the input difference and output difference pair under an ordinary -bit S-box and the number of ordinary -bit S-boxes that have the number of -bit inputs satisfying the input difference and output difference pair ), where and .
As an example, we compute the CDD table for the key selected S-box used in LBC, which is given as Table 3. Each entry except has at most three combinations, which follow the difference distribution tables of the four ordinary S-boxes (see Table 4). Note that one may enhance the combined difference distribution table by associating every combination with its corresponding probability/probabilities.
Now by treating the eight key selected S-boxes in the S-box layer of LBC as eight identical ordinary S-boxes with the difference distribution table being the CDD table, we can check the minimum number of active S-boxes for a differential characteristic of a certain number of rounds, in a manner similar to that Matsui did for DES (under the general assumption for differential cryptanalysis) in [37]. As each ordinary S-box has a maximum (valid) probability of , we can obtain an upper bound for a differential characteristic of a certain number of rounds and get its security against differential cryptanalysis. Clearly, the upper bound is overestimated, since it is based on the CDD table and, each of the four ordinary difference distribution tables is only a subset of the CDD table. By this way, we can bound the security against differential cryptanalysis in the worst case from the point of the user of the cipher.
We made a computer program to compute the minimum numbers of active S-boxes of -round differential characteristics under the CDD table (), and the results are given in Table 5.
From Table 5, we see that the number of active S-boxes is larger than 32 for 18 or more rounds. In particular, for an 18-round differential characteristic, the number of active S-boxes is at least 33. 33 active S-boxes require a total of 66 selecting key bits, which means that there are only key bits left for the key recovery phase. Table 2 shows that a single nibble/bit will get at least 62 subkey bits involved after propagating through 3 rounds. A total of five rounds appended at both ends of an 18-round differential characteristic will indicate at least 3 rounds in an end, which would require an attacker to guess all the remaining 62 key bits in the key recovery phase. As a result, we can assume at most an 18-round differential characteristic and assume appending at most a total of five rounds at both ends. Remind that multiple differential cryptanalysis does not work well in the key selected S-box mechanism because a different differential characteristic will require a different set of selecting key bits, which would further shrink the space of the key bits that can be guessed in the key recovery phase. Therefore, 25-round LBC should be secure against differential cryptanalysis.

6.3. Linear Cryptanalysis

To analyse the security of LBC against linear cryptanalysis, we first have the following property of the operation.

Property 3. For the operation, if the input mask and the output mask are represented each as eight 4-bit nibbles corresponding to the eight S-boxes, that is, and , then the following is obtained.(1)The eight 4-bit nibbles of the output mask can be expressed with the eight 4-bit nibbles of the input mask as follows:(2)The eight 4-bit nibbles of the input mask can be expressed with the eight 4-bit nibbles of the output mask as follows:

Proof. By Property 1 (1), we haveBy Property 1 (2), we haveThus, the results follow trivially.
As mentioned earlier, a key selected S-box also makes it difficult for an attacker to apply linear cryptanalysis. Here, we similarly develop a conservative framework for the linear cryptanalysis of block ciphers using a key selected S-box, which is based on the concept of the combined bias distribution (CBD) table for a key selected S-box as follows.

Definition 3. The combined bias distribution (CBD) table for a key selected S-box: (for specific values of , , and ) is a table with rows being the possible input masks, columns being the output masks, and the th entry being the set of the possible combinations (the number of -bit inputs satisfying the input mask and output mask pair under an ordinary -bit S-box and the number of ordinary -bit S-boxes that have the number of -bit inputs satisfying the input mask and output mask pair ), where and .
Likewise, we can compute the CBD table for the key selected S-box used in LBC, which is given as Table 6. Each entry except has at most five combinations, namely, , which follow the bias distribution tables of the four ordinary S-boxes (see Table 7). Note that one may enhance the combined difference distribution table by associating every combination with its corresponding probability/probabilities.
Now by treating the eight key selected S-boxes in the S-box layer of LBC as eight identical ordinary S-boxes with the bias distribution table being the CBD table, we can check the minimum number of active S-boxes for a linear approximation of a certain number of rounds, in a manner similar to that Matsui used for DES (under the general assumption for linear cryptanalysis) in [37]. As each ordinary S-box has a maximum (valid) bias probability of , we can obtain an upper bound for a linear approximation of a certain number of rounds and get its security against linear cryptanalysis. Clearly, the upper bound is overestimated, since it is based on the CBD table, and each of the four ordinary bias distribution tables is only a subset of the combined bias distribution table. By this way, we can bound the security against linear cryptanalysis in the worst case from the point of the user of the cipher.
We made a computer program to compute the minimum numbers of active S-boxes of -round linear approximations under the CBD table (), and the results for 1, 2, 3, 4, and 5 rounds are 0, 1, 2, 5, and 8, respectively (it is rather time consuming for 6 or more rounds). Thus, a 20-round linear approximation has a minimum of active S-boxes, and 32 active S-boxes have at most a bias of , which is not valid for a linear cryptanalysis attack because . As a result, we can assume at most a 20-round linear approximation and can assume appending at most a total of five rounds at both ends, since a total of five rounds appended at both ends will indicate at least 3 rounds in an end, which would require an attacker to guess all the remaining 62 key bits. Remind that multiple linear cryptanalysis does not work well in the key selected S-box mechanism because a different linear approximation requires a different set of selecting key bits, which would further shrink the space of remaining key bits that can be guessed in the key recovery phase. Therefore, 25-round LBC should be secure against linear cryptanalysis.

6.4. Impossible Differential Cryptanalysis

Impossible differential cryptanalysis [38, 39] is a special case of differential cryptanalysis, which is based on a differential with a zero probability. Here, we analyse the security of LBC against impossible differential cryptanalysis.

Consider a plaintext pair with difference , where is a nonzero 4-bit value.

The output difference of Round 1 is of the form . The output difference of Round 2 is of the form , where . The output difference of Round 3 is of the form , where the question mark ‘?’ denotes a 4-bit indeterminate value that can be zero or nonzero, and denotes a nonzero 4-bit value. The output difference of Round 4 is of the form . The output difference of Round 5 is of the form .

On the contrary, given the output difference after Round 9, we can similarly get that the input difference just before Round 5 is of the form , where is a nonzero 4-bit value.

Observe that and . Thus, is an impossible differential, which we denote by .

This 9-round impossible differential of LBC is illustrated in Figure 12, and it can be used to attack at most 19 rounds of LBC, by assuming even 5 rounds at either end. As a result, 25-round LBC should be sufficiently secure against impossible differential cryptanalysis. Note that there also exist similar other 9-round impossible differentials.

6.5. Boomerang and Rectangle Attacks

Boomerang, amplified boomerang, and rectangle attacks [4042] are variants of differential cryptanalysis, which treat a block cipher as two cascades and use two short differentials with larger probabilities instead of a long differential with a smaller probability. Here, we analyse the security of LBC against boomerang, amplified boomerang, and rectangle attacks.

Typically, differential cryptanalysis is based on the idea of using a long differential characteristic with a usually small probability. Different from the idea of differential cryptanalysis, boomerang attack [42] is based on the idea of using two short differential characteristics with relatively larger probabilities. Suppose two short differential characteristics with probability and , respectively; then, and should satisfy to construct a valid boomerang distinguisher, where is the block size of the concerned cipher. Amplified boomerang and rectangle attacks refine boomerang attack mainly by using more than two differential characteristics with the same input or output difference.

For LBC, from Table 5, we can learn that an 11-round boomerang distinguisher has a minimum of 16 active S-boxes, which means that the product of the probabilities of two differential characteristics operating on 11 rounds is at most . Thus, 25-round LBC should be sufficiently secure against boomerang attack as well as amplified boomerang and rectangle attacks.

6.6. Integral Cryptanalysis

Here, we analyse the security of LBC against integral cryptanalysis [43]. Let denote a 4-bit nibble position which takes all the possible 16 values, denote a 4-bit nibble position which is balanced (in other words, its XOR sum is zero), denote a constant 4-bit nibble, and “?” denote a constant 4-bit nibble whose status is unclear about whether it is any of the above three statuses.

Consider a set of 16 plaintexts which takes all the possible 16 values on a certain 4-bit nibble position, say .

The output of Round 1 is of the form . The output of Round 2 is of the form . The output of Round 3 is of the form . The output of Round 4 is of the form . The output of Round 5 is of the form . Now, there is a 4-bit nibble position with symbol “C” and a 4-bit nibble position with symbol “A” in the output of Round 5. If we continue with one more round, all the 4-bit nibble positions of the output of the resulting round will have an unclear status. Thus, we get a 5-round integral distinguisher of one dimension, which is illustrated in Figure 13(a), here “one dimension” means there is only one active nibble position in the set of inputs.

If we would like to obtain a longer integral distinguisher by adding more rounds from the beginning, we can only add at most 4 rounds before reaching the full plaintext space, as illustrated in Figure 13(b). As a result, 25-round LBC should be sufficiently secure against integral cryptanalysis.

6.7. Slide Attack

The key schedule involves the round numbers (i.e., ) to avoid slide attacks [33, 34], and the rotation number “29” in Step 3 (a) guarantees that the three least significant bits of get involved in the generation of the subkey of the next round (if any), and the two most significant bits of get involved in the generation of another round subkey (if any).

6.8. Related-Key Cryptanalysis

The key schedule is based on the LBC round function to have a high level of nonlinearity and involves the key length parameter (i.e., ) to distinguish the different key versions, so as to avoid (potential) related-key attacks [27, 28, 32] under different key lengths. The use of the key selected S-box in the encryption/decryption algorithm makes it more difficult to apply related-key cryptanalysis, since the order of the ordinary S-boxes involved in the S-box layer of a fixed round is indeterminate if a key is unknown, and is very likely to vary when a key is changed.

6.9. Summary

We also analysed the security of LBC against other cryptanalysis methods. In summary, the potential 20-round linear approximation of Section 6.3 is the longest cryptanalysis distinguisher we have obtained, and thus 25-round LBC should be sufficiently secure.

Note that differential cryptanalysis and linear cryptanalysis require a different framework in the key selected S-box mechanism, while impossible differential cryptanalysis and integral cryptanalysis work similarly as in the ordinary mechanism, and boomerang, amplified boomerang, and rectangle attacks follow from differential cryptanalysis.

6.10. Security of LBC with the Ordinary S-Box Mechanism

In comparison, for LBC with the ordinary S-box mechanism rather than the key selected S-box mechanism (i.e., using an ordinary S-box , say , rather than a key selected S-box ), as shown in Figure 14, a single nibble/bit will get all the 128 subkey bits involved after propagating through at least 6 rounds. A total of 11 rounds appended at both ends of a linear approximation will indicate at least 6 rounds in an end, which would ensure that an attacker guesses all the 128 key bits in the key recovery phase; multiple linear cryptanalysis works well in the ordinary mechanism and thus we should take its effect into consideration. As a result, LBC with the ordinary S-box mechanism would require 32 rounds to be secure, assuming a 20-round linear approximation with a total of 11 rounds appended at both ends, plus one additional round for preventing the potential effect of multiple linear cryptanalysis.

7. Performance Gain Evaluation

In this section, we briefly give our performance gain evaluation of LBC over LBC with the ordinary S-box mechanism. Recall that as discussed in Section 6 from a design perspective, LBC requires 25 rounds to be secure, while LBC with the ordinary S-box mechanism requires 32 rounds to be secure.

We test software performances on two types of processors, one type has enough storage and computing resources for general purposes such as servers, and the other type has low or moderate storage and computing resources for resource-constrained devices such as smartphones. Note that there are various software and hardware implementation optimizations and trade-offs among such metrics as memory, cost, area, and throughput, faster or slower than the presented performance results.

7.1. Software Performance on Intel i3

The last second subcolumn of Table 8 shows the encryption-only performances of the two LBC versions under the same Single Instruction Multiple Data (SIMD) implementation method on a popular Intel i3 CPU i5-4200U @ 1.6GHz processor (x64 architecture) with enough storage and computing resources for general purposes such as servers, where the results are only for the encryption parts, and the round keys are stored for use after being generated, which is the usual case for a server. As a result, the key selected S-box mechanism offers speedup in the example LBC cipher.

Note that if the key schedule part was included, the speedup would be greater, since the two versions use the same process for round keys, but LBC with the key selected mechanism has only 25 rounds, while LBC with the ordinary mechanism has 32 rounds. Note also that since the round keys are stored after being generated, the results also hold for the case that the server processes a larger number of plaintexts at a time.

7.2. Software Performance on ARM NEON

The last subcolumn of Table 8 shows the performance of the two LBC versions under the same SIMD implementation method on a popular ARM Cortex-A9@1.4GHz processor (×64 architecture) for cost-sensitive devices such as smartphones, where the results are for both encryption and key schedule parts, and the round keys are generated on the fly, which is the usual case for a resource-constrained device. As a result, the key selected S-box mechanism offers speedup in the example LBC cipher.

7.3. Hardware Performance

When implemented in a parallel hardware implementation with one cycle per round, to process a (64-bit) plaintext block, LBC with the key selected mechanism takes 25 cycles, and LBC with the ordinary mechanism takes 32 cycles to process. Thus, the key selected S-box mechanism offers about speed improvement under this implementation approach in the example LBC cipher. In this case, the key selected S-box mechanism requires slightly more hardware area or GEs than the ordinary S-box mechanism, which may make it not suitable for extremely resource-constrained environments, but nevertheless it is okay in moderately resource-constrained environments.

8. Concluding Remarks

We have presented and investigated a generalised version of Feistel’s key selected S-box mechanism in modern block cipher design and have designed the LBC example cipher to demonstrate that the generalised key selected S-box mechanism can be advantageous over the ordinary S-box mechanism for improving security and/or performance without intensifying computational effort and storage space in some application environments. Especially, we have defined the combined difference distribution table and the combined bias distribution table for the generalised key selected S-box mechanism to analyse the security of a block cipher with a generalised key selected S-box against differential and linear cryptanalysis [44, 45].

As the first attempt, LBC is designed mainly as an example for the primary purpose of investigating relative security and performance gain of the generalised key selected S-box mechanism over the popular ordinary S-box mechanism in modern block cipher design. To us, the main overhead of the key selected S-box mechanism is that it requires slightly more hardware area or GEs than the ordinary S-box mechanism, which may make it not suitable for extremely lightweight application environments, depending on specific designs, but nevertheless it can gain better security and/or performance at least in general or moderately lightweight application environments. No single cipher design can be optimal in all application environments, this is the first detailed investigation on the key selected S-box mechanism, and we would like to see more investigations and better cipher designs on it.

Data Availability

The data are available from the corresponding author upon request.

Disclosure

An extended abstract version of this work was published in Proceedings of the 2017 IEEE Region Ten Conference (TENCON 2017, Penang, Malaysia, 5–0038 November, 2017) [46]. As the full version of the work, this paper gives more design rationale and security analysis of the LBC example cipher. The authors were with Institute for Infocomm Research (Singapore) when this work was completed.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Research Foundation (NRF), Prime Minister’s Office, Singapore, under its National Cybersecurity R&D Programme (Award no. NRF2014NCR-NCR001-31) and administered by the National Cybersecurity R&D Directorate. The authors are grateful to Matt Henricksen for his conversations, to Zhen Li for a preliminary software performance evaluation of LBC, and to Huaqun Guo and Jia Xu for verifying a software implementation of LBC.