Abstract
We improve the steepest descent algorithm and increase the double threshold parameter, which significantly improves the algorithm’s efficiency. And we design a new cost function so that in terms of search, various characteristics of Boolean functions can be taken into account simultaneously. Applying our algorithm, there are excellent results regarding the 9, 10, 11, and 12 variables. We find a Boolean function with a nonlinearity of 242 in 9 variables and the whole search space. Previously, this result only appeared in the rotational symmetry class. The best-achieved nonlinearity result for permutation (6, 5, 1, 4, 7, 2, 3, 0, 8) and (0, 7, 2, 5, 4, 1, 3, 6, 8) class is 238 and 239 introduced by Kavut in Information and Computation (2010). Still, applying our algorithm, we obtain a balanced Boolean function with a nonlinearity of 240 under the same permutation, indicating that our method is more general. Among the 11 variables, a Boolean function with a higher nonlinearity and a lower transparency level and the absolute value spectrum are maintained at a lower level. The algorithm performs well when considering all aspects of the property. There are similarly promising results in even-numbered variables.
1. Introduction
Boolean functions used in symmetric ciphers should have good cryptographic properties, such as balancedness, correlation immunity, high nonlinearity, high algebraic degree, high algebraic immunity, and low transparency order. However, all such characteristics cannot be optimum simultaneously and trade-offs should be considered. Therefore, constructions of Boolean functions with compromise criteria always challenge open problems [1, 2]. At the same time, many heuristic algorithms are applied to the search for Boolean functions, and many Boolean functions with good properties are obtained. Many heuristic algorithms mainly focus on the cryptographic properties of a Boolean function, and it is not easy to consider other properties simultaneously.
1.1. Related Work
Hill climbing (HC) and genetic algorithm (GA) were first applied to search for highly nonlinear Boolean functions in 1996 [3, 4] by modifying the true table of a Boolean function. Simultaneously, the literature shows numerous cryptographically interesting Boolean functions with more density in RSBFs [5, 6]. It will be helpful for us to capture the desired Boolean functions in this class. In 2007, using a steepest-descent-like algorithm, Maity and Maitra [7] searched Boolean functions in 9 variables with a nonlinearity of 241. In 2010, Kavut and Yücel [8] found a Boolean function with a 9-variable nonlinearity of 242 in the rotational and dihedral symmetry classes. Afterwards, Chakraborty et al. [9] redefined the transparency order in 2017, and Wang and Stănică [10] analyzed theoretically the transparency order constructed two infinite classes of balanced semibent Boolean functions with provably relatively good transparency order in 2019. In addition, Kavut et al. [11] applied the steepest-descent-like iterative search algorithm to build a Boolean function with lower autocorrelation.
1.2. Our Contribution
In our work, we developed an efficient algorithm based on the steepest-descent-like iterative algorithm. We developed a new steepest-descent-like iterative algorithm. When the number of iterations increases, the cost tends to get stuck in a loop. We have added a double threshold parameter and reset the operation to deal with this situation. And a new cost function is designed to obtain excellent nonlinearity, absolute indicator, and transparency order.
We have found a 9-variable Boolean function with excellent properties, which has a short transparency order and autocorrelation while maintaining a high nonlinearity of 242 in the rotational symmetry class. At the same time, we find the Boolean function with nonlinearity 242 in the whole search space using the randomization seed. This is the first time to find a 9-variable Boolean function with a nonlinearity of 242 without restricting the search range. We also found an 11-variable Boolean function with excellent properties over the results presented in [12]. It has higher nonlinearity and lower transparency order. The results of the article [8] mention that the nonlinearity of the 9-variable Boolean function satisfies the nonlinearity of (6, 5, 1, 4, 7, 2, 3, 0, 8) and (0, 7, 2, 5, 4, 1, 3, 6, 8) permutation, the best, respectively, for 238 and 239. In contrast, this article’s steepest-descent dual reset algorithm improves the result to 240.
2. Preliminaries
A Boolean function on variables may be viewed as a mapping from to . The truth table of a Boolean function is a binary string of length .
The Hamming weight of a binary string is the number of 1’s in denoted by wt . An -variable function is said to be balanced if its truth table contains an equal number of 0’s and 1’s, i.e., wt . Also, the Hamming distance between equidimensional binary strings and is defined by , where denotes the addition over GF (2).
An -variable Boolean function can be considered to be a multivariate polynomial over GF (2). This polynomial can be expressed as a sum of the product representation of all particular th-order products of the variables. More precisely, can be written aswhere the coefficients . This representation of is called the algebraic normal form (ANF) of . The number of variables in the highest order product term with a nonzero coefficient is called the algebraic degree, or the degree of , denoted by .
Functions of degree, at most one, are called affine functions. An affine function with a constant term equal to zero is called a linear function. The set of all -variable affine (respectively, linear) functions is denoted by (respectively, ).
Definition 1. The nonlinearity of an -variable function isi.e., the minimum distance from the set of all -variable affine functions.
Definition 2. Let and , both belonging to and . Let be a Boolean function on variables. Then, the Walsh transform of is a real-valued function over which is defined as
Definition 3. The autocorrelation function of a Boolean function at a point is defined by
We are interested to find out the point(s) for which the absolute value of is high.
Definition 4. The absolute indicator is defined as
For cryptographic purposes, our primary motivation is to construct Boolean function(s) with low values of .
Definition 5. For an n-variable Boolean function , the transparency order in [10] can be viewed as
2.1. Rotation Symmetric Boolean Functions
Let for . For [13], we define
Let . We can extend the definition of to -tuples as
Definition 6. A Boolean function is called rotation symmetric if for each input
Definition 7. An orbit is identified by the representative element which is the lexicographically first element of the -th orbit and [14]. Accordingly, we can use the simplified truth table to represent an RSBF . Its form aswhich is called the rotation symmetric truth table (RSTT). The length of RSTT is expressed as .
Definition 8. The class of DSBFs [? ], a subset of the RSBF class, is invariant under the action of the dihedral group denoted by . In addition to the (left) -cyclic shift operator on -tuples, which is defined previously, the dihedral group also includes the reflection operator . The permutations of are then defined asSimilar to RSBF, we use to represent the truth table length of DSBF.
2.2. Accelerated Calculation
Furthermore, we introduce an important matrix from [7] for analyzing Walsh spectra and accelerated calculation, which will be applied in our search. The matrix is defined as
Clearly, the size of is or , and the matrix element is the size of the -th orbit. Note that the Walsh spectrum of can be determined by
3. Search Strategy
Our search strategy uses the improved steepest-descent iteration algorithm. Each iteration step has an input Boolean function and an output Boolean function . Each iteration step calculates a cost function in the predefined neighborhood, and the Boolean function with the lowest cost is selected as the iteration output. In the algorithm design, the Boolean function has many properties, which are usually challenging to consider. Therefore, we carefully consider the Boolean function with the best comprehensive properties.
The 1-neighborhood of is obtained by flipping individual elements of its truth table. For an -variables Boolean function, the 1-neighborhood consists of many distinct Boolean functions, each being at the Hamming distance 1 to the original Boolean function.
3.1. New Cost Function
We introduced a new cost function that can obtain the Boolean functions with high nonlinearity, low absolute indicator, and low transparency order. We modified and the cost functions. Hence, we use the sum of quartic power errors as the cost function, which is defined as
In the search process, if only 1-bit flipping is adopted and the cost is minimized as the target for selection, after several rounds, the cost value will remain unchanged or two adjacent cost values will be repeated. This situation is also called trapped in a locally optimal solution. If no correction is made, the function becomes stable and no new process is created. The same can be said for heuristic algorithms of other classes. In [6], the authors suggest that the second smallest cost value before this algorithm can be selected as an iteration. The disadvantage of this approach is that it involves backtracking. The spatial complexity involved in reversal is challenging to determine in different situations, and the time complexity will also increase. The algorithm is optimized using the steepest-descent-like iterative algorithm based on a greedy algorithm. A novel steepest-descent dual reset algorithm is proposed, that is, speed up convergence. At the same time, it dramatically avoids the problem that the function stays in a suboptimal solution.
3.2. Dual Threshold
In the search process, if only 1-neighbor mode is adopted and the target is selected with the lowest cost, the cost value remains unchanged after several iterations or the cost value of two adjacent iterations is the same. This is also known as falling into a locally optimal solution. If no correction is made, the function becomes stable and no new solution is created. The same is true for other classes of heuristic algorithms. In the document [5], the authors suggest choosing the second smallest generation value before this algorithm as the iteration. The downside to this approach is that it involves backtracking. The spatial complexity involved in inversion is difficult to determine in different situations and the temporal complexity also increases. We propose a dual threshold parameter and optimize the parameter’s selection size according to many experiments. At the same time, the problem of the function staying in the locally optimal solution is significantly avoided.
3.2.1. The First Threshold
CIVT represents the cost-iteration threshold value. The CIVT size is defined as , where , and represents the length of the truth table, including and . is defined as the integer up of its product. At the same time, we refer to a variable to record the degree of , that is, the number of times that the of the first equals the of the first , where is an integer. When is triggered, + 1. When reaches our preset value of , we will trigger the first threshold, randomly select the position, and reset m consecutive bits. m is selected as variables. As for the selection of m here, after repeated attempts on m, we finally determined that the selection method was the best.
3.2.2. The Second Threshold
When the system triggers the threshold for the first time, it will undoubtedly reset the function in the current state, and due to the limitation of m, the current position will only have a small change. Still, the evolution of the function will not be far beyond our expectations. This way, the optimal local situation can be effectively improved, but the change interval is m bits in a row. So if you think of a significant improvement in a property like nonlinearity, it is impossible.
On top of this, we introduce another counter, ; also, giving an initial value of 0 causes + 1 whenever the first double threshold is triggered. When reaches , we calculate the probability , where is the total number of executions, and is the current number of executions. When is true for a given probability , this moving method is accepted even if the current cost value is not the optimal solution. The idea is similar to the simulated annealing algorithm, which takes weak solutions. The advantage of this is that when the program is just started, there is a high probability that the diluted solution will be accepted so that the function can jump out of the local optimal with a high chance. When the program is about to end, only the weak solution is accepted with a small probability and the variable state of the current function is preserved. And when the second threshold is triggered, we do not select the continuous bit to reset like the first threshold but select the discrete bit ( is the length of the current truth table) to reset. The effect of this is to increase the degree of dispersion of the current function.
3.3. Algorithmic Process
Our main algorithm flow is shown in Algorithm 1.
|
4. Experimental Results
In this section, we apply the steepest-descent dual reset algorithm and find the optimal Boolean functions of nonlinearity, autocorrelation, and transparency order in 9, 10, 11, and 12 variables, respectively. We give the following table to compare the experimental results of each variable.
In Table 1, we give three representative results for the Boolean function of 9 variables. We demonstrate the excellence of our results and the effectiveness of our algorithm by comparing the nonlinearity, transparency, and balance. While maintaining a higher nonlinearity, our results have a lower transparency order. We find that the new 9-variable Boolean function is the optimal transparency order result for the Boolean function with a nonlinearity of 240 based on the dihedral symmetry class. Compared with [15], the algorithm in this paper has good convergence in terms of nonlinearity. Before this paper, the result of a 9-variable Boolean function with a nonlinearity of 242 was obtained only by Kavut. It appeared in [8], published in in 2010, and its results were found in the rotationally symmetric class. This paper not only finds a new 9-variable Boolean function with nonlinearity 242 and excellent transparency order in the rotationally symmetric class but also finds a new 9-variable Boolean function with nonlinearity 242 and lower transparency order in the whole space . No one has ever published such results.
In [8], the author Kavut found the best nonlinearity of 238 and 239 in two kinds of 9-variable Boolean functions with the order of (6, 5, 1, 4, 5, 4, 1, 3, 6, 8) and (0, 7, 2, 5, 4, 1, 3, 6, 8). In this paper, and increase the maximum nonlinearity of these two Boolean functions to 240.
In Table 2, we compare the results in [2, 12, 15, 16], and we find the Boolean functions of high nonlinearity and low transparency in the class of rotational symmetry, represented by and . Among the equilibrium 10-element Boolean functions, has the highest nonlinearity among the known results and maintains a good transparency order. For the previous 10-variable Boolean function with a nonlinearity of 490, the lowest transparency order is 0.9864 given in [12]. However, the result of presented in this paper has lower transparency order compared with that in [12].
The result of an 11-variable Boolean function is shown in Table 3. We present two representative results. Compared with the results given in [12, 15], our search’s nonlinearity of the Boolean function obtained is higher. In addition, when the previously known nonlinearity is 990, the lowest transparency order is 0.9872. Still, our result has a lower transparency order with the same nonlinearity, which is also due to the high convergence of our designed algorithm. According to the conclusion of [10], Boolean functions with high nonlinearity usually have higher transparency order. However, after applying our improved search algorithm, this paper finds the equilibrium 11-element Boolean function , which has lower transparency order while maintaining higher nonlinearity, compared with the result in [12]. The result is the best among the general equilibrium 11-variable Boolean functions considering the nonlinearity and transparency order.
Table 4 compares the results of the 12-variable Boolean function with [14, 17]. The result is that we obtain a balanced Boolean function with a nonlinearity of 2000 and an autocorrelation of 136 and the transparency order is the lowest known.
5. Conclusion
The space of Boolean functions is vast, and the area of Boolean functions of n variables is ; so, it is not feasible to search exhaustively. An excellent approach is searching in a narrowed space using rotational symmetry classes. Under this circumstance, designing a workable search method for colleges and universities is also crucial. With our improved steepest descent method, good results can be obtained in rotationally symmetric class Boolean functions and entire space Boolean functions.
In the 9, 10, 11, and 12 variables, we have obtained Boolean functions with excellent properties, many of which are Boolean functions with the best comprehensive properties, such as the comparison results in Section 4.
We also used the cost function given in the experiment in [14] and compared it. After the investigation, we found that the difference in the convergence effect of different costs is vast when using our same algorithm. Therefore, designing a more rational cost function is also an urgent problem to solve in the heuristic search of Boolean functions. Although our cost has achieved good results in the application, we still hope to optimize the cost function further to make its computational complexity lower and the convergence more accurate.
Appendix
Some results of our experiment are as follows: Our result 1: 9-variable, nl(f) = 240, ac(f) = 72, To(f) = 0.9611 427f 4c88 5397 f7f6 4128 b109 9c18 cc0f 1635 2ae7 f934 77f1 e586 65a6 9696 62cd 250a 7d54 3afb ce09 cda1 7c57 1818 cc21 9a54 e70b 1b45 ee0b e41b f10b 0f7b 97d5 Our result 2: 9-variable, nl(f) = 242, ac(f) = 32, To(f) = 0.9832 3340 b6a1 1821 f196 42a8 5e2b 7e2f 3c3c b65f a0d9 5ec9 db1e ab2b db36 6618 5ae0 087f 5fe6 e075 7106 212f c918 754c 40e8 a1bc cbfa 7140 32a8 9614 56e0 66e8 a801 Our result 3: 9-variable, nl(f) = 242, ac(f) = 48, To(f) = 0.9828 6da2 cd1d b0b7 47b7 5b3d 90b8 62fe fa29 368e 5ff6 d744 8ec1 7a98 82d0 c40c a201 7acd b4a3 cb61 995e fabc 92f2 e7cc aa40 fe9d daf0 c64c ba08 935b 22cb 5d09 f1df Our result 4: 9-variable in the permutation of (6, 5, 1, 4, 7, 2, 3, 0, 8), nl(f) = 240, ac(f) = 160, To(f) = 0.9870 e83d 6d1b 7927 5697 3d17 1b92 2786 97a9 3d17 1b92 2786 97a9 17c2 92e4 86d8 a968 6a16 1a53 2647 56c1 1695 53e5 47d9 c1a9 1695 53e5 47d9 c1a9 95e9 e5ac d9b8 a93e Our result 5: 9-variable in the permutation of (0, 7, 2, 5, 4, 1, 3, 6, 8), nl(f) = 240, ac(f) = 128, To(f) = 0.9845 c200 c23c 0b8d 3b8c bcff a995 f031 9556 23b1 2fb0 01d5 c0c3 cc0d 9556 fd14 566a 0216 546b 19a7 623b 5469 abab 621b acf1 259b 4a2f 6b7f bfbc 4a27 b8cd 97bc bd15 Our result 6: 10-variable, nl(f) = 490, ac(f) = 80, To(f) = 0.9846 abfc ddc7 d180 d249 d530 f326 c46b 46b1 d104 2823 9c2d 7e1b d617 1aad 420a bc75 9520 6346 3af6 7b2d e193 2f94 08df 64f9 c04b 2149 24ff ae84 462a 76ea bc96 1845 b451 7a63 5e78 030f 78ab c94f 4de9 2ac5 da30 e479 3f9c a507 23b3 8588 0a42 cce1 c763 46b9 6b31 13a1 6e07 dc99 bf9a a203 130f 3ffb 094f deab f9d7 b14b 75f3 5755 Our result 7: 10-variable, nl(f) = 492, ac(f) = 40, To(f) = 0.9886 121d 53f6 365a ff38 5b3d 26d8 eaef 4ad0 369f 1be6 4828 e2c5 f8c8 e9ab 21d9 e314 1b6c 96ba 538b a829 3480 4c90 e90d b033 ffd1 e085 b893 cc8f 5847 a282 b95f 1274 47de 6cf4 d72c 8b8c 675a 808b c891 1d96 5a64 d101 60a4 9345 fd83 45a3 cb00 5f4e beae f212 bc01 8576 cad1 825b f1b1 90fa 62c4 613e 8d0d d54c 8a96 33fa 065c 2e20 Our result 8: 11-variable, nl(f) = 990, ac(f) = 88, To(f) = 0.9869 aaee cadf 93ab c58c b52c fba8 c601 a2d3 fc05 3bd7 c9f8 aef6 c25b 6675 ef7b c479 d883 3755 3ce8 d00c 93e5 9da3 bb9a c91a 866a 54ec 4e1b 4811 cf99 5cbc 9743 48b4 91a7 f268 790d 0544 2dc3 9ea2 c562 67c3 b03 d ce44 b5d4 fe78 adbc a4ba c6a4 31ea b20b 1fff 1007 8fd3 028a 65f8 13b7 2030 96c8 e5a0 1183 fcc2 f10d 427d 16a3 b803 a025 fb18 d87a 0ba7 49f5 3684 2705 1707 6bd5 c268 f1db fe3e 8240 0f2f 1f4c d32c f962 7990 93ca 1213 ed00 d446 9c8e 08f2 ebd5 e997 aa16 fdae c21a fb42 7925 cbeb a97e 66b9 35c8 99d9 3522 774c a299 d078 727e f2ef 1a40 c8a6 353d ac58 2e72 7933 b40f d2f7 da50 af66 6461 e228 dd86 867f c965 6690 032a 5885 310e ba69 f8e3 633c Our result 9: 11-variable, nl(f) = 992, ac(f) = 128, To(f) = 0.9856 eda2 9d5c c7f3 37f4 e02b bf1e 0e3e af61 a854 59cb 9fbf 57a8 05fd 5ea8 9cee 3c43 9d81 7265 3287 e19e d7ab 9aaa 326e c9d4 5477 eeb3 32a8 99c5 c6a0 a9fc 5ee1 701b c7e7 8556 3a5c 7c76 4b19 c03b fc03 c3e8 f73f 88ca d29d cc8c 5e08 6ca8 e183 e720 3734 6e6a a9a9 da1b 1f58 cc80 c3c3 f032 f578 9914 c983 aae4 66e9 bd57 2a41 128b b07e bd7e 8176 2639 0ed9 77f0 3fe0 6a69 71cb 4396 f140 4a9e efa5 405b a01e e9c5 bb3e 4eee 9585 e198 f708 c7a2 e0a0 91b0 76a8 11c5 7ca0 99c4 fc56 910a a97f 0941 4e7e 1e25 2dfc 7889 d996 d886 b38 d 069b 56aa 7394 f0b0 8054 e05b e05a ee50 1f19 ae63 3ec1 d6c2 5325 e0d6 c40e 8d8c ac75 2969 e8d3 8ea2 322f 198d 2043 5309 d1de Our result 10: 12-variable, nl(f) = 2000, ac(f) = 136, To(f) = 0.9923 2ede 819e a331 b5de 00d7 92db 06ee 2a65 99d9 7bb6 1b80 2a56 221f 9e8a 2baf 1f44 a5e5 81b4 5da8 bc4e 8b12 5c88 8044 aaa4 95d4 8a77 1b34 5c14 2ea9 bbd9 31d9 4352 eb00 cb54 e371 f842 ea7e 114d 1769 bc34 0953 8f85 af6d 0c18 f676 4307 bbef ee43 b140 9113 b2af 491c da56 97a8 af29 cae8 d474 440b 5602 3b5e 7920 81f5 426d 552f dbad 6726 83a9 4142 2492 a69b 2648 e880 2100 e630 8fde ac6b 754c 1ee0 a996 2917 37a0 113d a6dc a344 1426 f17f cd2d 9e1c 73b4 f3b1 bcd6 d8a2 a9fd 8e99 dbdf 0779 fd25 5337 b035 7568 43c1 1462 ec4a 9e38 2a05 bef0 1ee6 5408 eb89 6fb4 d6fe 9ee7 d546 4d02 4343 73b8 aaf1 8c91 8256 ab64 f21a c4c8 598a 73fb 437b 4f85 5514 2adc 84bc bfd4 5a18 7f5e 0883 055b adca b9d0 91bc 5ac0 41b5 5f57 2b5a 03b3 8bf7 f736 3a34 3762 db0e 2c33 4866 6e71 4439 f103 f2ff e93c 9b25 64dd bbf4 b50b 7ab5 2159 7918 bb32 7170 7dd5 45f1 6b7c 50c2 bcbd dafd 84b0 378a e633 8380 3b95 b58f 7597 597d ed52 cc3d fd31 5639 3fb5 3b1c 44c1 501a 337e 5870 5b4f 91fd 8588 3719 5ca4 8cd4 6e54 5468 6d09 039d c2aa b6bf b55c ecd2 788f dff8 f084 9ac3 06fa f4db 6db7 7efe 7655 e8cb 9d32 9a71 25f0 bafd d95d 7516 0d5f b022 03fc 910b 998e f59a df48 8404 421b 42d5 322b f996 a997 e386 471d 4445 23de 1869 1a8a a36a 154b fbf8 5e52 882f 25eb 8342 c7e2 fe4b 1c15 b2d3 7646 a993 b212 bc32 18bf 4011 3112 2ffe 94c2.
Data Availability
The data that support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (nos. 62172230 and 62262018) and the National Natural Science Foundation of Jiangsu Province (no. BK20201369).