1 Introduction
Publickey cryptosystem is widely used, but the speed is much lower than symmetric cryptosystem. In the process of encryption and decryption, modular exponentiation and scalar multiplication are the key factors affecting the efficiency. Common public key cryptosystems include DH, RSA, ElGamal, ECC, etc. Modular exponentiation is used in DH, RSA and ElGamal, which is
(1) 
In ECC, scalar multiplication is used as
(2) 
In these two operations, the calculation of positive integer e is involved. There are different calculation paths to get e. For example, when , there are 12481213, 12361213, …, etc. Such paths are called the addition chains of e.
Given a positive integer e, an addition chain A of with length r is a sequence of positive integers: , where , and for all . When , this step is called doubling step (i.e. ). When , this step is called star step (i.e. ). In this paper, indicates is an bit integer. is the Hamming weight of e which means the number of 1s in the binary form of . is the shortest addition chains length of e.
In the practical public key cryptosystem, the operands are usually selected with long bits for security. For example, RSA uses 1024 bits or 2048 bits, which costs a long time for the computer to perform the operation. In fact, the operation can be abstractly approached as an Addition Chain Problem (ACP) to find a shortest Addition Chain (AC). However, finding a shortest addition chain of length k is an NPhard problem since the search space size is comparable to k!. Therefore, optimizing the result of ACP to improve the execution speed of modular exponentiation or scalar multiplication is of great practical significance to improve the efficiency of public key cryptosystems.
At present, the methods to solve ACP mainly include the Binary method, the mary (ary) method [Bra39], the Window method [Knu14], the Powertree method [Knu14]
, the Genetic Algorithm
[CRJC05], the Artificial Immune System [CRC08], and the Evolutionary Programming [DMO15], etc. The Binary method is widely used in fast modular exponentiation and scalar multiplication, which can be further optimized. mary method divides the binary form of integers into windows, which are included in a precomputation. The algorithm proceeds by scanning all the windows. The Window method can achieve better results than mary method by looking for the windows whose head and tail are not zero, thus to reduce the precomputation. Scholars have made a number of analysis and improvements on the Window method [BC89, Koç95, KY98, KY00], which are collectively called Windowbased methods. The Powertree method, the Genetic Algorithm, the Artificial Immune System and the Evolutionary Programming need complex operations, which are suitable for relatively small integers until now. The Windowbased methods are feasible to solve large integers within a short time and have been repeatedly optimized, but have seemingly reached a plateau.In this paper, we proposes some novel methods to generate short addition chains. Our Simplified Powertree method reduces search space size is to while sacrificing some increasing of the addition chain length. Meanwhile, a Cross Window method and its variant are introduced by improving the Window method. Specifically, the Cross Window method handles the windows having cross correlation and its precomputation is generated shorter by the Addition Sequence algorithm. It is worth mentioning that the Cross Window method with the Addition Sequence algorithm can attain 9.5% reduction of the addition chain length, in the best case, compared to the Window method.
This paper is structured as follows. In Section 1, we present some basic introduction of the state of the art as well the main contributions of this paper. Section 2, briefly reviews some general existing methods, including the Binary method, the Powertree method and the Window method. We then continue to explore our contributions in Section 3, where we show a detailed description of our novel methods, including the Simplified Powertree method, the Cross Window method and the Cross Window method with Addition Sequence algorithm. We perform our experiment in Section 4, which shows the new methods can obtain shorter addition chains compared to the existing methods. Finally, we give our conclusions in Section 5.
2 Existing methods
2.1 Binary Method
2.2 Powertree Method (PTM)
PTM means that all nodes are represented in the form of a tree, and the nodes on the path are used as the addition chain of an integer. A complete powertree without duplicate nodes on any path is a tree that contains all possible results, as shown in Fig. 1.
The shortest addition chain of an integer can be determined by exhaustive search for all paths, which takes a long time. Note that root node 1 is layer 0, then nodes on layer k can get next layer by adding themselves and their previous nodes. The number of subnodes that can be generated is not less than , and the total number of nodes with depth of is over . In our Simplified Powertree method, we delete a large number of nodes and reduce the size from to .
2.3 Window Method
The idea of the Window Method (WM) is to split the binary form of an integer into some windows, then process the windows to get the addition chain through two parts: precomputation and consruction. Let the window length be k
. Precomputation selects all odd integers from 1 to
, and 2. The result of the precomputation is {1, 2, 3, 5, 7, …, }, with length .In WM, for the binary form of an integer , we read a window (the bitlength of (denote as) and , and indicate the most and the least significant bit of respectively). As a result, w is in the precomputation. For consruction, times doubling step and one time star step are performed. For the consecutive 0s, doubling steps are directly conducted. The implementation of WM is shown in Alg. 2.
3 New methods
3.1 Simplified Powertree Method
We first give another view of BM. Instead of using an optional addition mixed in doubling steps, we do the optional addition when all the doubling steps are done and the addition number 1 is adjusted to corresponding numbers. This implementation of BM is Alg. 3. This view of BM shows a feasible way to construct the addition chain, which leads to the key point of our Simplified Powertree method.
The Simplified Powertree method (SPTM) is proposed by subtly deleting tree nodes, which results in relatively small time and space complexity. A simplified powertree consists of root chain, main chain and branch chains. The structure of the root chain is BM() where is a parameter, and the main chain is . For each node in the main chain, a branch chain follows as . The structure of simplified powertree is shown in Fig. 2.
Based on the simplified powertree, the steps of constructing the addition chain of e are as follows:
(1) Obtain an addition chain of as BM() and record it.
(2) Search the branch chains and update the recorded addition chain whenever get a shorter addition chain.
(3) Output the recorded addition chain.
More specifically in step (2), for the branch chain followed , the corresponding addition chain of e is directly obtained if e is on the branch chain, else form an initial chain as BM() and do the “backwardadding”: whenever the newest integer in current chain adding the node backward in the initial chain is less than e, do the adding and append the adding result in current chain. Each branch chain can get an addition chain of e, as proved in Theorem 1. Then search all chains and update the recorded addition chain if get a shorter addition chain. The implementation of SPTM is shown in Alg. 4.
SPTM is exactly efficient. For any given positive integer , because the main chain and branch chains mainly contains doubling steps, their lengths are approximately the bitlength of , as . Therefore, the time complexity of SPTM is . The recorded addition chain is constantly updated so that the space complexity is .
Theorem 1: For any given positive integer, SPTM can produce an addition chain.
Proof: For the main chain, an addition chain of e is generated by BM. For the branch chain followed , let , where , according to division with remainder. In fact, it has because . Note . The branch chain contains the construction of , which can obtain based on BM. That is, according to the addition chain BM, we obtain BM. The main chain and the root chain contain the construction of , which can construct any integer from 1 to by BM, including by BM(). As a result, using the “backwardadding” , each branch chain can generate an addition chain of e. The proof is complete.
Theorem 2: Let be the length of addition chain obtained by SPTM. The range of is
(3) 
Proof: In the worst case, all the branch chains can not get a shorter chain than the main chain. The method degenerates to BM, and the length is .
In the best case, all 1s in the binary form of are divided into identical form of (note as ), and can be factorized into . For the first window w, the length is . For the other windows, do times star step and times doubling step. Thus the addition chain length is , equality holds if and only if . The proof is complete.
3.2 Cross Window Method
The Window method (WM) only considers the adjacent correlation which means that each window is divided sequentially. In practice, there are windows with cross correlation which means there is a cross relationship between the windows. Using windows with cross correlation may achieve better results. For example, for the integer , it only needs 2 star steps using cross windows, which is less than the result using adjacent windows, as shown in Fig. 4 and Fig. 4.
In this paper, a Cross Window method (CWM) is proposed to deal with the cross correlation. CWM has two parameters: valid window length and interval expansion length . CWM has two parts, same as WM: precomputation and consruction. In precomputation, the valid length is divided into two parts: the length of the right part , and the length of the left part .
When the interval expands, as , the precomputation of CWM are constructed by inserting the interval expansion numbers between the left and right parts of the precomputation of WM. The general structure is , which is performed specifically as follows:
(1) Get all odd numbers from 1 to , and 2.
(2) Get the interval expansion numbers, which are
(3) Combine all numbers from 1 to with the interval expansion and the numbers in step (1).
Finally, the precomputation () is {}, as shown by binary form in Fig. 5.
The lengths in step (1),(2) and (3) are and . The total length is .
When the interval expansion is not carried out, as , CWM degenerates to WM. In this case, step (2) should be removed. Thus the precomputation () is {}, with length , same as WM. It is unnecessary to divide the valid length . For consistency, let .
In CWM, for the binary form of e, read bits and remove its tail 0s as window w. If , the interval expansion position of the window is set to 0s. If , reset as its first R bits with removing the tail 0s. If , do nothing. As a result, w is in the precomputation. Then a highestbitaligned subtraction (note as ) is performed. That is, is to align the highest nonzero bit of with the highest nonzero bit of and execute a subtraction. A concrete example is listed by binary form in Fig. 6.
For , the first window is processed as , then do . Repeat this for and . Finally, is zero. From the above example, it is easy to find that is embedded in as . As a result, it is impossible to slide and add continuously like WM. To solve this problem, we record all the windows at corrosponding locations and construct the addition chain from the recorded windows. That is, do doubling steps bitbybit from the first recorded window and add the window at each recorded position. The implementation of CWM is shown in Alg. 5.
Theorem 3: Let be the length of addition chain obtained by CWM. The range of is
(4) 
Proof: In the worst case, the position of 1s can not form any window with length longer than 1, which degenerates to BM, and the length is .
In the best case, like SPTM, all 1s are divided into several identical windows (note as ), and the addition chain length is , equality holds if and only if . The proof is complete.
Theorem 4: In general, let the number of recorded windows be v, the length of precomputation be where = 0 if else , and the first window be , the addition chain length obtained by CWM is
(5) 
Proof: In CWM, we first construct the precomputation. The length of computation is if otherwise is , i.e. where = 0 if otherwise . Then perform times doubling step repeatedly and times star step for recorded windows except the first window. Thus the addition chain length obtained by CWMASA is The proof is complete.
3.3 Cross Window Method with Addition Sequence Algorithm
The precomputation of CWM can be optimized since some integers in the precomputation may not be used as a window. In this paper, a new Addition Sequence algorithm (ASA) is presented to construct a short precomputation of the used windows. ASA refers to the shortest addition chain containing given multiple integers, which is an NPcomplete problem even more difficult than ACP. Fortunately, ASA is solvable in CWM, since only the precomputation is involved which contains small integers. When we obtain a shorter precomputation using ASA, we can also use larger valid window length and interval expansion length and possible to obtain shorter addition chain.
Now we give a pragmatic ASA, which can find an addition chain containing all the used windows quickly. For an increasing order sequence , note the last two numbers as and let , according to division with remainder. For , we get BM and put it in by increasing order. We put in by increasing order if it is not in and is nonzero. Thus the addition chain from to is formed. Repeat the above steps for the following two numbers in reverse order until all integers in are solved. Finally, an addition chain containing is obtained. The implementation of ASA is shown in Alg. 6.
When the result of ASA is shorter than the original precomputation in CWM, the original precomputation will be replaced. CWM with ASA (CWMASA) is implemented in Alg. 7.
Theorem 5: Let the optimal addition chain length of the first window is , and the optimized length compared with BM is . Let be the length of addition chain obtained by CWMASA. The range of is
(6) 
Proof: In the worst case, the position of 1s can not form any window with length longer than 1, which degenerates to BM, and the length is .
In the best case, same as CWM, all 1s are divided into several identical windows (note as ), and addition chain length is . If ASA is used, the addition chain length is , equality holds if and only if . The proof is complete.
Theorem 6: In general, let the number of recorded windows be v, the length of precomputation be u, and the first window be , the addition chain length obtained by CWMASA is
(7) 
Proof: In CWMASA, we first construct the precomputation with length and then perform times doubling step repeatedly and times star step for recorded windows except the first window. Thus the addition chain length obtained by CWMASA is The proof is complete.
4 Numerical Results
In this section, the performance of WM, SPTM, CWM and CWMASA are compared. We firstly show the results of four methods on small integers with . Then a general case is conducted with the integers generated randomly with different Hamming weight. Moreover, the integers of effective types of SPTM are exhibited to indicate the irreplaceable advantages of SPTM in some cases. The parameters are selected as: WM: ; SPTM: and is odd; CWM: ; CWMASA: . The final result for an integer of a method is the shortest addition chain length within the parameter range.
4.1 The Integers with
For 365634 positive integers with [Cli], the results of WM, SPTM, CWM and CWMASA are shown in Table 1.

WM  SPTM  CWM  CWMASA  
count  proportion  count  proportion  count  proportion  count  proportion  
0  46193  0.126337  68586  0.187581  86565  0.236753  228805  0.625776  
1  150463  0.411513  187621  0.513139  185261  0.506684  131440  0.359485  
2  127654  0.349131  100286  0.274280  83943  0.229582  5379  0.014711  
3  37075  0.101399  9051  0.024754  9597  0.026248  10  0.000027  
4  4231  0.011572  90  0.000246  267  0.000730  0  0.000000  
5  18  0.000049  0  0.000000  1  0.000003  0  0.000000  
AVG  1.460505  1.136945  1.047527  0.388988 
In this range, from the first row, we can see the optimal results proportions of WM, SPTM, CWM and CWMASA are 12.6%, 18.8%, 23.7% and 62.6% respectively, and from the last row the average gap with the shortest is 1.46, 1.14, 1.05 and 0.39 respectively. The results of SPTM, CWM and CWMASA are better than those of WM, and are more concentrated on the part with smaller gap. CWMASA has the best results, and the optimal and suboptimal (the gap with the shortest is 1) results account for 98.5%.
4.2 The Integers Generated Randomly with Different Hamming Weight
Note
, which means the bit 1 occurs with the probability of
. Select bitlength as 160, 384, 512, 1024, 2048, 4096 and as 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 0.9. Set 50 integers for each combination. The performance of all methods are shown in Fig. 7.With the increasing of the bitlength, the gap of the results of these methods increases. The results obtained by SPTM are worse than WM, while the results obtained by CWM and CWMASA are better than WM. We give more details in Table 2 and Fig. 8. The average results by bitlength of the abovementioned test are shown in Table 2.
Len/bits  WM  SPTM  CWM  CWMASA 

160  192.37  210.15  191.93  187.89 
384  452.64  513.37  452.14  445.27 
512  600.18  688.10  599.64  590.71 
1024  1182.50  1388.86  1181.94  1166.94 
2048  2335.35  2793.60  2334.88  2307.24 
4096  4620.42  5610.92  4619.92  4567.07 
In the random case, for each bitlength, the results obtained by SPTM is greater than WM, which shows that SPTM is not effective in this case. The results obtained by CWM and CWMASA are better compared to WM, and the chain length obtained by CWMASA is relatively short. Fig. 8 shows the chain length optimization degree of CWMASA compared with WM.
As for , when , the optimization degree declines with the increasing of , and the overall optimization degree is relatively low; when , the optimization degree is the lowest; when , the optimization degree increases with the increasing of , and the overall optimization degree is relatively high.
For the bitlength, with the increasing of the bitlength, the optimization degree of CWMASA declines. This is because the corresponding extra times doubling step are unavoidably brought in with the increasing of the bitlength, so that the overall cardinality becomes larger.
In addition, the numbers with larger Hamming weight () are tested, as shown in Table 3. For , the average optimization degree of CWMASA is 7.89% compared with WM. When the length is 160 bits, the average length of the addition chain obtained by WM is 202.06, while CWMASA is 182.84, and the optimization degree reaches 9.51%.
Len/bits  WM  CWMASA  Optimization degree 

160  202.06  182.84  9.51% 
384  470.42  430.36  8.52% 
512  622.04  569.98  8.37% 
1024  1218.78  1126.4  7.58% 
2048  2394.72  2230.84  6.84% 
4096  4724.02  4414.82  6.55% 
4.3 The Integers of Effective Types for SPTM
SPTM is effective to the integers which have windows whose higest bit are followed by a long series of 0s and the rest of the windows. In this case, the length of the window is so long that the precomputations of WM and CWM are overwhelming. Because not using precomputation, the result of SPTM is better.
The generation rules of test integers are as follows:
(1) Randomly select bitlength and larger interval expansion length .
(2) The integers of bits is generated randomly, and then one bit 1 and 0s are set ahead to form a window, and the window is copied to k + 1 copies.
(3) The positions of these windows are randomly generated, and the position distance among the windows is not less than k. Then an integer is obtained from these windows with removing the tail 0s.
The bitlength is selected as 160, 384, 512, 1024, 2048, 4096, 50 integers in each length and a total of 300. The average test results are shown in Table 4.
Len/bits  WM  SPTM  CWM  CWMASA 

160  129.56  127.12  129.24  128.16 
384  350.08  340.64  349.00  345.88 
512  467.66  457.86  466.38  463.66 
1024  810.54  800.52  809.36  806.34 
2048  1653.44  1642.52  1651.88  1648.28 
4096  3006.00  2995.52  3004.72  3001.18 
In this case, the results obtained by SPTM are the best, and the results obtained by CWM and CWMASA are also better than those obtained by WM. This shows that although SPTM is not suitable for the integer of random cases, it can achieve the best results among several methods for the windows having the higest bit followed by a considerable number of 0s.
5 Conclusion
In this paper, we proposed a Simplified Powertree method and a Cross Window method with a new Addition Sequence algorithm. The Simplified Powertree method constructs a powertree with deep deletion. It is more suitable when the windows have the highest bit followed by a considerable number of 0s. The Cross Window method considers the windows with cross relationship. The cross windows are processed by recording the window positions for recovery. Furthermore, the precomputation is optimized with the Addition Sequence algorithm. The Cross Window method is slightly better than the Window method, and the Cross Window method with Addition Sequence algorithm has a better optimization, especially in the case of large Hamming weight. Roughly speaking, the average optimization degree is 78%, and the best case is 910%.
Acknowledgment
This work was supported by Natural Science Foundation of Beijing Municipality (No. 4202037), NSF of China with contract (No.61972018).
References

[BC89]
Jurjen N. Bos and Matthijs J. Coster.
Addition chain heuristics.
In Gilles Brassard, editor, Advances in Cryptology  CRYPTO ’89, 9th Annual International Cryptology Conference, Santa Barbara, California, USA, August 2024, 1989, Proceedings, volume 435 of Lecture Notes in Computer Science, pages 400–407. Springer, 1989.  [Bra39] Alfred Brauer. On addition chains. Bulletin of the American mathematical Society, 45(10):736–739, 1939.
 [Cli] N. Clift. Shortest addition chains. http://wwwhomes.unibielefeld.de/achim/addition_chain.html.
 [CRC08] Nareli Cruz Cortés, Francisco RodríguezHenríquez, and Carlos A. Coello Coello. An artificial immune system heuristic for generating short addition chains. IEEE Trans. Evol. Comput., 12(1):1–24, 2008.
 [CRJC05] Nareli Cruz Cortés, Francisco RodríguezHenríquez, Raúl JuárezMorales, and Carlos A. Coello Coello. Finding optimal addition chains using a genetic algorithm approach. In Computational Intelligence and Security, International Conference, CIS 2005, Xi’an, China, December 1519, 2005, Proceedings, Part I, volume 3801 of Lecture Notes in Computer Science, pages 208–215. Springer, 2005.
 [DMO15] Saúl DomínguezIsidro, Efrén MezuraMontes, and Luis Guillermo OsorioHernández. Evolutionary programming for the length minimization of addition chains. Eng. Appl. Artif. Intell., 37:125–134, 2015.
 [Knu14] Donald E Knuth. Art of computer programming, volume 2: Seminumerical algorithms. AddisonWesley Professional, 2014.
 [Koç95] Cetin K Koç. Analysis of sliding window techniques for exponentiation. Computers & Mathematics with Applications, 30(10):17–24, 1995.
 [KY98] Noboru Kunihiro and Hirosuke Yamamoto. Window and extended window methods for addition chain and additionsubtraction chain. IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, 81(1):72–81, 1998.
 [KY00] Noboru Kunihiro and Hirosuke Yamamoto. New methods for generating short addition chains. IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, 83(1):60–67, 2000.
 [NMMZ17] Adamu M. Noma, Abdullah Muhammed, Mohamad Afendee Mohamed, and Z. Ahmad Zulkarnain. A review on heuristics for addition chain problem: Towards efficient public key cryptosystems. J. Comput. Sci., 13(8):275–289, 2017.