DeepAI

# Post-quantum hash functions using SL_n(𝔽_p)

We define new families of Tillich-Zémor hash functions, using higher dimensional special linear groups over finite fields as platforms. The Cayley graphs of these groups combine fast mixing properties and high girth, which together give rise to good preimage and collision resistance of the corresponding hash functions. We justify the claim that the resulting hash functions are post-quantum secure.

• 2 publications
• 5 publications
• 2 publications
• 1 publication
• 13 publications
07/30/2021

### Quantum collision finding for homomorphic hash functions

Hash functions are a basic cryptographic primitive. Certain hash functio...
09/09/2022

### Post-Quantum Oblivious Transfer from Smooth Projective Hash Functions with Grey Zone

Oblivious Transfer (OT) is a major primitive for secure multiparty compu...
08/07/2018

### Partially perfect hash functions for intersecting families

Consider a large network with unknown number of nodes. Some of these nod...
08/31/2018

### A Formula That Generates Hash Collisions

We present an explicit formula that produces hash collisions for the Mer...
01/23/2023

### On the (Im)plausibility of Public-Key Quantum Money from Collision-Resistant Hash Functions

Public-key quantum money is a cryptographic proposal for using highly en...
02/22/2022

### Applying Grover's Algorithm to Hash Functions: A Software Perspective

Quantum software frameworks provide software engineers with the tools to...
10/27/2022

### Quantum security of subset cover problems

The subset cover problem for k ≥ 1 hash functions, which can be seen as ...

## 1 Introduction

Hash functions obtained from families of expander graphs were introduced by Charles-Lauter-Goren in [charles2008cryptographic], where they were in turn inspired by a scheme of Tillich-Zémor [Tillich1994Algebraic]. Charles-Lauter-Goren considered specific families of expander graphs discovered by Lubotzky-Phillips-Sarnak [Lubotzky1988Ramanujan] and Pizer [Pizer1990Ramanujan]. The Charles-Lauter-Goren construction is quite general, and can be applied to any expander family in which finding cycles is hard, and thereby furnishes collision resistant hash functions. Similar schemes have been proposed by several authors; see [Shpilrain2016Compositions, Bromberg2017Navigating, Ghaffari2018More], and [Sosnovski2018] for a survey on this topic.

Interest in hash functions based on novel platforms fits into the context of recent efforts to modernize existing hash functions, and to adapt them to the design and security of hash-based consensus mechanisms, most notably with respect to blockchains [Borthwick2020], and especially in light of the recently proved practicality of finding collisions in the SHA-1 hashing algorithm [Stevens2017First].

The general idea behind Tillich-Zémor hash functions is the following. Fixing a base vertex, the input of the hash function is interpreted as a sequence of instructions, resulting in a non-backtracking path in a -regular graph. The output of the hash function is the endpoint vertex of the path. More precisely, the input is a string of numbers in

 [d−1]\vcentcolon={1,2,…,d−1}

of arbitrary length, and the output is the vertex obtained by performing a simple walk starting at a base vertex, using the elements of as transition data for subsequent steps in the walk. See Definition 2.3 below for details.

A well-constructed hash function is an efficiently computable function which enjoys two main features. The first is preimage resistance, which means that given a point in the hash value, it is computationally hard to find an input that maps to that hash value. The second, is collision resistance, which requires the problem of finding distinct inputs with the same output to be computationally difficult.

The main goal of this paper is to propose a new Goren-Lauter-Charles–type scheme, where the hash functions use Cayley graphs of the special linear groups as platforms, where here is prime and

. A crucial observation is that, in schemes using these groups as a platform, the problem of finding a preimage or a collision corresponds to finding factorizations of the identity matrix with prescribed factors. With this observation in hand, and by taking into account recent work of Arzhantsheva-Biswas

[Arzhantseva2018Large] concerning the expanding properties of the Cayley graphs of these groups, we offer a detailed study of the security of our protocol. In particular, we have the following:

• Preimage resistance. In Proposition 2.5, we collect the expansion properties the family of Cayley graphs of the groups , where is fixed and where

tends to infinity. Expansion in this family of graphs guarantees good mixing properties, because under these conditions the random walk gives a good approximation to the uniform distribution after

steps. In particular, the likelihood of success of an indistinguishability attack is strictly bounded, and tends to as the order of the group tends to infinity; see Proposition 3.3.

• Collision resistance. The strength of our hash function with respect to collision resistance is mainly based on the absence of small cycles in the Cayley graphs of the underlying groups. In fact, Proposition 3.1 provides a lower bound on the girth of the graphs on the order of . It follows that a factorization of the identity, which is easily seen to be equivalent to finding a collision for the hash function, is in turn equivalent to solving over the a system of equations in a number of variables that is , over the field . In full generality, solving such systems of equations is NP-hard.

Replacing the problem of factoring the identity with the problem of factoring an arbitrary group element yields a similar system of equations, lending further evidence of resistance of the hash function to finding preimages; see Section 3.2.

For , certain Cayley graphs of the groups give rise to the celebrated Lubotzky–Phillips–Sarnak expander graphs [Lubotzky1988Ramanujan], which were then used to build hash functions in [charles2008cryptographic]. A successful collision attack (i.e. an efficient computation of a collision) was found in [TillichZemor08Collision], by taking coefficients in and then reducing to a system of equations of degree two. The essentially different nature of higher dimensional special linear groups gives evidence of additional security, and makes it likely that these attacks are far more difficult to execute for the hash functions proposed here.

Considering symmetric generating sets enables us to employ results from the theory of simple random walks in simplicial graphs. Nevertheless, the fact that we restrict ourselves to non backtracking random walks precludes the use of multiplicativity of the has function, and thus complicates parallel computation. We discuss these issues in Section 3.4.

##### Structure of the paper:

In Section 2 we give the relevant group theoretic background, define the hash functions, prove the expansion property of the Cayley graphs, and exhibit concrete examples. In Section 3, we describe the various properties of our scheme: namely, we relate free generation with a lower bound on the girth; we then describe the role of polynomial equations in preimage finding and in collision resistance, as well as the role of mixing in indistinguishability attacks. Finally, we discuss multiplicativity and parallel computing, showing that the collision attack from Grassl et. al. [grassl2011cryptanalysis] using palindromes does not break the scheme presented here. Section 4 concludes the paper.

## 2 Definition of the hash functions

This section defines our hash functions and exhibits concrete instances. We start by recalling some relevant background material which will be essential in our construction and in the sequel.

### 2.1 Background about special linear groups

For general results about special linear groups over finite fields, we refer the reader to Hall’s book [Hall2015]. In this section we concentrate on a number of properties established by Arzhantseva-Biswas in their article [Arzhantseva2018Large]. We summarize their results in the following proposition:

###### Proposition 2.1.

Let and let a prime. Write for the canonical projection given by reduction modulo . There exist matrices such that:

1. [label=()]

2. There exists a prime such that for all , the matrices and generate .

3. If is the subgroup generated by and inside of , then is isomorphic to , the free group of rank two.

4. The diameter of the Cayley graph of with respect to is .

Observe that for , items 1 and 2 reflect the fact that the subgroup of generated by and is usually a thin subgroup of . The fact that and generated the corresponding finite quotients for all but finitely many values of is a reflection of strong/superstrong approximation. In turn, item 3 implies that the girth of is optimal, subject to the condition that the graphs form a family of expander graphs (which will indeed be the case for us; see Proposition 2.5 below).

###### Remark 2.2.

When using the Cayley graphs as a platform, we think of as being fixed and as modulating the level of security, with the trade-off being that the hash functions functions become more expensive to compute for large .

Possible choices for and are given by:

 ~A=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1a00…001a0…0001a…0⋮⋮…a0000…1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠,~B=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝100…0b10…00b1…000b…0⋮⋮⋮000…b1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠∈SLn(Fp),

with . These matrices will be crucial in the description of our hash function.

### 2.2 The general construction

We now use matrices given in [Arzhantseva2018Large] to define an explicit family of hash functions.

###### Definition 2.3 (Special linear group based hash functions).

Let and let be a prime number. Let that satisfy:

• If , , and for some positive integer .

• If , there exists a prime such that and is at least and is of the form for some integer .

Consider the matrices and from the previous section. In the following we will denote and .

We use the notation to denote the set of integers in , and to denote the set of finite strings of integers in . We now define the hash function . We start by choosing bijections

 s:[4]→{A±1,B±1},sλ:[3]→{A±1,B±1}∖s(λ)

for each .

Then, given

 x=(xi)1≤i≤k∈[3]k,

we have the following inductive definition:

• ,

• with , for .

Finally, we set .

###### Remark 2.4.

Note that is –regular, so that after the first bit of the input , there are exactly three non-backtracking edges in the graph by which to proceed. The input can thus be viewed as encoding a reduced word in the free group . The lack of backtracking in the resulting walk on is crucial for the avoidance of collisions, as well as for the reduction of mixing time.

As stated in [Arzhantseva2018Large], the elements generate a free subgroup of and generate for all but finitely many values of , and these facts give rise to strong preimage and collision resistance of the resulting hash functions.

### 2.3 Expansion

For us to implement the Charles-Lauter-Goren approach, we must take advantage of the good mixing properties of the expander graphs.

###### Proposition 2.5.

For fixed and , the sequence is a family of expander graphs.

###### Sketch of proof.

By item 1 of Proposition 2.1, there exists a prime such that for all , the matrices and generate . Now, since has property (T) for [bekka2008kazhdan], we have that has property with respect to the family of congruence subgroups; see [Lubotzky2005What] for more details. ∎

An immediate consequence of this proposition is that the random walk approximates the uniform distribution after steps in the corresponding graph . As we will elaborate in Section 3.3, this result is relevant for the purpose of analyzing preimage resistance to indistinguishability attacks; see Proposition 3.3, in particular. We note that in [Bromberg2017Navigating], random walks are conducted on Cayley graphs with respect to non-symmetric generating sets, and thus their asymptotic behavior is less clear. Similar issues arise in [Tomkins2020New], since then hash values could be restricted to a proper subgroup. As stated in [Arzhantseva2018Large], we note that one can effectively compute the bound lower . No explicit bound on

has been given, though by combining existing results one can probably prove that

need not be very large, likely on the order of magnitude of ; see for instance [Golsefidy2012Expansion, Appendix] and [Guralnick1999Small, Theorem D]. Note that the larger the value of the prime , the more secure the hash function.

### 2.4 A concrete example

We finish this section by describing a family of concrete examples of hash functions, which are constructed for the specific values , and . We do not know what minimal value of would ensure security.

###### Definition 2.6.

Let be a prime, and let

 A=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1400…00140…00014…0⋮⋮…40000…1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠4,B=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝100…0210…0021…0002…0⋮⋮⋮000…21⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠4∈SLn(Fp),

Let , , , . We define the functions as follows:

• , , ,

• , , ,

• , , ,

• , , ,

Given an input sequence , we inductively define:

• , with , for each .

Then, the sequence is hashed to the matrix:

 φ(x)=B1⋯Bk.

Thus, we obtain a hash function for every .

## 3 Properties of the constructed hash functions

In this section we use graph and group-theoretic machinery to describe the security of the hash functions defined above. We center our analysis on resistance to preimage and collision breaking. The exposition is divided into three parts: first, we establish a lower bound in the girth of the Cayley graphs of the group with respect to the generating system ; second, we describe the consequences of girth bounds for collision resistance; third, we investigate resistance of the protocol against an indistinguishability attack.

### 3.1 Free groups and girth

The following proposition is in the spirit of [Bromberg2017Navigating].

###### Proposition 3.1.

Let such that the entries of and are bounded in absolute value by a positive constant . If and generate a free subgroup of , then the girth of the Cayley graph of , with respect to is at least

 ⌊log(p−1)lognc⌋.
###### Proof.

For any reduced word in and , we write (resp. ) for the projection of to (resp. ). It follows by a straightforward induction on that, if has length , then the entries of cannot exceed in absolute value. Now, let be the girth of the corresponding Cayley graph. Then, there exists a non trivial reduced word of length such that . It follows that is of the form , where is an integer matrix. Since is non trivial and since generate a rank two free subgroup of , the matrix is nonzero. We conclude that has an entry of absolute value at least . Since the entries of cannot exceed in absolute value, we have that the length of is bounded below by , the desired conclusion. ∎

### 3.2 Preimage and collision resistance, and post-quantum heuristics

We now analyze the resistance of our model to finding preimages and to collisions. Observe that finding a preimage of a particular hash value (resp. finding a collision of hash values) is equivalent to finding a factorization of a given group element (resp. of the identity) in with respect to the generating set.

The matrices involved have order , so a factorization can be seen as a family of equations with variables

 k1,…,km,ℓ1,…,ℓm∈Fp

satisfying:

 Ak1Bℓ1…AkmBℓm=M,

for a given challenge . Note that there are trivial solutions to preimage and collision breaking of the hash function, since is the identity. Since the girth of is , we consider nontrivial solutions to preimage or collision breaking to be ones where

 C1logp≤m∑i=1(ki+ℓi)≤C2logp,

where and are positive constants depending on but not on

. Note that estimates for

and can be produced, and that Proposition 3.1 furnishes an estimate for , for instance. Sharp values for and are of relatively minor consequence for us.

Each entry of the left-hand side matrix in equation is polynomial in . Thus, the equation corresponds to a system of multivariate polynomial equations over .

Solving multivariate polynomial equations over a finite field is known to be NP-hard [GareyJ1979Computers], which suggests a good level of security. Moreover, the reduction to solving multivariate polynomials, a class of hardness problems considered for standardization by NIST, provides a certain degree of confidence that the hash function is post-quantum. We contrast this approach with schemes based on isogeny graphs, which reduce to a more well-defined problem, albeit one not know to be NP-hard.

NP-hardness of a class of problems is a worst case complexity property, and for certain NP-hard classes of problems, relatively simple and efficient algorithms can find solutions in the vast majority of cases. Thus, NP-hardness of the underlying problem is not a guarantee of post-quantum behavior of the hash function.

A more compelling case for the hash function to be post-quantum arises from empirical difficulty of factoring in special linear groups over finite fields. For instance, in [FaugerePPR11], subexponential factorization algorithms were found for , and these were only found in 2011. These algorithms rely essentially on the fact that the matrices are

, and on the fact that the underlying field has characteristic two. Thus, the methods do not generalize in any straightforward way to larger dimensional special linear groups nor to fields with odd characteristic. In practice, factoring matrices over finite fields is quite difficult, and implemented algorithms are inefficient. Hardness appears to be optimized when the system of equations resulting from

is neither underdetermined nor overdetermined, i.e. when the number of equations and variables is comparable. Thus, the larger the value of

the more secure the hash function, at the expense of computational time and space, and the balance of degrees of freedom and constraints occurs when

, or in other words when is approximately the square root of the logarithm of . We may then expect the factorization problem to take exponential time in the number of variables in this case, and by resistant to the speedup resulting from quantum attacks.

### 3.3 The mixing property and indistinguishability attacks

In this section we study how to formalize the relationship between mixing properties and indistinguishability, and how expansion weakens indistinguishability attacks and thus gives heuristic confidence in pre-image breaking resistance.

#### 3.3.1 The mixing property

By the mixing property, we mean that the output vertex of a random input — in our case a random walk — approaches the uniform distribution on the hash space. When the random walk approaches the uniform distribution quickly, mixing is observed even when the input messages have relatively small length, say polynomial in . More precisely, we have the following corollary of result of Alon-Benjamini-Lubetzky-Sodin [Alon2007NonBacktracking], which characterizes the rate at which a random walk on a graph converges to the uniform distribution in terms of the spectral properties of its adjacency matrix:

###### Theorem 3.2.

[Alon2007NonBacktracking, Theorem 1.1, cf. proof of Theorem 1.3] Suppose . Let be a non backtracking random walk on a –regular connected graph with vertices. There is a constant such that whenever we have

 |Pr(Xℓ=v)−1/N|≤1/N2,

for every vertex of .

Examining the proof given in [Alon2007NonBacktracking], one finds that the rate of mixing depends not so much on the graph

, as much as on eigenvalues of the adjacency matrix of

. Thus, if is a member of a sequence of expander graphs, we may take the constant in Theorem 3.2 to depend only on the expansion constant of the family.

It is well-known that mixing properties are desirable in Tillich-Zémor hashing schemes; see [charles2008cryptographic, TillichZemor08Collision]. As explained in [TillichZemor08Collision], mixing is particularly relevant when the hash functions are used in protocols whose security relies on the random oracle model; see [BellareR1993RandomOracles] for example of such protocols. The relevance of this approach depends on the distribution of possible messages and in particular on how they are encoded, a question we do not address in the present paper.

Surprisingly few mathematical statements addressing the relationship between mixing and attacks are present in the literature; an example can be found in [Sterner2021Commitment, Theorem 3], in the context of commitment schemes. The following proposition, which is the main goal of this section, fits in this context. We will give precise definitions of all the terms involved below.

###### Proposition 3.3.

Let the hash function of Definition 2.3, and let . There is a positive constant such that if , then a generic indistinguishability attack wins with probability at most .

In particular we have the following immediate consequence. Note that probability is the best we can hope for if an adversary is asked to distinguish between two situations, because such an adversary can always choose at random with success probability .

###### Corollary 3.4.

In the previous notation, when tends to infinity, the probability of winning for a generic indistinguishability attack tends to .

The complexity of the so-called generic indistinguishability attack mentioned here is linear in the number of vertices of the graph involved. Thus, in the specific case of Proposition 3.3 it has exponential complexity, since the size of is polynomial in , and our reference is . Thus, we obtain a more general notion of indistinguishability than in [Katz2021Introduction, §6.8], restricted to polynomial-time algorithms.

The proof of Proposition 3.3 will be obtained as a corollary of the more general Proposition 3.5, and of the mixing properties of the random walk on . We elaborate on these ideas presently.

#### 3.3.2 Generic attack of indistinguishability challenge

The goal of this subsection is to define indistinguishability challenges, investigate possible attacks, and to justify to what extent the hash functions defined in this paper are resistant to these attacks.

##### Indistinguishability challenge.

Let be a finite set. The challenger

and , which take values in . These distributions are a priori distinct, and we think of them as being “close”. picks (resp. ) at random, according to the distribution (resp. ). then flips a balanced coin and sends to their adversary the element , which satisfies:

• if ,

• if .

Given , the goal of is to guess . In other words, needs to decide if sent an element of according to the distribution of or of .

##### Generic indistinguishability attack.

The adversary knows the public distributions and . For each binary relation

 ∗∈{<,=,>},

we denote by the set of such that . The best strategy available to is the following:

• If , then guesses ,

• If , then guesses a random ,

• If , then guesses .

In the following proposition, we relate the -distance between the distributions and on one hand, and the chances of success of following the strategy above on the other hand.

###### Proposition 3.5.

Let be such that

 |Pr(X=h)−Pr(Y=h)|≤ε,

for every . Then, guesses the correct with probability at most

 12+ε#H4.
###### Remark 3.6.

The probability that guesses correctly is when the strategy is to guess a random , so Proposition 3.5 above essentially says that cannot do much better, provided the distributions are sufficiently close.

###### Proof of Proposition 3.5.

Let . We have

 PA=∑h∈HPr(A is right∣Z=h)Pr(Z=h)

For each binary relation let

 P∗=∑h∈H∗Pr(A is right∣Z=h)Pr(Z=h).

Then, we have . We analyze each of these terms independently.

• If , then guesses a random . Thus,

 P= =∑h∈H=Pr(Z=h)2 =Pr(X=h)+Pr(Y=h)4.
• If , then guesses . Thus,

 P< =∑h∈H
• If , then guesses . Thus,

 P> =∑h∈H>Pr(P=1∣Z=h)Pr(Z=h) =∑h∈H>Pr(Z=h∣P=1)Pr(P=1) =∑h∈H>Pr(Y=h)2.

So, we obtain

 PA=12∑h∈Hmax{Pr(X=h),Pr(Y=h)}.

We can now give an upper bound on . We can assume without any loss of generality that . Then,

 PA =12∑h∉H
##### Generic indistinguishability attack in our framework.

In the framework of hash functions, we consider a function , where and are finite sets. The challenger has two variables in hand, namely and , both chosen uniformly at random. The indistinguishablility challenge is now adapted to the two variables and . We can now prove Proposition 3.3.

###### Proof of Proposition 3.3.

From Theorem 3.2, there are constants such that after steps, the non backtracking random walk on is at -distance at most from the uniform distribution. From Proposition 3.5, the generic indistinguishability attack wins with probability at most . ∎

### 3.4 Multiplicativity and parallel computing

The hash functions considered in this article take as input a string of integers in , converting each integer into a matrix of the form , and finally outputs the product of these matrices.

The fact that we require the underlying walk to be non-backtracking implies that this attribution is not locally determined: a given bit in the string is mapped to a matrix that depends on the previous bits. This dependency can be dramatic: for example, according to Definition 2.6 a sequence of the form will be mapped to the product , while a sequence of the form will be mapped to the product . In particular, the last bit of these two strings can be mapped to different matrices, depending on the first bit in the string. The endpoints of the corresponding walk in in the Cayley graph may be arbitrarily far away from each other.

As a consequence, the function need not be multiplicative under concatenation of strings. This lack of multiplicativity makes it difficult to perform parallel computations with the given hash functions, as we now investigate in more detail.

#### 3.4.1 Good and bad tails

Recall that for a finite set , the notation is used for the set of finite length strings of elements of . As before, the notation denotes the set .

###### Definition 3.7.

Let be a finite group, generated by two elements and . Let .

A string is called a good tail with respect to if there exists such that for every , the last letter of is , where here is the string obtained from the concatenation of and . A string which is not a good tail is called a bad tail.

Local constraints in Definition 2.6, can be obtained by the following fact:

###### Fact 3.8.

The function

 ~φ:{xi}∈[3]∗↦{Bi}∈{A±1,B±1}∗

constructed in Definition 2.6 has the following good tails: 11, 31, 22, 32, 13 and 23.

###### Proof.

It is straightforward to check that:

• any string ending in 11 or 31 outputs a string ending in ;

• any string ending in 22 or 32 outputs a string ending in ;

• any string ending in 13 outputs a string ending in ;

• any string ending in 23 outputs a string ending in . ∎

The following proposition shows that the attribution above is optimal.

###### Proposition 3.9.

Special linear group based hash functions (Definition 2.3) admit at most six good tails of length two.

The bound in Proposition 3.9 is sharp, as shown via the attributions of Definition 2.6. The proof of Proposition 3.9 will follow from the following lemma:

###### Lemma 3.10.

Let

 ~φ:{xi}∈[3]∗↦{Bi}∈{A±1,B±1}∗

be a special linear group based hash function (Definition 2.3), and let . There exists such that is a bad tail with respect to .

###### Proof.

The only freedom that we have in the construction of Definition 2.3 is how we define the maps . We call elements of step matrices. Using , we say that the elements of are encoded by step matrices. We summarize the definitions of the maps in a table, with one row for each step matrix, and one column for each element of . Each cell from this tabular contains a step matrix.

To use this table, start from a string . Say that for some , we want to find the step matrix associated with . Let be the step matrix encoding . Then, the step matrix encoding is the step matrix in the cell located in the row labelled by and in the column labelled by .

It follows from the definition of the maps that for each step matrix , the row labelled by contains exactly the three matrices in the set . The attributions of Definition 2.6 can be described as in Figure 1.

Moreover, since each matrix can actually be the last step matrix used, every cell of the table can potentially be used. Fix an element , and assume for a contradiction that every integer has the property that is a good tail. The column corresponding to each has to contain at least two different step matrices. This implies that, in the row labelled by , at least two cells contain the same step matrix. Since this is true for each , this implies in particular that there is a step matrix that is contained three times in the column labelled by . The fourth cell of this column cannot be part of the row labelled by since this would give rise to another in a different column. This implies that the label of this row appears in a cell of another column. Additionally, this column contains a cell with step matrix not equal to . Then, the label of this column gives us an integer having the property that inputs ending by can have either or another matrix as a final matrix. This is a contradiction and concludes the proof of the lemma. ∎

###### Proof of Proposition 3.9.

From Lemma 3.10, to each integer of corresponds at least on bad tail, giving three different bad tails. ∎

As remarked previously, Definition 2.6 shows that the estimate in Proposition 3.9 is sharp, and so that in some sense, we have found an optimal way of defining the maps .

#### 3.4.2 Multiplicativity

It follows from the discussion of good and bad tails above that multiplicativity of the hash function can be obtained by restricting to sequences whose product ends with the matrix .

###### Fact 3.11.

In Definition 2.6, we have , provided ends with or .

#### 3.4.3 Parallel computing

Multiplicativity of the hash function under suitable conditions can be leveraged to compute its values by parallel computation. First, look for good tail substrings, namely: 11, 31, 22, 32, 13 or 23. For generic messages, one would expect such substrings to be quite common. Next, split the input immediately following one of these strings, and apply a slightly modified hash function (i.e. using the relevant instead of in the first matrix attribution). Finally, compute the product of the hash outputs.

###### Example 3.12.

Say we want to hash the string . Observe that: . We thus compute: and , where is defined analogously to in Definition 2.6, apart from the fact that is set to be instead of , since is a product ending by . Finally, .

### 3.5 Palindromic attacks

One of several proposals of hashing by walks on Cayley graphs can be found in [tillich1994hashing], wherein the Cayley graph is that of . A method for finding collisions for this hash function is presented in [grassl2011cryptanalysis] (cf. [FaugerePPR11]); we argue that the attack does not apply in our case, though our evidence for this claim is primarily empirical.

The idea of [grassl2011cryptanalysis] is to find collisions on palindromes; that is, bit-string entries that are invariant under reversing the order. To begin, one conjugates generators of to obtain new generators which give rise to an isomorphic graph, but which are symmetric matrices. That is, if the original generators are , one finds a matrix such that and are both symmetric matrices.

We first note that in our case, finding is not easy; for and , about of the elements satisfy this criterion. Moreover, there is no obvious way to compute ; attempts to calculate the entries of such a matrix directly have proved resistant to equation solving methods in standard computer algebra systems - indeed, this approach is actually less efficient than just checking all possible matrices. Therefore, we do not have much data for larger primes, since the naive method used to find a suitable matrix quickly becomes computationally infeasible.

Provided one can find a matrix , it follows that collisions in the hash function with respect to as generators are exactly the collisions with as generators; one can therefore rename the matrices as . One then proceeds according [grassl2011cryptanalysis, Lemma 1]: upon input of a palindromic string , the output of the product of conjugated generators in will always be a symmetric matrix.

Since our hash function requires avoidance of backtracking in the walk, we are not guaranteed a palindromic matrix product from a palindromic input string; however, since one could reverse-engineer the necessary input to obtain a palindromic matrix product, we proceed to discuss palindromic matrix products without reference to hash function.

It turns out, as one may check easily by induction, that a palindromic product in symmetric generators will itself be symmetric. The ultimate goal of [grassl2011cryptanalysis] is to use this fact to demonstrate that the function

 ρ:M↦AMA+BMB,

where is a palindromic product, outputs a matrix populated with either zeroes or the square of a field element appearing as an entry in . One then employs number theoretic tricks to force the nonzero elements to in and thus to obtain . One thus builds distinct but equal palindromic matrix products.

Consider the generators from Definition 2.6 over . Transforming these generators with respect to the matrix

 C=⎛⎜⎝26105310233⎞⎟⎠,

one checks that the palindrome is such that

 M=⎛⎜⎝742406266⎞⎟⎠,ρ(M)=⎛⎜⎝215117577⎞⎟⎠.

In particular, for each , the matrix contains an entry that is not the power of any entry of . This furnishes evidence that for , there is little hope of extending [grassl2011cryptanalysis, Corollary 1] to our context; we argue that the lack of closed form of transformed generators in general, the difficulty of finding them for larger parameters, and this example with a small value of , conspire to provide strong evidence that the approach will fail in general.

## 4 Conclusions

We have presented new Tillich-Zémor hash functions, with platforms Cayley graphs of for . We show that choosing appropriate generating matrices produces graphs without small cycles, and having a quick mixing property, both of which are highly desirable for preimage and collision resistance. Moreover, flexibility of choice of generating matrices and of the dimension gives the option of increasing the complexity of attacks. Future work includes the exact computation of the spectral gap and the prime (cf. item (i) of Proposition 2.1). Moreover, simulations should be carried out in order to compare with other existing schemes and determine the optimal values of and to be taken in implementations.