    # Computation of the Similarity Class of the p-Curvature

The p-curvature of a system of linear differential equations in positive characteristic p is a matrix that measures how far the system is from having a basis of polynomial solutions. We show that the similarity class of the p-curvature can be determined without computing the p-curvature itself. More precisely, we design an algorithm that computes the invariant factors of the p-curvature in time quasi-linear in √(p). This is much less than the size of the p-curvature, which is generally linear in p. The new algorithm allows to answer a question originating from the study of the Ising model in statistical physics.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Differential equations in positive characteristic  are important and well-studied objects in mathematics [22, 32, 33]. The main reason is arguably one of Grothendieck’s (still unsolved) conjectures [26, 27, 1], stating that a linear differential equation with coefficients in admits a basis of algebraic solutions if and only if its reductions modulo (almost) all primes  admit a basis of polynomial solutions modulo . Another motivation stems from the fact that the reductions modulo prime numbers yield useful information about the factorization of differential operators in characteristic zero.

To a linear differential equation in fixed characteristic , or more generally to a system of such equations, is attached a simple yet very useful object, the -curvature. Let  be the finite field with  elements. The -curvature of a system of linear differential equations with coefficients in  is a matrix with entries in  that measures the obstructions for the given system to possess a fundamental matrix of polynomial solutions in . Its definition is remarkably simple, especially at a higher level of generality: the -curvature of a differential module  of dimension  over  is the “differential-Frobenius-map” ( times). When applied to the differential module canonically attached with the system , the -curvature materializes into the -th iterate of the map that sends to , or more concretely, into the matrix of this map with respect to the canonical basis of . It is given as the term of the sequence of matrices in defined by

From a computer algebra perspective, many effectivity questions naturally arise. They primarily concern the algorithmic complexity of various operations and properties related to the -curvature: How fast can one compute ? How fast can one decide its nullity? How fast can one determine its minimal and characteristic polynomial? Apart the fundamental nature of these questions from the algebraic complexity theory viewpoint, there are concrete motivations for the efficient computation of the -curvature, coming from various applications, notably in enumerative combinatorics and statistical physics [7, 8, 2].

We pursue the algorithmic study of the -curvature, initiated in [9, 3, 4]. In those articles, several questions were answered satisfactorily, but a few other problems were left open. In summary, the current state of affairs is as follows. First, the -curvature  can be computed in time when and when . The soft-O notation indicates that polylogarithmic factors in the argument of are deliberately not displayed. These complexities match, up to polylogarithmic factors, the generic size of . Secondly, one can decide the nullity of  in time  and compute its characteristic polynomial in time . It is not known whether the exponent

is optimal for the last problem. In all these estimates, the complexity (“time”) measure is the number of arithmetic operations

in the ground field , and the dependence is expressed in the main parameter  only. Nevertheless, precise estimates are also available in terms of the other parameters of the input.

In the present work, we focus on the computation of all the invariant factors of the -curvature, and show that they can also be determined in time . Previously, this was unknown even for the minimal polynomial of or for testing the nullity of . The fact that a sublinear cost could in principle be achievable, although  itself has a total arithmetic size linear in , comes from the observation that the coefficients of the invariant factors of lie in the subfield of , in other words they are very sparse.

To achieve our objective, we blend the methods used in our previous works  and . The first key ingredient is the construction, for any point  in the algebraic closure of  that is not a pole of , of a matrix with entries in which is similar to the evaluation of the -curvature at the point . This construction comes from  and ultimately relies on the existence of a well-suited ring, of so-called Hurwitz series in , for which an analogue of the Cauchy–Lipschitz theorem holds for the system around the (ordinary) point . The matrix  is the -th coefficient of the fundamental matrix of Hurwitz series solutions of at .

The second key ingredient is a baby step / giant step algorithm that computes  in operations in via fast matrix factorials. Finally, we recover the invariant factors of  from those of the matrices , for a suitable number of values

. The main difficulty in this interpolation process is that there exist badly behaved points

for which the invariant factors of are not the evaluations at  of the invariant factors of . The remaining task is then to bound the number of unlucky evaluation points . The key feature allowing a good control on these points, independent of , is the fact that the invariant factors of have coefficients in .

Relationship to previous work. There exists a large body of work concerning the computation of so-called Frobenius forms of matrices (that is, the list of their invariant factors, possibly with corresponding transformation matrices), and the related problem of Smith forms of polynomial matrices. The specificities of our problem prevent us from applying these methods directly; however, our work is related to several of these previous results.

Let  be a feasible exponent for matrix multiplication. The best deterministic algorithm known so far for the computation of the Frobenius form of an matrix over a field  is due to Storjohann . This algorithm has running time operations in . We will use it to compute the invariant factors of the matrices above. Las Vegas algorithms were given by Giesbrecht , Eberly  and Pernet and Storjohann , the latter having expected running time over sufficiently large fields.

The case of matrices with integer or rational entries has attracted a lot of attention; this situation is close to ours, with the bit size of integers playing a role similar to the degree of the entries in the -curvature. Early work goes back to algorithms of Kaltofen et al. [23, 24] for the Smith form of matrices over , which introduced techniques used in several further algorithms, such as the Las Vegas algorithm by Storjohann and Labahn . Giesbrecht’s PhD thesis  gives a Las Vegas algorithm with expected cost for the Frobenius normal form of an matrix with integer entries of bit size ; Storjohann and Giesbrecht substantially improved this result in , with an algorithm of expected cost . The best Monte Carlo running time known to us is , by Kaltofen and Villard .

In the latter case of matrices with integer coefficients, a common technique relies on reduction modulo primes, and a main source of difficulty is to control the number of “unlucky” reductions. We pointed out above that this is the case in our algorithm as well. In general, the number of unlucky primes is showed to be in ; in our case, the degree of the entries grows linearly with , but as we said above, we can alleviate this issue by exploiting the properties of the -curvature. Storjohann and Giesbrecht proved in  that a candidate for the Frobenius form of an integer matrix can be verified using only primes; it would be most interesting to adapt this idea to our situation.

Structure of the paper. In Section 2, we recall the main theoretical properties of the invariant factors of a polynomial matrix, and study their behavior under specialization. We obtain bounds on bad evaluation points, and use them to design (deterministic and probabilistic) evaluation-interpolation algorithms for computing the invariant factors of a polynomial matrix. Section 3 is devoted to the design of our main algorithms for the similarity class of the -curvature, with deterministic and probabilistic versions for both the system case and the scalar case. Finally, Section 4 presents an application of our algorithm, that allows to answer a question coming from theoretical physics.

Complexity basics. We use standard complexity notation, such as for the exponent of matrix multiplication. The best known upper bound is from . Many arithmetic operations on univariate polynomials of degree in can be performed in operations in the field : addition, multiplication, shift, interpolation, etc, the key to these results being fast polynomial multiplication [29, 11, 21]. A general reference for these questions in .

## 2 Computing invariant factors of special polynomial matrices

### 2.1 Definition and classical facts

We recall here some basic facts about invariant factors of matrices defined over a field. We fix for now a field , and a matrix . For a monic polynomial , let denote its companion matrix:

 MP=⎛⎜ ⎜ ⎜ ⎜ ⎜⎝a01a1⋱⋮1ad−1⎞⎟ ⎟ ⎟ ⎟ ⎟⎠.

A well-known theorem [16, Th. 9, Ch. VII] asserts that there exist a unique sequence of monic polynomials for which divides for all and is similar to a block diagonal matrix whose diagonal entries are . The ’s are called the invariant factors of . We emphasize that, with our convention, there are always invariant factors but some of them may be equal to , in which case the corresponding companion matrix is the empty one. Under this normalization, the -th invariant factor can be obtained as , where is the gcd of the minors of size of the matrix , where

stands for the identity matrix of size

. The invariant factors are closely related to the characteristic polynomial; indeed, we have

 I1⋅I2⋯In=Gn=det(TIn−M). (1)

Given some irreducible polynomial in , we consider the sequence (of integers):

 e↦dP,e=dimKkerPe(M)degP. (2)

It turns out that this sequence completely determines the -adic valuation of the invariant factors. Indeed, denoting by the -adic valuation of , we have the relations:

 dP,e =n∑j=1min(e,vj), (3) dP,e−dP,e−1 =Card{j|vj≥e} (4)

from which the ’s can be recovered without ambiguity since they form a nondecreasing sequence. It also follows from the above formula that the sequence is concave and eventually constant. Its final value is the dimension of the characteristic subspace associated to and it is reached as soon as is greater than or equal to .

### 2.2 Behaviour under specialization

Let be a perfect field of characteristic . We consider a matrix with coefficients in . For an element lying in a finite extension of , we denote by the image of under the mapping , . Our aim is to compare the invariant factors of and those of .

We introduce some notation. Let be the invariant factors of . It follows from the relation (1) that they all lie in . We can therefore evaluate them at for each element as above and get this way univariate polynomials with coefficients in . Let be these evaluations. We also consider the invariant factors of and call them . We furthemore define

 Gj(x,T) =I1(x,T)⋅I2(x,T)⋯Ij(x,T) andGj,a(T) =I1,a(T)⋅I2,a(T)⋯Ij,a(T).

The characterization of the ’s in term of minors yields:

###### Lemma 1

For all and all , the polynomial divides in .

Let be the irreducible factors of the characteristic polynomial of , and let us write for . For all and , let be the multiplicity of in .

###### Proposition 2

We assume is separable and

 dimk(x)kerPi(x,M(x))ei,j+1=dimℓkerPi(a,M(a))ei,j+1

for all and for all . Then for all .

The equality of dimensions is also true for , as their sum on both sides equals  (using separability) and these dimensions can only increase by specialization. Let be the sequence defined by Eq. (2) with respect to the irreducible polynomial and the matrix . We define similarly for each irreducible factor of the sequence corresponding to the polynomial and the matrix . We claim that it is enough to prove that for all , and all irreducible divisors of . Indeed, by Eq. (4), such an equality would imply:

 vP(T)(Ij,a(T))=ei,j (5)

provided that is an irreducible divisor of , and where denotes the -adic valuation. On the other hand, still assuming that is an irreducible divisor of , it follows from the definition of the ’s that:

 vP(T)(Ij(a,T))≥ei,j (6)

and that the equality holds if and only if does not divide any of the for . Comparing characteristic polynomials, we know moreover that . Combining this with (5) and (6), we find that the ’s are pairwise coprime and finally get for , as wanted.

Until the end of the proof, we fix the index and reserve the letter to denote an irreducible divisor of . For a fixed integer , denote by the greatest index for which and observe that Eq. (3) can be rewritten . Using Lemma 1, we derive for all and . Eq. (4) now implies that the indices for which are exactly the ’s (). Using concavity, we then observe that it is enough to check that for indices of the form . For those , we have by assumption:

 ∑PdegP⋅dP,e=dimℓkerPi(a,M(a))e=dimk(x)kerPi(x,M(x))e=degTPi⋅dPi,e=∑PdegP⋅dPi,e

and thus for all because the inequalities are already known.

### 2.3 A bound on bad evaluation points

Let be a square matrix of size with coefficients in . We set and assume that:

(i) the entries of have degree at most (for a ),

(ii) is similar to a matrix with coefficients in .

We are going to bound the number of values of for which the invariant factors of do not specialize correctly at . Similar discussions appear is Section 4 of Giesbrecht’s thesis  in the (more complicated) case of integer matrices. Our treatment is nevertheless rather different in many places.

The basic bound. By assumption (ii), the characteristic polynomial lies in the subring of .

###### Lemma 3

The invariant factors all belong to . Their degree with respect to is at most .

By assumption (i), is a polynomial in of degree at most . It then follows from Eq. (1) that the ’s are polynomials in of degree at most as well. Now, the assumption (ii) ensures that the ’s actually lie in . This completes the proof.

###### Lemma 4

We assume that . There are at most points such that at least one of the ’s is not separable.

We have that and , since divides . Denote by the discriminant of with respect to . Its degree in is at most , and the assumption implies that is not identically zero. For any such that , the polynomial is separable, and the same holds for the ’s. Noting that is perfect, the conclusion holds.

###### Proposition 5

We assume . Let be elements in a separable closure of which are pairwise non conjugate over . We assume that for each , there exists with . Then:

 N∑i=1deg(ai)≤4mn⋅(n−1)+mn⋅(2n−1)

where denotes the algebraicity degree of over .

We use the criteria of Proposition 2. We start by putting away the values of for which at least one of the ’s is not separable. By Lemma 4, there are at most such values. We then have to bound from above the values of such that the equalities:

 dimk(x)kerPi(x,M(x))e=dimℓkerPi(a,M(a))e

may fail for some and some exponent for some .

Let us fix such a pair . Set for simplicity. By assumption (i), the entries of have degree at most with . On the other hand, we deduce from assumption (ii) that the ’s all lie in and, as a consequence, that is similar to a matrix with coefficients in . Define . The equality then fails if and only if the minors of of size all vanish at , i.e., if and only if the gcd of these minors is divisible by the minimal polynomial of over , say . Noting that , the latter condition is also equivalent to the fact that divides in the ring . This can be possible for at most values of .

Therefore, if are pairwise non-conjugate “unlucky values” of , the sum appearing in the statement of the proposition is bounded from above by:

 (n−1)∑i,emi,e =m(n−1)∑i,eedegTPi +(n−1)∑i,eedegXPi.

We notice that, when remains fixed, the number of exponents of the form () is bounded from above by . The sum of these exponents is then at most:

 (∑n−1j=1ei,j)+ei,n+1=ei+1≤2ei,

where denotes the multiplicity of the factor in the characteristic polynomial . Our bound then becomes . Using and yields the bound.

A refinement. For the applications we have in mind, we shall need a refinement of Proposition 5 under the following hypothesis depending on a parameter :

: the polynomial has degree at most w.r.t .

We observe that is fulfilled when is a companion matrix whose entries are polynomials of degree at most .

###### Proposition 6

Under the assumptions of Prop. 5 and the additional hypothesis , we have:

 N∑i=1deg(ai)≤2μ⋅(2n−1)+μ⋅(2n−1).

Let be any bivariate polynomial with coefficients in . Set and let denote the gcd of the minors of size (for some integer ) of . We claim that:

 degxδ(x)≤pμ⋅degTP+s⋅degxP (7)

To prove the claim, we consider the Frobenius normal form of and set . Observe that any minor of vanishes or has the shape where is a coefficient of for all . Noting that , we derive that all the minors of have degree at most . Now write where the ’s lie in . Let denote the -linear endomorphism of attached to the matrix . Set ; it clearly corresponds to

. Given a vector space

and linear endomorphisms of , let us agree to define as

 E⊗s→⋀sEx1⊗⋯⊗xs↦u1(x1)∧⋯∧us(xs).

where is here defined as a quotient of . Expanding the exterior product , we get:

 ⋀s~g=degTP∑i1,…,is=0ai1(x)⋯ais(x)⋅~fi1∧⋯∧~fis. (8)

Moreover, assuming for simplicity that and letting by convention, we can write:

where denotes the composition of the above (pairwise commuting) maps. We get that the entries of the matrix (in the canonical basis) of all have degree at most . The same argument demonstrates that the degrees of the entries of the above matrix are not greater than:

 pμ⋅max(i1,…,is)≤pμ⋅degTP

when we no longer assume that the ’s are sorted by nondecreasing order. Therefore, back to Eq. (8), we find that the entries of have degree at most . It is then also the case of its trace, which is the same as the trace of since and are similar. This finally implies the claimed inequality (7) because has to divide this trace.

The Proposition now follows by inserting the above input in the proof of Proposition 5.

### 2.4 Algorithms

We keep the matrix satisfying the assumptions (i) and (ii) of §2.3. From now on, we assume that the only access we have to the matrix passes through a black box invariant_factors_at that takes as input an element  lying in a finite extension of and outputs instantly the invariant factors of the matrix . Our aim is to compute the invariant factors of . We will propose two possible approaches: the first one is deterministic but rather slow although the second one is faster but probabilistic and comes up with a Monte-Carlo algorithm which may sometimes output wrong answers.

Throughout this section, the letter refers to a priori upper bound on the -degree of the characteristic polynomial of . One can of course always take but better bounds might be available in particular cases. Similarly we reserve the letter for an upper bound on the sum of degrees of “unlucky evaluation points”. Proposition 5 tells us that is always an acceptable value for . Remember however that this value can be lowered to under the hypothesis thanks to Proposition 6. We will always assume that .

For simplicity of exposition, we assume from now on that is a finite field of cardinality (it is more difficult and the case of most interest for us).

Deterministic. The discussion of §2.3 suggests the following algorithm whose correctness follows directly from the definition of together with the assumption .

Algorithm invariant_factors_deterministic

Input: satisfying (i) and (ii), , with

Output: The invariant factors of

1. Construct an extension of of degree

1. and pick an element such that

1. Cost: operations in

2.

3. for

4.    Find of degree s.t.

5. return

###### Proposition 7

The algorithm above requires only one call to the black box invariant_factors_at with an input of degree exactly .

Probabilistic. We now present a Monte-Carlo algorithm:

Algorithm invariant_factors_montecarlo

Input: s.t. (i) and (ii), , , with

Output: The invariant factors of

1. Find the smallest integer such that:

 2⋅(D+s+1)2s(qs−2F)+12⋅(4Fqs)(D−2)/s≤ε (9)

1. and set and .

2. for

3.    pick at random s.t.

3.    Cost: operations in

4.

5. for

6.

7.    select of cardinality s.t.

7.    (i)  for all

7.    (ii) the are pairwise non conjugate for

7.    Remark: if such does not exist, raise an error

8.    compute of -degree s.t.

8.     for all

7.    Cost: operations in

9. return

###### Proposition 8

We have . Moreover:

Correctness: Algorithm invariant_factors_montecarlo

fails or returns a wrong answer with probability at most

.

Complexity: It performs calls to the black box with inputs of degree and operations in .

The first assertion is left to the reader. Let be the set of elements of such that . It is an easy exercise to prove that has at least elements (the bound is not sharp). Let be the conjugacy classes (under the Galois action) in . Remark that each has by definition elements, so that . We say that a conjugacy class is bad if it contains one element for which for some . Otherwise, we say that it is good. Let (resp. ) be the number of bad (resp. good) classes. We have and by definition of .

The algorithm invariant_factors_montecarlo succeeds if there exist at least indices for which the corresponding ’s lie in pairwise distinct good classes. This happens with probability at least:

 1CK⋅K∑j=k(Kj)⋅G(G−1)⋯(G−k+1)⋅Gj−k⋅BK−j.

(The above formula gives the probability that the first good classes are pairwise distinct, which is actually stronger than what we need.) The above quantity is at least equal to

 (1−kG)k⋅(1−k−1∑j=0(Kj)⋅(GC)j⋅(BC)K−j).

Moreover for , we have:

 (GC)j⋅(BC)K−j ≤(BGC2)j⋅(BC)K−2j≤122j⋅(2Fqs)K−2j ≤12K⋅(4Fqs)K−2j≤12K⋅(4Fqs)(D−2)/s.

Therefore the probability of success is at least:

 (1−kG)k⋅(1−12⋅(4Fqs)(D−2)/s).

Using and , we find that the probability of failure is at most the LHS of Eq. (9). The correctness is proved. As for the complexity, the results are obvious.

## 3 Computing invariant factors of the p-curvature

Throughout this section, we fix a finite field of cardinality and characteristic . We endow the field of rational functions with the natural derivation .

### 3.1 The case of differential modules

We recall that a differential module over is -vector space endowed with an additive map satisfying the following Leibniz rule:

 ∀f∈k(x),∀m∈M,∂(fm)=f′⋅m+f⋅∂(m).

The -curvature of a differential module is the mapping ( times). Using the fact that the -th derivative of any vanishes, we derive from the Leibniz relation above that is -linear endomorphism of . It follows moreover from [4, Remark 4.5] that is defined over , in the sense that there exists a -basis of in which the matrix of has coefficients in . In particular, all the invariant factors of the -curvature have their coefficients in .

Statement of the main Theorem. From now on, we fix a differential module . We assume that is finite dimensional over and let denote its dimension. We pick a basis of and let denote the matrix of with respect to this basis. We write where and the entries of all lie in . Let be an upper bound on the degrees of all these polynomials. The aim of this section is to design fast deterministic and probabilistic algorithms for computing the invariant factors of the -curvature of . The following Theorem summarizes our results.

###### Theorem 9

We assume .

1. There exists a deterministic algorithm that computes the invariant factors of the -curvature of within

 O~ (dω+32rω+2√p)

operations in .

2. Let . There exists a Monte-Carlo algorithm computing the invariant factors of the -curvature of  in

 O~ (dω+12rω⋅(dr−logε)⋅√p)

operations in . This algorithm returns a wrong answer with probability at most .

In what follows, we will use the notation for the matrix of the -curvature of with respect to the distinguished basis . Given an element lying in a finite extension of , we denote by the matrix deduced from by evaluating it at .

The similarity class of . Let be an irreducible polynomial over . Set and let denote the image of the variable in . We assume that does not divide , i.e., . The first ingredient we need is the construction of an auxiliary matrix which is similar to . This construction comes from our previous paper . Let us recall it briefly. We define the ring of Hurwitz series whose elements are formal infinite sums of the shape:

 a0+a1γ1(t)+a2γ2(t)+⋯+anγn(t)+⋯ (10)

and on which the addition is straightforward and the multiplication is governed by the rule . (The symbol should be thought of as .) We moreover endow with the derivation defined by (with the convention that ) and the projection map sending the series given by Eq. (10) to its constant coefficient . We shall often use the alternative notation for . If is given by the series (10), we then have for all nonnegative integers . We have a homomorphism of rings:

 ψS:k[x][1fA]→ℓ[[t]]dp,f(x)↦p−1∑i=0f(i)(a)γi(t).

It is easily checked that commutes with the derivation. We can then consider the differential module over obtained from by scalar extension. By definition, it corresponds to the differential system .

The benefit of working over is the existence of an analogue of the well-known Cauchy–Lipschitz Theorem [4, Proposition 3.4]. This notably implies the existence of a fundamental matrix of solutions, i.e., an matrix with entries in , and satisfying:

 Y′S=ψS(A)⋅YSandYS(0)=Ir (11)

with the identity matrix of size . Moreover, as explained in more details later, the construction of is effective.

For any integer , we let denote the matrix obtained from by taking the -th derivative entry-wise. The next proposition is a consequence of [4, Proposition 4.4].

###### Proposition 10

The matrices and are similar over .

Fast computation of . We recall that is defined as the solution of the system (11). Remembering that we have written , we obtain the relation:

 ψS(fA)⋅Y′S=ψS(~A)⋅YS. (12)

Write and