# Fast Computation of the Nth Term of an Algebraic Series over a Finite Prime Field

We address the question of computing one selected term of an algebraic power series. In characteristic zero, the best algorithm currently known for computing the Nth coefficient of an algebraic series uses differential equations and has arithmetic complexity quasi-linear in √(N). We show that over a prime field of positive characteristic p, the complexity can be lowered to O( N). The mathematical basis for this dramatic improvement is a classical theorem stating that a formal power series with coefficients in a finite field is algebraic if and only if the sequence of its coefficients can be generated by an automaton. We revisit and enhance two constructive proofs of this result for finite prime fields. The first proof uses Mahler equations, whose sizes appear to be prohibitively large. The second proof relies on diagonals of rational functions; we turn it into an efficient algorithm, of complexity linear in N and quasi-linear in p.

## Authors

• 16 publications
• 2 publications
• 3 publications
• ### Fast Coefficient Computation for Algebraic Power Series in Positive Characteristic

We revisit Christol's theorem on algebraic power series in positive char...
06/18/2018 ∙ by Alin Bostan, et al. ∙ 0

• ### Quasi-equivalence of heights in algebraic function fields of one variable

For points (a,b) on an algebraic curve over a field K with height 𝔥, the...
11/25/2021 ∙ by Ruyong Feng, et al. ∙ 0

• ### On the Computation of Neumann Series

This paper proposes new factorizations for computing the Neumann series....
07/18/2017 ∙ by Vassil Dimitrov, et al. ∙ 0

• ### The Complexity of Computing all Subfields of an Algebraic Number Field

For a finite separable field extension K/k, all subfields can be obtaine...
06/03/2016 ∙ by Jonas Szutkoski, et al. ∙ 0

• ### Symbolic computation of hypergeometric type and non-holonomic power series

A term a_n is m-fold hypergeometric, for a given positive integer m, if ...
02/08/2021 ∙ by Bertrand Teguia Tabuguia, et al. ∙ 0

• ### Wilf classes of non-symmetric operads

Two operads are said to belong to the same Wilf class if they have the s...
05/19/2021 ∙ by Andrey T. Cherkasov, et al. ∙ 0

• ### Algebraic Diagonals and Walks: Algorithms, Bounds, Complexity

The diagonal of a multivariate power series F is the univariate power se...
10/15/2015 ∙ by Alin Bostan, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

One of the most difficult questions in modular computations is the complexity of computations for a large prime  of coefficients in the expansion of an algebraic function.

D.V. Chudnovsky & G.V. Chudnovsky, 1990 [11].

Context. Algebraic functions are ubiquitous in all branches of pure and applied mathematics, notably in algebraic geometry, combinatorics and number theory. They also arise at the confluence of several fields in computer science: functional equations, automatic sequences, complexity theory. From a computer algebra perspective, a fundamental question is the efficient computation of power series expansions of algebraic functions. We focus on the particular question of computing one selected term of an algebraic power series  whose coefficients belong to a field of positive characteristic . Beyond its relevance to complexity theory, this problem is important in applications to integer factorization and point-counting [4].

Setting. More precisely, we assume in this article that the ground field is the prime field  and that the power series  is (implicitly) given as the unique solution in  of

 E(x,f(x))=0,f(0)=0,

where  is a polynomial in that satisfies

 E(0,0)=0andEy(0,0)≠0.

(Here, and hereafter, stands for the partial derivative .)

Given as input the polynomial and an integer , our algorithmic problem is to efficiently compute the th coefficient  of . The efficiency is measured in terms of number of arithmetic operations in the field , the main parameters being the index , the prime  and the bidegree of with respect to .

In the particular case , the algebraic function is actually a rational function, and thus the th coefficient of its series expansion can be computed in operations in , using standard binary powering techniques [20, 14].

Therefore, it will be assumed in all that follows that .

Previous work and contribution. The most straightforward method for computing the coefficient of the algebraic power series  proceeds by undetermined coefficients. Its arithmetic complexity is . Kung and Traub [19] showed that the formal Newton iteration can accelerate this to . (The soft-O notation indicates that polylogarithmic factors are omitted.) Both methods work in arbitrary characteristic and compute  together with all .

In characteristic zero, it is possible to compute the coefficient faster, without computing all the previous ones. This result is due to the Chudnovsky brothers [10] and is based on the classical fact that the coefficient sequence satisfies a linear recurrence with polynomial coefficients [12, 9, 3]. Combined with baby steps/giant steps techniques, this leads to an algorithm of complexity quasi-linear in . Except for the very particular case of rational functions (), no faster method is currently known.

Under restrictive assumptions, the baby step/giant step algorithm can be adapted to the case of positive characteristic [4]. The main obstacle is the fact that the linear recurrence satisfied by the sequence has a leading coefficient that may vanish at various indices. In the spirit of [4, §8], -adic lifting techniques could in principle be used, but we are not aware of any sharp analysis of the sufficient -adic precision in the general case. Anyways, the best that can be expected from this method is a cost quasi-linear in .

We attack the problem from a different angle. Our starting point is a theorem due to the second author in the late 1970s [2, Th. 12.2.5]. It states that a formal power series with coefficients in a finite field is algebraic if and only if the sequence of its coefficients can be generated by a -automaton, i.e., is the output of a finite-state machine taking as input the digits of in base . This implicitly contains the roots of a -method for computing , but the size of the -automaton is at least , see e.g. [22].

In its original version [7] the theorem was stated for , but the proof extends mutatis mutandis to any finite field. A different proof was given by Christol, Kamae, Mendès-France and Rauzy [8, §7]. Although constructive in essence, these proofs do not focus on computational complexity aspects.

Inspired by [1] and [13], we show that each of them leads to an algorithm of arithmetic complexity  for the computation of , after a precomputation that may be costly for large. On the one hand, the proof in [8] relies on the fact that the sequence satisfies a divide-and-conquer recurrence. However, we show (Sec. 2) that the size of the recurrence is polynomial in , making the algorithm uninteresting even for very moderate values of . On the other hand, the key of the proof in [7] is to represent  as the diagonal of a bivariate rational function. We turn it (Sections 34) into an efficient algorithm that has complexity after a precomputation whose cost is only. To our knowledge, the only previous explicit occurrence of a -type complexity for this problem appears in [11, p. 121], which announces the bound but without proof.

Structure of the paper. In Sec. 2, we propose an algorithm that computes the th coefficient using Mahler equations and divide-and-conquer recurrences. Sec. 3 is devoted to the study of a different algorithm, based on the concept of diagonals of rational functions. We conclude in Sec. 4 with the design of our main algorithm.

Cost measures. We use standard complexity notation. The real number denotes a feasible exponent for matrix multiplication i.e., there exists an algorithm for multiplying matrices with entries in  in operations in . The best bound currently known is from [16]. We use the fact that many arithmetic operations in , the set of polynomials of degree at most  in , can be performed in  operations: addition, multiplication, division, etc. The key to these results is a divide-and-conquer approach combined with fast polynomial multiplication [23, 5, 18]. A general reference on fast algebraic algorithms is [17].

## 2 Using Mahler Equations

We revisit, from an algorithmic point of view, the proof in [8, §7] of the fact that the coefficients of an algebraic power series can be recognized by a -automaton. Starting from an algebraic equation for , we first compute a Mahler equation satisfied by , then we derive an appropriate divide-and-conquer recurrence for its coefficient sequence , and use it to compute  efficiently. We will show that, although the complexity of this method is very good (i.e., logarithmic) with respect to the index , the computation cost of the recurrence, and actually its mere size, are extremely high (i.e., exponential) with respect to the algebraicity degree .

### 2.1 From algebraic equations to Mahler equations and DAC recurrences

Mahler equations are functional equations which use the Mahler operator , or  for short, acting on formal power series by substitution: . Their power series solutions are called Mahler series. In characteristic , there is no a priori link between algebraic equations and Mahler equations; moreover, a series which is at the same time algebraic and Mahler is a rational function [21, §1.3].

By contrast, over , a Mahler equation is nothing but an algebraic equation, because of the Frobenius endomorphism that permits writing for any . Hence a Mahler series in is an algebraic series. Conversely, an algebraic series is Mahler, since if  is algebraic of degree  then the power series , , …, are linearly dependent over .

Concretely, the successive powers  () are expressed as linear combinations of , , , over , and any dependence relation between them delivers a nontrivial Mahler equation with polynomial coefficients

 c0(x)f(x)+c1(x)f(xp)+⋯+cK(x)f(xpK)=0, (1)

whose order  is at most .

###### Example 1 (A toy example)

Consider in . Let be the unique solution in of . The expressions of , , , as linear combinations of , , with coefficients in  give the columns of the matrix

 ⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣0x−2x7+x5+xx41−x37+⋯\omit\span\omit\span\omit\span\@@LTX@noalign\vskip6.0ptplus2.0ptminus2.0pt\omit11x8−x6+1x40−2x38+⋯\omit\span\omit\span\omit\span\@@LTX@noalign\vskip6.0ptplus2.0ptminus2.0pt\omit0x−2x7+x5+xx41−x37+⋯⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦.

The first three columns are linearly dependent and lead to the Mahler equation

 x4(1−x2−x4)f(x)−(1+x4−2x6)f(x5)+f(x25)=0. (2)

A Mahler equation instantly translates into a recurrence of divide-and-conquer type, in short a DAC recurrence. Such a recurrence links the value of the index- coefficient  with some values for the indices , …, and some shifted indices. The translation is absolutely simple: a term in equation (1) translates into ; more generally, a term becomes , with the convention that a coefficient  is zero if  is not a nonnegative integer.

###### Example 2 (A binomial case)

Let be a prime number and let be the algebraic series in  solution of . The rewriting

 x+(1+f)p−1−1=x+1+fp1+f−1=(x−1)(1+f)+(1+fp)1+f

yields and ; hence . The situation is highly non-generic, since all powers  are expressible as linear combinations of  and  only. This delivers the second-order Mahler equation

 (xp−1−xp)f(x)−(1+xp−1−xp)f(xp)+f(xp2)=0,

which translates into a recurrence on the coefficients sequence

 fn−p+1−fn−p−fnp−fn−p+1p+fn−pp+fnp2=0. (3)

It is possible to make the recurrence more explicit by considering the  cases according to the value of the residue of  modulo .

### 2.2 Nth coefficient via a DAC recurrence

The DAC recurrence relation attached to the Mahler equation (1), together with enough initial conditions, can be used to compute the th coefficient  of the algebraic series . The complexity of the resulting algorithm is linear with respect to . The reason is that the coefficient  in (1) generally has more than one monomial, and as a consequence, all the coefficients , are needed to compute .

###### Example 3 (A binomial case, cont.)

We specialize Ex. 2 by taking and compute  with . Applying (3) to , we need to know the value for , next for , , , and also for , , , , , . In the end, it appears that all values of for the indices smaller than  are needed to compute .

It is possible to decrease dramatically the cost of the computation of , from linear in  to logarithmic in . The idea is to derive from (1) another Mahler equation with the additional feature that its trailing coefficient is 1. To do so, it suffices to perform a change of unknown series. Putting , we obtain the Mahler equation

 g(x)+c1(x)c0(x)p−2g(xp)+⋯+cK(x)c0(x)pK−2g(xpK)=0. (4)

Obviously this approach assumes that  is not zero. But, as proved in [2, Lemma 12.2.3], this is the case if (1) is assumed to be a minimal-order Mahler equation satisfied by .

The series  is no longer a power series, but a Laurent series in . We cut it into two parts, its negative part  in and its nonnegative part  in :

 g−(x)=∑n<0gnxn,h(x)=∑n≥0gnxn.

Let us rewrite Eq. (4) in the compact form , where

is a skew polynomial in the variable

and in the Mahler operator . Plugging in it yields three terms. The first is the power series . The second is the nonnegative part  of , in . The third is the negative part of , and it is zero because it is the only one which belongs to . Eq. (4) is rewritten as a new (inhomogeneous) Mahler equation , namely

 h(x)+c1(x)c0(x)p−2h(xp)+⋯+cK(x)c0(x)pK−2h(xpK)=b(x). (5)
###### Example 4 (A toy example, cont.)

The change of series in (2) gives the new equation

 h(x)−x12(1−x)(1+x)(1+x2+2x4)(1−x2−x4)3h(x5)+x92(1−x2−x4)23h(x25)=−x−x5+x7+2x9+⋯+x149−x157−2x159

and a recurrence

 hn=hn−125+2hn−145+⋯−hn−9225+⋯,

while  is given by

The right-hand side of this new inhomogeneous equation is obtained as the nonnegative part of , where , , are the coefficients of Eq. (2), and where , the negative part of , can be determined starting from .

To compute the coefficient  for , this approach only requires the computation of thirteen terms of the sequence , namely for . This number of terms behaves like , and compares well to the use of a recurrence like (3), which requires terms.

At this point, it is intuitively plausible that the th coefficient can be computed in arithmetic complexity : to obtain , we need a few values  and these ones are essentially obtained from a bounded number of , ….

### 2.3 Nth coefficient via the section operators

This plausibility can be strengthened and made clearer with the introduction of the section operators (sometimes called Cartier operators, or decimation operators) and the concept of -rational series.

###### Definition 1

The section operators , with respect to the radix , are defined by

 Sr∑n≥0fnxn=∑k≥0fpk+rxk,for0≤r

In other words, there is a section operator for each digit of the radix- numeration system and the th section operator extracts from a given series its part associated with the indices congruent to the digit  modulo . These operators are linear and well-behaved with respect to the product:

 Sr[f(x)g(x)]=∑s+t≡rmodpx⌊s+tp⌋Ssf(x)Stg(x).

In particular, for the previous formula becomes

 Sr[f(x)h(xp)]=Sr[f(x)]×h(x), (7)

because of the obvious relationships with the Mahler operator , for . The action of the section operators can be extended to the field  of formal Laurent series and the same properties apply to the extended operators, which we denote in the same way.

The section operators permit to express the coefficient  of a formal power series  using the radix- digits of .

###### Lemma 2

Let be in and let be the radix- expansion of . Then

 hN=(SNℓ⋯SN1SN0h)(0). (8)

We first apply the section operator  to the series  associated with the least significant digit of , that is we start from  and we pick every th coefficient of . This gives Iterating this process with each digit produces the series

 SNℓ⋯SN1SN0h(x)=hN+hN+pℓ+1x+⋯,

whose constant coefficient is .

### 2.4 Linear representation

Let us return to the algebraic series  and its relative . The Mahler equation (5) can be rewritten

 h(x)=b(x)+a1(x)h(xp)+⋯+aK(x)h(xpK). (9)

The coefficients  and , , are polynomials of degrees at most

, say. As a matter of fact, the vector space

of linear combinations

 s(x)=a(x)+b0(x)h(x)+b1(x)h(xp)+⋯+bK(x)h(xpK)

with  and all  in  is stable under the action of the section operators. Indeed, from

 s(x)=(a(x)+b0(x)b(x))+(b0(x)a1(x)+b1(x))h(xp)+⋯+(b0(x)aK(x)+bK(x))h(xpK)

we deduce

 Srs(x)=Sr(a(x)+b0(x)b(x))+Sr(b0(x)a1(x)+b1(x))h(x)+⋯+Sr(b0(x)aK(x)+bK(x))h(xpK−1), (10)

and the polynomials and , , have degrees not greater than .

The important consequence of this observation is that we have produced a finite dimensional -vector space  that contains  and that is stable under the sections . This will enable us to effectively compute the coefficient using formula (8), by performing matrix computations.

The vector space  is spanned over by the family . Let  be a matrix of  with respect to , let  be the column vector containing only zero entries except an entry equal to 1 at index , and  be the row matrix of the evaluations at  of the elements of .

With these notation, Eq. (8) rewrites in matrix terms:

 hN=LANℓ⋯AN1AN0C. (11)
###### Definition 3

The family of matrices , is a linear representation of the series . A power series that admits a linear representation is said to be -rational.

As it was pointed out in [1, p. 178], an immediate consequence of Eq. (11) is that the coefficient can be computed using only matrix-vector products in size , therefore in arithmetic complexity .

Actually, the matrices are structured, and their product by a vector can be done faster than quadratically. Indeed, Eq. (10) shows that one can perform a matrix-vector product by the matrix  using polynomial products in , thus in quasi-linear complexity . We deduce:

###### Proposition 4

The th coefficient of the solution  of (9) can be computed in time .

### 2.5 Nth coefficient using Mahler equations

We are ready to prove the main result of this section.

###### Theorem 5

Let be a polynomial in such that and , and let be its unique root with . Algorithm 1 computes the th coefficient  of  using operations in .

To do this, we first need a preliminary result that will also be useful in Sec. 4. It provides size and complexity bounds for the remainder of the Euclidean division of a monomial in  by a polynomial in  with coefficients in .

###### Lemma 6

Let be a polynomial in of degree  in and at most in . Then, for ,

 yDmodE=1eD−d+1d(r0(x)+⋯+rd−1(x)yd−1),

where the ’s are in of degree at most .

One can compute the ’s using operations in .

The degree bounds are proved by induction on . The computation of can be performed by binary powering. The last step dominates the cost of the whole process. It amounts to first computing the product of two polynomials in of degree at most in and of degree in , then reducing the result modulo . Both steps have complexity : the multiplication is done via Kronecker’s substitution [17, Cor. 8.28], the reduction via an adaptation of the Newton iteration used for the fast division of univariate polynomials [17, Th. 9.6].

The first step of Algorithm 1 is the computation of a Mahler equation for the algebraic series . We begin by analyzing the sizes of this equation, and the cost of its computation.

###### Theorem 7

Let be a polynomial in such that and , and let be its unique root with . Algorithm 2 computes a Mahler equation of minimal-order satisfied by using operations in . The output equation has order at most and polynomial coefficients in of degree at most .

We first prove the correctness part. Since the ’s all live in the vector space generated by over , the algorithm terminates and returns a Mahler operator of order at most .

At step we have , thus . Since , this yields , so is a Mahler operator for . It has minimal order, because otherwise would be linearly dependent over . Therefore, the algorithm is correct.

By Lemma 6, is in of degree in at most and degree in at most . Moreover, can be computed in time .

By Cramer’s rule, the degrees of the ’s are bounded by . Testing the rank condition, and solving for the ’s amounts to polynomial linear algebra in size at most and degree at most . This can be done in complexity , using Storjohann’s algorithm [24].

We are now ready to prove Theorem 5.

[of Theorem 5] The correctness of Algorithm 1 follows from the discussion in Sec. 2.4. By Theorem 7, the output of Step 1 is a Mahler equation of order and coefficients of maximum degree . In particular, . The cost of Step 1 is . Step 2 consists in computing the coefficients for , of maximum degree . It can be performed in time . Step 3 has negligible complexity using the Kung-Traub algorithm [19]. Step 4 requires operations. By Proposition 4 the last steps of the algorithm can be done using for each , thus for a total cost of . This dominates the whole computation.

Generically, the size bounds in Theorem 5 are quite tight. This is illustrated by the following example.

###### Example 5 (A cubic equation)

Let us consider the algebraic equation in . The assumption is fulfilled. We find a Mahler equation of order , whose coefficients , , , have respectively heights , , , . We compute within precision  the solution , and the negative part of . We find a new operator by computing the product in  and next a new equation for the nonnegative part  of  by computing the polynomial . It appears that the coefficients , , , and the right-hand side  have respectively heights , , , , and . We arrive at a linear representation with size for . If we increase the prime number , we successively find sizes for , and  for . We do not pursue further because it is clear that such sizes make the computation unfeasible.

## 3 Using Diagonals

To improve the arithmetic complexity with respect to the prime number , we need a different way to represent the algebraic series . It is given by Furstenberg’s theorem, and relies on the concept of diagonal of rational functions.

### 3.1 From algebraic equation to diagonal

Furstenberg’s theorem [15] tells us that every algebraic series is the diagonal of a bivariate rational function

 f(x)=Da(x,y)b(x,y),b(0,0)≠0. (12)

This means that , where .

Moreover, under the working assumption , the rational function  is explicit in terms of :

 a(x,y)=yEy(xy,y),b(x,y)=E(xy,y)/y.

(Notice that .) If the polynomial has degree  and height , then the partial degrees of and  are bounded as follows

 dx=max(degxa,degxb)≤h,dy=max(degya,degyb)≤d+h. (13)
###### Example 6 (A quartic equation)

Consider

 f(x)=x+x3+x4+3x5+5x6+2x7+4x8−x9−x10−2x11+⋯,

the unique solution in  of the algebraic equation

 E(x,y)=−x+(1+x)y−(1+x2)y2−y3+(1+x)y4=0.

Then is the diagonal of the rational function with

 a=y−(2−x)y2−3y3+(4−2x2)y4+4xy5,
 b=(1−x)−(1−x)y−y2+(1−x2)y3+xy4.

Fig. 1 shows pictorially that  is the diagonal of .

We will follow the path drawn in [6], and compute the th coefficient  using as intermediate data structure.

### 3.2 Nth coefficient via diagonals

Bivariate section operators can be defined by mimicking (6). Although they depend on two residues modulo , we will always use them with the same residue for both variables; thus we will simply denote them by  for . A nice feature is that they commute with the diagonal operator,

 SrD=DSr.

Together with (12), this property translates the action of the section operators from the algebraic series  to the rational function . Moreover, property (7) extends to bivariate section operators and justifies the following computation

 Sra(x,y)b(x,y)=Sra(x,y)b(x,y)p−1b(x,y)p=Sra(x,y)b(x,y)p−1b(xp,yp)=Sra(x,y)b(x,y)p−1b(x,y).

To put it plainly

 Srab=Srabp−1b.

This yields an action on bivariate polynomials  of what we may call pseudo-section operators, defined by

 Trv=Srvbp−1,0≤r

hence

 (Sr1⋯Srk)(ab)=(Tr1⋯Trk)(a)b,0≤ri

At this point, it is not difficult to see that the vector subspace , where  and  are the maximal partial degrees of  and  defined in (13), is stable under the pseudo-section operators. The supports of  and of polynomials  in