 # Improved Polynomial Remainder Sequences for Ore Polynomials

Polynomial remainder sequences contain the intermediate results of the Euclidean algorithm when applied to (non-)commutative polynomials. The running time of the algorithm is dependent on the size of the coefficients of the remainders. Different ways have been studied to make these as small as possible. The subresultant sequence of two polynomials is a polynomial remainder sequence in which the size of the coefficients is optimal in the generic case, but when taking the input from applications, the coefficients are often larger than necessary. We generalize two improvements of the subresultant sequence to Ore polynomials and derive a new bound for the minimal coefficient size. Our approach also yields a new proof for the results in the commutative case, providing a new point of view on the origin of the extraneous factors of the coefficients.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

When given a system of differential equations, one might be interested in finding the common solutions of these equations. In order to do so, one can compute another differential equation whose solution space is the intersection of the solution spaces of the equations in the original system. One way to do this is to translate the equations into operators and use the Euclidean algorithm to compute their greatest common right divisor. The solution space of the greatest common right divisor then consists of the desired elements.

Similarly, given a sequence of numbers that satisfies two different recurrence equations, the Euclidean algorithm is used in applications to find a reasonable candidate for the least order equation of which is a solution.

Carrying out Euclid’s algorithm applied to two polynomials over a domain usually requires a prediction of the denominators that might appear in the coefficients of the remainders in order to bypass costly computations in the quotient field of . While such a prediction can be done easily, the growth of the coefficients of the remainders can be tremendous, which might result in an unnecessary high running time. This can be avoided by dividing out possible content of the remainders to make their coefficients as small as possible. For commutative polynomials as well as for non-commutative operators, different ways have been extensively studied to find factors of the content in the sequence of remainders without computing the GCD of the coefficients of each element of the sequence. Most notably in this respect are subresultant sequences, where the growth of the coefficients can be reduced from exponential to linear in the number of reduction steps in the Euclidean algorithm. When taking generic, randomly generated input, the coefficient size in the subresultant sequence is usually optimal, but when taking the input from applications in e.g. combinatorics or physics, the remainders still have non-trivial content in many cases.

For commutative polynomials, some ways are known to improve on subresultants. In this article we generalize two of these results to Ore polynomials and we also give a new proof for the commutative case that is based on the structure of subresultants as matrix determinants. Furthermore, we use these results to derive a new bound for the coefficient size of the content-free remainders.

In Section 2 the basic notions of Ore polynomial rings are stated. A precise definition and examples of polynomial remainder sequences are given in Section 3 and further details on the subresultant sequence are then presented in Section 4. The main results of this article can be found in Sections 5 and 6, where we first describe how additional content in the subresultant sequence can emerge and then use these results to improve on the Euclidean algorithm and to get a new bound for the size of the coefficients.

## 2 Preliminaries

The algebraic framework for different kinds of operators that we consider here are Ore polynomial rings, which were introduced by Øystein Ore in the 1930’s. We provide an overview of some basic facts that suffice our needs and that can be found in Ore (1933) and Bronstein and Petkovšek (1996). Let be a commutative domain, the set of univariate polynomials over and let be an injective endomorphism.

1. A map is called pseudo-derivation w.r.t. , if for any

2. Suppose that is a pseudo-derivation w.r.t. . We define the Ore polynomial ring with componentwise addition and the unique distributive and associative extension of the multiplication rule

to arbitrary polynomials in . To clearly distinguish this ring from the standard polynomial ring over , we denote it by .

Elements of an Ore polynomial ring are called operators and are denoted by capital letters. We refer to the leading coefficient of an operator as , to the coefficient of in as and to the polynomial degree of in as the order of .

Commonly used Ore polynomial rings are:

1. , the ring of commutative polynomials over .

2. , the ring of linear ordinary differential operators.

3. If is the forward shift in , i.e. , then is the ring of linear ordinary recurrence operators.

4. If is the -shift in , i.e. , then is the ring of Jackson’s -derivative operators.

In this article, we consider the following situation: Let be a Euclidean domain with degree function and let be an Ore polynomial ring where is an automorphism. For any operator , we define to be the maximal coefficient degree of . The content of is the greatest common divisor of all the coefficients of and it is defined to be if is a field. It is possible to extend to an Ore polynomial ring over the quotient field  of by setting and  for (see Li (1996), Proposition 2.2.1). We will denote this ring by without making it explicit that the automorphism and the pseudo-derivation are extensions of the functions used in . It is well known that for any two operators , there exists a greatest common right divisor (GCRD) and it can be made unique (up to units in ) by setting to a nonzero - left multiple of any GCRD of and  that has coefficients in but does not have any content in .

Throughout this article, we let , be such that and  is the GCRD of and .

For and , is obtained by applying times to and , where is the inverse map of . The th -factorial of is defined as the product

 a[n]:=n−1∏i=0σi(a).

## 3 Polynomial Remainder Sequences for Ore Polynomials

The greatest common right divisor of and  can be computed by using the Euclidean algorithm. If we multiply any intermediate result that appears during the execution of the algorithm by an element of , the final output will be a -left multiple of . This amount of freedom allows us to optimize the running time by choosing these factors appropriately. In order to be able to formulate improvements of this kind, the notion of polynomial remainder sequences has been introduced. Each element of such a sequence corresponds to a remainder computed in one iteration of the Euclidean algorithm. Let and  be sequences in , a sequence in and let and  be sequences in such that

 R0=A,R1=B,di=dRi, αiRi−1=QiRi+βiRi+1,di+1

and all are nonzero except for . We call the sequence a polynomial remainder sequence (PRS) of and .

A PRS of and  is uniquely determined by specifying the and . Whenever we talk about a PRS , we allow ourselves to refer to the related sequences , etc. as in the above definition without explicitly introducing them.

In order to efficiently compute , one wants to make sure that all the remainders are elements of rather than . This can be achieved by choosing the in a way such that the quotient of any two consecutive remainders has coefficients in . To this extent, for set and division with remainder yields and  in with:

 αiRi−1=QiRi+Ri+1,di+1

We call the pseudo-quotient of and  and  the pseudo-remainder of and .

The are used to make sure that computations can be done in and the control the coefficient growth in a PRS. We want to contain as many factors of the content of as possible without much computational overhead needed to obtain these factors.

Set and

1. . This is called the pseudo PRS of and . Here, no content will be divided out.

2. . This is called the primitive PRS of and . The coefficients of the remainders will be as small as possible, but it is necessary to compute the GCD of the coefficients of each remainder in order to get the .

3. The subresultant PRS of and  (see Section 4) is given by

 βi ={−σ(ψ1)[d0−d1], if i=1,−lc(Ri−1)σ(ψi)[di−1−di], if 2≤i≤ℓ,

where

 ψi=⎧⎪ ⎪⎨⎪ ⎪⎩−1, if i=1,(−lc(Ri−1))[di−2−di−1]σ(ψi−1)[di−2−di−1−1], if 2≤i≤ℓ.

In this PRS, the content that is generated systematically by pseudo-remaindering will be cleared from the remainders.

While in all of the above PRSs the remainders are elements of , the degrees of the coefficients differ drastically, as illustrated in the following example. It can be shown that the degrees of the coefficients in the pseudo PRS grow exponentially with , which renders this PRS practically useless. The growth in the subresultant and primitive PRS is linear in .

Assume we are given a finite sequence of rational numbers that comes from a sequence which admits a linear recurrence equation with polynomial coefficients. If the amount of data is sufficiently large, we are able to guess recurrence operators of some fixed order and maximal coefficient degree that annihilate , i.e. the operators applied to the sequence give zero. (For details on guessing and a Mathematica implementation of the method, see Kauers (2009).) For example, consider

 tn=n∑k=0(2n+4k)+(2n−k)!+k3.

Given the first 300 terms of this sequence, we can find two operators and  in with , and maximal coefficient degree , resp. Both operators annihilate the given sequence, but none of them is of minimal order. To get an annihilating minimal order operator, we compute the GCRD of and  in . Table 1 shows the maximal coefficient degrees of the remainders for different PRSs of and .

 PRS pseudo subresultant primitive R2 R3 R4 R5 R6 R7 R8 11 22 49 114 271 650 1565 11 16 21 26 31 36 41 9 12 15 18 21 24 21

Table 1: Maximal coefficient degrees for different PRSs.

The example confirms that the degrees in the pseudo PRS grow exponentially, whereas the subresultant PRS and the primitive PRS show linear growth. At the same time, the degrees in the subresultant PRS are not as small as possible. This behavior is typical not only for this pair  and , but in general for operators coming from applications. For randomly generated operators, the subresultant PRS and the primitive PRS usually coincide. Our goal is to understand the difference between randomly generated input and the operators and  as above and to identify the source of some (and most often all) of the additional content in the subresultant PRS. To make use of this knowledge, we will then adjust the formulas for and  from Example 3.3 so that we get a PRS with smaller degrees without having to compute the content of every remainder.

## 4 Subresultant Theory for Ore Polynomials

For commutative polynomials, the theory of subresultants was intensively studied by Brown (1978), Brown and Traub (1971), Collins (1967) and Loos (1982). The main idea is to translate relations between the elements of a PRS like the Bézout relation or the (pseudo-)remainder formula into linear algebra. A central tool in this context is the Sylvester matrix, which, roughly speaking, contains the coefficients of all the monomial multiples of the input polynomials that are necessary to compute remainders of any possible degree. The remainders in the subresultant sequence turn out to be polynomials whose coefficients are determinants of certain submatrices of this matrix. Li (1998) generalized these results to Ore polynomials.

Figure 1: The form of the Sylvester matrix of and . Entries outside of the gray area are zero.

The Sylvester matrix is defined to be the matrix of size with the following entries: If and , the entry in the th row and th column is the th coefficient of . If and , the entry in the th row and th column is the th coefficient of .

For with , the matrix is obtained from by removing the rows to , the rows to , the columns to and the last columns except for the column .

Figure 2: Sketch of . The lines indicate the removed rows and columns. The column under the dotted line is added again.

For , the polynomial

 sresi(A,B):=i∑j=0det(Syli,j(A,B))xj

is called the th (polynomial) subresultant of and . If the order of is strictly less than , the th subresultant of and  is called defective, otherwise it is called regular. The subresultant sequence of and  of the first kind is the subsequence of

 (A,B,sresdB−1(A,B),sresdB−2(A,B),…,sres0(A,B),0)

that contains , , the trailing zero and all nonzero for which is regular.

###### Theorem 1 (Li (1998))

The polynomial remainder sequence given by and  as in Example 3.3, the subresultant PRS, is equal to the subresultant sequence of and  of the first kind.

## 5 Identifying Content of Polynomial Subresultants

The representation of subresultants in terms of determinants of the matrices makes it possible to identify content by exploiting the special form of these matrices as well as the correspondence between rows of the Sylvester matrix and monomial multiples of and . For the case of commutative polynomials, some results are known for detecting such additional content. We generalize two results to the Ore setting. The first (Theorem 2) is a generalization of an observation mentioned in Brown (1978), which carries over quite easily to the Ore case. The second (Theorem 4

) usually performs better in terms of coefficient size of the remainders, but a heuristic argument is necessary to use it algorithmically (see Section

6).

###### Theorem 2

With and  for , we get:

Let be fixed. The coefficients of are the determinants of the matrices for . The first column of all of these matrices is

 (σdB−1−i(lc(A)),0,…,0,σdA−1−i(lc(B)),0,…,0)T.

Laplace expansion along this column proves the claim.∎

Not all of the subresultants of and  are in the subresultant PRS of and . To make use of Theorem 2 for a new PRS, we need a minor specialisation of the statement:

###### Corollary 1

Let be the subresultant PRS of and  (not necessarily normal). If we choose

 t=gcd(σdB−1(lc(A)),σdA−1(lc(B))),γ2=σ−dB+1(t) and γi=σdi−2−di−1(γi−1) for 2

then for .

Suppose is the th subresultant of and . Then, by the definition of the subresultant sequence of the first kind and Theorem 1, the st subresultant of and  is regular. Because of this and the subresultant block structure (see Li (1998)), is of order and so is equal to . By Theorem 2, the content of is divisible by . It is easy to see that is equal to .∎

In the commutative case, a second source of additional content was determined, although this result is not widely known. The following theorem can be found in Knuth (1981):

###### Theorem 3

Let be such that the subresultant PRS of and  is normal, i.e. for , and let be the GCD of and . Then for .

A generalization of Theorem 3 to Ore polynomials is not straightforward, as Example 3 shows.

[Example 3 cont.] If we take and  as in Example 3, then the leading coefficient of the GCRD of and  is , where is a polynomial of degree 17. The subresultant PRS of and  turns out to be normal and  is of order . By Theorem 3, if the polynomials were elements of , would be divisible by and a naive translation of the theorem to the non-commutative case suggests divisibility by a polynomial of degree at least 36. The (monic) content of , however, is only , which is contained in, but not equal to, .

Again in the commutative case, let , be such that and . Knuth (1981) proves Theorem 3 by showing that if is the subresultant PRS of  and  and  is the subresultant PRS of , , then . This approach is problematic for Ore polynomials, because there the ’s and the ’s have coefficients in and not necessarily in . This means that even after showing that a quotient  is a -left multiple of some subresultant of and , the left factor and the denominators in the coefficients of might not be coprime and thus lead to cancellation. Therefore we will not only describe why in the non-commutative case only some factors of appear as content, but we also present a new proof of Theorem 3 that makes it more explicit where the additional content comes from. Moreover, we won’t require the remainder sequence to be normal.

In , if is a multiple of the primitive polynomial , then their quotient will always have coefficients in , and therefore, the leading coefficient of contains all the factors of the leading coefficient of . For Ore polynomials, this is not necessarily true, since the quotient of and might be an element of . Still, different left multiples of in may share some common factors in their leading coefficients, as described in Lemma 1.

###### Lemma 1

Let be fixed, let be a left ideal and let be any element of of order such that, among all the operators of order in , its leading coefficient  is minimal with respect to the degree. Then is independent of the choice of (up to multiplication by units in ) and for any with we have .

Assume there are for which the claim does not hold. We let and get , thus by assumption. Division with remainder yields nonzero such that

 lc(L′)=qt+r,deg(r)

Hence the operator is an element of whose leading coefficient has degree less than . This contradicts the choice of .

For the uniqueness, let be any other operator of order with minimal leading coefficient degree. By what was just shown above, we get and , so and are associates. ∎

Consider , and from Lemma 1. The shift of the leading coefficient of  is called the essential part of at order . If there is no operator in for some order , the essential part of at order is defined to be .

Let and where is the left ideal generated by . We give an informal explanation of essential parts of in terms of solutions of , i.e. functions that are annihilated by . Any non-removable singularity of a solution of corresponds to a root of the leading coefficient of , but not for any root of there has to be a solution with a non-removable singularity at that point. Any solution of is also a solution of every operator in  and it can happen that there are nonzero -left multiples of in that have strictly smaller leading coefficient degree than . If such a desingularized operator exists, it means that some of the roots of can be removed by multiplying with another operator from the left. These removable roots are called the apparent singularities of . It is shown in Jaroschek (2013) that there exists a unique minimal (w.r.t. degree) essential part of that appears in the essential parts of at every order greater than . This minimal essential part of is a polynomial whose roots are exactly the non-apparent singularities of , and it turns out that for each root of the essential part of , there is at least one solution of that does not admit an analytic continuation at that point. A more detailed description of desingularization and apparent singularities of differential equations can be found in Ince (1926). Further references and recent results on desingularization of Ore operators can be found in Chen et al. (2013).

Note that for commutative polynomials, by Gauß’ Lemma, the essential part of a nonzero ideal at any order is equal to the leading coefficient of the primitive greatest common divisor of the ideal elements.

For the remaining part of this article, let be the left ideal generaed by and . We formulate our Ore generalization of Theorem 3, where now some of the essential parts of play the role of the leading coefficient of the GCRD of and .

###### Theorem 4

Let and . If is the essential part of at order  for , then

 (Δ+i−1∏k=i+1tk)∣cont(sresi(A,B)).

For any , is of size and if the last column is removed, the resulting matrix does not depend on anymore. For , let be the set of all matrices obtained by removing the last columns and any rows from . The th coefficient of is the determinant of and Laplace expansion along the last column shows that it is a -linear combination of the elements of . By induction on  we show that the determinant of any element of is divisible by . The theorem is then proven by setting .

For , the only entry in a matrix in is either zero or the leading coefficient of a monomial left multiple of or of order , so the claim follows from Lemma 1.

Now suppose the claim is true for and let be any element of . If the determinant of is zero, then there is nothing to show. Consider the case where . Then there is a such that . By Cramer’s rule, the th component  of is of the form where is the determinant of some element of . By induction hypothesis it is divisible by . Every row in corresponds to an operator of the form or for , minus some of the lower order terms. For the th row, , we denote the corresponding operator by . By the definition of , the operator will have order and leading coefficient . So if we set

 v′:=det(M)tΔ+i−ntΔ+i−(n−1)…tΔ+i−1v∈Dn+1

and , then is an element in of order and its leading coefficient is . Lemma 1 yields that is divisible by , so we get in total .∎

In practice, the essential parts of will most likely be the same at every order with . In that case, Theorem 4 is equivalent to the following simplification, where only the essential part of at order needs to be known.

###### Corollary 2

Let and . If is the essential part of at order , then

 σi+1(t)[Δ−1]∣cont(sresi(A,B)).

According to Lemma 1, divides the essential part of at order for any . If , then the th subresultnat of and is zero. Otherwise, Theorem 4 yields that is divisible by

 σi+1(t)σi+1(t)…σΔ+i−1(t)=σi+1(t)[Δ−1].\hbox to 0.0pt{□}

Like for Theorem 2, an adjustment of Corollary 2 to the block structure of the subresultant sequence of the first kind is needed in order to construct a new PRS.

###### Corollary 3

Let be the subresultant PRS of and  (not necessarily normal) and let be the essential part of at order . If we set and

 γi=σdi−1(t)[di−2−di−1]γi−1σdA+dB−di−2+1(t)[di−2−di−1] for 2

then for .

Suppose is the th subresultant of and . As in the proof of Corollary 1, we have that  is equal to . So by Corollary 2, the content of is divisible by . Simple hand calculation shows that this is equal to .∎

## 6 Improved Polynomial Remainder Sequence

We now derive formulas for the and  that take into account the potential additional content characterized by Theorems 2 and 4. For this we need the following lemma:

###### Lemma 2

For :

By Lemma 2.3 in Li (1998), the pseudo-remainder of and  is the st subresultant of and  (up to sign). Consequently, its coefficients are determinants of submatrices of that contain one row corresponding to the operator and  rows corresponding to operators of the form , . Thus, by Lemma 2.2 in Li (1998), it follows that (up to sign)

 prem(γ1A,γ2B)=γ1γ[dA−dB+1]2prem(A,B). (2)

The pseudo-remainder formula (1) applied to and  is

 lc(γ2B)[dA−dB+1]γ1A=pquo(γ1A,γ2B)γ2B+prem(γ1A,γ2B).

Combining this with (2) and dividing the resulting equation by from the left gives the desired result. ∎

This now allows us to state and  for improved polynomial remainder sequences:

###### Theorem 5

Suppose is the subresultant PRS of and  and  is any sequence in with . Set . Then is a PRS of and  with:

 ~αi =lc(~Ri)[di−1−di+1], ~βi =⎧⎪ ⎪⎨⎪ ⎪⎩−σ(~ψ1)[d0−d1]γ2, if i=1,−lc(~Ri−1)σ(~ψi)[di−1−di]γi[di−1−di+1]γi+1, if 2≤i≤ℓ,

where

 ~ψi=⎧⎪ ⎪⎨⎪ ⎪⎩−1, if i=1,(−γi−1lc(~Ri−1))[di−2−di−1]σ(~ψi−1)[di−2−di−1−1], if 2≤i≤ℓ.

From the definition of and the equations

it follows that

 γ[di−1−di+1]iγi−1~αi~Ri−1=Qiγi~Ri+βiγi+1~Ri+1. (3)

For the first summand on the right hand side, Lemma 2 yields

 Qiγi=γ[di−1−di+1]iγi−1~Qi. (4)

For the second summand, observe that since equals , we have that equals for all . Thus

 βiγi+1=γ[di−1−di+1]iγi−1~βi. (5)

The proof is concluded by combining (3), (4) and (5) and dividing the resulting equation by from the left.∎

Two possible choices for were presented in Corollary 1 and 3. The computation of  in Corollary 1 is straightforward, but in Corollary 3, the essential part of (the ideal generated by and ) at order  is usually not known. A simple heuristic can solve this problem in most cases: As was shown in Lemma 1, the essential part of at order  appears in a shifted version in the leading coefficient of every nonzero ideal element with order less than or equal to . In particular it is contained in and . Thus, if is the essential part of  at order , we have

 σdA(t)∣gcd(lc(A),σdA−dB(lc(B))) (6)

and in most cases, we not only have divisibility but equality. In fact, in all the examples we looked at that came from combinatorics or physics, this guess for the essential part turned out to be correct.

[Example 3 cont.] We now use Theorem 5 and Corollaries 1 and 3 to compute new PRSs of and  as in Example 3. The essential part of at order  is , so , which is also the guess given by the right hand side of (6). Applying Corollary 1 yields the factors

 γ2=n+17,γ3=n+18,…γi=n+16+i−1,…

whereas Corollary 3 gives

 γ2=(n+16),γ3=(n+15),…γi=(n+16−i+2)[2(i−1)],…

The improvements from Corollary 1 are marginal, while the degrees in the improved PRS with the results from Corollary 3 are equal to the degrees in the primitive PRS, except for the very last step:

 PRS R2 R3 R4 R5 R6 R7 R8 subresultant 11 16 21 26 31 36 41 improved (Cor. 1) 10 15 20 25 30 35 40 improved (Cor. 3) 9 12 15 18 21 24 27 primitive 9 12 15 18 21 24 21

Table 2: Maximal coefficient degrees for the subresultant, improved and primitive PRS.

Although the remainders in the PRS based on Corollary 3 are usually primitive when starting from randomly generated operators or operators that come from some applications, it is not guaranteed that this is always the case. As an example, consider

 A,B∈Q[y][x], A =x4+yx2+yx+y, B =x3+yx2.

The second subresultant of and  is , so , but in the improved PRS, no content will be found.

As mentioned, it may also happen that the guess for the essential part of at order is too large, for example:

 A,B∈Q(y)[D