 # Bounds for Substituting Algebraic Functions into D-finite Functions

It is well known that the composition of a D-finite function with an algebraic function is again D-finite. We give the first estimates for the orders and the degrees of annihilating operators for the compositions. We find that the analysis of removable singularities leads to an order-degree curve which is much more accurate than the order-degree curve obtained from the usual linear algebra reasoning.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A function is called D-finite if it satisfies an ordinary linear differential equation with polynomial coefficients,

 p0(x)f(x)+p1(x)f′(x)+⋯+pr(x)f(r)(x)=0.

A function is called algebraic if it satisfies a polynomial equation with polynomial coefficients,

 p0(x)+p1(x)g(x)+⋯+pr(x)g(x)r=0.

It is well known  that when is D-finite and is algebraic, the composition is again D-finite. For the special case this reduces to Abel’s theorem, which says that every algebraic function is D-finite. This particular case was investigated closely in , where a collection of bounds was given for the orders and degrees of the differential equations satisfied by a given algebraic function. It was also pointed out in  that differential equations of higher order may have significantly lower degrees, an observation that gave rise to a more efficient algorithm for transforming an algebraic equation into a differential equation. Their observation has also motivated the study of order-degree curves: for a fixed D-finite function , these curves describe the boundary of the region of all pairs such that satisfies a differential equation of order  and degree .

###### Example 1.

We have fixed some randomly chosen operator of order and degree and a random

polynomial of -degree and -degree . For some prescribed orders , we computed the smallest degrees  such that there is an operator of order and degree that annihilates for all solutions of and all solutions of . The points are shown in the figure on the right.

Experiments suggested that order-degree curves are often just simple hyperbolas. A priori knowledge of these hyperbolas can be used to design efficient algorithms. For the case of creative telescoping of hyperexponential functions and hypergeometric terms, as well as for simple D-finite closure properties (addition, multiplication, Ore-action), bounds for order-degree curves have been derived [4, 3, 8]. However, it turned out that these bounds are often not tight.

A new approach to order-degree curves has been suggested in , where a connection was established between order-degree curves and apparent singularities. Using the main result of this paper, very accurate order-degree curves for a function  can be written down in terms of the number and the cost of the apparent singularities of the minimal order annihilating operator for . However, when the task is to compute an annihilating operator from some other representation, e.g., a definite integral, then the information about the apparent singularities of the minimal order operator is only a posteriori knowledge. Therefore, in order to design efficient algorithms using the result of , we need to predict the singularity structure of the output operator in terms of the input data. This is the program for the present paper.

First (Section 2), we derive an order-degree bound for D-finite substitution using the classical approach of considering a suitable ansatz over the constant field, comparing coefficients, and balancing variables and equations in the resulting linear system. This leads to an order-degree curve which is not tight. Then (Section 3) we estimate the order and degree of the minimal order annihilating operator for the composition by generalizing the corresponding result of  from to arbitrary D-finite . The derivation of the bound is a bit more tricky in this more general situation, but once it is available, most of the subsequent algorithmic considerations of  generalize straightforwardly. Finally (Section 4) we turn to the analysis of the singularity structure, which indeed leads to much more accurate results. The derivation is also much more straightforward, except for the required justification of the desingularization cost. In practice, it is almost always equal to one, and although this is the value to be expected for generic input, it is surprisingly cumbersome to give a rigorous proof for this expectation.

Throughout the paper, we use the following conventions:

• is a field of characteristic zero, is the usual commutative ring of univariate polynomials over . We write or for the commutative ring of bivariate polynomials and for the non-commutative ring of linear differential operators with polynomial coefficients. In this latter ring, the multiplication is governed by the commutation rule .

• is an operator of order with polynomial coefficients of degree at most .

• is a polynomial of degrees and . It is assumed that is square-free as an element of and that it has no divisors in , where is the algebraic closure of .

• is an operator such that for every solution of  and every solution of , the composition is a solution of . The expression can be understood either as a composition of analytic functions in the case , or in the following sense. We define such that for every , for every solution of and every solution of , annihilates , which is a well-defined element of . In the case these two definitions coincide.

## 2 Order-Degree-Curve by Linear Algebra

Let be a solution of , i.e., suppose that , and let be a solution of , i.e., suppose that . Expressions involving and can be manipulated according to the following three well-known observation:

1. (Reduction by ) For each polynomial with there exists a polynomial with and such that

 Q(x,g)=1lcy(P)~Q(x,g).

The polynomial is the result of the first step of computing the pseudoremainder of by w.r.t. .

2. (Reduction by ) There exist polynomials of degree at most such that

 f(rL)∘g=1vrP−1∑j=0rL−1∑k=0qj,kgj⋅(f(k)∘g).

To see this, write for some polynomials of degree at most . Then we have

 f(rL)∘g=−1lrL∘grL−1∑k=0(lk∘g)⋅(f(k)∘g).

By the assumptions on , the denominator cannot be zero. In other words, in . For each , consider an ansatz for polynomials of degrees at most and , respectively, and compare coefficients with respect to . This gives inhomogeneous linear systems over with variables and equations, which only differ in the inhomogeneous part but have the same matrix for every . The claim follows using Cramer’s rule, taking into account that the coefficient matrix of the system has many columns with polynomials of degree  and many columns with polynomials of degree (which is also the degree of the inhomogeneous part). Note that does not depend on .

3. (Multiplication by ) For each polynomial with there exist polynomials of degree at most such that

 g′Q(x,g)=1wlcy(P)rP−1∑j=0qjgj,

where is the discriminant of . To see this, first apply Observation 1 (Reduction by ) to rewrite as for some of degree . Then consider an ansatz with unknown polynomials of degrees at most and , respectively, and compare coefficients with respect to . This gives an inhomogeneous linear system over with variables and equations. The claim then follows using Cramer’s rule.

###### Lemma 2.

Let , where and are as in the Observations 2 and 3 above. Let be a solution of and be a solution of . Then for every there are polynomials of degree at most such that

 ∂ℓ(f∘g)=1uℓrP−1∑i=0rL−1∑j=0ei,jgi⋅(f(j)∘g).
###### Proof.

This is evidently true for . Suppose it is true for some . Then

 ∂ℓ+1(f∘g)=rP−1∑i=0rL−1∑j=0(ei,juℓgi⋅(f(j)∘g))′ =rP−1∑i=0rL−1∑j=0(e′i,ju−ℓei,ju′uℓ+1gi⋅(f(j)∘g) +ei,juℓ(igi−1⋅(f(j)∘g)+gi⋅(f(j+1)∘g))g′).

The first term in the summand expression already matches the claimed bound. To complete the proof, we show that

 (igi−1⋅(f(j)∘g)+gi⋅(f(j+1)∘g))g′=1urP−1∑k=0qkgk (1)

for some polynomials  of degree at most . Indeed, the only critical term is . According to Observation 2, can be rewritten as for some of degree at most . This turns the left hand side of (1) into an expression of the form for some polynomials of degree at most . An -fold application of Observation 1 brings this expression to the form for some polynomials of degree at most . Now Observation 3 completes the induction argument.

###### Theorem 3.

Let be such that

 r≥rLrPandd≥r(3rP+dL−1)dPrLrPr+1−rLrP.

Then there exists an operator of order and degree such that for every solution  of  and every solution  of  the composition is a solution of . In particular, there is an operator of order and degree .

###### Proof.

Let be a solution of and be a solution of . Then we have and , and we seek an operator such that . Let and consider an ansatz

 M=d∑i=0r∑j=0ci,jxi∂j

with undetermined coefficients .

Let be as in Lemma 2. Then applying to and multiplying by gives an expression of the form

 d+rdeg(u)∑i=0rP−1∑j=0rL−1∑k=0qi,j,kxigj⋅(f(k)∘g),

where the are -linear combinations of the undetermined coefficients . Equating all the to zero leads to a linear system over with at most equations and exactly variables. This system has a nontrivial solution as soon as

 (r+1)(d+1) >(1+d+rdeg(u))rLrP ⟺ (r+1−rLrP)(d+1) >rrLrPdeg(u) ⟺ d >−1+rrLrPdeg(u)r+1−rLrP.

The claim follows because .

## 3 A Degree Bound for the Minimal Operator

According to Theorem 3, there is operator of order and degree . Usually there is no operator of order less than , but if such an operator accidentally exists, Theorem 3 makes no statement about its degree. The result of the present section (Theorem 8 below) is a degree bound for the minimal order operator, which also applies when its order is less than , and which is better than the bound of Theorem 3 if the minimal order operator has order .

The following Lemma is a variant of Lemma 2 in which is allowed to appear in the denominator, and with exponents larger than . This allows us to keep the -degrees smaller.

###### Lemma 4.

Let be a solution of and be a solution of . For every , there exist polynomials for such that and for all , and

 ∂ℓ(f∘g)=1U(x,g)ℓrL−1∑j=0Eℓ,j(x,g)(f(j)∘g),

where .

###### Proof.

This is true for . Suppose it is true for some . Then

 ∂ℓ+1(f∘g)=(1U(x,g)ℓrL−1∑j=0Eℓ,j(x,g)(f(j)∘g))′=rL−1∑j=0(ℓ(Ux+g′Uy)Uℓ+1Ei,j⋅(f(j)∘g)+1Uℓ((Eℓ,j)x+g′⋅(Eℓ,j)y)(f(j)∘g)+1UℓEℓ,jg′⋅(f(j+1)∘g))

We consider the summands separately. In , is already a polynomial in and of bidegree at most . Since and is divisible by , is also a polynomial with the same bound for the bidegree.

Futhermore, we can write

 (Eℓ,j)x+g′⋅(Eℓ,j)y=1U(U(Eℓ,j)x−PxPylrL(g)(Eℓ,j)y),

where the expression in the parenthesis satisfies the stated bound.

For , the last summand can be written as

 1UℓEℓ,jg′⋅(f(j+1)∘g)=PxPylrl(g)Uℓ+1Eℓ,j⋅(f(j+1)∘g). (2)

For , due to Observation 2

 g′⋅(f(rL)∘g)=−PxPyUrL−1∑j=0lj(g)(f(j)∘g). (3)

Right-hand sides of both (2) and (3) satisfy the bound.

Let be -linearly independent solutions of , and let be distinct solutions of . By we denote the -dimension of the -linear space spanned by for all and . The order of the operator annihilating is at least . We will construct an operator of order annihilating using Wronskian-type matrices.

###### Lemma 5.

There exists a matrix such that the bidegree of every entry of the -th row of does not exceed and

if and only if the vector

lies in the column space of the matrix .

###### Proof.

With the notation of Lemma 4, let be the matrix whose -th entry is . Then meets the stated degree bound.

By we denote the Wronskian matrix for . Then if and only if the vector lies in the column space of the matrix . Hence, it is sufficient to prove that and have the same column space. The following matrix equality follows from the definition of

The latter matrix is nondegenerate since it is a Wronskian matrix for the -linearly independent power series , …, with respect to the derivation . Hence, and have the same column space.

In order to express the above condition of lying in the column space in terms of vanishing of a single determinant, we want to “square” the matrix .

###### Lemma 6.

There exists a matrix such that the degree of every entry does not exceed and the matrix

 C=(A(x,g1)⋯A(x,grP)B(g1)⋯B(grP))

has rank .

###### Proof.

Let be the Vandermonde matrix for , and let

denote the identity matrix. Then

is nondegenerate and has the form , for some with entries of degree at most . Since is nondegenerate, we can choose rows which span a complimentary subspace to the row space of . Discarding all other rows from , we obtain with the desired properties.

By (, resp.) we will denote the matrix (, resp.) without the -th row.

###### Lemma 7.

For every the determinant of is divisible by

###### Proof.

We show that is divisible by for every . Without loss of generality, it is sufficient to show this for and . We have

 detCℓ=∣∣∣Aℓ(x,g1)−Aℓ(x,g2)Aℓ(x,g2)⋯Aℓ(x,grP)B(g1)−B(g2)B(g2)⋯B(grP)∣∣∣.

Since for every polynomial we have , every entry of the first columns in the above matrix is divisible by . Hence, the whole determinant is divisible by .

###### Theorem 8.

The minimal operator annihilating for every and such that and has order and degree at most

 2r2dP−12(r−2)(r−1)+rdPrL(2rP+dL−1)−dPrL(rP−1) =O(rdPrL(dL+rP)).
###### Proof.

We construct using for . We consider some and by we denote the -dimensional vector . If , then the first rows of the matrix are linearly dependent, so it is degenerate. On the other hand, if this matrix is degenerate, then Lemma 6 implies that is a linear combination of the columns of , so Lemma 5 implies that . Hence . Due to Lemma 7, the latter condition is equivalent to , where . Thus we can take . It remains to bound the degrees of the coefficients of .

Combining lemmas 5, 6, and 7, we obtain

 dX :=degxcℓ≤∑i≠ℓ(2rdP+1−i)≤2r2dP−12(r−2)(r−1), dY :=deggicℓ≤rrL(2rP+dL−1)−rL(rP−1).

Since is symmetric with respect to , it can be written as an element of where is the -th elementary symmetric polynomial in , and the total degree of with respect to ’s does not exceed . Substituting with the corresponding coefficient of and clearing denominators, we obtain a polynomial in of degree at most .

Since the order of is equal to the dimension of the space of all compositions of the form , where and , is the minimal annihilating operator for this space.

###### Remark 9.

The proof of Theorem 8 is a generalization of the proof of [2, Thm. 1]. Specializing , in Theorem 8 gives a sightly larger bound as the bound in [2, Thm. 1], but with the same leading term.

Although the bound of Theorem 8 for beats the bound of Theorem 3 for by a factor of , it is apparently still not tight. Experiments we have conducted with random operators lead us to conjecture that in fact, at least generically, the minimal order operator of order has degree

. By interpolating the degrees of the operators we found in our computations, we obtain the expression in the following conjecture.

###### Conjecture 10.

For every there exist and such that the corresponding minimal order operator has order and degree

 r2L(2rP(rP−1)+1)dP+rLrP(dP(dL+1)+1)+dLdP −r2Lr2P−rLdLdP,

and there do not exist and for which the corresponding minimal operator has order and larger degree.

## 4 Order-Degree-Curve by singularities

A singularity of the minimal operator is a root of its leading coefficient polynomial . In the notation and terminology of , a factor  of this polynomial is called removable at cost  if there exists an operator of order such that and . A factor  is called removable if it is removable at some finite cost , and non-removable otherwise. The following theorem [7, Theorem 9] translates information about the removable singularities of a minimal operator into an order-degree curve.

###### Theorem 11.

Let , and let be pairwise coprime factors of which are removable at costs , respectively. Let and

 d≥degx(M)−⌈m∑i=1(1−cir−deg∂(M)+1)+degx(pi)⌉,

where we use the notation . Then there exists an operator such that and and .

The order-degree curve of Theorem 11 is much more accurate than that of Theorem 3. However, the theorem depends on quantities that are not easily observable when only and are known. From Theorem 8 (or Conj. 10), we have a good bound for . In the the rest of the paper, we discuss bounds and plausible hypotheses for the degree and the cost of the removable factors. The following example shows how knowledge about the degree of the operator and the degree and cost of its removable singularities influence the curve.

###### Example 12.

The figure below compares the data of Example 1 with the curve obtained from Theorem 11 using , , , . This curve is labeled (a) below. Only for a few orders , the curve slightly overshoots. In contrast, the curve of Theorem 3, labeled (b) below, overshoots significantly and systematically.

The figure also illustrates how the parameters affect the accuracy of the estimate. The value is correctly predicted by Conjecture 10. If we use the more conservative estimate of Theorem 8, we get the curve (e). For curve (d) we have assumed a removability degree of , as predicted by Theorem 17 below, instead of the true value . For (c) we have assumed a removability cost instead of .

### 4.1 Degree of Removable Factors

###### Lemma 13.

Let be a polynomial with , and . Assume that is a root of of multiplicity . Then the squarefree part

 S(y)=P(α,y)/gcd(P(α,y),Py(α,y))

of has degree at least .

###### Proof.

Let be the Sylvester matrix for and with respect to . The value is of the form , where every has at least common columns with . Since , at least one of these matrices is nondegenerate. Hence, . On the other hand, is equal to the dimension of the space of pairs of polynomials such that and