1 Introduction
For a given nonsingular polynomial matrix in , one can find a unimodular matrix such that is triangular. Unimodular means that there is a polynomial inverse matrix, or equivalently, the determinant is a nonzero constant from . Triangularizing a matrix is useful for solving linear systems and computing matrix operations such as determinants or normal forms. In the latter case, the bestknown example is the Hermite normal form, first defined by Hermite in 1851 in the context of triangularizing integer matrices Hermite1851 . Here,
with the added properties that each is monic and for all . Classical variations of this definition include specifying upper rather than lower triangular forms, and specifying row rather than column forms. In the latter case, the unimodular matrix multiplies on the left rather than the right, and the degree of the diagonal entries dominates that of their columns rather than their rows.
The goal of this paper is the fast, deterministic computation of the determinant and Hermite normal form of a nonsingular polynomial matrix. The common ingredient in both algorithms is a method for the fast computation of the diagonal entries of a matrix triangularization. The product of these entries gives, at least up to a constant, the determinant while Hermite forms are determined from a given triangularization by reducing the remaining entries modulo the diagonal entries.
In the case of determinant computation, there has been a number of efforts directed to obtaining algorithms whose complexities are given in terms of exponents of matrix multiplication. Interestingly enough, in the case of matrices over a field, Bunch and Hopcroft BunchHopcroft1974 showed that if there exists an algorithm which multiplies matrices in field operations for some , then there also exists an algorithm for computing the determinant with the same cost bound . In the case of an arbitrary commutative ring or of the integers, fast determinant algorithms have been given by Kaltofen kaltofen92 , Abbott et al. abbott and Kaltofen and Villard KaltofenVillard . We refer the reader to the last named paper and the references therein for more details on efficient determinant computation of such matrices.
In the specific case of the determinant of a matrix of polynomials with , Storjohann storjohann:phd2000 gave a recursive deterministic algorithm making use of fractionfree Gaussian elimination with a cost of operations. A deterministic algorithm was later given by Mulders and Storjohann muldersstorjohann:2003 , modifying their algorithm for weak Popov form computation. Using low rank perturbations, Eberly et al. EberlyGiesbrechtVillard gave a randomized determinant algorithm for integer matrices which can be adapted to be used with polynomial matrices using field operations. Storjohann storjohann:2003 later used high order lifting to give a randomized algorithm which computes the determinant using field operations. The algorithm of Giorgi et al. Giorgi2003 has a similar cost but only works on a class of generic input matrices, matrices that are well behaved in the computation.
Similarly there has been considerable progress in the efficient computation of the Hermite form of a polynomial matrix. Hafner and McCurley hafner and Iliopoulos iliopoulos give algorithms with a complexity bound of operations from where . They control the size of the matrices encountered during the computation by working modulo the determinant. Using matrix multiplication the algorithms of Hafner and McCurley hafner , Storjohann and Labahn storjohannlabahn96 and Villard villard96 reduce the cost to operations where is the exponent of matrix multiplication. The algorithm of Storjohann and Labahn worked with integer matrices but the results directly carry over to polynomial matrices. Mulders and Storjohann muldersstorjohann:2003 then gave an iterative algorithm having complexity , thus reducing the exponent of but at the cost of increasing the exponent of .
During the past two decades, there has been a goal to design algorithms that perform various linear algebra operations in about the time that it takes to multiply two polynomial matrices having the same dimension and degree as the input matrix, namely at a cost . Randomized algorithms with such a cost already exist for a number of polynomial matrix problems, for example for linear system solving storjohann:2003 , Smith normal form computation storjohann:2003 , row reduction Giorgi2003 and small nullspace bases computation StoVil05 . In the case of polynomial matrix inversion, the randomized algorithm in Storjohann15 costs , which is quasilinear in the number of field elements used to represent the inverse. For Hermite form computation, Gupta and Storjohann GS2011 gave a randomized algorithm with expected cost , later improved to in Gupta11 . Their algorithm was the first to be both softly cubic in and softly linear in . It is worth mentioning that all the algorithms cited in this paragraph are of the Las Vegas type.
Recently, deterministic fast algorithms have been given for linear system solving and row reduction GSSV2012 , minimal nullspace bases za2012 , and matrix inversion ZhLaSt15 . Having a deterministic algorithm has advantages. As a simple but important example, this allows for use over a small finite field without the need for resorting to field extensions. The previous fastest Hermite form algorithms GS2011 ; Gupta11 do require such field extensions. In this paper, we give deterministic fast algorithms for computing Hermite forms and determinants.
Our approach relies on an efficient method for determining the diagonal elements of a triangularization of the input matrix . We can do this recursively by determining, for each integer , a partition
where has rows, has columns and is of size . The subscripts for and are meant to denote up, down, left and right. As is nonsingular, has full rank and hence one has that is a basis of the kernel of . Furthermore the matrix is nonsingular and is therefore a column basis of .
However the recursion described above requires additional properties if it is to be efficient for our applications. In the case of determinants, being lower triangular implies that we need both the product of the diagonals and also the determinant of the unimodular multiplier. For the case of Hermite form computation a sensible approach would be to first determine a triangular form of and then reduce the lower triangular elements using the diagonal entries with unimodular operations. In both applications it appears that we would need to know . However the degrees in such a unimodular multiplier can be too large for efficient computation. Indeed there are examples where the sum of the degrees in is (see Section 3), in which case computing is beyond our target cost .
In order to achieve the desired efficiency, our triangularization computations need to be done without actually determining the entire unimodular matrix . We accomplish this by making use of shifted minimal kernel bases and column bases of polynomial matrices, whose computations can be done efficiently using algorithms from za2012 and za2013 . Shifts are weightings of column degrees which basically help us to control the computations using column degrees rather than the degree of the polynomial matrix. Using the degree becomes an issue for efficiency when the degrees in the input matrix vary considerably from column to column. We remark that shifted minimal kernel bases and column bases, used in the context of fast block elimination, have also been used for deterministic algorithms for inversion ZhLaSt15 and unimodular completion zl2014 of polynomial matrices.
Fast algorithms for computing shifted minimal kernel bases za2012 and column bases za2013 imply that we can deterministically find the diagonals in field operations, where is the average of the column degrees of . We recall that the ceiling function indicates that for matrices with very low average column degree , this cost is still . By modifying this algorithm slightly we can also compute the determinant of the unimodular multiplier, giving our first contribution. In the next theorem, is the socalled generic determinant bound as defined in GSSV2012 (see also Section 2.3). It has the important property that is bounded from above by both the average of the degrees of the columns of and that of its rows.
Theorem 1.1.
Let be a nonsingular matrix in . There is a deterministic algorithm which computes the determinant of using operations in , with being the minimum of the average of the degrees of the columns of and that of its rows.
Applying our fast diagonal entry algorithm for Hermite form computation has more technical challenges. The difficulty comes from the unpredictability of the diagonal degrees of , which coincide with its row degrees. Indeed, we know that the sum of the diagonal degrees in is , and so the sum of the degrees in is . Still, the best known a priori bound for the degree of the th diagonal entry is and hence the sum of these bounds is , a factor of larger than the actual sum. Determining the diagonal entries gives us the row degrees of and thus solves this issue. Still, it remains a second major task: that of computing the remaining entries of .
The randomized algorithm of Gupta and Storjohann GS2011 ; Gupta11 solves the Hermite form problem using two steps, which both make use of the Smith normal form of and partial information on a left multiplier for this Smith form. The matrices and can be computed with a Las Vegas randomized algorithm using an expected number of field operations GS2011 ; Gupta11 , relying in particular on highorder lifting (storjohann:2003, , Section 17). The first step of their algorithm consists of computing the diagonal entries of by triangularization of a matrix involving and , a computation done in operations Gupta11 . The second step sets up a system of linear modular equations which admits as a basis of solutions: the matrix of the system is and the moduli are the diagonal entries of . The degrees of the diagonal entries obtained in the first step are then used to find as another basis of solutions of this system, computed in GS2011 using in particular fast minimal approximant basis and partial linearization techniques Storjohann06 ; ZL2012 .
The algorithm presented here for Hermite forms follows a twostep process similar to the algorithm of Gupta and Storjohann, but it avoids using the Smith form of , whose deterministic computation in still remains an open problem. Instead, as explained above, we compute the diagonal entries of deterministically via Section 1 using field operations, where is the average of the column degrees of . As for the second step, using the knowledge of the diagonal degrees of combined with partial linearization techniques from (GSSV2012, , Section 6), we show that can then be computed via a single call to fast deterministic column reduction GSSV2012 using field operations. This new problem reduction illustrates the fact that knowing in advance the degree shape of reduced or normal forms makes their computation much easier, something already observed and exploited in GS2011 ; zhou:phd2012 ; JeNeScVi16 .
This approach results in a deterministic algorithm for Hermite form computation, which is satisfactory for matrices that have most entries of similar degree . However, inspired from other contexts such as approximant and kernel basis computations Storjohann06 ; ZL2012 ; JeNeScVi16 ; za2012 as well as polynomial matrix inversion ZhLaSt15 and the determinant algorithm in this paper, one may hope for algorithms that are even faster than when the degrees in are nonuniform, for example, if all highdegree entries are located in a few rows and columns of . In the present paper we use ideas in GSSV2012 to reduce the nonuniformity of the degrees in in the context of Hermite form computation, thus obtaining Theorem 1.2.
Theorem 1.2.
Let be a nonsingular matrix in . There is a deterministic algorithm which computes the Hermite form of using operations in , with being the minimum of the average of the degrees of the columns of and that of its rows.
The remainder of this paper is organized as follows. In Section 2 we give preliminary information on shifted degrees as well as kernel and column bases of polynomial matrices. We also recall why it is interesting to have cost bounds involving the generic determinant bound rather than the degree of the matrix; see in particular Remark 2.6. Section 3 contains the fast algorithm for finding the diagonal entries of a triangular form. This is followed in Section 4 by our algorithm for finding the determinant. The reduction of degrees of off diagonal entries in the Hermite form is then given in Section 5. It computes the remaining entries by relying in particular on fast deterministic column reduction. In Section 6 we then give the details about how to use partial linearization to decrease the nonuniformity of the degrees in the input matrix for Hermite form computation. The paper ends with a conclusion and topics for future research.
2 Preliminaries
In this section we first give the basic notations for column degrees and shifted degrees
of vectors and matrices of polynomials. We then present the building blocks used in our algorithms, namely the concepts of
kernel basis and column basis for a matrix of polynomials. Finally, we explain our interest in having cost bounds involving the socalled generic determinant bound.2.1 Shifted Degrees
Our methods make use of the concept of shifted degrees of polynomial matrices BLV:1999 , basically shifting the importance of the degrees in some of the rows of a basis. For a column vector of univariate polynomials over a field , its column degree, denoted by , is the maximum of the degrees of the entries of , that is,
The shifted column degree generalizes this standard column degree by taking the maximum after shifting the degrees by a given integer vector that is known as a shift. More specifically, the shifted column degree of with respect to a shift , or the column degree of , is
where
For a matrix , we use and to denote respectively the list of its column degrees and the list of its shifted column degrees. For the uniform shift , the shifted column degree specializes to the standard column degree. Similarly, is equivalent to for all and , that is, bounds the row degrees of .
The shifted row degree of a row vector is defined similarly as
Shifted degrees have been used previously in polynomial matrix computations and in generalizations of some matrix normal forms BLV:jsc06 . The shifted column degree is equivalent to the notion of defect commonly used in the rational approximation literature.
Along with shifted degrees we also make use of the notion of a polynomial matrix being column reduced. A fullrank polynomial matrix is column reduced if its leading column coefficient matrix, that is the matrix
has full rank. Then, the polynomial matrix is column reduced if is column reduced. The concept of being shifted row reduced is similar.
The usefulness of the shifted degrees can be seen from their applications in polynomial matrix computation problems such as HermitePadé and MPadé approximations
Beckermann92 ; BarBul92 ; BeLa94 ; ZL2012 , minimal kernel bases za2012 , and shifted column reduction BLV:jsc06 ; Neiger16 .An essential fact needed in this paper, also based on the use of shifted degrees, is the efficient multiplication of matrices with unbalanced degrees (za2012, , Theorem 3.7).
Theorem 2.1.
Let with , a shift with entries bounding the column degrees of , and a bound on the sum of the entries of . Let with and the sum of its column degrees satisfying . Then we can multiply and with a cost of , where is the average of the entries of .
2.2 Shifted Kernel and Column Bases
The kernel of is the module . Such a module is free and of rank (DumFoo04, , Chapter 12, Theorem 4); any of its bases is called a kernel basis of . In other words:
Definition 2.2.
Given , a polynomial matrix is a (right) kernel basis of if the following properties hold:

has full rank,

satisfies ,

Any satisfying can be written as a linear combination of the columns of , that is, there exists such that .
It is easy to show that any pair of kernel bases and of are unimodularly equivalent. An minimal kernel basis of is a kernel basis that is column reduced.
Definition 2.3.
Given , a matrix is an minimal (right) kernel basis of if is a kernel basis of and is column reduced.
A column basis of is a basis of the module , which is free of rank . Such a basis can be represented as a full rank matrix whose columns are the basis elements. A column basis is not unique and indeed any column basis right multiplied by a unimodular matrix gives another column basis.
Example 2.4.
Let
be a matrix over having column degree . Then a column basis , and a kernel basis , of are given by
For example, if and denote the columns of then the third column of , denoted by , is given by
Here . In addition, the shifted leading coefficient matrix
has full rank, and hence we have that is an minimal kernel basis of .
Fast algorithms for kernel basis computation and column basis computation are given in za2012 and in za2013 , respectively. In both cases they make use of fast methods for order bases (often also referred to as minimal approximant bases) BeLa94 ; Giorgi2003 ; ZL2009 ; ZL2012 . In what follows, we write for the sum of the entries of a tuple with nonnegative entries.
Theorem 2.5.
Let with and , and let be such that componentwise. Then, there exist deterministic algorithms which compute

an minimal kernel basis of using field operations,

a column basis of using field operations,
where is the average column degree of .
2.3 The generic determinant degree bound
For a nonsingular matrix , the degree of the determinant of provides a good measure of the size of the output in the case of Hermite form computation. Indeed, if we denote by the degrees of the diagonal entries of , then we have Since the diagonal entries are those of largest degree in their respective rows, we directly obtain that can be represented using field elements.
The size of the input can be measured by several quantities, which differ in how precisely they account for the distribution of the degrees in . It is interesting to relate these quantities to the degree of the determinant of , since the latter measures the size of the output . A first, coarse bound is given by the maximum degree of the entries of the matrix: can be represented by field elements. On the other hand, by definition of the determinant we have that has degree at most . A second, finer bound can be obtained using the average of the row degrees and of the column degrees: the size of in terms of field elements is at most . Again we have the related bound
An even finer bound on the size of is given by the generic determinant bound, introduced in (GSSV2012, , Section 6). For , this is defined as
(1) 
where is the set of permutations of , and where
By definition, we have the inequalities
and it is easily checked that can be represented using field elements.
Thus in Hermite form computation both the input and the output have average degree in and can be represented using field elements. Furthermore gives a more precise account of the degrees in than the average row and column degrees, and an algorithm with cost bound is always faster, sometimes significantly, than an algorithm with cost bound where is the average column degree or the average row degree, let alone .
Remark 2.6.
Let us justify why this can sometimes be significantly faster. We have seen that is bounded from above by both the average column degree and the average row degree of . It turns out that, in some important cases may be substantially smaller than these averages. For example, consider with one row and one column of uniformly large degree and all other entries of degree :
Here, the average row degree and the average column degree are both exactly while the generic determinant bound is as well. Thus, here is much smaller than . For similar examples, we refer the reader to (GSSV2012, , Example 4) and (ZhLaSt15, , equation (8)).
3 Determining the diagonal entries of a triangular form
In this section we show how to determine the diagonal entries of a triangular form of a nonsingular matrix with having column degrees . Our algorithm makes use of fast kernel and column bases computations.
As mentioned in the introduction, we consider unimodularly transforming to
(2) 
which eliminates a top right block and gives two square diagonal blocks and in . After this block triangularization step, the matrix is now closer to being in triangular form. Applying this procedure recursively to and , until the matrices reach dimension , gives the diagonal entries of a triangular form of . These entries are unique up to multiplication by a nonzero constant from , and in particular making them monic yields the diagonal entries of the Hermite form of .
In this procedure, a major problem is that the degrees in the unimodular multiplier can be too large for efficient computation. For example, the matrix
of degree is unimodular and hence its Hermite form is the identity. However the corresponding unimodular multiplier is
with the sum of the degrees in being in , beyond our target cost .
3.1 Fast block elimination
Our approach is to make use of fast kernel and column basis methods to efficiently compute the diagonal blocks and while at the same time avoiding the computation of all of .
Partition , with and consisting of the upper and lower rows of , respectively. Then both upper and lower parts have fullrank since is assumed to be nonsingular. By partitioning , where the column dimension of matches the row dimension of , then becomes
Notice that the matrix is nonsingular and is therefore a column basis of . As such this can be efficiently computed as mentioned in Theorem 2.5. In order to compute , notice that the matrix is a right kernel basis of , which makes the top right block of zero.
The following lemma states that the kernel basis can be replaced by any other kernel basis of thus giving another unimodular matrix that also works.
Lemma 3.1.
Partition and suppose is a column basis of and a kernel basis of . Then there is a unimodular matrix such that
where . If is square and nonsingular, then and are also square and nonsingular.
Proof.
This follows from (za2013, , Lemma 3.1). ∎
Note that we do not compute the blocks represented by the symbol . Thus Lemma 3.1 allows us to determine and independently without computing the unimodular matrix. This procedure for computing the diagonal entries is presented in Algorithm 1. Formally the cost of this algorithm is given in Proposition 3.3.
3.2 Computational cost and example
Before giving a cost bound for our algorithm, let us observe its correctness on an example.
Example 3.2.
Let
working over . Considering the matrix formed by the top two rows of , then a column basis and kernel basis of were given in Example 2.4. If denotes the bottom row of , then this gives diagonal blocks
and
Recursively computing with , we obtain a column basis and kernel basis of the top row of , as
If denote the bottom row of , we get , which gives the second diagonal block from . Thus we have the diagonal entries of a triangular form of . On the other hand, since is already a matrix we do not need to do any extra work. As a result we have that is unimodularly equivalent to
giving, up to making them monic, the diagonal entries of the Hermite form of .
Proposition 3.3.
Algorithm 1 costs field operations to compute the diagonal entries of the Hermite normal form of a nonsingular matrix , where is the average column degree of .
Proof.
The three main operations are computing a column basis of , computing a kernel basis of , and multiplying the matrices . Let denote the column degree of and set , an integer used to measure size for our problem.
For the column basis computation, by Theorem 2.5 (see also (za2013, , Theorem 5.6)) we know that a column basis of can be computed with a cost of , where . Furthermore, the sum of the column degrees of the computed is bounded by the sum of the column degrees of (see za2013 , in particular the proof of Lemma 5.5 therein). Thus, since componentwise, the sum of the column degrees of is at most .
Similarly, according to Theorem 2.5 (see also (za2012, , Theorem 4.1)), computing an minimal kernel basis of costs operations, and the sum of the column degrees of the output kernel basis is bounded by (za2012, , Theorem 3.4).
For the matrix multiplication , we have that the sum of the column degrees of and the sum of the column degrees of are both bounded by . Therefore Theorem 2.1 applies and the multiplication can be done with a cost of . Furthermore, since the entries of bounds the corresponding column degrees of , according to (za2012, , Lemma 3.1), we have that the column degrees of are bounded by the column degrees of . In particular, the sum of the column degrees of is at most .
If we let the cost of Algorithm 1 be for an input matrix of dimension then
As depends on we use with not depending on . Then we solve the recurrence relation as
In this cost bound, we do not detail the logarithmic factors because it is not clear to us for the moment how many logarithmic factors arise from the calls to the kernel basis and column basis algorithms of
za2012 ; za2013 , where they are not reported. Yet, from the recurrence relation above, it can be observed that no extra logarithmic factor will be introduced if , while an extra factor logarithmic in will be introduced if .4 Efficient Determinant Computation
In this section, we show how to recursively and efficiently compute the determinant of a nonsingular matrix having column degrees . Our algorithm follows a strategy similar to the recursive block triangularization in Section 3, making use of fast kernel basis and column basis computation.
Indeed, after unimodularly transforming to
as in equation 2, the determinant of can be computed as
(3) 
which requires us to first compute , , and . The same procedure can then be applied to compute the determinant of and the determinant of . However, as is unimodular we will handle its determinant differently. This can be repeated recursively until the dimension becomes .
One major obstacle for efficiency of this approach is that we do want to compute the scalar , and as noted in Section 3, the degrees of the unimodular matrix can be too large for efficient computation. To sidestep this issue, we will show that can be computed with only partial knowledge of the matrix . Combining this with the method of Section 3 to compute the matrices and without computing all of and , we obtain an efficient recursive algorithm.
Remark 4.1.
In some cases, the computation of the determinant is easily done from the diagonal entries of a triangular form. Indeed, let be nonsingular and assume that we have computed the diagonal entries of its Hermite form. Then, for some nonzero constant . If the constant coefficient of is nonzero, we can retrieve by computing the constant coefficient of , which is found by linear algebra using operations since . More generally, if we know such that , then we can deduce efficiently. Yet, this does not lead to a fast deterministic algorithm in general since it may happen that for all field elements , or that finding with is a difficult task.
We now focus on computing the determinant of , or equivalently, the determinant of . The column basis computation from za2013 for computing the diagonal block also gives , the matrix consisting of the right columns of , which is a right kernel basis of . In fact, this column basis computation also gives a right factor multiplied with the column basis to give . The following lemma shows that this right factor coincides with the matrix consisting of the top rows of . The column basis computation therefore gives both and with no additional work.
Lemma 4.2.
Let be the dimension of . The matrix satisfies if and only if is the submatrix of formed by its top rows.
Proof.
The proof follows directly from
While the determinant of or the determinant of is needed to compute the determinant of , a major problem is that we do not know or , which may not be efficiently computed due to their possibly large degrees. This means we need to compute the determinant of or without knowing the complete matrix or . The following lemma shows how this can be done using just and , which are obtained from the computation of the column basis .
Lemma 4.3.
Let and satisfy, as before,
where the row dimension of , the column dimension of , and the dimension of are . Let be the inverse of with rows in and be a matrix such that is unimodular. Then is unimodular and
Proof.
Since , we just need to show that . This follows from
In particular is a nonzero constant and thus is unimodular. ∎
Lemma 4.3 shows that the determinant of can be computed using , , and a unimodular completion of . In fact, this can be made more efficient still by noticing that since we are looking for a constant determinant, the higher degree parts of the matrices do not affect the computation. Indeed, if is unimodular, then one has
(4) 
since
Equation 4 allows us to use just the degree zero coefficient matrices in the computation. Hence Lemma 4.3 can be improved as follows.
Lemma 4.4.
Let , , and be as before. Let and be the constant matrices of and , respectively. Let be a matrix such that is nonsingular. Then
Proof.
Suppose is such that and is unimodular. Using Lemma 4.3 and equation 4, we have that is unimodular with and thus
Let us now show how to construct such a matrix . Let be any matrix such that is unimodular and let denote its constant term . It is easily checked that
for some nonsingular and some . Define the matrix in . On the one hand, we have that the matrix is unimodular. On the other hand, by construction we have that . ∎
Thus Lemma 4.4 requires us to compute a matrix such that is nonsingular. This can be obtained from the nonsingular matrix that transforms to its reduced column echelon form computed using the Gauss Jordan transform algorithm from storjohann:phd2000 with a cost of field operations.
We now have all the ingredients needed for computing the determinant of . A recursive algorithm is given in Algorithm 2, which computes the determinant of as the product of the determinant of and the determinant of . The determinant of is computed by recursively computing the determinants of its diagonal blocks and .