 # ZpL: a p-adic precision package

We present a new package ZpL for the mathematical software system SM. It implements a sharp tracking of precision on p-adic numbers, following the theory of ultrametric precision introduced in . The underlying algorithms are mostly based on automatic dierentiation techniques. We introduce them, study their complexity and discuss our design choices. We illustrate the bene-ts of our package (in comparison with previous implementations) with a large sample of examples coming from linear algebra, com-mutative algebra and dierential equations.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

When computing with real and -adic fields, exact results are usually impossible, since most elements have infinite decimal or -adic expansions. Working with these fields thus requires an analysis of how precision evolves through the sequence of steps involved in carrying out a computation. In this paper, we describe a package for computing with -adic rings and fields, based on a series of papers by the same authors (caruso-roe-vaccon:14a, ; caruso-roe-vaccon:15, ; caruso-roe-vaccon:16, ; caruso-roe-vaccon:17, ). The core of the package is a method for tracking precision using -adic lattices which can yield dramatically more precise results, at the cost of increased runtime and memory usage.

The standard method for handling precision when computing with real numbers is floating point arithmetic, which may also be used in

-adic computation. At a given precision level, a finite set of representable numbers are chosen, and arithmetic operations are defined to give a representable number that is close to the true result (ieee754:2008, ). Floating point arithmetic has the benefit of efficient arithmetic operations, but users are responsible for tracking the precision of the results. Numerically unstable algorithms can lead to very inaccurate answers (higham:2002, ).

If provably correct results are desired, interval arithmetic provides an alternative to floating point. Instead of just tracking an approximation to the answer, the package also tracks a radius within which the true result lies. This method is commonly used for -adic computations since the ultrametric property of -adic fields frequently keeps the radius small. Computations remain fairly efficient with this approach, but numerical instability can still lead to dramatic losses in precision (see §2 for many examples). Tracking the precision of multiple variables concurrently, the set of possible true values associated to an inexact value takes the form of an ellipsoid with axes parallel to the coordinate axes.

For better control of precision, we may allow arbitrary axes. This change would have little utility for real numbers, since such ellipsoids are not preserved by most functions. For -adic fields, in contrast, differentiable maps with surjective differential will send sufficiently small ellipsoids to other ellipsoids. From an algebraic perspective, these ellipsoids are just cosets of a lattice inside a

-adic vector space, and the main result of

(caruso-roe-vaccon:14a, ) (see also Proposition 3.1 below) describes how the image of such a coset under a map is given exactly by applying the differential of to .

In this paper, we describe an implementation of this idea in SageMath (sage, ). Rather than attaching a precision to each element, we store the precision of many elements together by tracking a precision module for the whole collection of variables. As variables are created and destroyed, we update a matrix whose rows represent the vectors in the module. Information about the precision of elements is extracted from the matrix as necessary.

The article is structured as follows. In §2 we provide a demonstration of the package, showing how it can provide more precise answers than the traditional methods for tracking -adic precision. In particular, §2.1 describes elementary arithmetic and the SOMOS-4 sequence, §2.2 gives examples from linear algebra, §2.3 examples using polynomials, and §2.4 examples of differential equations.

In §3 we give more details on the implementation. §3.1 contains a brief overview on the theory of -adic precision of (caruso-roe-vaccon:14a, ). In the next two subsections, we explain in more details how ZpLC and ZpLF work. §3.2 is devoted to the implementation of automatic differentiation leading to the actual computation of the module that models the precision. In §3.3, we explain how precision on any individual number can be recovered and discuss the validity of our results. The complexity overhead induced by our package is analyzed in §3.4.

Finally, §4 contains a discussion of how we see this package fitting into the existing -adic implementations. While these methods do introduce overhead, they are well suited to exploring precision behavior when designing algorithms, and can provide hints as to when further precision analysis would be useful.

## 2. Short demonstration

The first step is to define the parents: the rings of -adic numbers we will work with.

 ZpIn: Z2 =ZpXX(2, print_mode=’digits’) Q2 =QpXX(2, print_mode=’digits’)

ZpXX is a generic notation for ZpCR, ZpLC and ZpLF. The first, ZpCR, is the usual constructor for -adic parents in SageMath. It tracks precision using interval arithmetic. On the contrary ZpLC and ZpLF are provided by our package. In the sequel, we will compare the outputs provided by each parent. Results for ZpLF are only displayed when they differ from ZpLC.

### 2.1. Elementary arithmetic

We begin our tour of the features of the ZpL package with some basic arithmetic computations. We first pick some random element . The function random_element is designed so that it guarantees that the picked random element is the same for each constructor ZpCR, ZpLC and ZpLF.

 ZpIn: x =random_element(Z3, prec=5); x ...11111 ...11111

Multiplication by (here ) is a shift on the digits and thus leads to a gain of one digit in absolute precision. In the example below, we observe that when this multiplication is split into several steps, ZpCR does not see the gain of precision while ZpL does.

 ZpIn: 3*x ZpIn: x + x + x ZpCR: ...111110 ZpCR: ...11110 ZpLC: ...111110 ZpLC: ...111110

The same phenomenon occurs for multiplication.

 ZpIn: x^3 ZpIn: x * x * x ZpCR: ...010101 ZpCR: ...10101 ZpLC: ...010101 ZpLC: ...010101

ZpL is also well suited for working with coefficients with unbalanced precision.

 ZpIn: x =random_element(Z2, prec=10) y =random_element(Z2, prec=5) u, v = x+y, x-y u, v (...10111, ...01111) (...10111, ...01111)

Now, let us compute and compare it with (observe that they should be equal).

 ZpIn: u + v ZpIn: 2*x ZpCR: ...00110 ZpCR: ...00110100110 ZpLC: ...00110100110 ZpLC: ...00110100110

Again ZpCR does not output the optimal precision when the computation is split into several steps whereas ZpL does. Actually, these toy examples illustrate quite common situations which often occur during the execution of many algorithms. For this reason, interval arithmetic often overestimates the losses of precision. Roughly speaking, the aim of our package is to “fix this misfeature”. In the next subsections, we present a bunch of examples showing the benefit of ZpL in various contexts.

SOMOS 4. A first example is the SOMOS-4 sequence. It is defined by the recurrence:

 un+4=un+1un+3+u2n+2un

and is known for its high numerical instability (see (caruso-roe-vaccon:14a, )). Nevertheless, the ZpL package saves precision even when using a generic unstable implementation of the SOMOS iteration.

 ZpIn: defsomos4(u0, u1, u2, u3, n): a, b, c, d = u0, u1, u2, u3 for _ inrange(4, n+1): a, b, c, d = b, c, d, (b*d + c*c) / a return d u0 = u1 = u2 =Z2(1,15); u3 =Z2(3,15) somos4(u0, u1, u2, u3, 18) ...11 ...100000000000111 somos4(u0, u1, u2, u3, 100) PrecisionError: cannot divide by something indistinguishable from zero. ...001001001110001

### 2.2. Linear algebra

Many generic algorithms of linear algebra lead to quite important instability when they are used with -adic numbers. In many cases, our package ZpL rubs this instability without having to change the algorithm, nor the implementation.

Matrix multiplication. As revealed in (caruso-roe-vaccon:15, ), a first simple example where instability appears is simply matrix multiplication. This might be surprising because no division occurs in this situation. Observe nevertheless the difference between ZpCR and ZpLC.

 ZpIn: MS =MatrixSpace(Z2,2) M =random_element(MS, prec=5) for _ in range(25): M *=random_element(MS, prec=5) M [0 0] [0 0] [...100000000000 ...1000000000] [ ...010000000 ...00100000]

On the aforementioned example, we notice that ZpCR

is unable to decide whether the product vanishes or not. Having good estimates on the precision is therefore very important in such situations.

Characteristic polynomials. Characteristic polynomials are notoriously hard to compute (caruso-roe-vaccon:15, ; caruso-roe-vaccon:17, ). We illustrate this with the following example (using the default algorithm of SageMath for the computation of the characteristic polynomial, which is a division free algorithm in this setting) :

 ZpIn: M =random_element(MatrixSpace(Q2,3), prec=10) M.determinant() ...010000010 ...010000010 M.charpoly() ...00000000000000000001*x^3 + ...1001011.011*x^2 + ...0111.01*x + 0 ...00000000000000000001*x^3 + ...1001011.011*x^2 + ...11100111.01*x + ...010000010

We observe that ZpLC can guarantee more digits on the coefficient. Moreover, it recovers the correct precision on the constant coefficient (which is the determinant) whereas ZpCR is confused and cannot even certify that it does not vanish.

### 2.3. Commutative algebra

Our package can be applied to computation with -adic polynomials.

Euclidean algorithm. A natural example is that of the computation of GCD, whose stability has been studied in (caruso:2017, ). A naive implementation of the Euclidean algorithm can produce different behavior depending on the type of implementation of the field of -adic coefficients.

 ZpIn: S. =PolynomialRing(Z2) P =random_element(S, degree=10, prec=5) Q =random_element(S, degree=10, prec=5) D = x^5 +random_element(S, degree=4, prec=8); D ...00000000000000000001*x^5 + ...11111010*x^4 + ...10000000*x^3 + ...11001111*x^2 + ...10000110*x + ...11100010 ...00000000000000000001*x^5 + ...11111010*x^4 + ...10000000*x^3 + ...11001111*x^2 + ...10000110*x + ...11100010 defeuclidean(A,B): while B != 0: A, B = B, A % B return A.monic() euclidean(D*P, D*Q) 0*x^9 + ...1*x^8 + 0*x^7 + 0*x^6 + 0*x^5 + 0*x^4 + 0*x^3 + ...1*x^2 + ...10*x + ...10 ...00000000000000000001*x^5 + ...11111010*x^4 + ...10000000*x^3 + ...11001111*x^2 + ...10000110*x + ...11100010

With high probability,

and are coprime, implying that the gcd of is is . However, we observe that ZpCR output a quite different result. The point is that, in the ZpCR case, Euclidean algorithm stops prematurely because the test B != 0 fails too early due to the lack of precision.

Gröbner bases. Our package can be applied on complex computations like that of Gröbner bases using generic Gröbner bases algorithms.

 ZpIn: R. =PolynomialRing(Q2, order=’invlex’) F = [Q2(2,10)*x +Q2(1,10)*z, Q2(1,10)*x^2 +Q2(1,10)*y^2 -Q2(2,10)*z^2, Q2(4,10)*y^2 +Q2(1,10)*y*z +Q2(8,10)*z^2 ] from sage.rings.polynomial.toy_buchberger\ importbuchberger_improved g =buchberger_improved(ideal(F)) g.sort(); g [x^3, x*y + ...1100010*x^2, y^2 + ...11001*x^2, z + ...0000000010*x] [x^3, x*y + ...111100010*x^2, y^2 + ...1111111001*x^2, z + ...0000000010*x]

As we can see, some loss in precision occurs in the Buchberger algorithm and is avoided thanks to ZpL.

In (LV16, ), the behavior of the precision when solving -adic differential equations with separation of variables has been studied. The authors have investigated the gap that appears when applying a Newton-method solver between the theoretic loss in precision and the actual loss in precision for a naive implementation in Zp(p). We can reach this theoretical loss in precision using ZpL. We use a generic Newton_Iteration_Solver(g,h,N) that applies N steps of the Newton method for as described in (LV16, ).

 ZpIn: S. =PowerSeriesRing(Q2, 16) h = 1 + t + t^3 y = t + t^2 *random_element(S, prec=10) g = y.derivative() / h(y) u =Newton_Iteration_Solver(g, h, 4); u ...1101 ...11011101

## 3. Behind the scenes

In this section, we explain how our package ZpL works and analyze its performance. The main theoretical result on which our package is based is the ultrametric precision theory developed in (caruso-roe-vaccon:14a, ), which suggests tracking precision via lattices and differential computations. For this reason, our approach is very inspired by automatic differentiation techniques (rall:1981, ) and our implementation follows the usual operator overloading strategy. We will introduce two versions of our package, namely ZpLC and ZpLF: this former is safer while the latter is faster.

Remark about the naming. The letter L, which appears in the name of the package, comes from “lattices”. The letters C (in ZpLC) and F (in ZpLF) stand for “cap” and “float” respectively.

### 3.1. The precision Lemma

In (caruso-roe-vaccon:14a, ), we suggest the use of lattices to represent the precision of elements in -vector spaces. This approach contrasts with the coordinate-wise method (of e.g. Zp(5)) that is traditionally used in SageMath where the precision of an element is specified by giving the precision of each coordinate separately and is updated after each basic operation.

Consider a finite dimensional normed vector space defined over . We use the notation for the norm on and (resp. ) for the open (resp. closed) ball of radius centered at the origin. A lattice is a sub--module which generates over . Because of ultrametricity, the balls and are examples of lattices. Lattices can be thought of as special neighborhoods of , and therefore are good candidates to model precision data. Moreover, as revealed in (caruso-roe-vaccon:14a, ), they behave quite well under (strictly) differentiable maps:

###### Proposition 3.1 ().

Let and be two finite dimensional normed vector spaces over and be a function defined on an open subset of . We assume that is differentiable at some point and that the differential is surjective. Then, for all , there exists a positive real number such that, for all , any lattice such that satisfies:

 (1) f(v0+H)=f(v0)+dfv0(H).

This proposition enables the lattice method of tracking precision, where the precision of the input is specified as a lattice and precision is tracked via differentials of the steps within a given algorithm. The equality sign in Eq. (1) shows that this method yields the optimum possible precision. We refer to (caruso-roe-vaccon:14a, , §4.1) for a more complete exposition.

### 3.2. Tracking precision

We now explain in more details the internal mechanisms ZpLC and ZpLF use for tracking precision.

In what follows, it will be convenient to use a notion of discrete time represented by the letter . Rigorously, it is defined as follows: when the -adic ring or is created and increases by each time a variable is created, deleted111The deletion can be explicit (through a call to the del operator) or implicit (handled by the garbage collector). or updated.

Let be the set of alive variables at time . Set ; it is a finite dimensional vector space over which should be thought of as the set of all possible values that can be taken by the variables in . For , let be the vector whose coordinates all vanish except at position v which takes the value . The family is obviously a basis of ; we will refer to it as the canonical basis.

#### 3.2.1. The case of ZpLC.

Following Proposition 3.1, the package ZpLC follows the precision by keeping track of a lattice in , which is a global object whose purpose is to model the precision on all the variables in all together. Concretely, this lattice is represented by a matrix

in row-echelon form whose rows form a set of generators. Below, we explain how the matrices

are updated each time increases.

Creating a variable. This happens when we encounter an instruction having one of the two following forms:

[Computation] w =(v_1,, v_n)

[New value] w =R(value,prec)

In both cases, w is the newly created variable. The ’s stand for already defined variables and is some -ary builtin function (in most cases it is just addition, subtraction, multiplication or division). On the contrary, the terms “value” and “prec” refer to user-specified constants or integral values which was computed earlier.

Let us first examine the first construction [Computation]. With our conventions, if is the time just before the execution of the instruction we are interested in, the ’s lie in while w does not. Moreover , so that . The mapping taking the values of variables at time to that at time is:

 F:Et⟶Et+1x––↦x––⊕f(x1,…,xn)

where is the -th coordinate of the vector . The Jacobian matrix of at is easily computed; it is the block matrix where

is the identity matrix of size

and is the column vector whose v-th entry is if v is one of the ’s and otherwise. Therefore, the image of under is represented by the matrix where is the column vector:

 (2) C=n∑i=1∂f∂vi(x––)⋅Ci

where is the column vector of corresponding to the variable . Observe that the matrix is no longer a square matrix; it has one extra column. This reflects the fact that . Rephrasing this in a different language, the image of under is no longer a lattice in

but is included in an hyperplane.

The package ZpLC tackles this issue by introducing a cap: we do not work with but instead define the lattice where is an integer, the so-called cap. Alternatively, one may introduce the map:

 (3) ~F:Et⊕Qp⟶Et+1x––⊕c↦x––⊕(f(x1,…,xn)+c).

The lattice is then the image of under for any value of . The choice of the cap is of course a sensitive question. ZpLC proceeds as follows. When a ring is created, it comes with two constants (which can be specified by the user): a relative cap relcap and an absolute cap abscap. With these predefined values, the chosen cap is:

 (4) Nt+1=min(\textscabscap,\textscrelcap+vp(y))

with . In concrete terms, the lattice is represented by the block matrix:

 (MtC0pNt+1).

Performing row operations, we see then the entries of can be reduced modulo without changing the lattice. In order to optimize the size of the objects, we perform this reduction and define by:

 Mt+1=(MtC mod pNt+10pNt+1).

We observe in particular that is still in row-echelon form.

Finally, we need to explain which value is set to the newly created variable w. We observe that it cannot be exactly because the latter is a priori a -adic number which cannot be computed exactly. For this reason, we have to truncate it at some finite precision. Again we choose the precision , i.e. we define as . The congruence (which holds thanks to the extra generator we have added) justifies this choice.

The second construction “w =R(value,prec)” is easier to handle since, roughly speaking, it corresponds to the case . In this situation, keeping in mind the cap, the lattice is defined by for the cap . The corresponding matrix is then given by:

 Mt+1=(Mt00pmin(prec,Nt+1)).

Deleting a variable. Let us now examine the case where a variable w is deleted (or collected by the garbage collector). Just after the deletion, at time , we then have . Thus . Moreover, the deletion of w is modeled by the canonical projection . Since is linear, it is its own differential (at each point) and we set . A matrix representing is deduced from by erasing the column corresponding to w. However the matrix we get this way is no longer in row-echelon form. We then need to re-echelonize it.

More precisely, the obtained matrix has this shape:

where a cell is colored when it can contain a non-vanishing entry. The top part of the matrix is then already echelonized, so that we only have to re-echelonize the bottom right corner whose size is the distance from the column corresponding to the erased variable to the end. Thanks to the particular shape of the matrix, the echelonization can be performed efficiently: we combine the first rows (of the bottom right part) in order to clear the first unwanted nonzero entry and then proceed recursively.

Updating a variable. Just like for creation, this happens when the program reaches an affectation “w = ...” where the variable w is already defined. This situation reduces to the creation of the temporary variable (the value of the right-hand-size), the deletion of the old variable w and a renaming. It can then be handled using the methods discussed previously.

#### 3.2.2. The case of ZpLF.

The way the package ZpLF tracks precision is based on similar techniques but differs from ZpLC in that it does not introduce a cap but instead allows to be a sub--module of of any codimension. This point of view is nice because it implies smaller objects and consequently leads to faster algorithms. However, it has a huge drawback; indeed, unlike lattices, submodules of of arbitrary codimensions are not exact objects, in the sense that they cannot be represented by integral matrices in full generality. Consequently, they cannot be encoded on a computer. We work around this drawback by replacing everywhere exact -adic numbers by floating point -adic numbers (at some given precision) (course-padic, ).

The fact that the lattice can now have arbitrary codimension translates to the fact the matrix can be rectangular. Precisely, we will maintain matrices of the shape:

where only the colored cells may contain a nonzero value and the black cells —the so-called pivots— do not vanish. A variable whose corresponding column contains a pivot will be called a pivot variable at time .

Creating a variable. We assume first that the newly created variable is defined through a statement of the form: “w =(v_1,, v_n)”. As already explained in the case of ZpLC, this code is modeled by the mathematical mapping:

 F:Et⟶Et+1x––↦x––⊕f(x1,…,xn).

Here represents the state of memory at time , and is the coordinate of corresponding to the variable .

In the ZpLF framework, is defined as the image of under the differential . Accordingly, the matrix is defined as where is the column vector defined by Eq. (2). However, we insist on the fact that all the computations now take place in the “ring” of floating point -adic numbers. Therefore, we cannot guarantee that the rows of generate . Nonetheless, they generate a module which is expected to be close to .

If w is created by the code “w =R(value,prec)”, we define and consequently:

 Mt+1=(Mt00pprec)

If prec is (or, equivalently, not specified), we agree that and .

Deleting a variable. As for ZpLC, the matrix operation implied by the deletion of the variable w is the deletion of the corresponding column of . If w is not a pivot variable at time , the matrix keeps the form (5) after erasure; therefore no more treatment is needed in this case.

Otherwise, we re-echelonize the matrix as follows. After the deletion of the column , we examine the first column which was located on the right of . Two situations may occur (depending on the fact that was or was not a pivot column):

In the first case, we perform row operations in order to replace the pair by where is an element of valuation . Observe that is necessarily nonzero in this case, so that does not vanish as well. After this operation, we move to the next column and repeat the same process.

The second case is divided into two subcases. First, if does not vanish, it can serve as a pivot and the obtained matrix has the desired shape. When this occurs, the echelonization stops. On the contrary, if , we just untint the corresponding cell and move to the next column without modifying the matrix.

### 3.3. Visualizing the precision

Our package implements several methods giving access to the precision structure. In the subsection, we present and discuss the most relevant features in this direction.

Absolute precision of one element. This is the simplest accessible precision datum. It is encapsulated in the notation when an element is printed. For example, the (partial) session:

 ZpIn: v =Z2(173,10); v ...0010101101

indicates that the absolute precision on v is since exactly digits are printed. The method precision_absolute provides a more easy-to-use access to the absolute precision.

 ZpIn: v.precision_absolute() ZpLC: 10

Both ZpLC and ZpLF compute the absolute precision of v (at time ) as the smallest valuation of an entry of the column of corresponding to the variable v. Alternatively, it is the unique integer for which where takes a vector to its v-coordinate. This definition of the absolute precision sounds revelant because, if we believe that the submodule is supposed to encode the precision on the variables in , Proposition 3.1 applied with the mapping indicates that a good candidate for the precision on is , that is .

About correctness. We emphasize that the absolute precision computed this way is not proved, either for ZpLF or ZpLC. However, in the case of ZpLC, one can be slightly more precise. Let be the vector space of user-defined variables before time and be the lattice modeling the precision on them. The pair is defined inductively as follows: we set and , when a new variable w is created by “w = R(value,prec)”; otherwise, we put and . Moreover the values entered by the user defines a vector (with integral coordinates) .

Similarly, in order to model the caps, we define a pair by the recurrence , each time a new variable w is created. Here, the exponent is the cap defined by Eq. (4). In case of deletion, we put and .

Taking the compositum of all the functions (cf Eq. (3)) from time to , we find that the execution of the session until time is modeled by a mathematical function . From the design of ZpLC, we deduce further that there exists a vector such that:

 Φt(u––t⊕k––t)=x––t%anddΦt(Ut⊕Kt)=Ht

where the differential of is taken at the point . Set ; it maps to the v-coordinate of and satisfies where is the value returned by precision_absolute. Thus, as soon as the assumptions of Proposition 3.1 are fulfilled, we derive . Noting that , we finally get:

 (6) Φt,v(u––t+Ut)⊂Φt,v((u––t+Ut)⊕Kt)=xt,v+pNZp.

The latter inclusion means that the computed value is accurate at precision , i.e. that the output absolute precision is correct.

Unfortunately, checking automatically the assumptions of Proposition 3.1 in full generality seems to be difficult, though it can be done by hand for many particular examples (caruso-roe-vaccon:14a, ; caruso:2017, ; LV16, ).

###### Remark 3.2 ().

Assuming that Proposition 3.1 applies, the absolute precision computed as above is optimal if and only if the inclusion of (6) is an equality. Applying again Proposition 3.1 with the restricted mapping and the lattice , we find that this happens if and only if .

Unfortunately, the latter condition cannot be checked on the matrix (because of reductions). However it is possible (and easy) to check whether the weaker condition . This checking is achieved by the method is_precision_capped (provided by our package) which returns true if . As a consequence, when this method answers false, the absolute precision computed by the software is likely optimal.

Precision on a subset of elements. Our package implements the method precision_lattice through which we can have access to the joint precision on a set of variables: it outputs a matrix (in echelon form) whose rows generate a lattice representing the precision on the subset of given variables.

When the variables are “independent”, the precision lattice is split and the method precision_lattice outputs a diagonal matrix:

 ZpIn: ZpIn: x =Z2(987,10); y =Z2(21,5) # We first retrieve the precision object L =Z2.precision() L.precision_lattice([x,y]) [1024 0] [ 0 32]

However, after some computations, the precision matrix evolves and does not remain diagonal in general (though it is always triangular because it is displayed in row-echelon form):

 ZpIn: u, v = x+y, x-y L.precision_lattice([u,v]) [ 32 2016] [ 0 2048]

The fact that the precision matrix is no longer diagonal indicates that some well-chosen linear combinations of and are known with more digits than and themselves. In this particular example, the sum is known at precision while the (optimal) precision on and separately is only .

 ZpIn: u, v (...10000, ...00110) u + v ...11110110110

Diffused digits of precision. The phenomenon observed above is formalized by the notion of diffused digits of precision introduced in (caruso-roe-vaccon:15, ). We recall briefly its definition.

###### Definition 3.3 ().

Let be a -vector space endowed with a distinguished basis and write for the projections. Let be a lattice. The number of diffused digits of precision of is the length of where .

If represents the actual precision on some object, then is the smallest diagonal lattice containing . It then corresponds to the maximal coordinate-wise precision we can reach on the set of variables corresponding to the basis .

The method number_of_diffused_digits computes the number of diffused digits of precision on a set of variables. Observe:

 ZpIn: L.number_of_diffused_digits([x,y]) 0 L.number_of_diffused_digits([u,v]) 6

For the last example, we recall that the relevant precision lattice is generated by the matrix:

 (2520160211).

The minimal diagonal suplattice of is generated by the scalar matrix and contains with index in it. This is where the digits of precision come from. There are easily visible here: the sum is known with digits, that is exactly more digits than the summands and .

### 3.4. Complexity

We now discuss the cost of the above operations. In what follows, we shall count operations in . Although is an inexact field, our model of complexity makes sense because the size of the -adic numbers we manipulate will all have roughly the same size: for ZpLF, it is the precision we use for floating point arithmetic while, for ZpLC, it is the absolute cap which was fixed at the beginning.

It is convenient to introduce a total order on : for , we say that if v was created before w. By construction, the columns of the matrix are ordered with respect to . We denote by (resp. ) the number of rows (resp. columns) of . By construction is also the cardinality of . We have and the equality always holds in the ZpLC case.

For , we define the index of v, denoted by as the number of elements of which are not greater than v. If we sort the elements of by increasing order, v then appears in -th position. We also define the co-index of v by .

Similarly, for any variable , we define the height (resp. the co-height) of v at time as the number of pivot variables w such that (resp. ). We denote it by (resp. by ). Clearly . The height of v is the height of the significant part of the column of which corresponds to v. In the case of ZpLC, all variables are pivot variables and thus and for all v.

Creating a variable. With the notations of §3.2, it is obvious that creating a new variable w requires:

 O(n∑i=1hgti(vi))⊂O(nct)

operations in . Here, we recall that is the arity of the operation defining w. In most cases it is ; thus the above complexity reduces to .

In the ZpLF context, counts the number of user-defined variables. It is then expected to be constant (roughly equal to the size of the input) while running a given algorithm.

On the contrary, in the ZpLC context, counts the number of variables which are alive at time . It is no longer expected to be constant but evolves continuously when the algorithm runs.

The tables of Figure 1 show the total number of created variables (which reflects the complexity) together with the maximum number of variables alive at the same time (which reflects the memory occupation) while executing two basic computations. The first one is the computation of the characteristic polynomial of a square matrix by the default algorithm used by SageMath for -adic fields (which is a division-free algorithm of quartic complexity) while the second one is the computation of the gcd of two polynomials using a naive Euclidean algorithm (of quadratic complexity). We can observe that, for both of them, the memory usage is roughly equal to the square root of the complexity.

Deleting a variable. The deletion of the variable w induces the deletion of the corresponding column of , possibly followed by a partial row-echelonization. In terms of algebraic complexity, the deletion is free. The cost of the echelonization is within operations in .

In the ZpLF case, we expect that, most of the time, the deleted variables were created after all initial variables were set by the user. This means that we expect to vanish and so, the corresponding cost to be negligible.

In the ZpLC case, we always have , so that the cost becomes . This does not look nice a priori. However, the principle of temporal locality (denning:2005, ) asserts that tends to be small in general: destroyed variables are often variables that were created recently. As a basic example, variables which are local to a small piece of code (e.g. a short function or a loop) do not survive for a long time. It turns out that this behavior is typical in many implementations!

The histogram of Figure 2 shows the distribution of while executing the Euclidean algorithm (naive implementation) with two polynomials of degree as input. The bias is evident: most of the time .

Summary: Impact on complexity. We consider the case of an algorithm with the following characteristics: its complexity is operations in (without any tracking of precision), its memory usage is elements of , its input and its output have size and (elements of ) respectively.

In the case of ZpLF, creating a variable has a cost whereas deleting a variable is free. Thus when executed with the ZpLF mechanism, the complexity of our algorithm becomes .

In the ZpLC framework, creating a variable has a cost . The case of deletion is more difficult to handle. However, by the temporal locality principle, it seems safe to assume that it is not the bottleneck (which is the case in practice). Therefore, when executed with the ZpLF mechanism, the cost of our algorithm is expected to be roughly . Going further in speculation, we might estimate the magnitude of as about with , leading to a complexity of . For quasi-optimal algorithms, the term dominates. However, as soon as the complexity is at least quadratic in , the dominant term is and the impact on the complexity is then limited.

## 4. Conclusion

The package ZpL provides powerful tools (based on automatic differentiation) to track precision in the -adic setting. In many concrete situations, it greatly outperforms standard interval arithmetic, as shown in §2. The impact on complexity is controlled but nevertheless non-negligible (see §3.4). For this reason, it is unlikely that a fast algorithm will rely directly on the machinery proposed by ZpL, though it might do so for a specific part of a computation. At least for now, bringing together rapidity and stability still requires a substantial human contribution and a careful special study of all parameters.

Nevertheless, we believe that ZpL can be extremely helpful to anyone designing a fast and stable -adic algorithm for a couple of reasons. First, it provides mechanisms to automatically detect which steps of a given algorithm are stable and which ones are not. In this way, it highlights the parts of the algorithm on which the researcher has to concentrate their effort. Second, recall that a classical strategy to improve stability consists in working internally at higher precision. Finding the internal increase in precision that best balances efficiency and accuracy is not an easy task in general. Understanding the number of diffused digits of precision gives very useful hints in this direction. For example, when there are no diffused digits of precision then the optimal precision completely splits over the variables and there is no need to internally increase the precision. On the contrary, when there are many diffused digits of precision, a large increment is often required. Since ZpL gives a direct access to the number of diffused digits of precision, it could be very useful to the designer who is concerned with the balance between efficiency and accuracy.

## References

•  754-2008 - IEEE Std. for Floating-Point Arithmetic. IEEE, 2008.
•  Xavier Caruso. Computations with -adic numbers. pages 1–83, 2017.
•  Xavier Caruso. Numerical stability of euclide algorithm over ultrametric fields. J. Number Theor. Bordeaux, 29:503–534, 2017.
•  Xavier Caruso, David Roe, and Tristan Vaccon. Tracking -adic precision. LMS Journal of Computation and Mathematics, 17(A):274–294, 2014.
•  Xavier Caruso, David Roe, and Tristan Vaccon. p-Adic Stability In Linear Algebra. In Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC ’15, pages 101–108, New York, NY, USA, 2015. ACM.
•  Xavier Caruso, David Roe, and Tristan Vaccon. Division and Slope Factorization of p-Adic Polynomials. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC ’16, pages 159–166, New York, NY, USA, 2016. ACM.
•  Xavier Caruso, David Roe, and Tristan Vaccon. Characteristic Polynomials of P-adic Matrices. In Proceedings of the 2017 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC ’17, pages 389–396, New York, NY, USA, 2017. ACM.
•  Peter Denning. The locality principle. Commun. ACM, 48:19–24, 2005.
•  Nicholas Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 2nd ed. edition, 2002.
•  Pierre Lairez and Tristan Vaccon. On p-adic differential equations with separation of variables. In Proceedings of the ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC 2016, Waterloo, ON, Canada, July 19-22, 2016, pages 319–323, 2016.
•  Louis Rall. Automatic Differentiation: Techniques and Applications, volume 120 of Lecture Notes in Computer Science. Springer, Berlin, 1981.
•  The Sage Developers. SageMath, the Sage Mathematics Software System (Version 8.1), 2018.