An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement

09/16/2020
by   Takeshi Iwashita, et al.
Hokkaido University
0

In this paper, we develop a (preconditioned) GMRES solver based on integer arithmetic, and introduce an iterative refinement framework for the solver. We describe the data format for the coefficient matrix and vectors for the solver that is based on integer or fixed-point numbers. To avoid overflow in calculations, we introduce initial scaling and logical shifts (adjustments) of operands in arithmetic operations. We present the approach for operand shifts, considering the characteristics of the GMRES algorithm. Numerical tests demonstrate that the integer arithmetic-based solver with iterative refinement has comparable solver performance in terms of convergence to the standard solver based on floating-point arithmetic. Moreover, we show that preconditioning is important, not only for improving convergence but also reducing the risk of overflow.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

11/24/2017

Exploring Approximations for Floating-Point Arithmetic using UppSAT

We consider the problem of solving floating-point constraints obtained f...
09/25/2020

Compressed Basis GMRES on High Performance GPUs

Krylov methods provide a fast and highly parallel numerical tool for the...
07/13/2021

Multistage Mixed Precision Iterative Refinement

Low precision arithmetic, in particular half precision (16-bit) floating...
05/05/2021

Fixed-point iterative linear inverse solver with extended precision

Solving linear systems is a ubiquitous task in science and engineering. ...
02/04/2022

Fixed-Point Code Synthesis For Neural Networks

Over the last few years, neural networks have started penetrating safety...
12/20/2021

Efficient Floating Point Arithmetic for Quantum Computers

One of the major promises of quantum computing is the realization of SIM...
08/21/2021

Integer-arithmetic-only Certified Robustness for Quantized Neural Networks

Adversarial data examples have drawn significant attention from the mach...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In recent years, it has become difficult to improve the performance of processors, particularly their energy efficiency. The main reason is the decline in lithographic scaling, which threatens the well-known techno-economic model for the IT industry, that is, Moore’s Law [1, 2]. Thus, new computing technologies and devices based on different physics from CMOS technology are being widely investigated. Although quantum computing is a typical example for these technologies, some technologies aim to develop an ultra low-power but high-performance computer that is operated by instructions similar to conventional computers, for example computing devices based on single-flux-quantum (SFQ) circuits [3, 4]. However, these new types of computers may support only integer arithmetic in the early stage of research and deployment, because circuits for floating-point (FP) arithmetic are more complex and power consuming than those for integer arithmetic. Accordingly, we attempt to evaluate the potential of integer arithmetic computing for scientific computing. Specifically, we focus on iterative methods that are widely used in various scientific simulations, and investigate an integer arithmetic-based iterative linear solver, in which only integer arithmetic is used in the main iteration loop.

While there is a wide variety of iterative solvers, we develop a generalized minimal residual (GMRES) solver [5] using integer (fixed-point number) arithmetic that is denoted by int-GMRES. The GMRES method is a Krylov subspace method and is used as a standard solver for a linear system that has an unsymmetric coefficient matrix. In our solver, the iterative refinement technique is used with the GMRES solver based on integer arithmetic to obtain a solution vector with the same accuracy as the output of a standard FP arithmetic solver. Although the technique is classical, it is useful for mixed-precision computing [6]. In this paper, we introduce the iterative refinement framework for an integer arithmetic-based solver and present the details of the implementation of the int-GMRES solver.

In Sections II and III, we introduce some notation and problem definitions, including the initial scaling of the linear system to be solved. In Section IV, we describe the iterative refinement framework for the solver based on integer arithmetic. In Sections V and VI, we present the details of the implementation of (preconditioned) int-GMRES. In Section VII, we present the numerical results. In Sections VIII and IX, we describe related works and summarize the paper.

Ii Notation

In this paper, we discuss a linear solver in which integer arithmetic is mainly used. In the program, some variables and elements of arrays are declared as integer numbers, and they are treated as fixed point numbers in the analysis. We use Q notation for the fixed-point number. Q. denotes a number with integer and fractional bits. The word length is , because a sign bit is used. The entire word is a two’s complement integer.

In the following, we denote the -th row -th element of matrix by or . We denote the -th element of vector by . When matrices, vectors, and variables have a bar, such as , this indicates that their elements or values are fixed-point or integer numbers and are stored using the intWL type in the program.

Iii Problem and Initial Scaling

In this paper, we consider the following -dimensional linear system of equations:

(1)

The elements of and are given by FP numbers. Typically, they are double precision. We need to solve (1) with sufficient accuracy; that is, the relative residual norm calculated using (double-precision) FP arithmetic must be smaller than a given tolerance. The final value of each element of is given by an FP number.

First, the linear system (1) is scaled using FP arithmetic as follows:

(2)

where and . When we intend to preserve a particular property of the coefficient matrix, such as symmetry, the scaled linear system can be written as follows:

(3)

In (2) and (3), , , and are diagonal matrices. In the present analysis, the -th diagonal element of is given by

(4)

When the linear system (2) is solved mainly using integer arithmetic, the setting of can be an important issue, and depends on the solver implementation. Based on our preliminary tests, we suggest that , whereas a larger value can be set for a preconditioned solver.

Iv Iterative Refinement

We use an iterative refinement technique, which is slightly adjusted for iterative linear solvers based on integer arithmetic. In the technique, we refine the approximate solution vector by solving the residual equation. We assume that we obtain sufficiently accurate solution vector by times refinements. In each refinement, a linear system of equations is approximately solved. Then, the solution vector (or its sufficiently accurate approximation) is written as

(5)

In our technique, the approximate solution vector for the -th refinement is obtained by (approximately) solving the linear system of equations:

(6)

In (6), each element of and is given by an FP number.

Iv-a Setting of the Right-Hand Side and Solution Vector

Before the -th refinement process, we calculate

(7)

using FP arithmetic. We note that . Although can be determined by solving , we solve its scaled system (6) considering the use of an integer arithmetic-based solver and representation range of a fixed-point number. Using FP arithmetic, we calculate the scaled vector of using

(8)

and

(9)

Then, the vector for the -th refinement is written by

(10)

When the entire refinement process works, we can expect that the scaling factor decreases as increases.

Iv-B Coefficient Matrix

In this subsection, we describe the setting of . Each element of matrices used in the iterative linear solver is given by an integer number without fractional bits. After the initial scaling of the original linear system, we cast each element of to an integer number and obtain . Next, we calculate using with FP arithmetic. Then, we determine a scaling factor as follows:

(11)

After each element of is multiplied by , it is cast to an intWL number to obtain . After the same scaling and casting processes are performed repeatedly, the coefficient matrix can be written as

(12)

because each element of is an FP number with a finite word length. Each element of is an integer number (no fractional bits). It holds that . In the -th refinement process, we use a limited number of terms on the right-hand side of (12); that is

(13)

where is a parameter for the solver. When , we only use in the refinement process; that is, .

Iv-C Refinement Process

Finally, we introduce an iterative refinement framework for iterative linear solvers that mainly use integer arithmetic, as shown in Fig. 1. In Fig. 1, is the approximation of and is the maximum value of . We assume that no FP arithmetic is used in the main loop of the iterative solver used in the framework. Table I lists the arguments of the iterative linear solver based on integer arithmetic. The input parameter is the number of fractional bits for fixed-point numbers involved in the iterative solver. In the program, the input data of the coefficient matrix are represented by integer numbers. The input of is an initial guess for the iterative solver. The output of is the (approximate) solution vector of (6), each element of which is an FP number.

Initial scaling
Calculate , , …, , , , …,
for
  if () break
  Calculate
  Calculate and
  Integer_arithmetic_based_linear_solver( arguments )
  // to solve
  
endfor
Fig. 1: Iterative refinement framework using the iterative linear solver based on integer arithmetic

V GMRES Solver Using Integer Arithmetic (int-GMRES)

V-a Overview and Data Types of int-GMRES

In this section, we introduce the GMRES solver based on integer arithmetic that is used in the iterative refinement framework. We denote the solver by int-GMRES in this paper.

In our solver, each element of the coefficient matrix is given by an integer number (no fractional bits). The elements of vectors and variables used in the main GMRES iteration loop are given by fixed-point numbers in the Q format.

In the following sections, we use the term ”bit shift.” In this paper, left and right shifts with bits refer to multiplication by and division by , respectively. These operations for signed integer numbers can be implemented using the shift operation when the used computer supports a logical shift. In this paper, we assume the use of this type of computer. However, in some computational environments, the result of a shift operation for a signed integer number is ”undefined.”

Figure 2 shows the algorithm for the int-GMRES() solver, where is the restart period. In the figure, (FP) represents the statement or calculation based on FP arithmetic, whereas (INT) represents integer arithmetic. In the following subsections, we explain for the basic arithmetic of fixed-point numbers and kernels of GMRES, and then present the implementation details.

V-B Basic Arithmetic of Fixed-Point Numbers

In this subsection, we describe the implementation of four basic arithmetics of fixed-point numbers in the Q format.

V-B1 Addition and Subtraction

The addition and subtraction of two fixed point numbers of Q are straightforwardly implemented using the integer addition instruction. The obtained integer value directly represents the result in the Q format.

V-B2 Multiplication

The multiplication of fixed-point numbers is required in various parts of the GMRES program that include calculations of inner products and norms. Let us consider the multiplication of two fixed-point numbers in the Q format: and . We denote the integer representation of and in the program by and , respectively; that is, and . The multiplication procedure for is given as follows: After two integer numbers and are divided by and , respectively, they are multiplied using the integer instruction. The obtained value corresponds to in the Q format, where and . When we need the result represented as a Q number, the value is divided by . Figure 3 demonstrates the multiplication procedure of fixed-point numbers. When the computer supports a logical shift operation, using a C language-like representation, the multiplication in the program, in which the result is represented in the Q format, is written as

(14)

where is the integer representation in the program for .

Arrays, variables I/O Number type
, , Input Integer
, , Input Integer
Input Integer
Input Floating point
Input / Output Floating point
TABLE I: Types of arguments
1. Compute ,
            // (FP)
2. Cast to
3.
4. For =1, 2, …,
5.   Compute // (INT)
6.   For
7.      // (INT)
8.      // (INT)
9.   Endfor
10.    // (INT)
11.    // (INT)
12.   For
13.     
                             // (INT)
14.   Endfor
15.    // (INT)
16.   , // (INT)
17.   ,   // (INT)
18.   
19.    // (INT)
20. Endfor
21. Cast to , and to
22. // (FP)
23. // (FP)
Fig. 2: Algorithm for the int-GMRES() method
Fig. 3: Multiplication of fixed-point numbers

V-B3 Division

The division of by is implemented as follows: After the first source operand is multiplied by and the second source operand is divided by , the first operand is divided by the second operand using the integer division instruction. The resultant valuable is multiplied by and the final result in the Q format is obtained.

V-B4 Square Root

The calculation of a square root is required for the GMRES algorithm. In this subsection, we describe the calculation of the square root of a fixed-point number in the Q format, where . Let denote the integer representation of in the program. We apply the Babylonian square root algorithm for using integer arithmetic. The obtained value is multiplied by . The final result provides the integer representation of the square root of in the Q format.

V-C Kernels of the GMRES Method

In this subsection, we describe the implementation of three computational kernels of the GMRES method.

V-C1 Inner product

We consider the inner product of two vectors, each element of which is a fixed-point number in the Q format. Using the multiplication and addition operations for fixed-point numbers described in Section V-B, we obtain the result of the inner product as a number in the Q format. To obtain a better accuracy in calculations, we typically set to be larger than . Therefore, to obtain a result in the Q format, the result variable is divided by . Figure 4 shows a sample code for the inner product. In the figure, b1 and b2 correspond to and in the procedure for the multiplication, respectively.

cs=0;
for (l=0; l<n; l++){
    cs=cs+(v[l] >> b1)*(w[l] >> b2) ; }
Fig. 4: Calculation of the inner product with the setting (only on computers that support a logical shift for a signed integer number)

V-C2 Norm

When we calculate a vector norm, we first calculate the inner product of the vector and itself. Using the procedure described above, we obtain the result of the inner product in the Q format. Then, we calculate its square root using the procedure described in Section V-B4. Finally, we obtain the norm of the vector which is represented in the Q format.

V-C3 Matrix Vector Multiplication

Matrix vector multiplication is a main kernel of Krylov subspace methods, in which the GMRES method is classified. From (

13), the kernel consists of matrix vector multiplications:

(15)

where is an -dimensional source vector. In our implementation, each element of the matrices is given by an integer number, which has no fraction bits. The element of the source and resultant vectors is a fixed-point number in the Q format. Consequently, each matrix vector multiplication can be performed by a simple integer matrix vector multiplication program. Each element of is divided by , and then added to the corresponding element of the resultant vector.

In the above procedure, it is implied that the result of does not contribute to the final result when

is substantially large. Consequently, we estimate that

must be at most 3 or 4 in a practical scenario. When we require more accuracy for the matrix vector multiplication, we should use multiple words for each element of the resultant vector.

V-D Implementation Details of int-GMRES and Setting of the Operand Shifts

In this subsection, we present the details of the int-GMRES solver while paying special attention to setting the parameters in fixed-point number arithmetic.

V-D1 Cast of to (l. 2 in Fig. 2)

Each element of is multiplied by using FP arithmetic. Then, it is cast to an intWL number. The obtained integer array that corresponds to consists of fixed-point numbers in the Q format.

V-D2 Arnoldi Process (l. 4-11 in Fig. 2)

Line 5 (matrix vector multiplication)

Line 5 is matrix vector multiplication, which we implement using the method described in Section V-C3.

Line 7 (inner product)

Line 7 in Fig. 2 is the calculation of an inner product, which we implement using the method described in Section V-C1. We suggest a special setting for the operand shift in the multiplication involved in the calculation. Because is a normalized vector, the upper bits of each element of are always zero. Considering this feature, we only shift the first source operand which corresponds to ; that is .

Line 8

Line 8 involves the multiplication of a vector element by a scalar value and subtraction between two vectors. Like the inner product in line 7, we only shift the first source operand in the multiplication, considering the profile of .

Line 10 (norm)

Line 10 is the calculation of the norm, which we implement using the procedure described in Section V-C2. In the multiplication, we naturally set .

Line 11

Line 11 is the division of a vector by a scalar number. We use the procedure for division described in Section V-B3.

V-D3 Givens Rotation (l. 12-20 in Fig. 2)

Line 13

We regard the statement as the inner product of the vectors of two elements. Therefore, we use the procedure for the inner product. For the multiplication involved in the procedure, we also use a special setting for the operand shift. Because the absolute value of and is not larger than one, we only shift the second operand corresponding to or , that is, .

Line 15

We can implement the statement as the calculation of the norm of the vector of two elements.

Line 16

We use the procedure for the division of a fixed-point number by another fixed-point number.

Line 17

Line 17 consists of the multiplication of scalar values. The absolute values of and are bounded by one, and monotonically decreases as the iteration count increases. Thus, we do not shift the operands in the multiplication because of the low risk of overflow.

V-D4 Update of the Solution Vector (l. 21-23 in Fig. 2)

Line 21

We cast each element of the integer arrays for and to an FP number, which we then divide by . In the practical implementation, we combine these casting operations with the following computations (l. 22-23) to avoid an additional array allocation.

Lines 22 and 23

We update the output of the int-GMRES solver, that is, using FP arithmetic.

V-D5 Summary of Setting the Parameters

Table II summarizes the type of fixed-point numbers, that is, the number of fractional bits, and the quantity of in arithmetic involved in the int-GMRES solver. In the table, the line number corresponds to the line of the statement in Fig. 2, and #fb represents the number of fractional bits of the fixed-point number used for vectors and variables.

Vi Preconditioning

Preconditioning is a practically important technique to accelerate the convergence of an iterative solver. To apply a preconditioning technique to the GMRES solver, we replace two statements (lines 1 and 5) in Fig. 2 by the following statements:

Line 1’.  Compute // (FP)

Line 5’.  Compute // (INT)

Typically, the preconditioner matrix well approximates the coefficient matrix. In this paper, we report the application of a standard incomplete LU (ILU), which is precisely ILU(0) preconditioning. In our solver, the element of the preconditioner matrix is given by an integer number (no fractional bits), which is the same as the coefficient matrix.

Vi-a ILU preconditioning

In ILU preconditioning, we use the incomplete factorized matrix of the coefficient matrix. Using FP arithmetic, we incompletely factorize the coefficient matrix as

(16)

where and have ones for their diagonal elements. Next, we define two diagonal matrices as follows:

(17)
(18)

and

(19)

Then, we introduce two matrices:

(20)

and

(21)

We apply the type cast from float/double to int for each element of and , and then we obtain lower and upper triangular matrices, and , respectively. Then, the preconditioner matrix is given by

(22)

The ILU preconditioning step corresponding to lines 1’ and 5’ is given by forward and backward substitutions. We can simply use a program for the substitutions in which integer arithmetic is used; that is, if we have a program for substitution based on FP arithmetic, we only change the data type for the matrix and vectors (float/double to int) in the program.

Input Output
Line # Kernel Arithmetic 1st (or single) operand 2nd operand Result
#fb Shift #fb Shift #fb
Line 5 Matrix vector Multiplication 0 No No
Multiplication Addition No No
Multiplication No
Line 7 Inner Product Addition No No
Shift - -
Line 8 Vector update Multiplication No
Subtraction No No
Line 10 Norm Multiplication
Addition No No
Square root No - -
Line 11 Division
Multiplication No
Line 13 Inner Product Addition No No
Shift - -
Line 15 Multiplication
Addition No No
Square root No - -
Line 16 Division
Line 17 Multiplication No No
TABLE II: Number of fractional bits of the input and output variables and operand shifts in the calculation

Vii Numerical Result

Vii-a Computation Environment and Test Problems

We conducted numerical tests to evaluate the developed int-GMRES solver. We evaluated the convergence of the relative residual norm of the solver in comparison with a standard GMRES solver using FP arithmetic. We performed numerical tests on a node of Fujitsu CX2550 (M4) at the Information Initiative Center, Hokkaido University. The node was equipped with two Intel 20-core Xeon (Gold 6148) processors and 384 GB shared memory. We wrote the program code in C and used an Intel compiler for the analysis. Logical shift was supported on the computer for a signed integer number.

In the integer arithmetic-based solver, the linear system (6) for the refinement was approximately solved using iterations of int-GMRES. For comparison, we also used a standard double precision GMRES() solver. We set the convergence criterion as the relative residual norm being less than . The relative residual norm was calculated every iterations using FP arithmetic in both the standard and integer arithmetic-based solvers. The comparison of the convergence properties of the solvers is performed every iterations.

For the test problems, we selected ten linear systems from the SuiteSparse Matrix Collection [7]. We selected matrices with various sizes from the collection, for which the standard GMRES solver based on double-precision FP arithmetic worked. Table III lists the properties of selected matrices. The right-hand side vector was given by a vector of ones.

Data set Problem type Dimension # nonzero
atmosmodj CFD 1,270,432 8,814,880
atmosmodl CFD 1,489,752 10,319,760
cage14 Graph 1,505,785 27,130,349
CoupCons3D Structural problem 416,800 17,277,420
epb2 Thermal problem 25,228 175,027
majorbasis Optimization problem 160,000 1,750,416
memchip Circuit simulation 2,707,524 13,343,948
stomach Electro-physical model 213,360 3,021,648
torso3 Finite difference model 259,156 4,429,042
wang3 Semiconductor analysis 26,064 177,168
TABLE III: Matrix information for the test problems

Vii-B Results for the Non-Preconditioned GMRES Solver

The int-GMRES solver based on integer arithmetic requires parameters to be set. The number of fractional bits was given by 30. The word length was 64, and the 64bit integer (int64) type was used for both fixed-point and integer numbers used in the solver. The parameter for the coefficient matrix was given by zero; that is, we only used in the test. These settings were also used in the numerical test of the preconditioned GMRES solver. Table IV lists the setting (number of bits) for operand shifts involved in the calculation of int-GMRES. Moreover, was set to 16 for the non-preconditioned solver and 32 for the ILU preconditioned solver.

Table V shows the number of iterations of a standard GMRES solver using double-precision FP arithmetic and the int-GMRES solver, where ”Double” denotes the standard solver. In the numerical tests, int-GMRES, in addition to the standard solver could solve the problem. When , int-GMRES unexpectedly converged faster in three test cases. In the wang3 test, which was the worst case for int-GMRES, the solver only required 20% more iterations than the standard solver.

When , the convergence rates of the standard and integer arithmetic based solvers were comparable for the test cases, except for cage14 and wang3. However, int-GMRES required only one more restart period than the standard solver to converge in cage14. In the wang3 test, which was regarded as the worst case for int-GMRES, the solver only required 24% more iterations than the standard solver. Figure 5 shows the comparison of the convergence rates of the standard double-precision and integer arithmetic-based GMRES solvers. For the atmosmodj dataset, the two solvers had an identical convergence rate, which means that the loss of accuracy in int-GMRES did not have a significant influence on solver performance. In contrast to the result of the atmosmodj test, the int-GMRES solver had a lower convergence rate than the standard solver in the wang3 test.

Line # Arithmetic Setting for operand shift
Line 7 Multiplication
Line 8 Multiplication
Line 10 Multiplication
Line 11 Division ,
Line 13 Multiplication
Line 15 Multiplication
Line 16 Division ,
TABLE IV: Number of fractional bits of the input and output variables, and operand shifts for the GMRES solver (no preconditioning)
=10 =30
Data set Double int-GMRES Double int-GMRES
atmosmodj 5,820 5,850 2,100 2,100
atmosmodl 880 840 420 420
cage14 20 20 30 60
CoupCons3D 430 430 360 360
epb2 820 730 540 540
majorbasis 90 100 90 90
memchip 460 380 300 300
stomach 310 310 180 180
torso3 150 150 150 150
wang3 720 860 510 630
TABLE V: Number of iterations in no preconditioning case
(a) atmosmodj test
(b) wang3 test
Fig. 5: Comparison of the convergence behaviors of standard and int-GMRES solvers without preconditioning when is 30

Vii-C Results for the Preconditioned GMRES Solver

When preconditioning is applied to an iterative solver, the convergence rate generally improves. Thus, we can expect that the residual norm is relatively small and the risk of overflow in the calculation is reduced. When ILU preconditioning is used, we can avoid the operand shift that sacrifices the accuracy of arithmetics in the solver. We note that the first source operand shift in the division operation is necessary for improving the calculation accuracy. Table VI lists the settings of the operand shifts in the preconditioned solver based on integer arithmetic, which is denoted by int-ILU-GMRES.

Table VII shows the number of iterations of the ILU-GMRES solver using double-precision FP arithmetic and the int-ILU-GMRES solver. When compared with the non-preconditioned solver, both solvers attained significant improvement in convergence. Moreover, the convergence rates of the two solvers were comparable. The int-ILU-GMRES solver required only one more restart period than the standard solver in some test cases. Figure 6 shows that both solvers had identical convergence behavior of the relative residual norm in the wang3 test.

Vii-D Discussions

Vii-D1 Preconditioning

Preconditioning is important in the context of iterative solvers based on integer arithmetic, because it reduces the risk of overflow. Consequently, we can decrease the number of bits of the operand shift, which improves the accuracy of arithmetic. For the non-preconditioned solver, we investigated an auto-tuning technique for the shift. However, it proved to be unnecessary in the preconditioning case. In the implementation of int-ILU-GMRES, we do not need to use the operand shift which sacrifices the accuracy. A similar effect was also confirmed in the Gauss–Seidel preconditioning case.

Vii-D2 Condition of the Problems

Because we selected test problems (matrices) for which a non-preconditioned GMRES solver using FP arithmetic attained convergence, the problems were not heavily ill-conditioned. Consequently, the int-GMRES solver also solved the problems. It is possible that problems exist that the standard FP arithmetic solver can solve but int-GMRES cannot. However, as far as we have tested, it seems not to be an easy task to seek such a problem; that is, the int-GMRES solver used with the iterative refinement technique may have comparable solver performance to the standard FP arithmetic solver.

Line # Arithmetic Setting for operand shift
Line 7 Multiplication
Line 8 Multiplication
Line 10 Multiplication
Line 11 Division ,
Line 13 Multiplication
Line 15 Multiplication
Line 16 Division ,
TABLE VI: Number of fractional bits of the input and output variables and operand shifts for the ILU-GMRES solver
=10 =30
Data set Double int-GMRES Double int-GMRES
atmosmodj 610 610 300 300
atmosmodl 140 140 120 120
cage14 10 20 30 60
CoupCons3D 140 150 30 60
epb2 50 50 60 60
majorbasis 20 20 30 60
stomach 20 20 30 60
torso3 40 40 30 60
wang3 180 180 120 120
TABLE VII: Number of iterations in the ILU preconditioning case
Fig. 6: Comparison of the convergence behaviors of standard and int-GMRES solvers with ILU preconditioning in the wang3 test when is 30

Viii Related Works

In this section, we introduce several papers that discuss mixed-precision linear solvers using the iterative refinement technique. The survey paper [6] by D. Göddeke et al. provides a good introduction to the mixed-precision iterative refinement algorithm framework. The paper [8] by Antz et al. is another early work on a mixed-precision linear solver, in which the authors reported a GPU implementation of an error correction solver using the GMRES method and showed the effectiveness of their approach in CFD applications. A. Haidar et al. reported the development of an architecture-specific algorithm and highly tuned implementations for the latest GPUs of mixed-precision iterative refinement solvers in [9]. Their solver that involved LU factorization was targeted at a linear system with a dense coefficient matrix. Carson et al. presented a general algorithm for iterative refinement with three precisions and its error analysis in [10]. Moreover, the Exascale Computing Project Multiprecision Effort Team (Lead: Hartwig Antz) recently opened its technical report to the public, which provides a comprehensive review of mixed-precision computing [11].

Next, we briefly mention analyses based on integer arithmetic (fixed-point numbers). Currently, integer arithmetic is often used in machine learning and artificial intelligence applications. LU factorization based on integer arithmetic for these applications is given in

[11]. Numerical linear algebra algorithms based on fixed-point numbers have also been investigated in the context of signal processing [12, 13]. The difference between the present research and these papers is in the investigation and development of the GMRES method based on integer arithmetic.

Ix Conclusions

In this paper, we developed a GMRES solver based on integer arithmetic, denoted by int-GMRES. The int-GMRES solver was used with an iterative refinement technique to attain a solution as accurate as that of a normal linear solver based on FP arithmetic. We also developed an ILU preconditioned int-GMRES solver. In integer arithmetic (fixed-point number) computing, it is important to avoid overflow in calculations. We explained how the operands are adjusted (logically shifted) in the calculation considering the characteristics of the GMRES method. We conducted numerical tests using matrices from SuiteSparse Matrix Collections. The numerical results demonstrated that the int-GMRES solver had comparable solver performance in terms of convergence to the standard FP solver. Moreover, we found that preconditioning was important for the solver using integer arithmetic to avoid overflow.

In the future, we will evaluate solver performance in terms of timing on the model of new computing devices in which integer arithmetic has advantages over conventional computing devices for calculation speed or power consumption.

References

  • [1] J. S. Vetter, E. P. DeBenedictis and T. M. Conte, “Architectures for the Post-Moore Era,” IEEE Micro, vol. 37, pp. 6–8, 2017.
  • [2] J. Shalf, “The future of computing beyond Moore’s law,” Phil. Trans. R. Soc. A. , 378 (2166), 20190061, 2020.
  • [3] R. Sato, Y. Hatanaka, Y. Ando, M. Tanaka, A. Fujimaki, K. Takagi, N. Takagi, “High-speed operation of random-access-memory-embedded microprocessor with minimal instruction set architecture based on rapid single-flux-quantum logic,” IEEE Trans. Appl. Supercond., vol. 27, pp. 1–5, 2017.
  • [4] K. Ishida, M. Tanaka, T. Ono, K. Inoue, “Towards ultra-high-speed cryogenic single-flux-quantum computing,” IEICE Trans. Electron., vol. E101-C, pp. 359–369, 2018.
  • [5] Y. Saad, Iterative Methods for Sparse Linear Systems, (Second ed.). Philadelphia, PA, SIAM, 2003.
  • [6] D. Göddeke, R. Strzodka, and S. Turek, “Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations,” Int. J. Parallel Emergent Distrib. Syst., vol. 22, pp. 221–256, 2007.
  • [7] T. A. Davis, and Y. Hu, “The university of Florida sparse matrix collection,” ACM Trans. Math. Software. 38, pp. 1–25, 2011.
  • [8] H. Anzt, V. Heuveline, and B. Rocker, “An error correction solver for linear systems: evaluation of mixed precision implementations,” VECPAR 2010, LNCS, vol. 6449, Springer, 2011.
  • [9]

    A. Haidar, S. Tomov, J. Dongarra, and N. J. Higham, “Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers,” Proc. SC18, Intl. Conf. High Performance Comput., Networking, Storage and Analysis, pp. 603–613, 2018.

  • [10] E. Carson, and N. J. Higham, “Accelerating the solution of linear systems by iterative refinement in three precisions,” SIAM J. Sci. Comput., vol. 40, pp. A817–A847, 2018.
  • [11] A. Abdelfattah, H. Anzt, E.G. Boman, E. Carson, T. Cojean, J. Dongarra, M. Gates, T. Grützmacher, N.J. Higham, S. Li, N. Lindquist, Y. Liu, J. Loe, P. Luszczek, P. Nayak, S. Pranesh, S. Rajamanickam, T. Ribizel, B. Smith, K. Swirydowicz, S. Thomas, S. Tomov, Y.M. Tsai, I. Yamazaki, U.M. Yang, “A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic,” arXiv preprint arXiv:2007.06674, 2020.
  • [12] Z. Nikolić, H. T. Nguyen, and G. Frantz, “Design and implementation of numerical linear algebra algorithms on fixed point DSPs,” EURASIP Journal on Advances in Signal Processing, 087046, 2007.
  • [13]

    T. Pradhan, B. Kabi, and A. Routray, “Fixed-point Hestenes algorithm for singular value decomposition of symmetric matrices,” Proc. 2013 Intl. Conf. Electronics, Signal Processing and Communication Systems, 2013.