A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic

by   Ahmad Abdelfattah, et al.

Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cores in ORNL's Summit supercomputer providing more than an order of magnitude higher performance than what is available in IEEE double precision. At the same time, the gap between the compute power on the one hand and the memory bandwidth on the other hand keeps increasing, making data access and communication prohibitively expensive compared to arithmetic operations. To start the multiprecision focus effort, we survey the numerical linear algebra community and summarize all existing multiprecision knowledge, expertise, and software capabilities in this landscape analysis report. We also include current efforts and preliminary results that may not yet be considered "mature technology," but have the potential to grow into production quality within the multiprecision focus effort. As we expect the reader to be familiar with the basics of numerical linear algebra, we refrain from providing a detailed background on the algorithms themselves but focus on how mixed- and multiprecision technology can help improving the performance of these methods and present highlights of application significantly outperforming the traditional fixed precision methods.


page 22

page 31


tcFFT: Accelerating Half-Precision FFT through Tensor Cores

Fast Fourier Transform (FFT) is an essential tool in scientific and engi...

Mixed precision matrix interpolative decompositions for model reduction

Renewed interest in mixed-precision algorithms has emerged due to growin...

A Study of Mixed Precision Strategies for GMRES on GPUs

Support for lower precision computation is becoming more common in accel...

Mixed precision in Graphics Processing Unit

Modern graphics computing units (GPUs) are designed and optimized to per...

ARCHITECT: Arbitrary-precision Hardware with Digit Elision for Efficient Iterative Compute

Many algorithms feature an iterative loop that converges to the result o...

Low-Precision Arithmetic for Fast Gaussian Processes

Low-precision arithmetic has had a transformative effect on the training...

Open-Source GEMM Hardware Kernels Generator: Toward Numerically-Tailored Computations

Many scientific computing problems can be reduced to Matrix-Matrix Multi...

Please sign up or login with your details

Forgot password? Click here to reset