DeepAI AI Chat
Log In Sign Up

FT-BLAS: A High Performance BLAS Implementation With Online Fault Tolerance

by   Yujia Zhai, et al.

Basic Linear Algebra Subprograms (BLAS) is a core library in scientific computing and machine learning. This paper presents FT-BLAS, a new implementation of BLAS routines that not only tolerates soft errors on the fly, but also provides comparable performance to modern state-of-the-art BLAS libraries on widely-used processors such as Intel Skylake and Cascade Lake. To accommodate the features of BLAS, which contains both memory-bound and computing-bound routines, we propose a hybrid strategy to incorporate fault tolerance into our brand-new BLAS implementation: duplicating computing instructions for memory-bound Level-1 and Level-2 BLAS routines and incorporating an Algorithm-Based Fault Tolerance mechanism for computing-bound Level-3 BLAS routines. Our high performance and low overhead are obtained from delicate assembly-level optimization and a kernel-fusion approach to the computing kernels. Experimental results demonstrate that FT-BLAS offers high reliability and high performance – faster than Intel MKL, OpenBLAS, and BLIS by up to 3.50 three levels of BLAS we benchmarked, even under hundreds of errors injected per minute.


page 1

page 2

page 3

page 4


Automating the Last-Mile for High Performance Dense Linear Algebra

High performance dense linear algebra (DLA) libraries often rely on a ge...

A C++17 Thread Pool for High-Performance Scientific Computing

We present a modern C++17-compatible thread pool implementation, built f...

Arithmetic-Intensity-Guided Fault Tolerance for Neural Network Inference on GPUs

Neural networks (NNs) are increasingly employed in domains that require ...

Improving Performance of Iterative Methods by Lossy Checkponting

Iterative methods are commonly used approaches to solve large, sparse li...

Algorithm-Based Fault Tolerance for Convolutional Neural Networks

Convolutional neural networks (CNNs) are becoming more and more importan...

High-Performance Level-1 and Level-2 BLAS

The introduction of the Basic Linear Algebra Subroutine (BLAS) in the 19...