Large Scale Distributed Linear Algebra With Tensor Processing Units

12/16/2021
by   Adam G. M. Lewis, et al.
0

We have repurposed Google Tensor Processing Units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast inter-core interconnects (ICI)s, physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this regime, the matrix-multiply units (MXU)s dominate the runtime, yielding impressive scaling, performance, and raw size: operating in float32 precision, a full 2048-core pod of third generation TPUs can multiply two matrices with linear size N= 220= 1 048 576 in about 2 minutes. Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks in dense linear algebra can similarly scale. As examples, we present (i) QR decomposition; (ii) resolution of linear systems; and (iii) the computation of matrix functions by polynomial iteration, demonstrated by the matrix polar factorization.

READ FULL TEXT

page 4

page 6

research
08/19/2019

A Computational Model for Tensor Core Units

To respond to the need of efficient training and inference of deep neura...
research
10/29/2019

DBCSR: A Blocked Sparse Tensor Algebra Library

Advanced algorithms for large-scale electronic structure calculations ar...
research
08/25/2021

A TensorFlow Simulation Framework for Scientific Computing of Fluid Flows on Tensor Processing Units

A computational fluid dynamics (CFD) simulation framework for predicting...
research
06/22/2020

Similarity Search with Tensor Core Units

Tensor Core Units (TCUs) are hardware accelerators developed for deep ne...
research
09/06/2023

CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

Many areas of machine learning and science involve large linear algebra ...
research
11/19/2016

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

We propose two novel techniques for overcoming load-imbalance encountere...
research
07/29/2021

ATLAS: Interactive and Educational Linear Algebra System Containing Non-Standard Methods

While there are numerous linear algebra teaching tools, they tend to be ...

Please sign up or login with your details

Forgot password? Click here to reset