DeepAI AI Chat
Log In Sign Up

How Low Can We Go: Trading Memory for Error in Low-Precision Training

by   Chengrun Yang, et al.

Low-precision arithmetic trains deep learning models using less energy, less memory and less time. However, we pay a price for the savings: lower precision may yield larger round-off error and hence larger prediction error. As applications proliferate, users must choose which precision to use to train a new model, and chip manufacturers must decide which precisions to manufacture. We view these precision choices as a hyperparameter tuning problem, and borrow ideas from meta-learning to learn the tradeoff between memory and error. In this paper, we introduce Pareto Estimation to Pick the Perfect Precision (PEPPP). We use matrix factorization to find non-dominated configurations (the Pareto frontier) with a limited number of network evaluations. For any given memory budget, the precision that minimizes error is a point on this frontier. Practitioners can use the frontier to trade memory for error and choose the best precision for their goals.


page 1

page 2

page 3

page 4


Mixed precision matrix interpolative decompositions for model reduction

Renewed interest in mixed-precision algorithms has emerged due to growin...

WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics

Today's high performance deep learning architectures involve large model...

Single-pass Nyström approximation in mixed precision

Low rank matrix approximations appear in a number of scientific computin...

Training with Mixed-Precision Floating-Point Assignments

When training deep neural networks, keeping all tensors in high precisio...

Adaptive Precision Training for Resource Constrained Devices

Learn in-situ is a growing trend for Edge AI. Training deep neural netwo...

Characteristic polynomials of p-adic matrices

We analyze the precision of the characteristic polynomial of an n× n p-a...

QPyTorch: A Low-Precision Arithmetic Simulation Framework

Low-precision training reduces computational cost and produces efficient...