XEngine: Optimal Tensor Rematerialization for Neural Networks in Heterogeneous Environments

12/19/2022
by   Manuela Schuler, et al.
0

Memory efficiency is crucial in training deep learning networks on resource-restricted devices. During backpropagation, forward tensors are used to calculate gradients. Despite the option of keeping those dependencies in memory until they are reused in backpropagation, some forward tensors can be discarded and recomputed later from saved tensors, so-called checkpoints. This allows, in particular, for resource-constrained heterogeneous environments to make use of all available compute devices. Unfortunately, the definition of these checkpoints is a non-trivial problem and poses a challenge to the programmer - improper or excessive recomputations negate the benefit of checkpointing. In this article, we present XEngine, an approach that schedules network operators to heterogeneous devices in low memory environments by determining checkpoints and recomputations of tensors. Our approach selects suitable resources per timestep and operator and optimizes the end-to-end time for neural networks taking the memory limitation of each device into account. For this, we formulate a mixed-integer quadratic program (MIQP) to schedule operators of deep learning networks on heterogeneous systems. We compare our MIQP solver XEngine against Checkmate, a mixed-integer linear programming (MILP) approach that solves recomputation on a single device. Our solver finds solutions that are up to 22.5 which the network is computed exclusively on a single device. We also find valid schedules for networks making use of both central processing units and graphics processing units if memory limitations do not allow scheduling exclusively to the graphics processing unit.

READ FULL TEXT

page 1

page 16

page 17

page 20

page 25

page 26

research
10/21/2021

Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural Networks

Compared to classical deep neural networks its binarized versions can be...
research
10/24/2022

OLLA: Decreasing the Memory Usage of Neural Networks by Optimizing the Lifetime and Location of Arrays

The size of deep neural networks has grown exponentially in recent years...
research
07/15/2022

POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging

Fine-tuning models on edge devices like mobile phones would enable priva...
research
08/12/2022

Modeling Task Mapping for Data-intensive Applications in Heterogeneous Systems

We introduce a new model for the task mapping problem to aid in the syst...
research
01/01/2020

Lossless Compression of Deep Neural Networks

Deep neural networks have been successful in many predictive modeling ta...
research
03/25/2021

Enabling Incremental Training with Forward Pass for Edge Devices

Deep Neural Networks (DNNs) are commonly deployed on end devices that ex...
research
07/31/2023

DiviML: A Module-based Heuristic for Mapping Neural Networks onto Heterogeneous Platforms

Datacenters are increasingly becoming heterogeneous, and are starting to...

Please sign up or login with your details

Forgot password? Click here to reset