TFLMS: Large Model Support in TensorFlow by Graph Rewriting

07/05/2018
by   Tung D. Le, et al.
0

While accelerators such as GPUs have limited memory, deep neural networks are becoming larger and will not fit with the memory limitation of accelerators for training. We propose an approach to tackle this problem by rewriting the computational graph of a neural network, in which swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. In particular, we first revise the concept of a computational graph by defining a concrete semantics for variables in a graph. We then formally show how to derive swap-out and swap-in operations from an existing graph and present rules to optimize the graph. To realize our approach, we developed a module in TensorFlow, named TFLMS. TFLMS is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and 2x larger batch size, respectively. In particular, we were able to train 3DUNet using images of size of 192^3 for image segmentation, which, without TFLMS, had been done only by dividing the images to smaller images, which affects the accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/19/2020

A Computational-Graph Partitioning Method for Training Memory-Constrained DNNs

We propose ParDNN, an automatic, generic, and non-intrusive partitioning...
research
03/07/2021

Implementing graph neural networks with TensorFlow-Keras

Graph neural networks are a versatile machine learning architecture that...
research
03/27/2019

High Performance Monte Carlo Simulation of Ising Model on TPU Clusters

Large scale deep neural networks profited from an emerging class of AI a...
research
01/21/2021

ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction

Deep neural networks have usually to be compressed and accelerated for t...
research
11/05/2018

Mesh-TensorFlow: Deep Learning for Supercomputers

Batch-splitting (data-parallelism) is the dominant distributed Deep Neur...
research
12/04/2018

Auto-tuning TensorFlow Threading Model for CPU Backend

TensorFlow is a popular deep learning framework used by data scientists ...
research
08/25/2019

Extending TensorFlow's Semantics with Pipelined Execution

TensorFlow is a popular cloud computing framework that targets machine l...

Please sign up or login with your details

Forgot password? Click here to reset