In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

12/07/2017
by   Samuel Rota Bulò, et al.
0

In this work we present In-Place Activated Batch Normalization (InPlace-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50 recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2 time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as InPlace-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report results for COCO-Stuff, Cityscapes and Mapillary Vistas, obtaining new state-of-the-art results on the latter without additional training data but in a single-scale and -model scenario. Code can be found at https://github.com/mapillary/inplace_abn .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2021

BitTrain: Sparse Bitmap Compression for Memory-Efficient Training on the Edge

Training on the Edge enables neural networks to learn continuously from ...
research
07/21/2017

Memory-Efficient Implementation of DenseNets

The DenseNet architecture is highly computationally efficient as a resul...
research
11/18/2020

A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression

Deep neural networks (DNNs) are becoming increasingly deeper, wider, and...
research
07/27/2019

Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions

Semantic segmentation with Convolutional Neural Networks is a memory-int...
research
01/19/2020

Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization

Batch Normalization (BN) is one of the most widely used techniques in De...
research
12/05/2022

MobileTL: On-device Transfer Learning with Inverted Residual Blocks

Transfer learning on edge is challenging due to on-device limited resour...
research
10/19/2022

Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction

Training deep learning models can be computationally expensive. Prior wo...

Please sign up or login with your details

Forgot password? Click here to reset