ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction

01/21/2021
by   Thomas Pfeil, et al.
0

Deep neural networks have usually to be compressed and accelerated for their usage in low-power, e.g. mobile, devices. Recently, massively-parallel hardware accelerators were developed that offer high throughput and low latency at low power by utilizing in-memory computation. However, to exploit these benefits the computational graph of a neural network has to fit into the in-computation memory of these hardware systems that is usually rather limited in size. In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs. To this end, the graph is designed to contain loops by iteratively executing a single network building block. Furthermore, the trade-off between accuracy and latency of these so-called iterative neural networks is improved by adding multiple intermediate outputs both during training and inference. We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets that are especially demanding in terms of computational resources. In ablation studies, the improvement of network training by intermediate network outputs as well as the trade-off between weight sharing over iterations and the network size are investigated.

READ FULL TEXT

page 2

page 10

research
06/16/2020

Memory-Efficient Pipeline-Parallel DNN Training

Many state-of-the-art results in domains such as NLP and computer vision...
research
02/07/2020

Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning

As neural networks are increasingly employed in machine learning practic...
research
02/17/2020

STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...
research
05/10/2019

MobiVSR: A Visual Speech Recognition Solution for Mobile Devices

Visual speech recognition (VSR) is the task of recognizing spoken langua...
research
07/05/2018

TFLMS: Large Model Support in TensorFlow by Graph Rewriting

While accelerators such as GPUs have limited memory, deep neural network...
research
11/02/2016

Deep counter networks for asynchronous event-based processing

Despite their advantages in terms of computational resources, latency, a...
research
07/24/2020

Dopant Network Processing Units: Towards Efficient Neural-network Emulators with High-capacity Nanoelectronic Nodes

The rapidly growing computational demands of deep neural networks requir...

Please sign up or login with your details

Forgot password? Click here to reset