Efficient Memory Management for Deep Neural Net Inference

01/10/2020
by   Yury Pisarchyk, et al.
0

While deep neural net inference was considered a task for servers only, latest advances in technology allow the task of inference to be moved to mobile and embedded devices, desired for various reasons ranging from latency to privacy. These devices are not only limited by their compute power and battery, but also by their inferior physical memory and cache, and thus, an efficient memory manager becomes a crucial component for deep neural net inference at the edge. In this paper, we explore various strategies to smartly share memory buffers among intermediate tensors in deep neural networks. Employing these can result in up to 10.5x smaller memory footprint than running inference without one and up to 11

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2018

Co-Design of Deep Neural Nets and Neural Net Accelerators for Embedded Vision Applications

Deep Learning is arguably the most rapidly evolving research area in rec...
research
08/21/2022

Memristive Computing for Efficient Inference on Resource Constrained Devices

The advent of deep learning has resulted in a number of applications whi...
research
09/06/2017

Embedded Binarized Neural Networks

We study embedded Binarized Neural Networks (eBNNs) with the aim of allo...
research
10/29/2022

MinUn: Accurate ML Inference on Microcontrollers

Running machine learning inference on tiny devices, known as TinyML, is ...
research
07/29/2020

Fully Dynamic Inference with Deep Neural Networks

Modern deep neural networks are powerful and widely applicable models th...
research
08/02/2017

ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections

Deep neural networks have become ubiquitous for applications related to ...

Please sign up or login with your details

Forgot password? Click here to reset