Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

06/04/2020
by   Haichen Shen, et al.
0

Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes. Existing deep learning systems focus on optimizing and executing static neural networks which assume a pre-determined model architecture and input data shapes–assumptions which are violated by dynamic neural networks. Therefore, executing dynamic models with deep learning systems is currently both inflexible and sub-optimal, if not impossible. Optimizing dynamic neural networks is more challenging than static neural networks; optimizations must consider all possible execution paths and tensor shapes. This paper proposes Nimble, a high-performance and flexible system to optimize, compile, and execute dynamic neural networks on multiple platforms. Nimble handles model dynamism by introducing a dynamic type system, a set of dynamism-oriented optimizations, and a light-weight virtual machine runtime. Our evaluation demonstrates that Nimble outperforms state-of-the-art deep learning frameworks and runtime systems for dynamic neural networks by up to 20x on hardware platforms including Intel CPUs, ARM CPUs, and Nvidia GPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2021

Tuna: A Static Analysis Approach to Optimizing Deep Neural Networks

We introduce Tuna, a static analysis approach to optimizing deep neural ...
research
06/11/2020

Ansor : Generating High-Performance Tensor Programs for Deep Learning

High-performance tensor programs are crucial to guarantee efficient exec...
research
05/17/2023

ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time

Dynamic control flow is an important technique often used to design expr...
research
02/16/2023

Decoupled Model Schedule for Deep Learning Training

Recent years have seen an increase in the development of large deep lear...
research
03/23/2023

Graph Tensor Networks: An Intuitive Framework for Designing Large-Scale Neural Learning Systems on Multiple Domains

Despite the omnipresence of tensors and tensor operations in modern deep...
research
01/20/2021

SparseDNN: Fast Sparse Deep Learning Inference on CPUs

The last few years have seen gigantic leaps in algorithms and systems to...
research
11/10/2019

Using Deep Neural Networks for Estimating Loop Unrolling Factor

Optimizing programs requires deep expertise. On one hand, it is a tediou...

Please sign up or login with your details

Forgot password? Click here to reset