EcoRNN: Fused LSTM RNN Implementation with Data Layout Optimization

05/22/2018
by   Bojian Zheng, et al.
0

Long-Short-Term-Memory Recurrent Neural Network (LSTM RNN) is a state-of-the-art (SOTA) model for analyzing sequential data. Current implementations of LSTM RNN in machine learning frameworks usually either lack performance or flexibility. For example, default implementations in Tensorflow and MXNet invoke many tiny GPU kernels, leading to excessive overhead in launching GPU threads. Although cuDNN, NVIDIA's deep learning library, can accelerate performance by around 2x, it is closed-source and inflexible, hampering further research and performance improvements in frameworks, such as PyTorch, that use cuDNN as their backend. In this paper, we introduce a new RNN implementation called EcoRNN that is significantly faster than the SOTA open-source implementation in MXNet and is competitive with the closed-source cuDNN. We show that (1) fusing tiny GPU kernels and (2) applying data layout optimization can give us a maximum performance boost of 3x over MXNet default and 1.5x over cuDNN implementations. Our optimizations also apply to other RNN cell types such as LSTM variants and Gated Recurrent Units (GRUs). We integrate EcoRNN into MXNet Python library and open-source it to benefit machine learning practitioners.

READ FULL TEXT
research
04/11/2016

Deep Gate Recurrent Neural Network

This paper introduces two recurrent neural network structures called Sim...
research
05/06/2016

LSTM with Working Memory

Previous RNN architectures have largely been superseded by LSTM, or "Lon...
research
04/10/2018

French Word Recognition through a Quick Survey on Recurrent Neural Networks Using Long-Short Term Memory RNN-LSTM

Optical character recognition (OCR) is a fundamental problem in computer...
research
02/07/2018

Effective Quantization Approaches for Recurrent Neural Networks

Deep learning, and in particular Recurrent Neural Networks (RNN) have sh...
research
07/30/2023

RoseNNa: A performant, portable library for neural network inference with application to computational fluid dynamics

The rise of neural network-based machine learning ushered in high-level ...
research
12/10/2019

libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications

There are many ways to represent a molecule as input to a machine learni...
research
08/09/2018

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

Because of their effectiveness in broad practical applications, LSTM net...

Please sign up or login with your details

Forgot password? Click here to reset