Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization

08/18/2017
by   Viacheslav Khomenko, et al.
0

An efficient algorithm for recurrent neural network training is presented. The approach increases the training speed for tasks where a length of the input sequence may vary significantly. The proposed approach is based on the optimal batch bucketing by input sequence length and data parallelization on multiple graphical processing units. The baseline training performance without sequence bucketing is compared with the proposed solution for a different number of buckets. An example is given for the online handwriting recognition task using an LSTM recurrent neural network. The evaluation is performed in terms of the wall clock time, number of epochs, and validation loss value.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/30/2021

Coordinate descent on the orthogonal group for recurrent neural network training

We propose to use stochastic Riemannian coordinate descent on the orthog...
research
10/18/2019

On the Difficulty of Warm-Starting Neural Network Training

In many real-world deployments of machine learning systems, data arrive ...
research
03/10/2015

Single stream parallelization of generalized LSTM-like RNNs on a GPU

Recurrent neural networks (RNNs) have shown outstanding performance on p...
research
05/05/2017

A comprehensive study of batch construction strategies for recurrent neural networks in MXNet

In this work we compare different batch construction methods for mini-ba...
research
01/30/2018

Accelerating recurrent neural network language model based online speech recognition system

This paper presents methods to accelerate recurrent neural network based...
research
06/16/2018

Recurrent neural network-based user authentication for freely typed keystroke data

Keystroke dynamics-based user authentication (KDA) based on long and fre...
research
10/05/2020

A fast memoryless predictive algorithm in a chain of recurrent neural networks

In the recent publication (arxiv:2007.08063v2 [cs.LG]) a fast prediction...

Please sign up or login with your details

Forgot password? Click here to reset