A comprehensive study of batch construction strategies for recurrent neural networks in MXNet

05/05/2017
by   Patrick Doetsch, et al.
0

In this work we compare different batch construction methods for mini-batch training of recurrent neural networks. While popular implementations like TensorFlow and MXNet suggest a bucketing approach to improve the parallelization capabilities of the recurrent training process, we propose a simple ordering strategy that arranges the training sequences in a stochastic alternatingly sorted way. We compare our method to sequence bucketing as well as various other batch construction strategies on the CHiME-4 noisy speech recognition corpus. The experiments show that our alternated sorting approach is able to compete both in training time and recognition performance while being conceptually simpler to implement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2017

An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Training of neural machine translation (NMT) models usually uses mini-ba...
research
10/05/2015

Batch Normalized Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are powerful models for sequential data...
research
08/18/2017

Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization

An efficient algorithm for recurrent neural network training is presente...
research
07/09/2018

On Training Recurrent Networks with Truncated Backpropagation Through Time in Speech Recognition

Recurrent neural networks have been the dominant models for many speech ...
research
10/28/2021

Cross-Batch Negative Sampling for Training Two-Tower Recommenders

The two-tower architecture has been widely applied for learning item and...
research
02/16/2023

Stabilising and accelerating light gated recurrent units for automatic speech recognition

The light gated recurrent units (Li-GRU) is well-known for achieving imp...
research
11/29/2019

DeepAlign: Alignment-based Process Anomaly Correction using Recurrent Neural Networks

In this paper, we propose DeepAlign, a novel approach to multi-perspecti...

Please sign up or login with your details

Forgot password? Click here to reset