Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling

11/23/2016
by   Zhe Gan, et al.
0

Recurrent neural networks (RNNs) have shown promising performance for language modeling. However, traditional training of RNNs using back-propagation through time often suffers from overfitting. One reason for this is that stochastic optimization (used for large training sets) does not provide good estimates of model uncertainty. This paper leverages recent advances in stochastic gradient Markov Chain Monte Carlo (also appropriate for large training sets) to learn weight uncertainty in RNNs. It yields a principled Bayesian learning algorithm, adding gradient noise during training (enhancing exploration of the model-parameter space) and model averaging when testing. Extensive experiments on various RNN models and across a broad range of applications demonstrate the superiority of the proposed approach over stochastic optimization.

READ FULL TEXT
research
04/10/2017

Bayesian Recurrent Neural Networks

In this work we explore a straightforward variational Bayes scheme for R...
research
02/15/2017

Training Language Models Using Target-Propagation

While Truncated Back-Propagation through Time (BPTT) is the most popular...
research
08/03/2019

Ensemble Neural Networks (ENN): A gradient-free stochastic method

In this study, an efficient stochastic gradient-free method, the ensembl...
research
12/25/2015

Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayes...
research
02/27/2019

Alternating Synthetic and Real Gradients for Neural Language Modeling

Training recurrent neural networks (RNNs) with backpropagation through t...
research
11/26/2015

Regularizing RNNs by Stabilizing Activations

We stabilize the activations of Recurrent Neural Networks (RNNs) by pena...
research
07/05/2019

A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks

We present a framework for compactly summarizing many recent results in ...

Please sign up or login with your details

Forgot password? Click here to reset