SEARNN: Training RNNs with Global-Local Losses

06/14/2017
by   Rémi Leblond, et al.
0

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropriate surrogate for the test error: by only maximizing the ground truth probability, it fails to exploit the wealth of information offered by structured losses. Further, it introduces discrepancies between training and predicting (such as exposure bias) that may hurt test performance. Instead, SEARNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error. We demonstrate improved performance over MLE on three different tasks: OCR, spelling correction and text chunking. Finally, we propose a subsampling strategy to enable SEARNN to scale to large vocabulary sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2023

Learning from Predictions: Fusing Training and Autoregressive Inference for Long-Term Spatiotemporal Forecasts

Recurrent Neural Networks (RNNs) have become an integral part of modelin...
research
10/21/2019

On Predictive Information Sub-optimality of RNNs

Certain biological neurons demonstrate a remarkable capability to optima...
research
06/04/2020

MLE-guided parameter search for task loss minimization in neural sequence modeling

Neural autoregressive sequence models are used to generate sequences in ...
research
01/25/2023

On the inconsistency of separable losses for structured prediction

In this paper, we prove that separable negative log-likelihood losses fo...
research
09/30/2018

Efficient Sequence Labeling with Actor-Critic Training

Neural approaches to sequence labeling often use a Conditional Random Fi...
research
03/22/2021

Alleviate Exposure Bias in Sequence Prediction with Recurrent Neural Networks

A popular strategy to train recurrent neural networks (RNNs), known as “...

Please sign up or login with your details

Forgot password? Click here to reset