Classical Structured Prediction Losses for Sequence to Sequence Learning

11/14/2017
by   Sergey Edunov, et al.
0

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models. Our experiments show that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup. We also report new state of the art results on both IWSLT 2014 German-English translation as well as Gigaword abstractive summarization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2016

Sequence-to-sequence neural network models for transliteration

Transliteration is a key component of machine translation systems and so...
research
06/09/2016

Sequence-to-Sequence Learning as Beam-Search Optimization

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important ...
research
05/08/2017

Convolutional Sequence to Sequence Learning

The prevalent approach to sequence to sequence learning maps an input se...
research
08/05/2017

A Comparison of Neural Models for Word Ordering

We compare several language models for the word-ordering task and propos...
research
02/14/2022

Sequence-to-Sequence Resources for Catalan

In this work, we introduce sequence-to-sequence language resources for C...
research
04/21/2017

Bandit Structured Prediction for Neural Sequence-to-Sequence Learning

Bandit structured prediction describes a stochastic optimization framewo...
research
01/25/2023

On the inconsistency of separable losses for structured prediction

In this paper, we prove that separable negative log-likelihood losses fo...

Please sign up or login with your details

Forgot password? Click here to reset