Length bias in Encoder Decoder Models and a Case for Global Conditioning

06/10/2016
by   Pavel Sountsov, et al.
0

Encoder-decoder networks are popular for modeling sequences probabilistically in many applications. These models use the power of the Long Short-Term Memory (LSTM) architecture to capture the full dependence among variables, unlike earlier models like CRFs that typically assumed conditional independence among non-adjacent variables. However in practice encoder-decoder models exhibit a bias towards short sequences that surprisingly gets worse with increasing beam size. In this paper we show that such phenomenon is due to a discrepancy between the full sequence margin and the per-element margin enforced by the locally conditioned training objective of a encoder-decoder model. The discrepancy more adversely impacts long sequences, explaining the bias towards predicting short sequences. For the case where the predicted sequences come from a closed set, we show that a globally conditioned model alleviates the above problems of encoder-decoder models. From a practical point of view, our proposed model also eliminates the need for a beam-search during inference, which reduces to an efficient dot-product based search in a vector-space.

READ FULL TEXT
research
08/06/2016

Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding

This paper investigates the framework of encoder-decoder with attention ...
research
02/18/2018

Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture

In this paper, we propose a deep learning-based vehicle trajectory predi...
research
09/12/2017

Affective Neural Response Generation

Existing neural conversational models process natural language primarily...
research
06/01/2023

Hierarchical Attention Encoder Decoder

Recent advances in large language models have shown that autoregressive ...
research
07/30/2022

Global Attention-based Encoder-Decoder LSTM Model for Temperature Prediction of Permanent Magnet Synchronous Motors

Temperature monitoring is critical for electrical motors to determine if...
research
10/30/2015

Generating Text with Deep Reinforcement Learning

We introduce a novel schema for sequence to sequence learning with a Dee...
research
05/09/2018

A Click Sequence Model for Web Search

Getting a better understanding of user behavior is important for advanci...

Please sign up or login with your details

Forgot password? Click here to reset