Middle-Out Decoding

10/28/2018
by   Shikib Mehri, et al.
0

Despite being virtually ubiquitous, sequence-to-sequence models are challenged by their lack of diversity and inability to be externally controlled. In this paper, we speculate that a fundamental shortcoming of sequence generation models is that the decoding is done strictly from left-to-right, meaning that outputs values generated earlier have a profound effect on those generated later. To address this issue, we propose a novel middle-out decoder architecture that begins from an initial middle-word and simultaneously expands the sequence in both directions. To facilitate information flow and maintain consistent decoding, we introduce a dual self-attention mechanism that allows us to model complex dependencies between the outputs. We illustrate the performance of our model on the task of video captioning, as well as a synthetic sequence de-noising task. Our middle-out decoder achieves significant improvements on de-noising and competitive performance in the task of video captioning, while quantifiably improving the caption diversity. Furthermore, we perform a qualitative analysis that demonstrates our ability to effectively control the generation process of our decoder.

READ FULL TEXT

page 13

page 14

page 15

research
04/02/2020

Consistent Multiple Sequence Decoding

Sequence decoding is one of the core components of most visual-lingual m...
research
02/27/2020

Hierarchical Memory Decoding for Video Captioning

Recent advances of video captioning often employ a recurrent neural netw...
research
10/27/2020

Fast Interleaved Bidirectional Sequence Generation

Independence assumptions during sequence generation can speed up inferen...
research
09/19/2019

Adaptively Aligned Image Captioning via Adaptive Attention Time

Recent neural models for image captioning usually employs an encoder-dec...
research
04/24/2017

Multi-Task Video Captioning with Video and Entailment Generation

Video captioning, the task of describing the content of a video, has see...
research
10/16/2019

Imperial College London Submission to VATEX Video Captioning Task

This paper describes the Imperial College London team's submission to th...
research
07/18/2019

Forward-Backward Decoding for Regularizing End-to-End TTS

Neural end-to-end TTS can generate very high-quality synthesized speech,...

Please sign up or login with your details

Forgot password? Click here to reset