Multi-task Sequence to Sequence Learning

11/19/2015
by   Minh-Thang Luong, et al.
0

Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation. Our results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks. Furthermore, we have established a new state-of-the-art result in constituent parsing with 93.0 F1. Lastly, we reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context: autoencoder helps less in terms of perplexities but more on BLEU scores compared to skip-thought.

READ FULL TEXT
research
03/24/2017

Sequence-to-Sequence Models Can Directly Translate Foreign Speech

We present a recurrent encoder-decoder deep neural network architecture ...
research
05/11/2018

Neural Machine Translation for Bilingually Scarce Scenarios: A Deep Multi-task Learning Approach

Neural machine translation requires large amounts of parallel training t...
research
06/01/2018

Scaling Neural Machine Translation

Sequence to sequence learning models still require several days to reach...
research
11/08/2016

Unsupervised Pretraining for Sequence to Sequence Learning

Sequence to sequence models are successful tools for supervised sequence...
research
06/09/2017

Rethinking Skip-thought: A Neighborhood based Approach

We study the skip-thought model with neighborhood information as weak su...
research
05/18/2021

Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder

Heterogeneity of sentences exists in sequence to sequence tasks such as ...
research
03/13/2017

DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks

In this work, we present a compact, modular framework for constructing n...

Please sign up or login with your details

Forgot password? Click here to reset