Generative Bridging Network in Neural Sequence Prediction

06/28/2017
by   Wenhu Chen, et al.
0

Maximum Likelihood Estimation (MLE) suffers from data sparsity problem in sequence prediction tasks where training resource is rare. In order to alleviate this problem, in this paper, we propose a novel generative bridging network (GBN) to train sequence prediction models, which contains a generator and a bridge. Unlike MLE directly maximizing the likelihood of the ground truth, the bridge extends the point-wise ground truth to a bridge distribution (containing inexhaustible examples), and the generator is trained to minimize their KL-divergence. In order to guide the training of generator with additional signals, the bridge distribution can be set or trained to possess specific properties, by using different constraints. More specifically, to increase output diversity, enhance language smoothness and relieve learning burden, three different regularization constraints are introduced to construct bridge distributions. By combining these bridges with a sequence generator, three independent GBNs are proposed, namely uniform GBN, language-model GBN and coaching GBN. Experiment conducted on two recognized sequence prediction tasks (machine translation and abstractive text summarization) shows that our proposed GBNs can yield significant improvements over strong baseline systems. Furthermore, by analyzing samples drawn from bridge distributions, expected influences on the sequence model training are verified.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2018

Token-level and sequence-level loss smoothing for RNN language models

Despite the effectiveness of recurrent neural network language models, t...
research
11/30/2019

Modeling Fluency and Faithfulness for Diverse Neural Machine Translation

Neural machine translation models usually adopt the teacher forcing stra...
research
01/18/2019

Improving Sequence-to-Sequence Learning via Optimal Transport

Sequence-to-sequence models are commonly trained via maximum likelihood ...
research
08/24/2018

Approximate Distribution Matching for Sequence-to-Sequence Learning

Sequence-to-Sequence models were introduced to tackle many real-life pro...
research
06/10/2021

Mode recovery in neural autoregressive sequence modeling

Despite its wide use, recent studies have revealed unexpected and undesi...
research
06/21/2018

BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation

In many natural language generation tasks, incorporating additional know...
research
05/04/2023

Scanpath Prediction in Panoramic Videos via Expected Code Length Minimization

Predicting human scanpaths when exploring panoramic videos is a challeng...

Please sign up or login with your details

Forgot password? Click here to reset