Hierarchical Phrase-based Sequence-to-Sequence Learning

11/15/2022
by   Bailin Wang, et al.
0

We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative parser based on a bracketing transduction grammar whose derivation tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one. We use the same seq2seq model to translate at all phrase scales, which results in two inference modes: one mode in which the parser is discarded and only the seq2seq component is used at the sequence-level, and another in which the parser is combined with the seq2seq model. Decoding in the latter mode is done with the cube-pruned CKY algorithm, which is more involved but can make use of new translation rules during inference. We formalize our model as a source-conditioned synchronous grammar and develop an efficient variational inference algorithm for training. When applied on top of both randomly initialized and pretrained seq2seq models, we find that both inference modes performs well compared to baselines on small scale machine translation benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2021

Sequence-to-Sequence Learning with Latent Neural Grammars

Sequence-to-sequence learning with neural networks has become the de fac...
research
06/05/2023

Improving Grammar-based Sequence-to-Sequence Modeling with Decomposition and Constraints

Neural QCFG is a grammar-based sequence-tosequence (seq2seq) model with ...
research
04/03/2020

Learning synchronous context-free grammars with multiple specialised non-terminals for hierarchical phrase-based translation

Translation models based on hierarchical phrase-based statistical machin...
research
11/06/2018

Neural Phrase-to-Phrase Machine Translation

In this paper, we propose Neural Phrase-to-Phrase Machine Translation (N...
research
09/01/1997

Identifying Hierarchical Structure in Sequences: A linear-time algorithm

SEQUITUR is an algorithm that infers a hierarchical structure from a seq...
research
06/10/2021

Progressive Multi-Granularity Training for Non-Autoregressive Translation

Non-autoregressive translation (NAT) significantly accelerates the infer...
research
08/24/2018

Approximate Distribution Matching for Sequence-to-Sequence Learning

Sequence-to-Sequence models were introduced to tackle many real-life pro...

Please sign up or login with your details

Forgot password? Click here to reset