Speculative Beam Search for Simultaneous Translation

09/12/2019
by   Renjie Zheng, et al.
0

Beam search is universally used in full-sentence translation but its application to simultaneous translation remains non-trivial, where output words are committed on the fly. In particular, the recently proposed wait-k policy (Ma et al., 2019a) is a simple and effective method that (after an initial wait) commits one output word on receiving each input word, making beam search seemingly impossible. To address this challenge, we propose a speculative beam search algorithm that hallucinates several steps into the future in order to reach a more accurate decision, implicitly benefiting from a target language model. This makes beam search applicable for the first time to the generation of a single word in each step. Experiments over diverse language pairs show large improvements over previous work.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/31/2018

When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)

In neural text generation such as neural machine translation, summarizat...
07/20/2021

What Do You Get When You Cross Beam Search with Nucleus Sampling?

We combine beam search with the probabilistic pruning technique of nucle...
09/04/2019

Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation

Simultaneous translation is widely useful but remains challenging. Previ...
09/26/2022

Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned

This paper focuses on enhancing the captions generated by image-caption ...
06/11/2018

Finding Syntax in Human Encephalography with Beam Search

Recurrent neural network grammars (RNNGs) are generative models of (tree...
08/17/2019

Leveraging sentence similarity in natural language generation: Improving beam search using range voting

We propose a novel method for generating natural language sentences from...
10/10/2020

An Empirical Investigation of Beam-Aware Training in Supertagging

Structured prediction is often approached by training a locally normaliz...