An Empirical Investigation of Beam-Aware Training in Supertagging

10/10/2020
by   Renato Negrinho, et al.
0

Structured prediction is often approached by training a locally normalized model with maximum likelihood and decoding approximately with beam search. This approach leads to mismatches as, during training, the model is not exposed to its mistakes and does not use beam search. Beam-aware training aims to address these problems, but unfortunately, it is not yet widely used due to a lack of understanding about how it impacts performance, when it is most useful, and whether it is stable. Recently, Negrinho et al. (2018) proposed a meta-algorithm that captures beam-aware training algorithms and suggests new ones, but unfortunately did not provide empirical results. In this paper, we begin an empirical investigation: we train the supertagging model of Vaswani et al. (2016) and a simpler model with instantiations of the meta-algorithm. We explore the influence of various design choices and make recommendations for choosing them. We observe that beam-aware training improves performance for both models, with large improvements for the simpler model which must effectively manage uncertainty during decoding. Our results suggest that a model must be learned with search to maximize its effectiveness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2018

Learning Beam Search Policies via Imitation Learning

Beam search is widely used for approximate decoding in structured predic...
research
05/22/2020

Investigating Label Bias in Beam Search for Open-ended Text Generation

Beam search is an effective and widely used decoding algorithm in many s...
research
09/22/2021

Conditional Poisson Stochastic Beam Search

Beam search is the default decoding strategy for many sequence generatio...
research
09/12/2019

Speculative Beam Search for Simultaneous Translation

Beam search is universally used in full-sentence translation but its app...
research
03/05/2019

Study of Sparsity-Aware Reduced-Dimension Beam-Doppler Space-Time Adaptive Processing

Existing reduced-dimension beam-Doppler space-time adaptive processing (...
research
12/17/2022

A Simple Baseline for Beam Search Reranking

Reranking methods in machine translation aim to close the gap between co...

Please sign up or login with your details

Forgot password? Click here to reset