An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search

04/15/2019
by   Kartik Goyal, et al.
0

Globally normalized neural sequence models are considered superior to their locally normalized equivalents because they may ameliorate the effects of label bias. However, when considering high-capacity neural parametrizations that condition on the whole input sequence, both model classes are theoretically equivalent in terms of the distributions they are capable of representing. Thus, the practical advantage of global normalization in the context of modern neural methods remains unclear. In this paper, we attempt to shed light on this problem through an empirical study. We extend an approach for search-aware training via a continuous relaxation of beam search (Goyal et al., 2017b) in order to enable training of globally normalized recurrent sequence models through simple backpropagation. We then use this technique to conduct an empirical study of the interaction between global normalization, high-capacity encoders, and search-aware optimization. We observe that in the context of inexact search, globally normalized neural models are still more effective than their locally normalized counterparts. Further, since our training approach is sensitive to warm-starting with pre-trained models, we also propose a novel initialization strategy based on self-normalization for pre-training globally normalized models. We perform analysis of our approach on two tasks: CCG supertagging and Machine Translation, and demonstrate the importance of global normalization under different conditions while using search-aware training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2016

Globally Normalized Transition-Based Neural Networks

We introduce a globally normalized transition-based neural network model...
research
05/22/2020

Investigating Label Bias in Beam Search for Open-ended Text Generation

Beam search is an effective and widely used decoding algorithm in many s...
research
10/10/2020

An Empirical Investigation of Beam-Aware Training in Supertagging

Structured prediction is often approached by training a locally normaliz...
research
06/07/2021

A Globally Normalized Neural Model for Semantic Parsing

In this paper, we propose a globally normalized model for context-free g...
research
05/26/2022

Global Normalization for Streaming Speech Recognition in a Modular Framework

We introduce the Globally Normalized Autoregressive Transducer (GNAT) fo...
research
07/24/2017

Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification

We introduce globally normalized convolutional neural networks for joint...
research
06/12/2015

On the accuracy of self-normalized log-linear models

Calculation of the log-normalizer is a major computational obstacle in a...

Please sign up or login with your details

Forgot password? Click here to reset