Improving Joint Training of Inference Networks and Structured Prediction Energy Networks

11/07/2019
by   Lifu Tu, et al.
0

Deep energy-based models are powerful, but pose challenges for learning and inference (Belanger and McCallum, 2016). Tu and Gimpel (2018) developed an efficient framework for energy-based models by training "inference networks" to approximate structured inference instead of using gradient descent. However, their alternating optimization approach suffers from instabilities during training, requiring additional loss terms and careful hyperparameter tuning. In this paper, we contribute several strategies to stabilize and improve this joint training of energy functions and inference networks for structured prediction. We design a compound objective to jointly train both cost-augmented and test-time inference networks along with the energy function. We propose joint parameterizations for the inference networks that encourage them to capture complementary functionality during learning. We empirically validate our strategies on two sequence labeling tasks, showing easier paths to strong performance than prior work, as well as further improvements with global energy terms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2018

Learning Approximate Inference Networks for Structured Prediction

Structured prediction energy networks (SPENs; Belanger & McCallum 2016) ...
research
08/27/2021

Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP

Structured prediction in natural language processing (NLP) has a long hi...
research
02/28/2019

Scaling Matters in Deep Structured-Prediction Models

Deep structured-prediction energy-based models combine the expressive po...
research
12/22/2018

Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks

In structured output prediction tasks, labeling ground-truth training ou...
research
04/01/2019

Benchmarking Approximate Inference Methods for Neural Structured Prediction

Exact structured inference with neural network scoring functions is comp...
research
11/21/2022

Implicit Training of Energy Model for Structure Prediction

Most deep learning research has focused on developing new model and trai...
research
05/19/2022

Learning Energy Networks with Generalized Fenchel-Young Losses

Energy-based models, a.k.a. energy networks, perform inference by optimi...

Please sign up or login with your details

Forgot password? Click here to reset