Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

01/18/2021
by   Tianxing He, et al.
0

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/25/2021

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

While pretrained encoders have achieved success in various natural langu...
08/04/2021

Curriculum learning for language modeling

Language Models like ELMo and BERT have provided robust representations ...
01/15/2021

TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search

Text encoders based on C-DSSM or transformers have demonstrated strong p...
10/16/2020

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Data augmentation has been demonstrated as an effective strategy for imp...
12/06/2019

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

We propose to reinterpret a standard discriminative classifier of p(y|x)...
10/08/2020

Dual Inference for Improving Language Understanding and Generation

Natural language understanding (NLU) and Natural language generation (NL...
04/22/2020

Residual Energy-Based Models for Text Generation

Text generation is ubiquitous in many NLP tasks, from summarization, to ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.