Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

01/18/2021
by   Tianxing He, et al.
0

In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/25/2021

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

While pretrained encoders have achieved success in various natural langu...
research
12/15/2022

Image-and-Language Understanding from Pixels Only

Multimodal models are becoming increasingly effective, in part due to un...
research
01/15/2021

TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search

Text encoders based on C-DSSM or transformers have demonstrated strong p...
research
08/04/2021

Curriculum learning for language modeling

Language Models like ELMo and BERT have provided robust representations ...
research
05/20/2023

Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language Understanding

The pre-trained speech encoder wav2vec 2.0 performs very well on various...
research
11/10/2022

The CRINGE Loss: Learning what language not to model

Standard language model training employs gold human documents or human-h...
research
06/15/2020

To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks

Pretraining NLP models with variants of Masked Language Model (MLM) obje...

Please sign up or login with your details

Forgot password? Click here to reset