Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast—Choose Three

10/13/2020
by   Steven Reich, et al.
6

Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightforward or effective; furthermore, a held-out calibration set may not always be available. In this paper, we study ensemble distillation as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. We validate this framework on two tasks: named-entity recognition and machine translation. We find that, across both tasks, ensemble distillation produces models which retain much of, and occasionally improve upon, the performance and calibration benefits of ensembles, while only requiring a single model during test-time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2020

Calibrating Structured Output Predictors for Natural Language Processing

We address the problem of calibrating prediction confidence for output e...
research
05/29/2018

Distilling Knowledge for Search-based Structured Prediction

Many natural language processing tasks can be modeled into structured pr...
research
12/08/2018

Adaptive and Calibrated Ensemble Learning with Dependent Tail-free Process

Ensemble learning is a mainstay in modern data science practice. Convent...
research
07/05/2023

Set Learning for Accurate and Calibrated Models

Model overconfidence and poor calibration are common in machine learning...
research
01/16/2020

Better Boosting with Bandits for Online Learning

Probability estimates generated by boosting ensembles are poorly calibra...
research
03/29/2022

Learning Structured Gaussians to Approximate Deep Ensembles

This paper proposes using a sparse-structured multivariate Gaussian to p...
research
06/25/2020

Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

Automated machine learning (AutoML) can produce complex model ensembles ...

Please sign up or login with your details

Forgot password? Click here to reset