Distilling Knowledge for Search-based Structured Prediction

05/29/2018
by   Yijia Liu, et al.
0

Many natural language processing tasks can be modeled into structured prediction and solved as a search problem. In this paper, we distill an ensemble of multiple models trained with different initialization into a single model. In addition to learning to match the ensemble's probability output on the reference states, we also use the ensemble to explore the search space and learn from the encountered states in the exploration. Experimental results on two typical search-based structured prediction tasks -- transition-based dependency parsing and neural machine translation show that distillation can effectively improve the single model's performance and the final model achieves improvements of 1.32 in LAS and 2.65 in BLEU score on these two tasks respectively over strong baselines and it outperforms the greedy structured prediction models in previous literatures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2020

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast—Choose Three

Modern neural networks do not always produce well-calibrated predictions...
research
04/01/2019

Learning to Stop in Structured Prediction for Neural Machine Translation

Beam search optimization resolves many issues in neural machine translat...
research
03/25/2022

Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation

Subword regularizations use multiple subword segmentations during traini...
research
10/04/2020

Adversarial Attack and Defense of Structured Prediction Models

Building an effective adversarial attacker and elaborating on countermea...
research
12/05/2020

Reciprocal Supervised Learning Improves Neural Machine Translation

Despite the recent success on image classification, self-training has on...
research
02/07/2022

Measuring and Reducing Model Update Regression in Structured Prediction for NLP

Recent advance in deep learning has led to rapid adoption of machine lea...
research
10/11/2022

Exploring CNN-based models for image's aesthetic score prediction with using ensemble

In this paper, we proposed a framework of constructing two types of the ...

Please sign up or login with your details

Forgot password? Click here to reset