Pseudolikelihood Reranking with Masked Language Models

10/31/2019
by   Julian Salazar, et al.
15

We rerank with scores from pretrained masked language models like BERT to improve ASR and NMT performance. These log-pseudolikelihood scores (LPLs) can outperform large, autoregressive language models (GPT-2) in out-of-the-box scoring. RoBERTa reduces WER by up to 30 system and adds up to +1.7 BLEU on state-of-the-art baselines for TED Talks low-resource pairs, with further gains from domain adaptation. In the multilingual setting, a single XLM can be used to rerank translation outputs in multiple languages. The numerical and qualitative properties of LPL scores suggest that LPLs capture sentence fluency better than autoregressive scores. Finally, we finetune BERT to estimate sentence LPLs without masking, enabling scoring in a single, non-recurrent inference pass.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2019

Phoneme Level Language Models for Sequence Based Low Resource ASR

Building multilingual and crosslingual models help bring different langu...
research
05/17/2023

A Better Way to Do Masked Language Model Scoring

Estimating the log-likelihood of a given sentence under an autoregressiv...
research
05/22/2023

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition

Energy-based language models (ELMs) parameterize an unnormalized distrib...
research
04/18/2023

Romanization-based Large-scale Adaptation of Multilingual Language Models

Large multilingual pretrained language models (mPLMs) have become the de...
research
09/17/2021

New Students on Sesame Street: What Order-Aware Matrix Embeddings Can Learn from BERT

Large-scale pretrained language models (PreLMs) are revolutionizing natu...
research
12/27/2022

DeepCuts: Single-Shot Interpretability based Pruning for BERT

As language models have grown in parameters and layers, it has become mu...
research
10/05/2021

Sicilian Translator: A Recipe for Low-Resource NMT

With 17,000 pairs of Sicilian-English translated sentences, Arba Sicula ...

Please sign up or login with your details

Forgot password? Click here to reset