Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling

05/25/2022
by   Kaitao Song, et al.
0

Sentence scoring aims at measuring the likelihood score of a sentence and is widely used in many natural language processing scenarios, like reranking, which is to select the best sentence from multiple candidates. Previous works on sentence scoring mainly adopted either causal language modeling (CLM) like GPT or masked language modeling (MLM) like BERT, which have some limitations: 1) CLM only utilizes unidirectional information for the probability estimation of a sentence without considering bidirectional context, which affects the scoring quality; 2) MLM can only estimate the probability of partial tokens at a time and thus requires multiple forward passes to estimate the probability of the whole sentence, which incurs large computation and time cost. In this paper, we propose Transcormer – a Transformer model with a novel sliding language modeling (SLM) for sentence scoring. Specifically, our SLM adopts a triple-stream self-attention mechanism to estimate the probability of all tokens in a sentence with bidirectional context and only requires a single forward pass. SLM can avoid the limitations of CLM (only unidirectional context) and MLM (multiple forward passes) and inherit their advantages, and thus achieve high effectiveness and efficiency in scoring. Experimental results on multiple tasks demonstrate that our method achieves better performance than other language modelings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2023

Bidirectional Transformer Reranker for Grammatical Error Correction

Pre-trained seq2seq models have achieved state-of-the-art results in the...
research
04/20/2020

MPNet: Masked and Permuted Pre-training for Language Understanding

BERT adopts masked language modeling (MLM) for pre-training and is one o...
research
05/16/2019

Effective Sentence Scoring Method using Bidirectional Language Model for Speech Recognition

In automatic speech recognition, many studies have shown performance imp...
research
10/25/2021

Paradigm Shift in Language Modeling: Revisiting CNN for Modeling Sanskrit Originated Bengali and Hindi Language

Though there has been a large body of recent works in language modeling ...
research
11/04/2021

A text autoencoder from transformer for fast encoding language representation

In recent years BERT shows apparent advantages and great potential in na...
research
08/19/2020

Context-aware Goodness of Pronunciation for Computer-Assisted Pronunciation Training

Mispronunciation detection is an essential component of the Computer-Ass...
research
05/13/2023

Self-Supervised Sentence Compression for Meeting Summarization

The conventional summarization model often fails to capture critical inf...

Please sign up or login with your details

Forgot password? Click here to reset