Pointwise Paraphrase Appraisal is Potentially Problematic

05/25/2020
by   Hannah Chen, et al.
0

The prevailing approach for training and evaluating paraphrase identification models is constructed as a binary classification problem: the model is given a pair of sentences, and is judged by how accurately it classifies pairs as either paraphrases or non-paraphrases. This pointwise-based evaluation method does not match well the objective of most real world applications, so the goal of our work is to understand how models which perform well under pointwise evaluation may fail in practice and find better methods for evaluating paraphrase identification models. As a first step towards that goal, we show that although the standard way of fine-tuning BERT for paraphrase identification by pairing two sentences as one sequence results in a model with state-of-the-art performance, that model may perform poorly on simple tasks like identifying pairs with two identical sentences. Moreover, we show that these models may even predict a pair of randomly-selected sentences with higher paraphrase score than a pair of identical ones.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2022

ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD

Using pre-trained transformer models such as BERT has proven to be effec...
research
05/02/2022

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Inference tasks such as answer sentence selection (AS2) or fact verifica...
research
12/16/2015

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

How to model a pair of sentences is a critical issue in many NLP tasks s...
research
11/22/2021

Finding the Winning Ticket of BERT for Binary Text Classification via Adaptive Layer Truncation before Fine-tuning

In light of the success of transferring language models into NLP tasks, ...
research
03/04/2021

Error-driven Fixed-Budget ASR Personalization for Accented Speakers

We consider the task of personalizing ASR models while being constrained...
research
07/02/2022

Separating and Collapsing Electoral Control Types

[HHM20] discovered, for 7 pairs (C,D) of seemingly distinct standard ele...
research
11/03/2018

Unsupervised Identification of Study Descriptors in Toxicology Research: An Experimental Study

Identifying and extracting data elements such as study descriptors in pu...

Please sign up or login with your details

Forgot password? Click here to reset