Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

11/13/2021
by   Matej Klemen, et al.
0

Paraphrasing is a useful natural language processing task that can contribute to more diverse generated or translated texts. Natural language inference (NLI) and paraphrasing share some similarities and can benefit from a joint approach. We propose a novel methodology for the extraction of paraphrasing datasets from NLI datasets and cleaning existing paraphrasing datasets. Our approach is based on bidirectional entailment; namely, if two sentences can be mutually entailed, they are paraphrases. We evaluate our approach using several large pretrained transformer language models in the monolingual and cross-lingual setting. The results show high quality of extracted paraphrasing datasets and surprisingly high noise levels in two existing paraphrasing datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2021

TransWiC at SemEval-2021 Task 2: Transformer-based Multilingual and Cross-lingual Word-in-Context Disambiguation

Identifying whether a word carries the same meaning or different meaning...
research
06/07/2021

Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference

Multilingual transformers (XLM, mT5) have been shown to have remarkable ...
research
10/02/2012

A Semantic Approach for Automatic Structuring and Analysis of Software Process Patterns

The main contribution of this paper, is to propose a novel semantic appr...
research
03/24/2022

Generating Data to Mitigate Spurious Correlations in Natural Language Inference Datasets

Natural language processing models often exploit spurious correlations b...
research
05/06/2020

A Multi-Perspective Architecture for Semantic Code Search

The ability to match pieces of code to their corresponding natural langu...
research
02/10/2021

Language Models for Lexical Inference in Context

Lexical inference in context (LIiC) is the task of recognizing textual e...
research
03/09/2023

Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification

To make robots accessible to a broad audience, it is critical to endow t...

Please sign up or login with your details

Forgot password? Click here to reset