Neural semi-Markov CRF for Monolingual Word Alignment

06/04/2021
by   Wuwei Lan, et al.
0

Monolingual word alignment is important for studying fine-grained editing operations (i.e., deletion, addition, and substitution) in text-to-text generation tasks, such as paraphrase generation, text simplification, neutralizing biased language, etc. In this paper, we present a novel neural semi-Markov CRF alignment model, which unifies word and phrase alignments through variable-length spans. We also create a new benchmark with human annotations that cover four different text genres to evaluate monolingual word alignment models in more realistic settings. Experimental results show that our proposed model outperforms all previous approaches for monolingual word alignment as well as a competitive QA-based baseline, which was previously only applied to bilingual data. Our model demonstrates good generalizability to three out-of-domain datasets and shows great utility in two downstream applications: automatic text simplification and sentence pair classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2020

Neural CRF Model for Sentence Alignment in Text Simplification

The success of a text simplification system heavily depends on the quali...
research
10/14/2019

Updating Pre-trained Word Vectors and Text Classifiers using Monolingual Alignment

In this paper, we focus on the problem of adapting word vector-based mod...
research
06/14/2022

Text Generation with Text-Editing Models

Text-editing models have recently become a prominent alternative to seq2...
research
10/09/2018

A Fast, Compact, Accurate Model for Language Identification of Codemixed Text

We address fine-grained multilingual language identification: providing ...
research
09/19/2018

Monolingual sentence matching for text simplification

This work improves monolingual sentence alignment for text simplificatio...
research
10/26/2022

arXivEdits: Understanding the Human Revision Process in Scientific Writing

Scientific publications are the primary means to communicate research di...
research
09/06/2022

Monolingual alignment of word senses and definitions in lexicographical resources

The focus of this thesis is broadly on the alignment of lexicographical ...

Please sign up or login with your details

Forgot password? Click here to reset