Text Revision by On-the-Fly Representation Optimization

04/15/2022
by   Jingjing Li, et al.
0

Text revision refers to a family of natural language generation tasks, where the source and target sequences share moderate resemblance in surface form but differentiate in attributes, such as text formality and simplicity. Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems, which rely on large-scale parallel training corpus. In this paper, we present an iterative in-place editing approach for text revision, which requires no parallel data. In this approach, we simply fine-tune a pre-trained Transformer with masked language modeling and attribute classification. During inference, the editing at each iteration is realized by two-step span replacement. At the first step, the distributed representation of the text optimizes on the fly towards an attribute function. At the second step, a text span is masked and another new one is proposed conditioned on the optimized representation. The empirical experiments on two typical and important text revision tasks, text formalization and text simplification, show the effectiveness of our approach. It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification, and gains better performance than strong unsupervised methods on text formalization [Code and model are available at <https://github.com/jingjingli01/OREO>].

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2021

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

We present BARTpho with two versions – BARTpho_word and BARTpho_syllable...
research
05/30/2019

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

Unsupervised text attribute transfer automatically transforms a text to ...
research
06/24/2021

Winner Team Mia at TextVQA Challenge 2021: Vision-and-Language Representation Learning with Pre-trained Sequence-to-Sequence Model

TextVQA requires models to read and reason about text in images to answe...
research
03/08/2021

Text Simplification by Tagging

Edit-based approaches have recently shown promising results on multiple ...
research
04/22/2020

Keyphrase Prediction With Pre-trained Language Model

Recently, generative methods have been widely used in keyphrase predicti...
research
07/13/2023

Copy Is All You Need

The dominant text generation models compose the output by sequentially s...
research
07/18/2022

Label2Label: A Language Modeling Framework for Multi-Attribute Learning

Objects are usually associated with multiple attributes, and these attri...

Please sign up or login with your details

Forgot password? Click here to reset