EAT2seq: A generic framework for controlled sentence transformation without task-specific training

02/25/2019
by   Tommo Gröndahl, et al.
0

We present EAT2seq: a novel method to architect automatic linguistic transformations for a number of tasks, including controlled grammatical or lexical changes, style transfer, text generation, and machine translation. Our approach consists in creating an abstract representation of a sentence's meaning and grammar, which we use as input to an encoder-decoder network trained to reproduce the original sentence. Manipulating the abstract representation allows the transformation of sentences according to user-provided parameters, both grammatically and lexically, in any combination. The same architecture can further be used for controlled text generation, and has additional promise for machine translation. This strategy holds the promise of enabling many tasks that were hitherto outside the scope of NLP techniques for want of sufficient training data. We provide empirical evidence for the effectiveness of our approach by reproducing and transforming English sentences, and evaluating the results both manually and automatically. A single model trained on monolingual data is used for all tasks without any task-specific training. For a model trained on 8.5 million sentences, we report a BLEU score of 74.45 for reproduction, and scores between 55.29 and 81.82 for back-and-forth grammatical transformations across 14 category pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2019

Using logical form encodings for unsupervised linguistic transformation: Theory and applications

We present a novel method to architect automatic linguistic transformati...
research
10/10/2018

Improving Neural Text Simplification Model with Simplified Corpora

Text simplification (TS) can be viewed as monolingual translation task, ...
research
11/07/2019

The LIG system for the English-Czech Text Translation Task of IWSLT 2019

In this paper, we present our submission for the English to Czech Text T...
research
09/10/2021

BiSECT: Learning to Split and Rephrase Sentences with Bitexts

An important task in NLP applications such as sentence simplification is...
research
05/29/2021

Grammar Accuracy Evaluation (GAE): Quantifiable Intrinsic Evaluation of Machine Translation Models

Intrinsic evaluation by humans for the performance of natural language g...
research
09/09/2021

Generic resources are what you need: Style transfer tasks without task-specific parallel training data

Style transfer aims to rewrite a source text in a different target style...
research
10/04/2019

Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Data availability is a bottleneck during early stages of development of ...

Please sign up or login with your details

Forgot password? Click here to reset