Large-scale Hierarchical Alignment for Author Style Transfer

10/18/2018
by   Nikola I. Nikolov, et al.
0

We propose a simple method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different author styles, such as scientific papers and Wikipedia articles. Our approach is to first hierarchically search for nearest document neighbours and then for sentences therein. We demonstrate the effectiveness of our method through automatic and extrinsic evaluation on two tasks: text simplification from Wikipedia to Simple Wikipedia and style transfer from scientific journal articles to press releases. We show that pseudo-parallel sentences extracted with our method not only improve existing parallel data, but can even lead to competitive performance on their own.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2017

Style Transfer from Non-Parallel Text by Cross-Alignment

This paper focuses on style transfer on the basis of non-parallel text. ...
research
09/25/2019

Semi-supervised Text Style Transfer: Cross Projection in Latent Space

Text style transfer task requires the model to transfer a sentence of on...
research
09/19/2018

Monolingual sentence matching for text simplification

This work improves monolingual sentence alignment for text simplificatio...
research
03/16/2022

Multilingual Pre-training with Language and Task Adaptation for Multilingual Text Style Transfer

We exploit the pre-trained seq2seq model mBART for multilingual text sty...
research
05/10/2023

WikiSQE: A Large-Scale Dataset for Sentence Quality Estimation in Wikipedia

Wikipedia can be edited by anyone and thus contains various quality sent...
research
08/16/2019

How Sequence-to-Sequence Models Perceive Language Styles?

Style is ubiquitous in our daily language uses, while what is language s...
research
08/31/2019

(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas

Stylistic variation in text needs to be studied with different aspects i...

Please sign up or login with your details

Forgot password? Click here to reset