Unsupervised Sentence Compression using Denoising Auto-Encoders

09/07/2018
by   Thibault Févry, et al.
0

In sentence compression, the task of shortening sentences while retaining the original meaning, models tend to be trained on large corpora containing pairs of verbose and compressed sentences. To remove the need for paired corpora, we emulate a summarization task and add noise to extend sentences and train a denoising auto-encoder to recover the original, constructing an end-to-end training regime without the need for any examples of compressed sentences. We conduct a human evaluation of our model on a standard text summarization dataset and show that it performs comparably to a supervised baseline based on grammatical correctness and retention of meaning. Despite being exposed to no target data, our unsupervised models learn to generate imperfect but reasonably readable sentence summaries. Although we underperform supervised models based on ROUGE scores, our models are competitive with a supervised baseline based on human evaluation for grammatical correctness and retention of meaning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2019

SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders

We propose an end-to-end neural model for zero-shot abstractive text sum...
research
10/05/2018

Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks

Auto-encoders compress input data into a latent-space representation and...
research
04/21/2018

Unsupervised Natural Language Generation with Denoising Autoencoders

Generating text from structured data is important for various tasks such...
research
10/23/2017

Testing the limits of unsupervised learning for semantic similarity

Semantic Similarity between two sentences can be defined as a way to det...
research
02/16/2021

Orthogonal Features-based EEG Signal Denoising using Fractionally Compressed AutoEncoder

A fractional-based compressed auto-encoder architecture has been introdu...
research
01/03/2020

TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising

Text summarization aims to extract essential information from a piece of...
research
02/09/2021

Decontextualization: Making Sentences Stand-Alone

Models for question answering, dialogue agents, and summarization often ...

Please sign up or login with your details

Forgot password? Click here to reset