Discourse-Aware Text Simplification: From Complex Sentences to Linked Propositions

08/01/2023
by   Christina Niklaus, et al.
0

Sentences that present a complex syntax act as a major stumbling block for downstream Natural Language Processing applications whose predictive quality deteriorates with sentence length and complexity. The task of Text Simplification (TS) may remedy this situation. It aims to modify sentences in order to make them easier to process, using a set of rewriting operations, such as reordering, deletion, or splitting. State-of-the-art syntactic TS approaches suffer from two major drawbacks: first, they follow a very conservative approach in that they tend to retain the input rather than transforming it, and second, they ignore the cohesive nature of texts, where context spread across clauses or sentences is needed to infer the true meaning of a statement. To address these problems, we present a discourse-aware TS approach that splits and rephrases complex English sentences within the semantic context in which they occur. Based on a linguistically grounded transformation stage that uses clausal and phrasal disembedding mechanisms, complex sentences are transformed into shorter utterances with a simple canonical structure that can be easily analyzed by downstream applications. With sentence splitting, we thus address a TS task that has hardly been explored so far. Moreover, we introduce the notion of minimality in this context, as we aim to decompose source sentences into a set of self-contained minimal semantic units. To avoid breaking down the input into a disjointed sequence of statements that is difficult to interpret because important contextual information is missing, we incorporate the semantic context between the split propositions in the form of hierarchical structures and semantic relationships. In that way, we generate a semantic hierarchy of minimal propositions that leads to a novel representation of complex assertions that puts a semantic layer on top of the simplified sentences.

READ FULL TEXT

page 17

page 37

research
05/24/2021

Context-Preserving Text Simplification

We present a context-preserving text simplification (TS) approach that r...
research
09/26/2019

MinWikiSplit: A Sentence Splitting Corpus with Minimal Propositions

We compiled a new sentence splitting corpus that is composed of 203K pai...
research
09/26/2019

DisSim: A Discourse-Aware Syntactic Text Simplification Frameworkfor English and German

We introduce DisSim, a discourse-aware sentence splitting framework for ...
research
03/26/2016

Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

Understanding language goes hand in hand with the ability to integrate c...
research
06/03/2019

Transforming Complex Sentences into a Semantic Hierarchy

We present an approach for recursively splitting and rephrasing complex ...
research
01/16/2020

Fact-aware Sentence Split and Rephrase with Permutation Invariant Training

Sentence Split and Rephrase aims to break down a complex sentence into s...
research
07/30/2018

Graphene: Semantically-Linked Propositions in Open Information Extraction

We present an Open Information Extraction (IE) approach that uses a two-...

Please sign up or login with your details

Forgot password? Click here to reset