RST Parsing from Scratch

by   Thanh-Tung Nguyen, et al.

We introduce a novel top-down end-to-end formulation of document-level discourse parsing in the Rhetorical Structure Theory (RST) framework. In this formulation, we consider discourse parsing as a sequence of splitting decisions at token boundaries and use a seq2seq network to model the splitting decisions. Our framework facilitates discourse parsing from scratch without requiring discourse segmentation as a prerequisite; rather, it yields segmentation as part of the parsing process. Our unified parsing model adopts a beam search to decode the best tree structure by searching through a space of high-scoring trees. With extensive experiments on the standard English RST discourse treebank, we demonstrate that our parser outperforms existing methods by a good margin in both end-to-end parsing and parsing with gold segmentation. More importantly, it does so without using any handcrafted features, making it faster and easily adaptable to new languages and domains.


page 1

page 2

page 3

page 4


A Conditional Splitting Framework for Efficient Constituency Parsing

We introduce a generic seq2seq parsing framework that casts constituency...

Joint Syntacto-Discourse Parsing and the Syntacto-Discourse Treebank

Discourse parsing has long been treated as a stand-alone problem indepen...

DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing

Text discourse parsing weighs importantly in understanding information f...

Top-down Discourse Parsing via Sequence Labelling

We introduce a top-down approach to discourse parsing that is conceptual...

Fast Rhetorical Structure Theory Discourse Parsing

In recent years, There has been a variety of research on discourse parsi...

Text2Math: End-to-end Parsing Text into Math Expressions

We propose Text2Math, a model for semantically parsing text into math ex...

Cross-lingual and cross-domain discourse segmentation of entire documents

Discourse segmentation is a crucial step in building end-to-end discours...