Text Segmentation by Cross Segment Attention

04/30/2020
by   Michal Lukasik, et al.
0

Document and discourse segmentation are two fundamental NLP tasks pertaining to breaking up text into constituents, which are commonly used to help downstream tasks such as information retrieval or text summarization. In this work, we propose three transformer-based architectures and provide comprehensive comparisons with previously proposed approaches on three standard datasets. We establish a new state-of-the-art, reducing in particular the error rates by a large margin in all cases. We further analyze model sizes and find that we can build models with many fewer parameters while keeping good performance, thus facilitating real-world applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2023

Document Summarization with Text Segmentation

In this paper, we exploit the innate document segment structure for impr...
research
05/10/2020

From Standard Summarization to New Tasks and Beyond: Summarization with Manifold Information

Text summarization is the research area aiming at creating a short and c...
research
02/11/2021

Text Compression-aided Transformer Encoding

Text encoding is one of the most important steps in Natural Language Pro...
research
11/27/2022

Topic Segmentation in the Wild: Towards Segmentation of Semi-structured Unstructured Chats

Breaking down a document or a conversation into multiple contiguous segm...
research
01/03/2020

Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation

Breaking down the structure of long texts into semantically coherent seg...
research
11/13/2018

Discourse in Multimedia: A Case Study in Information Extraction

To ensure readability, text is often written and presented with due form...
research
10/11/2021

Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference

Performing event and entity coreference resolution across documents vast...

Please sign up or login with your details

Forgot password? Click here to reset