Text Segmentation as a Supervised Learning Task

03/25/2018
by   Omri Koshorek, et al.
0

Text segmentation, the task of dividing a document into contiguous segments based on its semantic structure, is a longstanding challenge in language understanding. Previous work on text segmentation focused on unsupervised methods such as clustering or graph search, due to the paucity in labeled data. In this work, we formulate text segmentation as a supervised learning problem, and present a large new dataset for text segmentation that is automatically extracted and labeled from Wikipedia. Moreover, we develop a segmentation model based on this dataset and show that it generalizes well to unseen natural text.

READ FULL TEXT
research
08/28/2018

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation

We propose KDSL, a new word sense disambiguation (WSD) framework that ut...
research
11/20/2020

Finding Prerequisite Relations between Concepts using Textbook

A prerequisite is anything that you need to know or understand first bef...
research
10/25/2022

Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies

The task of topical segmentation is well studied, but previous work has ...
research
08/29/2018

Attention-based Neural Text Segmentation

Text segmentation plays an important role in various Natural Language Pr...
research
08/31/2018

A Supervised Learning Approach For Heading Detection

As the Portable Document Format (PDF) file format increases in popularit...
research
11/01/2022

Seg Struct: The Interplay Between Part Segmentation and Structure Inference for 3D Shape Parsing

We propose Seg Struct, a supervised learning framework leveraging the ...
research
11/26/2015

OntoSeg: a Novel Approach to Text Segmentation using Ontological Similarity

Text segmentation (TS) aims at dividing long text into coherent segments...

Please sign up or login with your details

Forgot password? Click here to reset