Log In Sign Up

Neural Sequence Segmentation as Determining the Leftmost Segments

by   Yangming Li, et al.

Prior methods to text segmentation are mostly at token level. Despite the adequacy, this nature limits their full potential to capture the long-term dependencies among segments. In this work, we propose a novel framework that incrementally segments natural language sentences at segment level. For every step in segmentation, it recognizes the leftmost segment of the remaining sequence. Implementations involve LSTM-minus technique to construct the phrase representations and recurrent neural networks (RNN) to model the iterations of determining the leftmost segments. We have conducted extensive experiments on syntactic chunking and Chinese part-of-speech (POS) tagging across 3 datasets, demonstrating that our methods have significantly outperformed previous all baselines and achieved new state-of-the-art results. Moreover, qualitative analysis and the study on segmenting long-length sentences verify its effectiveness in modeling long-term dependencies.


page 1

page 2

page 3

page 4


Segmenting Natural Language Sentences via Lexical Unit Analysis

In this work, we present Lexical Unit Analysis (LUA), a framework for ge...

Segmental Recurrent Neural Networks

We introduce segmental recurrent neural networks (SRNNs) which define, g...

Learning Longer-term Dependencies via Grouped Distributor Unit

Learning long-term dependencies still remains difficult for recurrent ne...

Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies

The task of topical segmentation is well studied, but previous work has ...

Sequence Modeling via Segmentations

Segmental structure is a common pattern in many types of sequences such ...

Twin Networks: Using the Future as a Regularizer

Being able to model long-term dependencies in sequential data, such as t...

MrGCN: Mirror Graph Convolution Network for Relation Extraction with Long-Term Dependencies

The ability to capture complex linguistic structures and long-term depen...

Code Repositories


Open-source code for our NAACL-2021 paper: "Neural Sequence Segmentation as Determining the Leftmost Segments".

view repo