Unsupervised Neural Word Segmentation for Chinese via Segmental Language Modeling

by   Zhiqing Sun, et al.
Peking University

Previous traditional approaches to unsupervised Chinese word segmentation (CWS) can be roughly classified into discriminative and generative models. The former uses the carefully designed goodness measures for candidate segmentation, while the latter focuses on finding the optimal segmentation of the highest generative probability. However, while there exists a trivial way to extend the discriminative models into neural version by using neural language models, those of generative ones are non-trivial. In this paper, we propose the segmental language models (SLMs) for CWS. Our approach explicitly focuses on the segmental nature of Chinese, as well as preserves several properties of language models. In SLMs, a context encoder encodes the previous context and a segment decoder generates each segment incrementally. As far as we know, we are the first to propose a neural model for unsupervised CWS and achieve competitive performance to the state-of-the-art statistical models on four different datasets from SIGHAN 2005 bakeoff.


page 1

page 2

page 3

page 4


Unsupervised Word Segmentation with Bi-directional Neural Language Model

We present an unsupervised word segmentation model, in which the learnin...

Fast and Accurate Neural Word Segmentation for Chinese

Neural models with minimal feature engineering have achieved competitive...

Universal Word Segmentation: Implementation and Interpretation

Word segmentation is a low-level NLP task that is non-trivial for a cons...

Fast Neural Chinese Word Segmentation for Long Sentences

Rapidly developed neural models have achieved competitive performance in...

Unsupervised Recurrent Neural Network Grammars

Recurrent neural network grammars (RNNG) are generative models of langua...

A Generative Parser with a Discriminative Recognition Algorithm

Generative models defining joint distributions over parse trees and sent...

Neural or Statistical: An Empirical Study on Language Models for Chinese Input Recommendation on Mobile

Chinese input recommendation plays an important role in alleviating huma...

Please sign up or login with your details

Forgot password? Click here to reset