Effective Neural Solution for Multi-Criteria Word Segmentation

12/07/2017
by   He Han, et al.
0

We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS). Our novel design requires no private layers in model architecture, instead, introduces two artificial tokens at the beginning and ending of input sentence to specify the required target criteria. The rest of the model including Long Short-Term Memory (LSTM) layer and Conditional Random Fields (CRFs) layer remains unchanged and is shared across all datasets, keeping the size of parameter collection minimal and constant. On Bakeoff 2005 and Bakeoff 2008 datasets, our innovative design has surpassed both single-criterion and multi-criteria state-of-the-art learning results. To the best knowledge, our design is the first one that has achieved the latest high performance on such large scale datasets. Source codes and corpora of this paper are available on GitHub.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2018

Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

Multi-criteria Chinese word segmentation is a promising but challenging ...
research
04/25/2017

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Different linguistic perspectives causes many diverse segmentation crite...
research
06/28/2019

Multi-Criteria Chinese Word Segmentation with Transformer

Different linguistic perspectives cause many diverse segmentation criter...
research
09/20/2017

De-identification of medical records using conditional random fields and long short-term memory networks

The CEGS N-GRID 2016 Shared Task 1 in Clinical Natural Language Processi...
research
03/11/2019

Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

The ambiguous annotation criteria bring into the divergence of Chinese W...
research
05/21/2019

A Seq-to-Seq Transformer Premised Temporal Convolutional Network for Chinese Word Segmentation

The prevalent approaches of Chinese word segmentation task almost rely o...

Please sign up or login with your details

Forgot password? Click here to reset