DeepAI AI Chat
Log In Sign Up

A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks

by   Zhiqing Sun, et al.
Peking University

Most previous approaches to Chinese word segmentation can be roughly classified into character-based and word-based methods. The former regards this task as a sequence-labeling problem, while the latter directly segments character sequence into words. However, if we consider segmenting a given sentence, the most intuitive idea is to predict whether to segment for each gap between two consecutive characters, which in comparison makes previous approaches seem too complex. Therefore, in this paper, we propose a gap-based framework to implement this intuitive idea. Moreover, very deep convolutional neural networks, namely, ResNets and DenseNets, are exploited in our experiments. Results show that our approach outperforms the best character-based and word-based methods on 5 benchmarks, without any further post-processing module (e.g. Conditional Random Fields) nor beam search.


Chinese NER Using Lattice LSTM

We investigate a lattice-structured LSTM model for Chinese NER, which en...

Multiple Character Embeddings for Chinese Word Segmentation

Chinese word segmentation (CWS) is often regarded as a character-based s...

Neural Word Segmentation Learning for Chinese

Most previous approaches to Chinese word segmentation formalize this pro...

A Seq-to-Seq Transformer Premised Temporal Convolutional Network for Chinese Word Segmentation

The prevalent approaches of Chinese word segmentation task almost rely o...

Classical Chinese Sentence Segmentation for Tomb Biographies of Tang Dynasty

Tomb biographies of the Tang dynasty provide invaluable information abou...

Hyperbolic Deep Learning for Chinese Natural Language Understanding

Recently hyperbolic geometry has proven to be effective in building embe...

Onto Word Segmentation of the Complete Tang Poems

We aim at segmenting words in the Complete Tang Poems (CTP). Although it...