Chinese Word Segmentation with Heterogeneous Graph Neural Network

01/22/2022
by   Xuemei Tang, et al.
6

In recent years, deep learning has achieved significant success in the Chinese word segmentation (CWS) task. Most of these methods improve the performance of CWS by leveraging external information, e.g., words, sub-words, syntax. However, existing approaches fail to effectively integrate the multi-level linguistic information and also ignore the structural feature of the external information. Therefore, in this paper, we proposed a framework to improve CWS, named HGNSeg. It exploits multi-level external information sufficiently with the pre-trained language model and heterogeneous graph neural network. The experimental results on six benchmark datasets (e.g., Bakeoff 2005, Bakeoff 2008) validate that our approach can effectively improve the performance of Chinese word segmentation. Importantly, in cross-domain scenarios, our method also shows a strong ability to alleviate the out-of-vocabulary (OOV) problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2018

Neural Chinese Word Segmentation with Dictionary Knowledge

Chinese word segmentation (CWS) is an important task for Chinese NLP. Re...
research
01/17/2019

Robust Chinese Word Segmentation with Contextualized Word Representations

In recent years, after the neural-network-based method was proposed, the...
research
07/03/2019

Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features

This paper describes a conditional neural network architecture for Manda...
research
02/25/2021

LET: Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching

Chinese short text matching is a fundamental task in natural language pr...
research
05/19/2021

Combining GCN and Transformer for Chinese Grammatical Error Detection

This paper describes our system at NLPTEA-2020 Task: Chinese Grammatical...
research
01/18/2019

Chinese Word Segmentation: Another Decade Review (2007-2017)

This paper reviews the development of Chinese word segmentation (CWS) in...
research
12/03/2020

Label Enhanced Event Detection with Heterogeneous Graph Attention Networks

Event Detection (ED) aims to recognize instances of specified types of e...

Please sign up or login with your details

Forgot password? Click here to reset