Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation

04/26/2019
by   Fangzhao Wu, et al.
0

Chinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate words, making it difficult to identify the boundary of entities. Besides, the training data for CNER in many domains is usually insufficient, and annotating enough training data for CNER is very expensive and time-consuming. In this paper, we propose a neural approach for CNER. First, we introduce a CNN-LSTM-CRF neural architecture to capture both local and long-distance contexts for CNER. Second, we propose a unified framework to jointly train CNER and word segmentation models in order to enhance the ability of CNER model in identifying entity boundaries. Third, we introduce an automatic method to generate pseudo labeled samples from existing labeled data which can enrich the training data. Experiments on two benchmark datasets show that our approach can effectively improve the performance of Chinese named entity recognition, especially when training data is insufficient.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
02/27/2020

Integrating Boundary Assembling into a DNN Framework for Named Entity Recognition in Chinese Social Media Text

Named entity recognition is a challenging task in Natural Language Proce...
research
08/31/2022

Application of Data Encryption in Chinese Named Entity Recognition

Recently, with the continuous development of deep learning, the performa...
research
08/27/2018

Fast and Accurate Recognition of Chinese Clinical Named Entities with Residual Dilated Convolutions

Clinical Named Entity Recognition (CNER) aims to identify and classify c...
research
07/05/2018

Chinese Lexical Analysis with Deep Bi-GRU-CRF Network

Lexical analysis is believed to be a crucial step towards natural langua...
research
12/20/2017

Adversarial Structured Prediction for Multivariate Measures

Many predicted structured objects (e.g., sequences, matchings, trees) ar...
research
04/13/2018

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Clinical Named Entity Recognition (CNER) aims to identify and classify c...
research
10/27/2022

Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling

Boundary information is critical for various Chinese language processing...

Please sign up or login with your details

Forgot password? Click here to reset