Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-stage Span Labeling

12/17/2021
by   Duc-Vu Nguyen, et al.
0

Chinese word segmentation and part-of-speech tagging are necessary tasks in terms of computational linguistics and application of natural language processing. Many re-searchers still debate the demand for Chinese word segmentation and part-of-speech tagging in the deep learning era. Nevertheless, resolving ambiguities and detecting unknown words are challenging problems in this field. Previous studies on joint Chinese word segmentation and part-of-speech tagging mainly follow the character-based tagging model focusing on modeling n-gram features. Unlike previous works, we propose a neural model named SpanSegTag for joint Chinese word segmentation and part-of-speech tagging following the span labeling in which the probability of each n-gram being the word and the part-of-speech tag is the main problem. We use the biaffine operation over the left and right boundary representations of consecutive characters to model the n-grams. Our experiments show that our BERT-based model SpanSegTag achieved competitive performances on the CTB5, CTB6, and UD, or significant improvements on CTB7 and CTB9 benchmark datasets compared with the current state-of-the-art method using BERT or ZEN encoders.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2021

Span Labeling Approach for Vietnamese and Chinese Word Segmentation

In this paper, we propose a span labeling approach to model n-gram infor...
research
02/24/2021

Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese

Word segmentation and part-of-speech tagging are two critical preliminar...
research
04/05/2017

Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF

We present a character-based model for joint segmentation and POS taggin...
research
10/27/2022

Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling

Boundary information is critical for various Chinese language processing...
research
11/16/2016

A Feature-Enriched Neural Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

Recently, neural network models for natural language processing tasks ha...
research
03/03/2023

Ancient Chinese Word Segmentation and Part-of-Speech Tagging Using Distant Supervision

Ancient Chinese word segmentation (WSG) and part-of-speech tagging (POS)...
research
11/03/2022

Joint Chinese Word Segmentation and Span-based Constituency Parsing

In constituency parsing, span-based decoding is an important direction. ...

Please sign up or login with your details

Forgot password? Click here to reset