Unified Multi-Criteria Chinese Word Segmentation with BERT

04/13/2020
by   Zhen Ke, et al.
0

Multi-Criteria Chinese Word Segmentation (MCCWS) aims at finding word boundaries in a Chinese sentence composed of continuous characters while multiple segmentation criteria exist. The unified framework has been widely used in MCCWS and shows its effectiveness. Besides, the pre-trained BERT language model has been also introduced into the MCCWS task in a multi-task learning framework. In this paper, we combine the superiority of the unified framework and pretrained language model, and propose a unified MCCWS model based on BERT. Moreover, we augment the unified BERT-based MCCWS model with the bigram features and an auxiliary criterion classification task. Experiments on eight datasets with diverse criteria demonstrate that our methods could achieve new state-of-the-art results for MCCWS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Pre-trained Model for Chinese Word Segmentation with Meta Learning

Recent researches show that pre-trained models such as BERT (Devlin et a...
research
03/11/2019

Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning

The ambiguous annotation criteria bring into the divergence of Chinese W...
research
12/31/2020

Unified Mandarin TTS Front-end Based on Distilled BERT Model

The front-end module in a typical Mandarin text-to-speech system (TTS) i...
research
06/28/2019

Multi-Criteria Chinese Word Segmentation with Transformer

Different linguistic perspectives cause many diverse segmentation criter...
research
04/25/2017

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Different linguistic perspectives causes many diverse segmentation crite...
research
12/19/2018

Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

Multi-criteria Chinese word segmentation is a promising but challenging ...
research
09/20/2019

BERT Meets Chinese Word Segmentation

Chinese word segmentation (CWS) is a fundamental task for Chinese langua...

Please sign up or login with your details

Forgot password? Click here to reset