RethinkCWS: Is Chinese Word Segmentation a Solved Task?

11/13/2020
by   Jinlan Fu, et al.
0

The performance of the Chinese Word Segmentation (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks, especially the successful use of large pre-trained models. In this paper, we take stock of what we have achieved and rethink what's left in the CWS task. Methodologically, we propose a fine-grained evaluation for existing CWS systems, which not only allows us to diagnose the strengths and weaknesses of existing models (under the in-dataset setting), but enables us to quantify the discrepancy between different criterion and alleviate the negative transfer problem when doing multi-criteria learning. Strategically, despite not aiming to propose a novel model in this paper, our comprehensive experiments on eight models and seven datasets, as well as thorough analysis, could search for some promising direction for future research. We make all codes publicly available and release an interface that can quickly evaluate and diagnose user's models: https://github.com/neulab/InterpretEval.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Pre-trained Model for Chinese Word Segmentation with Meta Learning

Recent researches show that pre-trained models such as BERT (Devlin et a...
research
04/25/2017

Adversarial Multi-Criteria Learning for Chinese Word Segmentation

Different linguistic perspectives causes many diverse segmentation crite...
research
06/05/2023

MCTS: A Multi-Reference Chinese Text Simplification Dataset

Text simplification aims to make the text easier to understand by applyi...
research
08/25/2020

Conceptualized Representation Learning for Chinese Biomedical Text Mining

Biomedical text mining is becoming increasingly important as the number ...
research
02/17/2023

Model Doctor for Diagnosing and Treating Segmentation Error

Despite the remarkable progress in semantic segmentation tasks with the ...
research
06/28/2019

Multi-Criteria Chinese Word Segmentation with Transformer

Different linguistic perspectives cause many diverse segmentation criter...
research
01/30/2023

Evaluating Neuron Interpretation Methods of NLP Models

Neuron Interpretation has gained traction in the field of interpretabili...

Please sign up or login with your details

Forgot password? Click here to reset