Rethinking Masked Language Modeling for Chinese Spelling Correction

05/28/2023
by   Hongqiu Wu, et al.
0

In this paper, we study Chinese Spelling Correction (CSC) as a joint decision made by two separate models: a language model and an error model. Through empirical analysis, we find that fine-tuning BERT tends to over-fit the error model while under-fit the language model, resulting in poor generalization to out-of-distribution error patterns. Given that BERT is the backbone of most CSC models, this phenomenon has a significant negative impact. To address this issue, we are releasing a multi-domain benchmark LEMON, with higher quality and diversity than existing benchmarks, to allow a comprehensive assessment of the open domain generalization of CSC models. Then, we demonstrate that a very simple strategy, randomly masking 20% non-error tokens from the input sequence during fine-tuning is sufficient for learning a much better language model without sacrificing the error model. This technique can be applied to any model architecture and achieves new state-of-the-art results on SIGHAN, ECSpell, and LEMON.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2023

Chinese Spelling Correction as Rephrasing Language Model

This paper studies Chinese Spelling Correction (CSC), which aims to dete...
research
12/01/2021

DPRK-BERT: The Supreme Language Model

Deep language models have achieved remarkable success in the NLP domain....
research
08/27/2021

Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors

In this paper, we explore the capacity of a language model-based method ...
research
05/15/2020

Spelling Error Correction with Soft-Masked BERT

Spelling error correction is an important yet challenging task because a...
research
04/07/2023

Does Prompt-Tuning Language Model Ensure Privacy?

Prompt-tuning has received attention as an efficient tuning method in th...
research
01/16/2023

An Error-Guided Correction Model for Chinese Spelling Error Correction

Although existing neural network approaches have achieved great success ...
research
08/22/2023

Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries

Error prediction in large language models often relies on domain-specifi...

Please sign up or login with your details

Forgot password? Click here to reset