Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models

05/31/2021
by   Chong Li, et al.
0

A sequence-to-sequence learning with neural networks has empirically proven to be an effective framework for Chinese Spelling Correction (CSC), which takes a sentence with some spelling errors as input and outputs the corrected one. However, CSC models may fail to correct spelling errors covered by the confusion sets, and also will encounter unseen ones. We propose a method, which continually identifies the weak spots of a model to generate more valuable training instances, and apply a task-specific pre-training strategy to enhance the model. The generated adversarial examples are gradually added to the training set. Experimental results show that such an adversarial training method combined with the pretraining strategy can improve both the generalization and robustness of multiple CSC models across three different datasets, achieving stateof-the-art performance for CSC task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2022

CSCD-IME: Correcting Spelling Errors Generated by Pinyin IME

Chinese Spelling Correction (CSC) is a task to detect and correct spelli...
research
07/03/2018

Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study

Neural sequence-to-sequence (seq2seq) approaches have proven to be succe...
research
06/28/2023

An Adversarial Multi-Task Learning Method for Chinese Text Correction with Semantic Detection

Text correction, especially the semantic correction of more widely used ...
research
10/23/2022

Focus Is What You Need For Chinese Grammatical Error Correction

Chinese Grammatical Error Correction (CGEC) aims to automatically detect...
research
04/15/2021

An Alignment-Agnostic Model for Chinese Text Error Correction

This paper investigates how to correct Chinese text errors with types of...
research
02/11/2020

A Non-Intrusive Correction Algorithm for Classification Problems with Corrupted Data

A novel correction algorithm is proposed for multi-class classification ...

Please sign up or login with your details

Forgot password? Click here to reset