FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction

10/22/2022
by   Lvxiaowei Xu, et al.
0

Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category and scale. In this paper, we present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors. FCGEC is a human-annotated corpus with multiple references, consisting of 41,340 sentences collected mainly from multi-choice questions in public school Chinese examinations. Furthermore, we propose a Switch-Tagger-Generator (STG) baseline model to correct the grammatical errors in low-resource settings. Compared to other GEC benchmark models, experimental results illustrate that STG outperforms them on our FCGEC. However, there exists a significant gap between benchmark models and humans that encourages future models to bridge it.

READ FULL TEXT
research
10/19/2022

Linguistic Rules-Based Corpus Generation for Native Chinese Grammatical Error Correction

Chinese Grammatical Error Correction (CGEC) is both a challenging NLP ta...
research
07/26/2023

GrammarGPT: Exploring Open-Source LLMs for Native Chinese Grammatical Error Correction with Supervised Fine-Tuning

Grammatical error correction aims to correct ungrammatical sentences aut...
research
04/15/2021

An Alignment-Agnostic Model for Chinese Text Error Correction

This paper investigates how to correct Chinese text errors with types of...
research
05/09/2023

CSED: A Chinese Semantic Error Diagnosis Corpus

Recently, much Chinese text error correction work has focused on Chinese...
research
05/25/2023

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

We introduce NaSGEC, a new dataset to facilitate research on Chinese gra...
research
10/23/2022

Focus Is What You Need For Chinese Grammatical Error Correction

Chinese Grammatical Error Correction (CGEC) aims to automatically detect...
research
11/16/2022

CSCD-IME: Correcting Spelling Errors Generated by Pinyin IME

Chinese Spelling Correction (CSC) is a task to detect and correct spelli...

Please sign up or login with your details

Forgot password? Click here to reset