Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses

10/15/2020
by   Simon Flachs, et al.
0

Evaluation of grammatical error correction (GEC) systems has primarily focused on essays written by non-native learners of English, which however is only part of the full spectrum of GEC applications. We aim to broaden the target domain of GEC and release CWEB, a new benchmark for GEC consisting of website text generated by English speakers of varying levels of proficiency. Website data is a common and important domain that contains far fewer grammatical errors than learner essays, which we show presents a challenge to state-of-the-art GEC systems. We demonstrate that a factor behind this is the inability of systems to rely on a strong internal language model in low error density domains. We hope this work shall facilitate the development of open-domain GEC models that generalize to different topics and genres.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2023

NaSGEC: a Multi-Domain Chinese Grammatical Error Correction Dataset from Native Speaker Texts

We introduce NaSGEC, a new dataset to facilitate research on Chinese gra...
research
07/04/2023

A Language Model for Grammatical Error Correction in L2 Russian

Grammatical error correction is one of the fundamental tasks in Natural ...
research
02/14/2017

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG...
research
10/21/2020

Classifying Syntactic Errors in Learner Language

We present a method for classifying syntactic errors in learner language...
research
12/15/2021

ErAConD : Error Annotated Conversational Dialog Dataset for Grammatical Error Correction

Currently available grammatical error correction (GEC) datasets are comp...
research
08/19/2022

Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment

Grammatical Error Correction (GEC) systems perform a sequence-to-sequenc...
research
06/04/2020

Personalizing Grammatical Error Correction: Adaptation to Proficiency Level and L1

Grammar error correction (GEC) systems have become ubiquitous in a varie...

Please sign up or login with your details

Forgot password? Click here to reset