Do Grammatical Error Correction Models Realize Grammatical Generalization?

06/06/2021
by   Masato Mita, et al.
0

There has been an increased interest in data generation approaches to grammatical error correction (GEC) using pseudo data. However, these approaches suffer from several issues that make them inconvenient for real-world deployment including a demand for large amounts of training data. On the other hand, some errors based on grammatical rules may not necessarily require a large amount of data if GEC models can realize grammatical generalization. This study explores to what extent GEC models generalize grammatical knowledge required for correcting errors. We introduce an analysis method using synthetic and real GEC datasets with controlled vocabularies to evaluate whether models can generalize to unseen errors. We found that a current standard Transformer-based GEC model fails to realize grammatical generalization even in simple settings with limited vocabulary and syntax, suggesting that it lacks the generalization ability required to correct errors from provided training examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora

Grammatical error correction (GEC) is the task of correcting typos, spel...
research
05/03/2020

Correcting the Autocorrect: Context-Aware Typographical Error Correction via Training Data Augmentation

In this paper, we explore the artificial generation of typographical err...
research
01/10/2020

Towards Minimal Supervision BERT-based Grammar Error Correction

Current grammatical error correction (GEC) models typically consider the...
research
04/20/2021

Grammatical Error Generation Based on Translated Fragments

We perform neural machine translation of sentence fragments in order to ...
research
11/01/2018

Spelling Error Correction Using a Nested RNN Model and Pseudo Training Data

We propose a nested recurrent neural network (nested RNN) model for Engl...
research
06/16/2023

Improving Audio Caption Fluency with Automatic Error Correction

Automated audio captioning (AAC) is an important cross-modality translat...
research
03/17/2022

Type-Driven Multi-Turn Corrections for Grammatical Error Correction

Grammatical Error Correction (GEC) aims to automatically detect and corr...

Please sign up or login with your details

Forgot password? Click here to reset