Log In Sign Up

Artificial Error Generation with Machine Translation and Syntactic Patterns

by   Marek Rei, et al.

Shortage of available training data is holding back progress in the area of automated error detection. This paper investigates two alternative methods for artificially generating writing errors, in order to create additional resources. We propose treating error generation as a machine translation task, where grammatically correct text is translated to contain errors. In addition, we explore a system for extracting textual patterns from an annotated corpus, which can then be used to insert errors into grammatically correct sentences. Our experiments show that the inclusion of artificially generated errors significantly improves error detection accuracy on both FCE and CoNLL 2014 datasets.


page 1

page 2

page 3

page 4


The Unbearable Weight of Generating Artificial Errors for Grammatical Error Correction

In recent years, sequence-to-sequence models have been very effective fo...

Neural Text Generation with Artificial Negative Examples

Neural text generation models conditioning on given input (e.g. machine ...

Synthetic Error Dataset Generation Mimicking Bengali Writing Pattern

While writing Bengali using English keyboard, users often make spelling ...

Manually Annotated Spelling Error Corpus for Amharic

This paper presents a manually annotated spelling error corpus for Amhar...

Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection

Grammatical error correction, like other machine learning tasks, greatly...

How do you correct run-on sentences it's not as easy as it seems

Run-on sentences are common grammatical mistakes but little research has...

Paraphrase Generation as Unsupervised Machine Translation

In this paper, we propose a new paradigm for paraphrase generation by tr...