Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

05/23/2022
by   Masato Mita, et al.
0

Natural language processing technology has rapidly improved automated grammatical error correction tasks, and the community begins to explore document-level revision as one of the next challenges. To go beyond sentence-level automated grammatical error correction to NLP-based document-level revision assistant, there are two major obstacles: (1) there are few public corpora with document-level revisions being annotated by professional editors, and (2) it is not feasible to elicit all possible references and evaluate the quality of revision with such references because there are infinite possibilities of revision. This paper tackles these challenges. First, we introduce a new document-revision corpus, TETRA, where professional editors revised academic papers sampled from the ACL anthology which contain few trivial grammatical errors that enable us to focus more on document- and paragraph-level edits such as coherence and consistency. Second, we explore reference-less and interpretable methods for meta-evaluation that can detect quality improvements by document revision. We show the uniqueness of TETRA compared with existing document revision corpora and demonstrate that a fine-tuned pre-trained language model can discriminate the quality of documents after revision even when the difference is subtle. This promising result will encourage the community to further explore automated document revision models and metrics in future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2021

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

We present a corpus professionally annotated for grammatical error corre...
research
07/04/2023

A Language Model for Grammatical Error Correction in L2 Russian

Grammatical error correction is one of the fundamental tasks in Natural ...
research
04/04/2023

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

ChatGPT, a large-scale language model based on the advanced GPT-3.5 arch...
research
05/18/2023

CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

It is intractable to evaluate the performance of Grammatical Error Corre...
research
05/26/2019

Evaluation of basic modules for isolated spelling error correction in Polish texts

Spelling error correction is an important problem in natural language pr...
research
04/30/2018

Inherent Biases in Reference-based Evaluation for Grammatical Error Correction and Text Simplification

The prevalent use of too few references for evaluating text-to-text gene...
research
12/30/2018

ATHENA: Automated Tuning of Genomic Error Correction Algorithms using Language Models

The performance of most error-correction algorithms that operate on geno...

Please sign up or login with your details

Forgot password? Click here to reset