CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction

05/18/2023
by   Jingheng Ye, et al.
0

It is intractable to evaluate the performance of Grammatical Error Correction (GEC) systems since GEC is a highly subjective task. Designing an evaluation metric that is as objective as possible is crucial to the development of GEC task. Previous mainstream evaluation metrics, i.e., reference-based metrics, introduce bias into the multi-reference evaluation because they extract edits without considering the presence of multiple references. To overcome the problem, we propose Chunk-LEvel Multi-reference Evaluation (CLEME) designed to evaluate GEC systems in multi-reference settings. First, CLEME builds chunk sequences with consistent boundaries for the source, the hypothesis and all the references, thus eliminating the bias caused by inconsistent edit boundaries. Then, based on the discovery that there exist boundaries between different grammatical errors, we automatically determine the grammatical error boundaries and compute F_0.5 scores in a novel way. Our proposed CLEME approach consistently and substantially outperforms existing reference-based GEC metrics on multiple reference sets in both corpus-level and sentence-level settings. Extensive experiments and detailed analyses demonstrate the correctness of our discovery and the effectiveness of our designed evaluation metric.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2016

There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction

Current methods for automatically evaluating grammatical error correctio...
research
04/23/2022

MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction

This paper presents MuCGEC, a multi-reference multi-source evaluation da...
research
04/30/2018

Inherent Biases in Reference-based Evaluation for Grammatical Error Correction and Text Simplification

The prevalent use of too few references for evaluating text-to-text gene...
research
07/02/2017

Grammatical Error Correction with Neural Reinforcement Learning

We propose a neural encoder-decoder model with reinforcement learning (N...
research
10/23/2022

Focus Is What You Need For Chinese Grammatical Error Correction

Chinese Grammatical Error Correction (CGEC) aims to automatically detect...
research
05/23/2022

Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

Natural language processing technology has rapidly improved automated gr...
research
10/13/2022

An Analysis Method for Metric-Level Switching in Beat Tracking

For expressive music, the tempo may change over time, posing challenges ...

Please sign up or login with your details

Forgot password? Click here to reset