Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

06/05/2022
by   Jan-Christoph Klie, et al.
29

Annotated data is an essential ingredient in natural language processing for training and evaluating machine learning models. It is therefore very desirable for the annotations to be of high quality. Recent work, however, has shown that several popular datasets contain a surprising amount of annotation errors or inconsistencies. To alleviate this issue, many methods for annotation error detection have been devised over the years. While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets. This raises strong concerns on methods' general performance and makes it difficult to asses their strengths and weaknesses. We therefore reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets for text classification as well as token and span labeling. In addition, we define a uniform evaluation setup including a new formalization of the annotation error detection task, evaluation protocol and general best practices. To facilitate future research and reproducibility, we release our datasets and implementations in an easy-to-use and open source software package.

READ FULL TEXT

page 15

page 23

page 26

page 33

page 34

research
05/31/2023

ActiveAED: A Human in the Loop Improves Annotation Error Detection

Manually annotated datasets are crucial for training and evaluating Natu...
research
07/16/2023

Analyzing Dataset Annotation Quality Management in the Wild

Data quality is crucial for training accurate, unbiased, and trustworthy...
research
05/11/2022

ALIGNMEET: A Comprehensive Tool for Meeting Annotation, Alignment, and Evaluation

Summarization is a challenging problem, and even more challenging is to ...
research
07/02/2021

Scarecrow: A Framework for Scrutinizing Machine Text

Modern neural text generation systems can produce remarkably fluent and ...
research
12/07/2021

Towards a Shared Rubric for Dataset Annotation

When arranging for third-party data annotation, it can be hard to compar...
research
12/16/2017

Overview of the Wikidata Vandalism Detection Task at WSDM Cup 2017

We report on the Wikidata vandalism detection task at the WSDM Cup 2017....
research
12/16/2022

Homonymy Information for English WordNet

A widely acknowledged shortcoming of WordNet is that it lacks a distinct...

Please sign up or login with your details

Forgot password? Click here to reset