AI Chat AI Image Generator AI Video Text to Speech

Towards a Shared Rubric for Dataset Annotation

12/07/2021

∙

by Andrew Marc Greene, et al.

∙

∙

When arranging for third-party data annotation, it can be hard to compare how well the competing providers apply best practices to create high-quality datasets. This leads to a "race to the bottom," where competition based solely on price makes it hard for vendors to charge for high-quality annotation. We propose a voluntary rubric which can be used (a) as a scorecard to compare vendors' offerings, (b) to communicate our expectations of the vendors more clearly and consistently than today, (c) to justify the expense of choosing someone other than the lowest bidder, and (d) to encourage annotation providers to improve their practices.

page 2

page 3

page 4

research

∙ 09/24/2020

Best Practices for Managing Data Annotation Projects

Annotation is the labeling of data by human effort. Annotation is critic...

0 Tina Tseng, et al. ∙

research

∙ 03/21/2022

Whose AI Dream? In search of the aspiration in data annotation

This paper present the practice of data annotation from the perspective ...

0 Ding Wang, et al. ∙

research

∙ 06/05/2022

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

Annotated data is an essential ingredient in natural language processing...

29 Jan-Christoph Klie, et al. ∙

research

∙ 07/16/2023

Analyzing Dataset Annotation Quality Management in the Wild

Data quality is crucial for training accurate, unbiased, and trustworthy...

0 Jan-Christoph Klie, et al. ∙

research

∙ 02/18/2019

FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces

Large-scale annotation of image segmentation datasets is often prohibiti...

4 Philipe A. Dias, et al. ∙

research

∙ 12/17/2022

Towards Robust Handwritten Text Recognition with On-the-fly User Participation

Long-term OCR services aim to provide high-quality output to their users...

0 Ajoy Mondal, et al. ∙

research

∙ 07/17/2023

Can We Trust Race Prediction?

In the absence of sensitive race and ethnicity data, researchers, regula...

0 Cangyuan Li, et al. ∙