Active Annotation: bootstrapping annotation lexicon and guidelines for supervised NLU learning

08/12/2019
by   Federico Marinelli, et al.
0

Natural Language Understanding (NLU) models are typically trained in a supervised learning framework. In the case of intent classification, the predicted labels are predefined and based on the designed annotation schema while the labelling process is based on a laborious task where annotators manually inspect each utterance and assign the corresponding label. We propose an Active Annotation (AA) approach where we combine an unsupervised learning method in the embedding space, a human-in-the-loop verification process, and linguistic insights to create lexicons that can be open categories and adapted over time. In particular, annotators define the y-label space on-the-fly during the annotation using an iterative process and without the need for prior knowledge about the input data. We evaluate the proposed annotation paradigm in a real use-case NLU scenario. Results show that our Active Annotation paradigm achieves accurate and higher quality training data, with an annotation speed of an order of magnitude higher with respect to the traditional human-only driven baseline annotation methodology.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2018

Active Learning for New Domains in Natural Language Understanding

We explore active learning (AL) utterance selection for improving the ac...
research
07/07/2020

Modeling and Mitigating Human Annotation Errors to Design Efficient Stream Processing Systems with Human-in-the-loop Machine Learning

High-quality human annotations are necessary for creating effective mach...
research
11/22/2021

Self-supervised Semi-supervised Learning for Data Labeling and Quality Evaluation

As the adoption of deep learning techniques in industrial applications g...
research
04/02/2022

BERT-Assisted Semantic Annotation Correction for Emotion-Related Questions

Annotated data have traditionally been used to provide the input for tra...
research
03/30/2023

Neglected Free Lunch – Learning Image Classifiers Using Annotation Byproducts

Supervised learning of image classifiers distills human knowledge into a...
research
10/28/2020

Bayesian Methods for Semi-supervised Text Annotation

Human annotations are an important source of information in the developm...
research
02/17/2020

Handling Missing Annotations in Supervised Learning Data

Data annotation is an essential stage in supervised learning. However, t...

Please sign up or login with your details

Forgot password? Click here to reset