Handling Missing Annotations in Supervised Learning Data

02/17/2020
by   Alaa E. Abdel-Hakim, et al.
0

Data annotation is an essential stage in supervised learning. However, the annotation process is exhaustive and time consuming, specially for large datasets. Activities of Daily Living (ADL) recognition is an example of systems that exploit very large raw sensor data readings. In such systems, sensor readings are collected from activity-monitoring sensors in a 24/7 manner. The size of the generated dataset is so huge that it is almost impossible for a human annotator to give a certain label to every single instance in the dataset. This results in annotation gaps in the input data to the adopting supervised learning system. The performance of the recognition system is negatively affected by these gaps. In this work, we propose and investigate three different paradigms to handle these gaps. In the first paradigm, the gaps are taken out by dropping all unlabeled readings. A single "Unknown" or "Do-Nothing" label is given to the unlabeled readings within the operation of the second paradigm. The last paradigm handles these gaps by giving every one of them a unique label identifying the encapsulating deterministic labels. Also, we propose a semantic preprocessing method of annotation gaps by constructing a hybrid combination of some of these paradigms for further performance improvement. The performance of the proposed three paradigms and their hybrid combination is evaluated using an ADL benchmark dataset containing more than 2.5× 10^6 sensor readings that had been collected over more than nine months. The evaluation results emphasize the performance contrast under the operation of each paradigm and support a specific gap handling approach for better performance.

READ FULL TEXT

page 6

page 7

page 10

page 11

page 12

page 13

research
03/07/2022

HAR-GCNN: Deep Graph CNNs for Human Activity Recognition From Highly Unlabeled Mobile Sensor Data

The problem of human activity recognition from mobile sensor data applie...
research
12/25/2013

Towards Using Unlabeled Data in a Sparse-coding Framework for Human Activity Recognition

We propose a sparse-coding framework for activity recognition in ubiquit...
research
09/05/2021

Sensor Data Augmentation with Resampling for Contrastive Learning in Human Activity Recognition

Human activity recognition plays an increasingly important role not only...
research
12/14/2021

Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks

Labelled data is the foundation of most natural language processing task...
research
01/19/2018

SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction

Facial beauty prediction (FBP) is a significant visual recognition probl...
research
08/12/2019

Active Annotation: bootstrapping annotation lexicon and guidelines for supervised NLU learning

Natural Language Understanding (NLU) models are typically trained in a s...
research
04/06/2020

Evaluating NLP Models via Contrast Sets

Standard test sets for supervised learning evaluate in-distribution gene...

Please sign up or login with your details

Forgot password? Click here to reset