Satyam: Democratizing Groundtruth for Machine Vision

11/08/2018
by   Hang Qiu, et al.
0

The democratization of machine learning (ML) has led to ML-based machine vision systems for autonomous driving, traffic monitoring, and video surveillance. However, true democratization cannot be achieved without greatly simplifying the process of collecting groundtruth for training and testing these systems. This groundtruth collection is necessary to ensure good performance under varying conditions. In this paper, we present the design and evaluation of Satyam, a first-of-its-kind system that enables a layperson to launch groundtruth collection tasks for machine vision with minimal effort. Satyam leverages a crowdtasking platform, Amazon Mechanical Turk, and automates several challenging aspects of groundtruth collection: creating and launching of custom web-UI tasks for obtaining the desired groundtruth, controlling result quality in the face of spammers and untrained workers, adapting prices to match task complexity, filtering spammers and workers with poor performance, and processing worker payments. We validate Satyam using several popular benchmark vision datasets, and demonstrate that groundtruth obtained by Satyam is comparable to that obtained from trained experts and provides matching ML performance when used for training.

READ FULL TEXT

page 2

page 4

page 5

page 10

page 11

research
05/17/2022

A Labeling Task Design for Supporting Algorithmic Needs: Facilitating Worker Diversity and Reducing AI Bias

Studies on supervised machine learning (ML) recommend involving workers ...
research
05/17/2021

Towards Demystifying Serverless Machine Learning Training

The appeal of serverless (FaaS) has triggered a growing interest on how ...
research
08/25/2020

Towards Guidelines for Assessing Qualities of Machine Learning Systems

Nowadays, systems containing components based on machine learning (ML) m...
research
06/15/2023

In Search of netUnicorn: A Data-Collection Platform to Develop Generalizable ML Models for Network Security Problems

The remarkable success of the use of machine learning-based solutions fo...
research
11/03/2020

Ensuring Dataset Quality for Machine Learning Certification

In this paper, we address the problem of dataset quality in the context ...
research
03/05/2021

labelCloud: A Lightweight Domain-Independent Labeling Tool for 3D Object Detection in Point Clouds

Within the past decade, the rise of applications based on artificial int...
research
04/26/2021

Recurring Turking: Conducting Daily Task Studies on Mechanical Turk

In this paper, we present our system design for conducting recurring dai...

Please sign up or login with your details

Forgot password? Click here to reset