Active Learning from Crowd in Document Screening

11/11/2020
by   Evgeny Krivosheev, et al.
0

In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a challenging task since the budget is limited and there are countless number of ways to spend the given budget on the problem. We propose a multi-label active learning screening specific sampling technique – objective-aware sampling – for querying unlabelled documents for annotating. Our algorithm takes a decision on which machine filter need more training data and how to choose unlabeled items to annotate in order to minimize the risk of overall classification errors rather than minimizing a single filter error. We demonstrate that objective-aware sampling significantly outperforms the state of the art active learning sampling strategies.

READ FULL TEXT
research
04/01/2019

Combining Crowd and Machines for Multi-predicate Item Screening

This paper discusses how crowd and machine classifiers can be efficientl...
research
04/03/2019

Empirical Evaluations of Active Learning Strategies in Legal Document Review

One type of machine learning, text classification, is now regularly appl...
research
03/21/2018

Crowd-Machine Collaboration for Item Screening

In this paper we describe how crowd and machine classifier can be effici...
research
01/21/2021

Active Hybrid Classification

Hybrid crowd-machine classifiers can achieve superior performance by com...
research
10/12/2020

Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients

The recent increase in volume and complexity of available astronomical d...
research
06/15/2023

Crowd-Powered Photo Enhancement Featuring an Active Learning Based Local Filter

In this study, we address local photo enhancement to improve the aesthet...
research
10/19/2012

Budgeted Learning of Naive-Bayes Classifiers

Frequently, acquiring training data has an associated cost. We consider ...

Please sign up or login with your details

Forgot password? Click here to reset