Certifying One-Phase Technology-Assisted Reviews

08/29/2021
by   David D. Lewis, et al.
2

Technology-assisted review (TAR) workflows based on iterative active learning are widely used in document review applications. Most stopping rules for one-phase TAR workflows lack valid statistical guarantees, which has discouraged their use in some legal contexts. Drawing on the theory of quantile estimation, we provide the first broadly applicable and statistically valid sample-based stopping rules for one-phase TAR. We further show theoretically and empirically that overshooting a recall target, which has been treated as innocuous or desirable in past evaluations of stopping rules, is a major source of excess cost in one-phase TAR workflows. Counterintuitively, incurring a larger sampling cost to reduce excess recall leads to lower total cost in almost all scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 9

page 10

research
06/18/2021

Heuristic Stopping Rules For Technology-Assisted Review

Technology-assisted review (TAR) refers to human-in-the-loop active lear...
research
06/18/2021

On Minimizing Cost in Legal Document Review Workflows

Technology-assisted review (TAR) refers to human-in-the-loop machine lea...
research
01/08/2022

Impact of Stop Sets on Stopping Active Learning for Text Classification

Active learning is an increasingly important branch of machine learning ...
research
10/07/2021

Hitting the Target: Stopping Active Learning at the Cost-Based Optimum

Active learning allows machine learning models to be trained using fewer...
research
01/24/2018

Impact of Batch Size on Stopping Active Learning for Text Classification

When using active learning, smaller batch sizes are typically more effic...
research
12/28/2013

Stopping Rules for Bag-of-Words Image Search and Its Application in Appearance-Based Localization

We propose a technique to improve the search efficiency of the bag-of-wo...
research
07/05/2017

Early stopping for kernel boosting algorithms: A general analysis with localized complexities

Early stopping of iterative algorithms is a widely-used form of regulari...

Please sign up or login with your details

Forgot password? Click here to reset