Empirical Evaluations of Preprocessing Parameters' Impact on Predictive Coding's Effectiveness

04/03/2019
by   Rishi Chhatwal, et al.
0

Predictive coding, once used in only a small fraction of legal and business matters, is now widely deployed to quickly cull through increasingly vast amounts of data and reduce the need for costly and inefficient human document review. Previously, the sole front-end input used to create a predictive model was the exemplar documents (training data) chosen by subject-matter experts. Many predictive coding tools require users to rely on static preprocessing parameters and a single machine learning algorithm to develop the predictive model. Little research has been published discussing the impact preprocessing parameters and learning algorithms have on the effectiveness of the technology. A deeper dive into the generation of a predictive model shows that the settings and algorithm can have a strong effect on the accuracy and efficacy of a predictive coding tool. Understanding how these input parameters affect the output will empower legal teams with the information they need to implement predictive coding as efficiently and effectively as possible. This paper outlines different preprocessing parameters and algorithms as applied to multiple real-world data sets to understand the influence of various approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2019

Empirical Evaluations of Seed Set Selection Strategies for Predictive Coding

Training documents have a significant impact on the performance of predi...
research
04/03/2019

Explainable Text Classification in Legal Document Review A Case Study of Explainable Predictive Coding

In today's legal environment, lawsuits and regulatory investigations req...
research
04/03/2019

Empirical Study of Deep Learning for Text Classification in Legal Document Review

Predictive coding has been widely used in legal matters to find relevant...
research
07/27/2021

Predictive Coding: a Theoretical and Experimental Review

Predictive coding offers a potentially unifying account of cortical func...
research
12/01/2020

The Hidden Inconsistencies Introduced by Predictive Algorithms in Judicial Decision Making

Algorithms, from simple automation to machine learning, have been introd...
research
01/03/2015

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

Big data comes in various ways, types, shapes, forms and sizes. Indeed, ...

Please sign up or login with your details

Forgot password? Click here to reset