Computer-Assisted Creation of Boolean Search Rules for Text Classification in the Legal Domain

12/10/2021
by   Hannes Westermann, et al.
0

In this paper, we present a method of building strong, explainable classifiers in the form of Boolean search rules. We developed an interactive environment called CASE (Computer Assisted Semantic Exploration) which exploits word co-occurrence to guide human annotators in selection of relevant search terms. The system seamlessly facilitates iterative evaluation and improvement of the classification rules. The process enables the human annotators to leverage the benefits of statistical information while incorporating their expert intuition into the creation of such rules. We evaluate classifiers created with our CASE system on 4 datasets, and compare the results to machine learning methods, including SKOPE rules, Random forest, Support Vector Machine, and fastText classifiers. The results drive the discussion on trade-offs between superior compactness, simplicity, and intuitiveness of the Boolean search rules versus the better performance of state-of-the-art machine learning models for text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2021

The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification

We aim to highlight an interesting trend to contribute to the ongoing de...
research
10/25/2017

Exploring the Use of Text Classification in the Legal Domain

In this paper, we investigate the application of text classification met...
research
04/04/2022

A pipeline and comparative study of 12 machine learning models for text classification

Text-based communication is highly favoured as a communication method, e...
research
09/07/2020

COVCOR20 at WNUT-2020 Task 2: An Attempt to Combine Deep Learning and Expert rules

In the scope of WNUT-2020 Task 2, we developed various text classificati...
research
07/04/2020

Building a Competitive Associative Classifier

With the huge success of deep learning, other machine learning paradigms...
research
01/24/2018

Support Vector Machine Active Learning Algorithms with Query-by-Committee versus Closest-to-Hyperplane Selection

This paper investigates and evaluates support vector machine active lear...
research
11/23/2015

Interpretable Two-level Boolean Rule Learning for Classification

This paper proposes algorithms for learning two-level Boolean rules in C...

Please sign up or login with your details

Forgot password? Click here to reset