NESTLE: a No-Code Tool for Statistical Analysis of Legal Corpus

09/08/2023
by   Kyoungyeon Cho, et al.
0

The statistical analysis of large scale legal corpus can provide valuable legal insights. For such analysis one needs to (1) select a subset of the corpus using document retrieval tools, (2) structuralize text using information extraction (IE) systems, and (3) visualize the data for the statistical analysis. Each process demands either specialized tools or programming skills whereas no comprehensive unified "no-code" tools have been available. Especially for IE, if the target information is not predefined in the ontology of the IE system, one needs to build their own system. Here we provide NESTLE, a no code tool for large-scale statistical analysis of legal corpus. With NESTLE, users can search target documents, extract information, and visualize the structured data all via the chat interface with accompanying auxiliary GUI for the fine-level control. NESTLE consists of three main components: a search engine, an end-to-end IE system, and a Large Language Model (LLM) that glues the whole components together and provides the chat interface. Powered by LLM and the end-to-end IE system, NESTLE can extract any type of information that has not been predefined in the IE system opening up the possibility of unlimited customizable statistical analysis of the corpus without writing a single line of code. The use of the custom end-to-end IE system also enables faster and low-cost IE on large scale corpus. We validate our system on 15 Korean precedent IE tasks and 3 legal text classification tasks from LEXGLUE. The comprehensive experiments reveal NESTLE can achieve GPT-4 comparable performance by training the internal IE module with 4 human-labeled, and 192 LLM-labeled examples. The detailed analysis provides the insight on the trade-off between accuracy, time, and cost in building such system.

READ FULL TEXT

page 1

page 3

page 10

page 11

page 12

page 13

research
11/03/2022

Data-efficient End-to-end Information Extraction for Statistical Legal Analysis

Legal practitioners often face a vast amount of documents. Lawyers, for ...
research
10/05/2022

Graphie: A network-based visual interface for UK's Primary Legislation

We present Graphie, a novel navigational interface to visualize Acts and...
research
06/10/2022

A Multi-Task Benchmark for Korean Legal Language Understanding and Judgement Prediction

The recent advances of deep learning have dramatically changed how machi...
research
04/02/2022

HLDC: Hindi Legal Documents Corpus

Many populous countries including India are burdened with a considerable...
research
04/13/2019

Legal Area Classification: A Comparative Study of Text Classifiers on Singapore Supreme Court Judgments

This paper conducts a comparative study on the performance of various ma...
research
11/23/2019

Corpus-Level End-to-End Exploration for Interactive Systems

A core interest in building Artificial Intelligence (AI) agents is to le...
research
05/04/2023

Analyzing Hong Kong's Legal Judgments from a Computational Linguistics point-of-view

Analysis and extraction of useful information from legal judgments using...

Please sign up or login with your details

Forgot password? Click here to reset