The Cambridge Law Corpus: A Corpus for Legal AI Research

09/21/2023
by   Andreas Östling, et al.
0

We introduce the Cambridge Law Corpus (CLC), a corpus for legal AI research. It consists of over 250 000 court cases from the UK. Most cases are from the 21st century, but the corpus includes cases as old as the 16th century. This paper presents the first release of the corpus, containing the raw text and meta-data. Together with the corpus, we provide annotations on case outcomes for 638 cases, done by legal experts. Using our annotated data, we have trained and evaluated case outcome extraction with GPT-3, GPT-4 and RoBERTa models to provide benchmarks. We include an extensive legal and ethical discussion to address the potentially sensitive nature of this material. As a consequence, the corpus will only be released for research purposes under certain restrictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2022

Corpus for Automatic Structuring of Legal Documents

In populous countries, pending legal cases have been growing exponential...
research
05/28/2021

ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation

An automated system that could assist a judge in predicting the outcome ...
research
07/18/2022

Using attention methods to predict judicial outcomes

Legal Judgment Prediction is one of the most acclaimed fields for the co...
research
04/25/2021

What About the Precedent: An Information-Theoretic Analysis of Common Law

In common law, the outcome of a new case is determined mostly by precede...
research
09/07/2022

Algorithmic Learning Foundations for Common Law

This paper looks at a common law legal system as a learning algorithm, m...
research
10/25/2022

Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts

This work demonstrates that Legal Judgement Prediction systems without e...
research
04/20/2021

StateCensusLaws.org: A Web Application for Consuming and Annotating Legal Discourse Learning

In this work, we create a web application to highlight the output of NLP...

Please sign up or login with your details

Forgot password? Click here to reset