LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain

01/30/2023
by   Joel Niklaus, et al.
0

Lately, propelled by the phenomenal advances around the transformer architecture, the legal NLP field has enjoyed spectacular growth. To measure progress, well curated and challenging benchmarks are crucial. However, most benchmarks are English only and in legal NLP specifically there is no multilingual benchmark available yet. Additionally, many benchmarks are saturated, with the best models clearly outperforming the best humans and achieving near perfect scores. We survey the legal NLP literature and select 11 datasets covering 24 languages, creating LEXTREME. To provide a fair comparison, we propose two aggregate scores, one based on the datasets and one on the languages. The best baseline (XLM-R large) achieves both a dataset aggregate score a language aggregate score of 61.3. This indicates that LEXTREME is still very challenging and leaves ample room for improvement. To make it easy for researchers and practitioners to use, we release LEXTREME on huggingface together with all the code required to evaluate models and a public Weights and Biases project with all the runs.

READ FULL TEXT

page 19

page 20

page 21

page 22

page 23

page 24

research
06/03/2023

MultiLegalPile: A 689GB Multilingual Legal Corpus

Large, high-quality datasets are crucial for training Large Language Mod...
research
06/15/2023

SCALE: Scaling up the Complexity for Advanced Language Model Evaluation

Recent strides in Large Language Models (LLMs) have saturated many NLP b...
research
03/10/2021

CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review

Many specialized domains remain untouched by deep learning, as large lab...
research
09/15/2023

Resolving Legalese: A Multilingual Exploration of Negation Scope Resolution in Legal Documents

Resolving the scope of a negation within a sentence is a challenging NLP...
research
04/11/2022

A Survey on Legal Judgment Prediction: Datasets, Metrics, Models and Challenges

Legal judgment prediction (LJP) applies Natural Language Processing (NLP...
research
01/02/2023

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Reading comprehension of legal text can be a particularly challenging ta...
research
12/02/2021

How not to Lie with a Benchmark: Rearranging NLP Leaderboards

Comparison with a human is an essential requirement for a benchmark for ...

Please sign up or login with your details

Forgot password? Click here to reset