MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

01/02/2023
by   Steven H. Wang, et al.
0

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2021

CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review

Many specialized domains remain untouched by deep learning, as large lab...
research
12/19/2019

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

We present a Chinese judicial reading comprehension (CJRC) dataset which...
research
09/18/2018

Automatic Judgment Prediction via Legal Reading Comprehension

Automatic judgment prediction aims to predict the judicial results based...
research
04/10/2020

Molweni: A Challenge Multiparty Dialogues-based Machine Reading Comprehension Dataset with Discourse Structure

We present the Molweni dataset, a machine reading comprehension (MRC) da...
research
05/01/2020

Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset

Machine reading comprehension has made great progress in recent years ow...
research
01/30/2023

LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain

Lately, propelled by the phenomenal advances around the transformer arch...
research
11/06/2020

From Dataset Recycling to Multi-Property Extraction and Beyond

This paper investigates various Transformer architectures on the WikiRea...

Please sign up or login with your details

Forgot password? Click here to reset