E-NER – An Annotated Named Entity Recognition Corpus of Legal Text

12/19/2022
by   Ting Wai Terence Au, et al.
0

Identifying named entities such as a person, location or organization, in documents can highlight key information to readers. Training Named Entity Recognition (NER) models requires an annotated data set, which can be a time-consuming labour-intensive task. Nevertheless, there are publicly available NER data sets for general English. Recently there has been interest in developing NER for legal text. However, prior work and experimental results reported here indicate that there is a significant degradation in performance when NER methods trained on a general English data set are applied to legal text. We describe a publicly available legal NER data set, called E-NER, based on legal company filings available from the US Securities and Exchange Commission's EDGAR data set. Training a number of different NER algorithms on the general English CoNLL-2003 corpus but testing on our test collection confirmed significant degradations in accuracy, as measured by the F1-score, of between 29.4% and 60.4%, compared to training and testing on the E-NER collection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2020

Named Entity Recognition in the Legal Domain using a Pointer Generator Network

Named Entity Recognition (NER) is the task of identifying and classifyin...
research
05/24/2023

Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal Practitioners

In this paper, we introduce an end-to-end pipeline for retrieving, proce...
research
01/17/2022

Data-Centric Machine Learning in the Legal Domain

Machine learning research typically starts with a fixed data set created...
research
10/28/2016

Text Segmentation using Named Entity Recognition and Co-reference Resolution in English and Greek Texts

In this paper we examine the benefit of performing named entity recognit...
research
10/06/2019

Named Entity Recognition – Is there a glass ceiling?

Recent developments in Named Entity Recognition (NER) have resulted in b...
research
10/15/2018

Named-Entity Linking Using Deep Learning For Legal Documents: A Transfer Learning Approach

In the legal domain it is important to differentiate between words in ge...
research
02/27/2019

F10-SGD: Fast Training of Elastic-net Linear Models for Text Classification and Named-entity Recognition

Voice-assistants text classification and named-entity recognition (NER) ...

Please sign up or login with your details

Forgot password? Click here to reset