Named Entity Recognition in the Legal Domain using a Pointer Generator Network

12/17/2020
by   Stavroula Skylaki, et al.
1

Named Entity Recognition (NER) is the task of identifying and classifying named entities in unstructured text. In the legal domain, named entities of interest may include the case parties, judges, names of courts, case numbers, references to laws etc. We study the problem of legal NER with noisy text extracted from PDF files of filed court cases from US courts. The "gold standard" training data for NER systems provide annotation for each token of the text with the corresponding entity or non-entity label. We work with only partially complete training data, which differ from the gold standard NER data in that the exact location of the entities in the text is unknown and the entities may contain typos and/or OCR mistakes. To overcome the challenges of our noisy training data, e.g. text extraction errors and/or typos and unknown label indices, we formulate the NER task as a text-to-text sequence generation task and train a pointer generator network to generate the entities in the document rather than label them. We show that the pointer generator can be effective for NER in the absence of gold standard data and outperforms the common NER neural network architectures in long legal documents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

E-NER – An Annotated Named Entity Recognition Corpus of Legal Text

Identifying named entities such as a person, location or organization, i...
research
10/14/2022

Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

Although named entity recognition (NER) helps us to extract various doma...
research
05/10/2023

Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection

Accurate Named Entity Recognition (NER) is crucial for various informati...
research
02/14/2016

Exploiting Lists of Names for Named Entity Identification of Financial Institutions from Unstructured Documents

There is a wealth of information about financial systems that is embedde...
research
07/30/2020

Neural Modeling for Named Entities and Morphology (NEMO^2)

Named Entity Recognition (NER) is a fundamental NLP task, commonly formu...
research
04/04/2022

Extracting Impact Model Narratives from Social Services' Text

Named entity recognition (NER) is an important task in narration extract...
research
07/16/2019

MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation

We present MedCATTrainer an interface for building, improving and custom...

Please sign up or login with your details

Forgot password? Click here to reset