Towards De-identification of Legal Texts

10/09/2019
by   Diego Garat, et al.
0

In many countries, personal information that can be published or shared between organizations is regulated and, therefore, documents must undergo a process of de-identification to eliminate or obfuscate confidential data. Our work focuses on the de-identification of legal texts, where the goal is to hide the names of the actors involved in a lawsuit without losing the sense of the story. We present a first evaluation on our corpus of NLP tools in tasks such as segmentation, tokenization and recognition of named entities, and we analyze several evaluation measures for our de-identification task. Results are meager: 84 that might lead to the re-identification of involved names. We conclude that tools must be strongly adapted for processing texts of this particular domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2023

FlairNLP at SemEval-2023 Task 6b: Extraction of Legal Named Entities from Legal Texts using Contextual String Embeddings

Indian court legal texts and processes are essential towards the integri...
research
03/05/2019

Language and Dialect Identification of Cuneiform Texts

This article introduces a corpus of cuneiform texts from which the datas...
research
08/27/2020

Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus

This article introduces the Wanca 2017 corpus of texts crawled from the ...
research
05/14/2019

The Language of Legal and Illegal Activity on the Darknet

The non-indexed parts of the Internet (the Darknet) have become a haven ...
research
05/20/2023

CDJUR-BR – A Golden Collection of Legal Document from Brazilian Justice with Fine-Grained Named Entities

A basic task for most Legal Artificial Intelligence (Legal AI) applicati...
research
12/23/2022

From Judgement's Premises Towards Key Points

Key Point Analysis(KPA) is a relatively new task in NLP that combines su...
research
08/23/2023

Computational Dating for the Nuzi Cuneiform Archive: The Least Squares Constrained by Family Trees and Synchronisms

We introduce a computational method of dating for an archive in ancient ...

Please sign up or login with your details

Forgot password? Click here to reset