Unsupervised Simplification of Legal Texts

09/01/2022
by   Mert Cemri, et al.
0

The processing of legal texts has been developing as an emerging field in natural language processing (NLP). Legal texts contain unique jargon and complex linguistic attributes in vocabulary, semantics, syntax, and morphology. Therefore, the development of text simplification (TS) methods specific to the legal domain is of paramount importance for facilitating comprehension of legal text by ordinary people and providing inputs to high-level models for mainstream legal NLP applications. While a recent study proposed a rule-based TS method for legal text, learning-based TS in the legal domain has not been considered previously. Here we introduce an unsupervised simplification method for legal texts (USLT). USLT performs domain-specific TS by replacing complex words and splitting long sentences. To this end, USLT detects complex words in a sentence, generates candidates via a masked-transformer model, and selects a candidate for substitution based on a rank score. Afterward, USLT recursively decomposes long sentences into a hierarchy of shorter core and context sentences while preserving semantic meaning. We demonstrate that USLT outperforms state-of-the-art domain-general TS methods in text simplicity while keeping the semantics intact.

READ FULL TEXT
research
07/13/2021

Indian Legal NLP Benchmarks : A Survey

Availability of challenging benchmarks is the key to advancement of AI i...
research
12/16/2022

LegalRelectra: Mixed-domain Language Modeling for Long-range Legal Text Comprehension

The application of Natural Language Processing (NLP) to specialized doma...
research
11/10/2021

Critical Sentence Identification in Legal Cases Using Multi-Class Classification

Inherently, the legal domain contains a vast amount of data in text form...
research
05/14/2019

The Language of Legal and Illegal Activity on the Darknet

The non-indexed parts of the Internet (the Darknet) have become a haven ...
research
06/09/2023

Towards the Exploitation of LLM-based Chatbot for Providing Legal Support to Palestinian Cooperatives

With the ever-increasing utilization of natural language processing (NLP...
research
02/28/2023

A Survey on Long Text Modeling with Transformers

Modeling long texts has been an essential technique in the field of natu...
research
09/21/2016

Gov2Vec: Learning Distributed Representations of Institutions and Their Legal Text

We compare policy differences across institutions by embedding represent...

Please sign up or login with your details

Forgot password? Click here to reset