Customizing Contextualized Language Models forLegal Document Reviews

02/10/2021
by   Shohreh Shaghaghian, et al.
0

Inspired by the inductive transfer learning on computer vision, many efforts have been made to train contextualized language models that boost the performance of natural language processing tasks. These models are mostly trained on large general-domain corpora such as news, books, or Wikipedia.Although these pre-trained generic language models well perceive the semantic and syntactic essence of a language structure, exploiting them in a real-world domain-specific scenario still needs some practical considerations to be taken into account such as token distribution shifts, inference time, memory, and their simultaneous proficiency in multiple tasks. In this paper, we focus on the legal domain and present how different language model strained on general-domain corpora can be best customized for multiple legal document reviewing tasks. We compare their efficiencies with respect to task performances and present practical considerations.

READ FULL TEXT
research
10/04/2021

JuriBERT: A Masked-Language Model Adaptation for French Legal Text

Language models have proven to be very useful when adapted to specific d...
research
10/23/2021

Spanish Legalese Language Model and Corpora

There are many Language Models for the English language according to its...
research
04/20/2018

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Many efforts have been made to facilitate natural language processing ta...
research
04/06/2022

Language Model for Text Analytic in Cybersecurity

NLP is a form of artificial intelligence and machine learning concerned ...
research
07/20/2022

Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets

Over-parameterized models, typically pre-trained language models (LMs), ...
research
09/16/2023

NOWJ1@ALQAC 2023: Enhancing Legal Task Performance with Classic Statistical Models and Pre-trained Language Models

This paper describes the NOWJ1 Team's approach for the Automated Legal Q...
research
02/03/2023

LaMPP: Language Models as Probabilistic Priors for Perception and Action

Language models trained on large text corpora encode rich distributional...

Please sign up or login with your details

Forgot password? Click here to reset