LegaLMFiT: Efficient Short Legal Text Classification with LSTM Language Model Pre-Training

09/02/2021
by   Benjamin Clavié, et al.
0

Large Transformer-based language models such as BERT have led to broad performance improvements on many NLP tasks. Domain-specific variants of these models have demonstrated excellent performance on a variety of specialised tasks. In legal NLP, BERT-based models have led to new state-of-the-art results on multiple tasks. The exploration of these models has demonstrated the importance of capturing the specificity of the legal language and its vocabulary. However, such approaches suffer from high computational costs, leading to a higher ecological impact and lower accessibility. Our findings, focusing on English language legal text, show that lightweight LSTM-based Language Models are able to capture enough information from a small legal text pretraining corpus and achieve excellent performance on short legal text classification tasks. This is achieved with a significantly reduced computational overhead compared to BERT-based models. However, our method also shows degraded performance on a more complex task, multi-label classification of longer documents, highlighting the limitations of this lightweight approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/12/2021

Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech

Recent developments in Natural Language Processing have led to the intro...
research
10/24/2020

Large Scale Legal Text Classification Using Transformer Models

Large multi-label text classification is a challenging Natural Language ...
research
09/15/2021

The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification

We aim to highlight an interesting trend to contribute to the ongoing de...
research
02/16/2020

The Utility of General Domain Transfer Learning for Medical Language Tasks

The purpose of this study is to analyze the efficacy of transfer learnin...
research
06/05/2019

Neural Legal Judgment Prediction in English

Legal judgment prediction is the task of automatically predicting the ou...
research
04/18/2021

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset

While self-supervised learning has made rapid advances in natural langua...
research
03/31/2023

Attention is Not Always What You Need: Towards Efficient Classification of Domain-Specific Text

For large-scale IT corpora with hundreds of classes organized in a hiera...

Please sign up or login with your details

Forgot password? Click here to reset