AraLegal-BERT: A pretrained language model for Arabic Legal text

10/15/2022
by   Muhammad Al-Qurishi, et al.
0

The effectiveness of the BERT model on multiple linguistic tasks has been well documented. On the other hand, its potentials for narrow and specific domains such as Legal, have not been fully explored. In this paper, we examine how BERT can be used in the Arabic legal domain and try customizing this language model for several downstream tasks using several different domain-relevant training and testing datasets to train BERT from scratch. We introduce the AraLegal-BERT, a bidirectional encoder Transformer-based model that have been thoroughly tested and carefully optimized with the goal to amplify the impact of NLP-driven solution concerning jurisprudence, legal documents, and legal practice. We fine-tuned AraLegal-BERT and evaluated it against three BERT variations for Arabic language in three natural languages understanding (NLU) tasks. The results show that the base version of AraLegal-BERT achieve better accuracy than the general and original BERT over the Legal text.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

LEGAL-BERT: The Muppets straight out of Law School

BERT has achieved impressive performance in several NLP tasks. However, ...
research
08/11/2023

Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models

Auditing financial documents is a very tedious and time-consuming proces...
research
03/12/2021

Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech

Recent developments in Natural Language Processing have led to the intro...
research
12/25/2021

CABACE: Injecting Character Sequence Information and Domain Knowledge for Enhanced Acronym and Long-Form Extraction

Acronyms and long-forms are commonly found in research documents, more s...
research
07/08/2022

ABB-BERT: A BERT model for disambiguating abbreviations and contractions

Abbreviations and contractions are commonly found in text across differe...
research
05/04/2023

Leveraging BERT Language Model for Arabic Long Document Classification

Given the number of Arabic speakers worldwide and the notably large amou...
research
02/21/2021

Pre-Training BERT on Arabic Tweets: Practical Considerations

Pretraining Bidirectional Encoder Representations from Transformers (BER...

Please sign up or login with your details

Forgot password? Click here to reset