ArNLI: Arabic Natural Language Inference for Entailment and Contradiction Detection

09/28/2022
by   Khloud Al Jallad, et al.
11

Natural Language Inference (NLI) is a hot topic research in natural language processing, contradiction detection between sentences is a special case of NLI. This is considered a difficult NLP task which has a big influence when added as a component in many NLP applications, such as Question Answering Systems, text Summarization. Arabic Language is one of the most challenging low-resources languages in detecting contradictions due to its rich lexical, semantics ambiguity. We have created a data set of more than 12k sentences and named ArNLI, that will be publicly available. Moreover, we have applied a new model inspired by Stanford contradiction detection proposed solutions on English language. We proposed an approach to detect contradictions between pairs of sentences in Arabic language using contradiction vector combined with language model vector as an input to machine learning model. We analyzed results of different traditional machine learning classifiers and compared their results on our created data set (ArNLI) and on an automatic translation of both PHEME, SICK English data sets. Best results achieved using Random Forest classifier with an accuracy of 99

READ FULL TEXT

page 7

page 11

page 12

page 14

page 15

research
07/27/2023

Improving Natural Language Inference in Arabic using Transformer Models and Linguistically Informed Pre-Training

This paper addresses the classification of Arabic text data in the field...
research
06/03/2022

TCE at Qur'an QA 2022: Arabic Language Question Answering Over Holy Qur'an Using a Post-Processed Ensemble of BERT-based Models

In recent years, we witnessed great progress in different tasks of natur...
research
11/25/2020

A Panoramic Survey of Natural Language Processing in the Arab World

The term natural language refers to any system of symbolic communication...
research
06/17/2021

A Deep Belief Network Classification Approach for Automatic Diacritization of Arabic Text

Deep learning has emerged as a new area of machine learning research. It...
research
02/07/2021

An open access NLP dataset for Arabic dialects : Data collection, labeling, and model construction

Natural Language Processing (NLP) is today a very active field of resear...
research
01/10/2022

A Survey of Plagiarism Detection Systems: Case of Use with English, French and Arabic Languages

In academia, plagiarism is certainly not an emerging concern, but it bec...
research
09/28/2022

Applying Machine Learning for Duplicate Detection, Throttling and Prioritization of Equipment Commissioning Audits at Fulfillment Network

VQ (Vendor Qualification) and IOQ (Installation and Operation Qualificat...

Please sign up or login with your details

Forgot password? Click here to reset