A Benchmark Arabic Dataset for Commonsense Explanation

by   Saja AL-Tawalbeh, et al.

Language comprehension and commonsense knowledge validation by machines are challenging tasks that are still under researched and evaluated for Arabic text. In this paper, we present a benchmark Arabic dataset for commonsense explanation. The dataset consists of Arabic sentences that does not make sense along with three choices to select among them the one that explains why the sentence is false. Furthermore, this paper presents baseline results to assist and encourage the future evaluation of research in this field. The dataset is distributed under the Creative Commons CC-BY-SA 4.0 license and can be found on GitHub


page 1

page 2

page 3

page 4


Is this sentence valid? An Arabic Dataset for Commonsense Validation

The commonsense understanding and validation remains a challenging task ...

SemEval-2020 Task 4: Commonsense Validation and Explanation

In this paper, we present SemEval-2020 Task 4, Commonsense Validation an...

KaLM at SemEval-2020 Task 4: Knowledge-aware Language Models for Comprehension And Generation

This paper presents our strategies in SemEval 2020 Task 4: Commonsense V...

An Arabic-Hebrew parallel corpus of TED talks

We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT...

Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language

Online misogyny has become an increasing worry for Arab women who experi...

Autoencoding Language Model Based Ensemble Learning for Commonsense Validation and Explanation

An ultimate goal of artificial intelligence is to build computer systems...

CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning

This paper describes our system submitted to task 4 of SemEval 2020: Com...

Code Repositories


A Benchmark Arabic Dataset for Commonsense Explanation

view repo