TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties

by   Karima Kadaoui, et al.

Large language models (LLMs) finetuned to follow human instructions have recently emerged as a breakthrough in AI. Models such as Google Bard and OpenAI ChatGPT, for example, are surprisingly powerful tools for question answering, code debugging, and dialogue generation. Despite the purported multilingual proficiency of these models, their linguistic inclusivity remains insufficiently explored. Considering this constraint, we present a thorough assessment of Bard and ChatGPT (encompassing both GPT-3.5 and GPT-4) regarding their machine translation proficiencies across ten varieties of Arabic. Our evaluation covers diverse Arabic varieties such as Classical Arabic, Modern Standard Arabic, and several nuanced dialectal variants. Furthermore, we undertake a human-centric study to scrutinize the efficacy of the most recent model, Bard, in following human instructions during translation tasks. Our exhaustive analysis indicates that LLMs may encounter challenges with certain Arabic dialects, particularly those for which minimal public data exists, such as Algerian and Mauritanian dialects. However, they exhibit satisfactory performance with more prevalent dialects, albeit occasionally trailing behind established commercial systems like Google Translate. Additionally, our analysis reveals a circumscribed capability of Bard in aligning with human instructions in translation contexts. Collectively, our findings underscore that prevailing LLMs remain far from inclusive, with only limited ability to cater for the linguistic and cultural intricacies of diverse communities.


page 1

page 2

page 3

page 4


Dolphin: A Challenging and Diverse Benchmark for Arabic NLG

We present Dolphin, a novel benchmark that addresses the need for an eva...

Automatic Standardization of Arabic Dialects for Machine Translation

Based on an annotated multimedia corpus, television series Marāyā 2013, ...

AceGPT, Localizing Large Language Models in Arabic

This paper explores the imperative need and methodology for developing a...

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets

The development of large language models (LLMs) such as ChatGPT has brou...

Improving Arabic Diacritization by Learning to Diacritize and Translate

We propose a novel multitask learning method for diacritization which tr...

Contribution au Niveau de l'Approche Indirecte à Base de Transfert dans la Traduction Automatique

In this thesis, we address several important issues concerning the morph...

TURJUMAN: A Public Toolkit for Neural Arabic Machine Translation

We present TURJUMAN, a neural toolkit for translating from 20 languages ...

Please sign up or login with your details

Forgot password? Click here to reset