A Semi-supervised Approach for a Better Translation of Sentiment in Dialectical Arabic UGT

10/21/2022
by   Hadeel Saadany, et al.
0

In the online world, Machine Translation (MT) systems are extensively used to translate User-Generated Text (UGT) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards the topic of the text. However, MT systems still lack accuracy in some low-resource languages and sometimes make critical translation errors that completely flip the sentiment polarity of the target word or phrase and hence delivers a wrong affect message. This is particularly noticeable in texts that do not follow common lexico-grammatical standards such as the dialectical Arabic (DA) used on online platforms. In this research, we aim to improve the translation of sentiment in UGT written in the dialectical versions of the Arabic language to English. Given the scarcity of gold-standard parallel data for DA-EN in the UGT domain, we introduce a semi-supervised approach that exploits both monolingual and parallel data for training an NMT system initialised by a cross-lingual language model trained with supervised and unsupervised modeling objectives. We assess the accuracy of sentiment translation by our proposed system through a numerical 'sentiment-closeness' measure as well as human evaluation. We will show that our semi-supervised MT system can significantly help with correcting sentiment errors detected in the online translation of dialectical Arabic UGT.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2020

Is it Great or Terrible? Preserving Sentiment in Neural Machine Translation of Arabic Reviews

Since the advent of Neural Machine Translation (NMT) approaches there ha...
research
09/30/2021

Sentiment-Aware Measure (SAM) for Evaluating Sentiment Transfer by Machine Translation Systems

In translating text where sentiment is the main message, human translato...
research
09/29/2021

BLEU, METEOR, BERTScore: Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-oriented Text

Social media companies as well as authorities make extensive use of arti...
research
04/10/2021

Sentiment-based Candidate Selection for NMT

The explosion of user-generated content (UGC)–e.g. social media posts, c...
research
09/21/2023

OSN-MDAD: Machine Translation Dataset for Arabic Multi-Dialectal Conversations on Online Social Media

While resources for English language are fairly sufficient to understand...
research
12/31/2017

A New Approach for Measuring Sentiment Orientation based on Multi-Dimensional Vector Space

This study implements a vector space model approach to measure the senti...
research
07/16/2023

Cross-Lingual NER for Financial Transaction Data in Low-Resource Languages

We propose an efficient modeling framework for cross-lingual named entit...

Please sign up or login with your details

Forgot password? Click here to reset