Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification

03/27/2020
by   Wuraola Fisayo Oyewusi, et al.
0

Nigerian English adaptation, Pidgin, has evolved over the years through multi-language code switching, code mixing and linguistic adaptation. While Pidgin preserves many of the words in the normal English language corpus, both in spelling and pronunciation, the fundamental meaning of these words have changed significantly. For example,'ginger' is not a plant but an expression of motivation and 'tank' is not a container but an expression of gratitude. The implication is that the current approach of using direct English sentiment analysis of social media text from Nigeria is sub-optimal, as it will not be able to capture the semantic variation and contextual evolution in the contemporary meaning of these words. In practice, while many words in Nigerian Pidgin adaptation are the same as the standard English, the full English language based sentiment analysis models are not designed to capture the full intent of the Nigerian pidgin when used alone or code-mixed. By augmenting scarce human labelled code-changed text with ample synthetic code-reformatted text and meaning, we achieve significant improvements in sentiment scoring. Our research explores how to understand sentiment in an intrasentential code mixing and switching context where there has been significant word localization.This work presents a 300 VADER lexicon compatible Nigerian Pidgin sentiment tokens and their scores and a 14,000 gold standard Nigerian Pidgin tweets and their sentiments labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2020

A Sentiment Analysis Dataset for Code-Mixed Malayalam-English

There is an increasing demand for sentiment analysis of text from social...
research
05/30/2020

Corpus Creation for Sentiment Analysis in Code-Mixed Tamil-English Text

Understanding the sentiment of a comment from a video or an image is an ...
research
11/02/2016

Towards Sub-Word Level Compositions for Sentiment Analysis of Hindi-English Code Mixed Text

Sentiment analysis (SA) using code-mixed data from social media has seve...
research
09/07/2020

NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching language using a simple deep-learning classifier

Code-switching is a phenomenon in which two or more languages are used i...
research
06/13/2019

Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text

Multilingual writers and speakers often alternate between two languages ...
research
03/11/2018

Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages

Analysis of informative contents and sentiments of social users has been...
research
10/06/2020

Investigating African-American Vernacular English in Transformer-Based Text Generation

The growth of social media has encouraged the written use of African Ame...

Please sign up or login with your details

Forgot password? Click here to reset