Efficacy of BERT embeddings on predicting disaster from Twitter data

08/08/2021
by   Ashis Kumar Chanda, et al.
0

Social media like Twitter provide a common platform to share and communicate personal experiences with other people. People often post their life experiences, local news, and events on social media to inform others. Many rescue agencies monitor this type of data regularly to identify disasters and reduce the risk of lives. However, it is impossible for humans to manually check the mass amount of data and identify disasters in real-time. For this purpose, many research works have been proposed to present words in machine-understandable representations and apply machine learning methods on the word representations to identify the sentiment of a text. The previous research methods provide a single representation or embedding of a word from a given document. However, the recent advanced contextual embedding method (BERT) constructs different vectors for the same word in different contexts. BERT embeddings have been successfully used in different natural language processing (NLP) tasks, yet there is no concrete analysis of how these representations are helpful in disaster-type tweet analysis. In this research work, we explore the efficacy of BERT embeddings on predicting disaster from Twitter data and compare these to traditional context-free word embedding methods (GloVe, Skip-gram, and FastText). We use both traditional machine learning methods and deep learning methods for this purpose. We provide both quantitative and qualitative results for this study. The results show that the BERT embeddings have the best results in disaster prediction task than the traditional word embeddings. Our codes are made freely accessible to the research community.

READ FULL TEXT
research
10/26/2020

Robust and Consistent Estimation of Word Embedding for Bangla Language by fine-tuning Word2Vec Model

Word embedding or vector representation of word holds syntactical and se...
research
04/17/2023

New Product Development (NPD) through Social Media-based Analysis by Comparing Word2Vec and BERT Word Embeddings

This study introduces novel methods for sentiment and opinion classifica...
research
11/07/2019

Using Dynamic Embeddings to Improve Static Embeddings

How to build high-quality word embeddings is a fundamental research ques...
research
10/01/2020

Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT

White supremacists embrace a radical ideology that considers white peopl...
research
05/12/2019

The Secret Lives of Names? Name Embeddings from Social Media

Your name tells a lot about you: your gender, ethnicity and so on. It ha...
research
12/17/2020

BERT Goes Shopping: Comparing Distributional Models for Product Representations

Word embeddings (e.g., word2vec) have been applied successfully to eComm...
research
12/28/2020

DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language

Exponential growths of social media and micro-blogging sites not only pr...

Please sign up or login with your details

Forgot password? Click here to reset