Merit-based Fusion of NLP Techniques for Instant Feedback on Water Quality from Twitter Text

02/09/2022
by   Khubaib Ahmad, et al.
0

This paper focuses on an important environmental challenge; namely, water quality by analyzing the potential of social media as an immediate source of feedback. The main goal of the work is to automatically analyze and retrieve social media posts relevant to water quality with particular attention to posts describing different aspects of water quality, such as watercolor, smell, taste, and related illnesses. To this aim, we propose a novel framework incorporating different preprocessing, data augmentation, and classification techniques. In total, three different Neural Networks (NNs) architectures, namely (i) Bidirectional Encoder Representations from Transformers (BERT), (ii) Robustly Optimized BERT Pre-training Approach (XLM-RoBERTa), and (iii) custom Long short-term memory (LSTM) model, are employed in a merit-based fusion scheme. For merit-based weight assignment to the models, several optimization and search techniques are compared including a Particle Swarm Optimization (PSO), a Genetic Algorithm (GA), Brute Force (BF), Nelder-Mead, and Powell's optimization methods. We also provide an evaluation of the individual models where the highest F1-score of 0.81 is obtained with the BERT model. In merit-based fusion, overall better results are obtained with BF achieving an F1-score score of 0.852. We also provide comparison against existing methods, where a significant improvement for our proposed solutions is obtained. We believe such rigorous analysis of this relatively new topic will provide a baseline for future research.

READ FULL TEXT

page 3

page 7

research
01/01/2023

Relevance Classification of Flood-related Twitter Posts via Multiple Transformers

In recent years, social media has been widely explored as a potential so...
research
07/11/2022

A Late Fusion Framework with Multiple Optimization Methods for Media Interestingness

The recent advancement in Multimedia Analytical, Computer Vision (CV), a...
research
11/30/2020

Floods Detection in Twitter Text and Images

In this paper, we present our methods for the MediaEval 2020 Flood Relat...
research
08/15/2023

A Trustable LSTM-Autoencoder Network for Cyberbullying Detection on Social Media Using Synthetic Data

Social media cyberbullying has a detrimental effect on human life. As on...
research
08/19/2021

How Hateful are Movies? A Study and Prediction on Movie Subtitles

In this research, we investigate techniques to detect hate speech in mov...
research
10/01/2020

Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT

White supremacists embrace a radical ideology that considers white peopl...
research
09/14/2022

Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing

Strategy training is a multidisciplinary rehabilitation approach that te...

Please sign up or login with your details

Forgot password? Click here to reset