Log In Sign Up

A Multi-input Multi-output Transformer-based Hybrid Neural Network for Multi-class Privacy Disclosure Detection

by   A K M Nuhil Mehdy, et al.

The concern regarding users' data privacy has risen to its highest level due to the massive increase in communication platforms, social networking sites, and greater users' participation in online public discourse. An increasing number of people exchange private information via emails, text messages, and social media without being aware of the risks and implications. Researchers in the field of Natural Language Processing (NLP) have concentrated on creating tools and strategies to identify, categorize, and sanitize private information in text data since a substantial amount of data is exchanged in textual form. However, most of the detection methods solely rely on the existence of pre-identified keywords in the text and disregard the inference of the underlying meaning of the utterance in a specific context. Hence, in some situations, these tools and algorithms fail to detect disclosure, or the produced results are miss-classified. In this paper, we propose a multi-input, multi-output hybrid neural network which utilizes transfer-learning, linguistics, and metadata to learn the hidden patterns. Our goal is to better classify disclosure/non-disclosure content in terms of the context of situation. We trained and evaluated our model on a human-annotated ground truth dataset, containing a total of 5,400 tweets. The results show that the proposed model was able to identify privacy disclosure through tweets with an accuracy of 77.4 impressive accuracy of 99


Transfer Learning for Hate Speech Detection in Social Media

In today's society more and more people are connected to the Internet, a...

Automated Hate Speech Detection and the Problem of Offensive Language

A key challenge for automatic hate-speech detection on social media is t...

Detecting weak and strong Islamophobic hate speech on social media

Islamophobic hate speech on social media inflicts considerable harm on b...

TexTrolls: Identifying Russian Trolls on Twitter from a Textual Perspective

The online new emerging suspicious users, that usually are called trolls...

ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets

Advancing the utility of social media data for research applications req...

Detecting Privacy Requirements from User Stories with NLP Transfer Learning Models

To provide privacy-aware software systems, it is crucial to consider pri...

Automatic Identification and Ranking of Emergency Aids in Social Media Macro Community

Online social microblogging platforms including Twitter are increasingly...