On Detecting Messaging Abuse in Short Text Messages using Linguistic and Behavioral patterns

08/18/2014
by   Alejandro Mosquera, et al.
0

The use of short text messages in social media and instant messaging has become a popular communication channel during the last years. This rising popularity has caused an increment in messaging threats such as spam, phishing or malware as well as other threats. The processing of these short text message threats could pose additional challenges such as the presence of lexical variants, SMS-like contractions or advanced obfuscations which can degrade the performance of traditional filtering solutions. By using a real-world SMS data set from a large telecommunications operator from the US and a social media corpus, in this paper we analyze the effectiveness of machine learning filters based on linguistic and behavioral patterns in order to detect short text spam and abusive users in the network. We have also explored different ways to deal with short text message challenges such as tokenization and entity detection by using text normalization and substring clustering techniques. The obtained results show the validity of the proposed solution by enhancing baseline approaches.

READ FULL TEXT
research
08/26/2020

Helping Users Tackle Algorithmic Threats on Social Media: A Multimedia Research Agenda

Participation on social media platforms has many benefits but also poses...
research
08/21/2023

Comparing Measures of Linguistic Diversity Across Social Media Language Data and Census Data at Subnational Geographic Areas

This paper describes a preliminary study on the comparative linguistic e...
research
04/17/2023

Researchers eye-view of sarcasm detection in social media textual content

The enormous use of sarcastic text in all forms of communication in soci...
research
09/04/2015

Ontology Based SMS Controller for Smart Phones

Text analysis includes lexical analysis of the text and has been widely ...
research
10/17/2019

Explainable Authorship Verification in Social Media via Attention-based Similarity Learning

Authorship verification is the task of analyzing the linguistic patterns...
research
03/10/2023

ICStega: Image Captioning-based Semantically Controllable Linguistic Steganography

Nowadays, social media has become the preferred communication platform f...
research
10/10/2021

amsqr at SemEval-2020 Task 12: Offensive language detection using neural networks and anti-adversarial features

This paper describes a method and system to solve the problem of detecti...

Please sign up or login with your details

Forgot password? Click here to reset