Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm

12/04/2018
by   Vibhuti Gupta, et al.
0

Twitter is a popular social network platform where users can interact and post texts of up to 280 characters called tweets. Hashtags, hyperlinked words in tweets, have increasingly become crucial for tweet retrieval and search. Using hashtags for tweet topic classification is a challenging problem because of context dependent among words, slangs, abbreviation and emoticons in a short tweet along with evolving use of hashtags. Since Twitter generates millions of tweets daily, tweet analytics is a fundamental problem of Big data stream that often requires a real-time Distributed processing. This paper proposes a distributed online approach to tweet topic classification with hashtags. Being implemented on Apache Storm, a distributed real time framework, our approach incrementally identifies and updates a set of strong predictors in the Naïve Bayes model for classifying each incoming tweet instance. Preliminary experiments show promising results with up to 97 throughput on eight processors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2020

A Few Topical Tweets are Enough for Effective User-Level Stance Detection

Stance detection entails ascertaining the position of a user towards a t...
research
12/27/2016

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

Big data trend has enforced the data-centric systems to have continuous ...
research
09/09/2023

TECVis: A Visual Analytics Tool to Compare People's Emotion Feelings

Twitter is one of the popular social media platforms where people share ...
research
03/10/2018

Discovering Users Topic of Interest from Tweet

Nowadays social media has become one of the largest gatherings of people...
research
10/05/2019

City-level Geolocation of Tweets for Real-time Visual Analytics

Real-time tweets can provide useful information on evolving events and s...
research
12/13/2017

Everything You Always Wanted to Know About TREC RTS* (*But Were Afraid to Ask)

The TREC Real-Time Summarization (RTS) track provides a framework for ev...
research
10/17/2019

Adaptive Normalization in Streaming Data

In todays digital era, data are everywhere from Internet of Things to he...

Please sign up or login with your details

Forgot password? Click here to reset