EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter

by   Kuntal Dey, et al.

The hashtag recommendation problem addresses recommending (suggesting) one or more hashtags to explicitly tag a post made on a given social network platform, based upon the content and context of the post. In this work, we propose a novel methodology for hashtag recommendation for microblog posts, specifically Twitter. The methodology, EmTaggeR, is built upon a training-testing framework that builds on the top of the concept of word embedding. The training phase comprises of learning word vectors associated with each hashtag, and deriving a word embedding for each hashtag. We provide two training procedures, one in which each hashtag is trained with a separate word embedding model applicable in the context of that hashtag, and another in which each hashtag obtains its embedding from a global context. The testing phase constitutes computing the average word embedding of the test post, and finding the similarity of this embedding with the known embeddings of the hashtags. The tweets that contain the most-similar hashtag are extracted, and all the hashtags that appear in these tweets are ranked in terms of embedding similarity scores. The top-K hashtags that appear in this ranked list, are recommended for the given test post. Our system produces F1 score of 50.83 by around 6.53 times, outperforming the best-performing system known in the literature that provides a lift of 6.42 times. EmTaggeR is a fast, scalable and lightweight system, which makes it practical to deploy in real-life applications.


page 1

page 2

page 3

page 4


On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions

SkipGram word embedding models with negative sampling, or SGN in short, ...

Augmenting Semantic Representation of Depressive Language: from Forums to Microblogs

We discuss and analyze the process of creating word embedding feature re...

#REVAL: a semantic evaluation framework for hashtag recommendation

Automatic evaluation of hashtag recommendation models is a fundamental t...

Syntactic Interchangeability in Word Embedding Models

Nearest neighbors in word embedding models are commonly observed to be s...

Automatic Machine Learning Derived from Scholarly Big Data

One of the challenging aspects of applying machine learning is the need ...

Word Sense Disambiguation as a Game of Neurosymbolic Darts

Word Sense Disambiguation (WSD) is one of the hardest tasks in natural l...

Merchandise Recommendation for Retail Events with Word Embedding Weighted Tf-idf and Dynamic Query Expansion

To recommend relevant merchandises for seasonal retail events, we rely o...

Please sign up or login with your details

Forgot password? Click here to reset