Vector Space Model as Cognitive Space for Text Classification

08/21/2017
by   Barathi Ganesh HB, et al.
0

In this era of digitization, knowing the user's sociolect aspects have become essential features to build the user specific recommendation systems. These sociolect aspects could be found by mining the user's language sharing in the form of text in social media and reviews. This paper describes about the experiment that was performed in PAN Author Profiling 2017 shared task. The objective of the task is to find the sociolect aspects of the users from their tweets. The sociolect aspects considered in this experiment are user's gender and native language information. Here user's tweets written in a different language from their native language are represented as Document - Term Matrix with document frequency as the constraint. Further classification is done using the Support Vector Machine by taking gender and native language as target classes. This experiment attains the average accuracy of 73.42 prediction and 76.26

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2018

Gender Prediction in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System

The rapid expansion in the usage of social media networking sites leads ...
research
07/02/2020

Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset

This paper describes the Dakshina dataset, a new resource consisting of ...
research
07/03/2017

Including Dialects and Language Varieties in Author Profiling

This paper presents a computational approach to author profiling taking ...
research
10/10/2018

Inferring User Gender from User Generated Visual Content on a Deep Semantic Space

In this paper we address the task of gender classification on picture sh...
research
03/02/2019

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media

This short paper presents the design decisions taken and challenges enco...
research
09/01/2019

Topics to Avoid: Demoting Latent Confounds in Text Classification

Despite impressive performance on many text classification tasks, deep n...
research
12/19/2021

LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

This work aims to evaluate the ability that both probabilistic and state...

Please sign up or login with your details

Forgot password? Click here to reset