Inverse-Category-Frequency based supervised term weighting scheme for text categorization

12/13/2010
by   Deqing Wang, et al.
0

Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs. The widely used term weighting scheme in text categorization, i.e., tf.idf, is originated from information retrieval (IR) field. The intuition behind idf for text categorization seems less reasonable than IR. In this paper, we introduce inverse category frequency (icf) into term weighting scheme and propose two novel approaches, i.e., tf.icf and icf-based supervised term weighting schemes. The tf.icf adopts icf to substitute idf factor and favors terms occurring in fewer categories, rather than fewer documents. And the icf-based approach combines icf and relevance frequency (rf) to weight terms in a supervised way. Our cross-classifier and cross-corpus experiments have shown that our proposed approaches are superior or comparable to six supervised term weighting schemes and three traditional schemes in terms of macro-F1 and micro-F1.

READ FULL TEXT
research
03/12/2020

TF-IDFC-RF: A Novel Supervised Term Weighting Scheme

Sentiment Analysis is a branch of Affective Computing usually considered...
research
03/28/2019

Learning to Weight for Text Classification

In information retrieval (IR) and related tasks, term weighting approach...
research
05/19/2022

Why only Micro-F1? Class Weighting of Measures for Relation Classification

Relation classification models are conventionally evaluated using only a...
research
10/16/2016

Term-Class-Max-Support (TCMS): A Simple Text Document Categorization Approach Using Term-Class Relevance Measure

In this paper, a simple text categorization method using term-class rele...
research
05/03/2013

Feature Selection Based on Term Frequency and T-Test for Text Categorization

Much work has been done on feature selection. Existing methods are based...
research
07/13/2020

Assessing the behavior and performance of a supervised term-weighting technique for topic-based retrieval

This article analyses and evaluates FDDe̱ṯa̱, a supervised term-weightin...
research
04/16/2021

Back to the Basics: A Quantitative Analysis of Statistical and Graph-Based Term Weighting Schemes for Keyword Extraction

Term weighting schemes are widely used in Natural Language Processing an...

Please sign up or login with your details

Forgot password? Click here to reset