Orthogonal Matching Pursuit for Text Classification

07/12/2018
by   Konstantinos Skianis, et al.
0

In text classification, the problem of overfitting arises due to the high dimensionality, making regularization essential. Although classic regularizers provide sparsity, they fail to return highly accurate models. On the contrary, state-of-the-art group-lasso regularizers provide better results at the expense of low sparsity. In this paper, we apply a greedy variable selection algorithm, called Orthogonal Matching Pursuit, for the text classification task. We also extend standard group OMP by introducing overlapping group OMP to handle overlapping groups of features. Empirical analysis verifies that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and super-sparse models. Code and data are available at: https://www.dropbox.com/sh/7w7hjns71ol0xrz/AAC_G0_0DlcGkq6tQb2zqAaca?dl=0 .

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2011

Variable Selection in High Dimensions with Random Designs and Orthogonal Matching Pursuit

The performance of Orthogonal Matching Pursuit (OMP) for variable select...
research
05/26/2017

A WL-SPPIM Semantic Model for Document Classification

In this paper, we explore SPPIM-based text classification method, and th...
research
02/18/2014

Classification with Sparse Overlapping Groups

Classification with a sparsity constraint on the solution plays a centra...
research
07/27/2023

Gzip versus bag-of-words for text classification with KNN

The effectiveness of compression distance in KNN-based text classificati...
research
12/17/2010

Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: A Sure Screening Approach

We propose a novel application of the Simultaneous Orthogonal Matching P...
research
08/31/2016

Analysis of a low memory implementation of the Orthogonal Matching Pursuit greedy strategy

The convergence and numerical analysis of a low memory implementation of...
research
11/24/2016

Two-Level Structural Sparsity Regularization for Identifying Lattices and Defects in Noisy Images

This paper presents a regularized regression model with a two-level stru...

Please sign up or login with your details

Forgot password? Click here to reset