Embedding Feature Selection for Large-scale Hierarchical Classification

06/06/2017
by   Azad Naik, et al.
0

Large-scale Hierarchical Classification (HC) involves datasets consisting of thousands of classes and millions of training instances with high-dimensional features posing several big data challenges. Feature selection that aims to select the subset of discriminant features is an effective strategy to deal with large-scale HC problem. It speeds up the training process, reduces the prediction time and minimizes the memory requirements by compressing the total size of learned model weight vectors. Majority of the studies have also shown feature selection to be competent and successful in improving the classification accuracy by removing irrelevant features. In this work, we investigate various filter-based feature selection methods for dimensionality reduction to solve the large-scale HC problem. Our experimental evaluation on text and image datasets with varying distribution of features, classes and instances shows upto 3x order of speed-up on massive datasets and upto 45 memory requirements for storing the weight vectors of learned model without any significant loss (improvement for some datasets) in the classification accuracy. Source Code: https://cs.gmu.edu/ mlbio/featureselection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

High-Dimensional Feature Selection for Genomic Datasets

In the presence of large dimensional datasets that contain many irreleva...
research
10/27/2021

A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

Recently, many evolutionary computation methods have been developed to s...
research
03/07/2014

Ant Colony based Feature Selection Heuristics for Retinal Vessel Segmentation

Features selection is an essential step for successful data classificati...
research
06/05/2017

Inconsistent Node Flattening for Improving Top-down Hierarchical Classification

Large-scale classification of data where classes are structurally organi...
research
06/08/2021

Dynamic Instance-Wise Classification in Correlated Feature Spaces

In a typical supervised machine learning setting, the predictions on all...
research
10/26/2020

BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

We consider feature selection for applications in machine learning where...
research
05/26/2017

Classification of Major Depressive Disorder via Multi-Site Weighted LASSO Model

Large-scale collaborative analysis of brain imaging data, in psychiatry ...

Please sign up or login with your details

Forgot password? Click here to reset