Chromatic Learning for Sparse Datasets

06/06/2020
by   Vladimir Feinberg, et al.
0

Learning over sparse, high-dimensional data frequently necessitates the use of specialized methods such as the hashing trick. In this work, we design a highly scalable alternative approach that leverages the low degree of feature co-occurrences present in many practical settings. This approach, which we call Chromatic Learning (CL), obtains a low-dimensional dense feature representation by performing graph coloring over the co-occurrence graph of features—an approach previously used as a runtime performance optimization for GBDT training. This color-based dense representation can be combined with additional dense categorical encoding approaches, e.g., submodular feature compression, to further reduce dimensionality. CL exhibits linear parallelizability and consumes memory linear in the size of the co-occurrence graph. By leveraging the structural properties of the co-occurrence graph, CL can compress sparse datasets, such as KDD Cup 2012, that contain over 50M features down to 1024, using an order of magnitude fewer features than frequency-based truncation and the hashing trick while maintaining the same test error for linear models. This compression further enables the use of deep networks in this wide, sparse setting, where CL similarly has favorable performance compared to existing baselines for budgeted input dimension.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2021

SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

Sparse shrunk additive models and sparse random feature models have been...
research
02/07/2021

Additive Feature Hashing

The hashing trick is a machine learning technique used to encode categor...
research
01/07/2018

Graph Autoencoder-Based Unsupervised Feature Selection with Broad and Local Data Structure Preservation

Feature selection is a dimensionality reduction technique that selects a...
research
03/15/2021

Data Discovery Using Lossless Compression-Based Sparse Representation

Sparse representation has been widely used in data compression, signal a...
research
10/07/2016

Significance testing in non-sparse high-dimensional linear models

In high-dimensional linear models, the sparsity assumption is typically ...
research
04/30/2019

Categorical Feature Compression via Submodular Optimization

In the era of big data, learning from categorical features with very lar...
research
06/24/2016

Wide & Deep Learning for Recommender Systems

Generalized linear models with nonlinear feature transformations are wid...

Please sign up or login with your details

Forgot password? Click here to reset