Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection

02/19/2011
by   Abhimanyu Das, et al.
0

We study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. This problem can be viewed in the context of both feature selection and sparse approximation. We analyze the performance of widely used greedy heuristics, using insights from the maximization of submodular functions and spectral analysis. We introduce the submodularity ratio as a key quantity to help understand why greedy algorithms perform well even when the variables are highly correlated. Using our techniques, we obtain the strongest known approximation guarantees for this problem, both in terms of the submodularity ratio and the smallest k-sparse eigenvalue of the covariance matrix. We further demonstrate the wide applicability of our techniques by analyzing greedy algorithms for the dictionary selection problem, and significantly improve the previously known guarantees. Our theoretical analysis is complemented by experiments on real-world and synthetic data sets; the experiments show that the submodularity ratio is a stronger predictor of the performance of greedy algorithms than other spectral parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/08/2017

Scalable Greedy Feature Selection via Weak Submodularity

Greedy algorithms are widely used for problems in machine learning such ...
research
05/12/2023

Revisiting Matching Pursuit: Beyond Approximate Submodularity

We study the problem of selecting a subset of vectors from a large set, ...
research
03/06/2019

Fast Parallel Algorithms for Feature Selection

In this paper, we analyze a fast parallel algorithm to efficiently selec...
research
04/24/2019

Beyond Adaptive Submodularity: Approximation Guarantees of Greedy Policy with Adaptive Submodularity Ratio

We propose a new concept named adaptive submodularity ratio to study the...
research
11/21/2020

Near-Optimal Data Source Selection for Bayesian Learning

We study a fundamental problem in Bayesian learning, where the goal is t...
research
02/11/2021

SLS (Single ℓ_1 Selection): a new greedy algorithm with an ℓ_1-norm selection rule

In this paper, we propose a new greedy algorithm for sparse approximatio...
research
02/28/2022

Fast Feature Selection with Fairness Constraints

We study the fundamental problem of selecting optimal features for model...

Please sign up or login with your details

Forgot password? Click here to reset