Frank-Wolfe Algorithm for Exemplar Selection

11/06/2018
by   Gary Cheng, et al.
0

In this paper, we consider the problem of selecting representatives from a data set for arbitrary supervised/unsupervised learning tasks. We identify a subset S of a data set A such that 1) the size of S is much smaller than A and 2) S efficiently describes the entire data set, in a way formalized via auto-regression. The set S, also known as the exemplars of the data set A, is constructed by solving a convex auto-regressive version of dictionary learning where the dictionary and measurements are given by the data matrix. We show that in order to generate |S| = k exemplars, our algorithm, Frank-Wolfe Sparse Representation (FWSR), only requires ≈ k iterations with a per-iteration cost that is quadratic in the size of A, an order of magnitude faster than state of the art methods. We test our algorithm against current methods on 4 different data sets and are able to outperform other exemplar finding methods in almost all scenarios. We also test our algorithm qualitatively by selecting exemplars from a corpus of Donald Trump and Hillary Clinton's twitter posts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2019

The Wilderness Area Data Set: Adapting the Covertype data set for unsupervised learning

Benchmark data sets are of vital importance in machine learning research...
research
07/17/2023

Reduced Kernel Dictionary Learning

In this paper we present new algorithms for training reduced-size nonlin...
research
09/07/2018

Fast greedy algorithms for dictionary selection with generalized sparsity constraints

In dictionary selection, several atoms are selected from finite candidat...
research
06/07/2013

Loss-Proportional Subsampling for Subsequent ERM

We propose a sampling scheme suitable for reducing a data set prior to s...
research
12/18/2014

Example Selection For Dictionary Learning

In unsupervised learning, an unbiased uniform sampling strategy is typic...
research
10/17/2014

Generalized Conditional Gradient for Sparse Estimation

Structured sparsity is an important modeling tool that expands the appli...
research
06/06/2023

Statistical inference for sketching algorithms

Sketching algorithms use random projections to generate a smaller sketch...

Please sign up or login with your details

Forgot password? Click here to reset