ProtoDash: Fast Interpretable Prototype Selection

07/05/2017
by   Karthik S. Gurumoorthy, et al.
0

In this paper we propose an efficient algorithm ProtoDash for selecting prototypical examples from complex datasets. Our work builds on top of the learn to criticize (L2C) work by Kim et al. (2016) and generalizes it to not only select prototypes for a given sparsity level m but also to associate non-negative weights with each of them indicative of the importance of each prototype. Unlike in the case of L2C, this extension provides a single coherent framework under which both prototypes and criticisms (i.e. lowest weighted prototypes) can be found. Furthermore, our framework works for any symmetric positive definite kernel thus addressing one of the open questions laid out in Kim et al. (2016). Our additional requirement of learning non-negative weights introduces technical challenges as the objective is no longer submodular as in the previous work. However, we show that the problem is weakly submodular and derive approximation guarantees for our fast ProtoDash algorithm. Moreover, ProtoDash can not only find prototypical examples for a dataset X, but it can also find (weighted) prototypical examples from X^(2) that best represent another dataset X^(1), where X^(1) and X^(2) belong to the same feature space. We demonstrate the efficacy of our method on diverse domains namely; retail, digit recognition (MNIST) and on the latest publicly available 40 health questionnaires obtained from the Center for Disease Control (CDC) website maintained by the US Dept. of Health. We validate the results quantitatively as well as qualitatively based on expert feedback and recently published scientific studies on public health.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/21/2018

Streaming Methods for Restricted Strongly Convex Functions with Applications to Prototype Selection

In this paper, we show that if the optimization function is restricted-s...
research
04/19/2019

Submodular Maximization Beyond Non-negativity: Guarantees, Fast Algorithms, and Applications

It is generally believed that submodular functions -- and the more gener...
research
01/19/2023

Weighted EF1 Allocations for Indivisible Chores

We study how to fairly allocate a set of indivisible chores to a group o...
research
07/04/2022

Correlated Stochastic Knapsack with a Submodular Objective

We study the correlated stochastic knapsack problem of a submodular targ...
research
06/18/2018

Overlapping Clustering Models, and One (class) SVM to Bind Them All

People belong to multiple communities, words belong to multiple topics, ...
research
04/04/2018

Sparse non-negative super-resolution - simplified and stabilised

The convolution of a discrete measure, x=∑_i=1^ka_iδ_t_i, with a local w...
research
01/14/2020

Weighted Completion Time Minimization for Unrelated Machines via Iterative Fair Contention Resolution

We give a 1.488-approximation for the classic scheduling problem of mini...

Please sign up or login with your details

Forgot password? Click here to reset