Importance Sampling via Local Sensitivity

11/04/2019
by   Anant Raj, et al.
0

Given a loss function F:X→R^+ that can be written as the sum of losses over a large set of inputs a_1,..., a_n, it is often desirable to approximate F by subsampling the input points. Strong theoretical guarantees require taking into account the importance of each point, measured by how much its individual loss contributes to F(x). Maximizing this importance over all x ∈X yields the sensitivity score of a_i. Sampling with probabilities proportional to these scores gives strong provable guarantees, allowing one to approximately minimize of F using just the subsampled points. Unfortunately, sensitivity sampling is difficult to apply since 1) it is unclear how to efficiently compute the sensitivity scores and 2) the sample size required is often too large to be useful. We propose overcoming both obstacles by introducing the local sensitivity, which measures data point importance in a ball around some center x_0. We show that the local sensitivity can be efficiently estimated using the leverage scores of a quadratic approximation to F, and that the sample size required to approximate F around x_0 can be bounded. We propose employing local sensitivity sampling in an iterative optimization method and illustrate its usefulness by analyzing its convergence when F is smooth and convex.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/22/2020

Bayesian Update with Importance Sampling: Required Sample Size

Importance sampling is used to approximate Bayes' rule in many computati...
research
06/09/2020

Coresets for Near-Convex Functions

Coreset is usually a small weighted subset of n input points in R^d, tha...
research
06/01/2023

Sharper Bounds for ℓ_p Sensitivity Sampling

In large scale machine learning, random sampling is a popular way to app...
research
09/11/2018

Rethinking the Effective Sample Size

The effective sample size (ESS) is widely used in sample-based simulatio...
research
08/08/2022

Pairwise Learning via Stagewise Training in Proximal Setting

The pairwise objective paradigms are an important and essential aspect o...
research
06/20/2018

Random Feature Stein Discrepancies

Computable Stein discrepancies have been deployed for a variety of appli...

Please sign up or login with your details

Forgot password? Click here to reset