DeepAI AI Chat
Log In Sign Up

Not All Samples Are Created Equal: Deep Learning with Importance Sampling

03/02/2018
by   Angelos Katharopoulos, et al.
0

Deep neural network training spends most of the computation on examples that are properly handled, and could be ignored. We propose to mitigate this phenomenon with a principled importance sampling scheme that focuses computation on "informative" examples, and reduces the variance of the stochastic gradients during training. Our contribution is twofold: first, we derive a tractable upper bound to the per-sample gradient norm, and second we derive an estimator of the variance reduction achieved with importance sampling, which enables us to switch it on when it will result in an actual speedup. The resulting scheme can be used by changing a few lines of code in a standard SGD procedure, and we demonstrate experimentally, on image classification, CNN fine-tuning, and RNN training, that for a fixed wall-clock time budget, it provides a reduction of the train losses of up to an order of magnitude and a relative improvement of test errors between 5

READ FULL TEXT

page 1

page 2

page 3

page 4

11/20/2015

Variance Reduction in SGD by Distributed Importance Sampling

Humans are able to accelerate their learning by selecting training mater...
02/06/2016

Importance Sampling for Minibatches

Minibatching is a very well studied and highly popular technique in supe...
10/25/2018

Finite-sample Guarantees for Winsorized Importance Sampling

Importance sampling is a widely used technique to estimate the propertie...
06/20/2022

Learning Optimal Flows for Non-Equilibrium Importance Sampling

Many applications in computational sciences and statistical inference re...
10/27/2021

How Important is Importance Sampling for Deep Budgeted Training?

Long iterative training processes for Deep Neural Networks (DNNs) are co...
01/10/2013

Policy Improvement for POMDPs Using Normalized Importance Sampling

We present a new method for estimating the expected return of a POMDP fr...
10/14/2019

A unified view of likelihood ratio and reparameterization gradients and an optimal importance sampling scheme

Reparameterization (RP) and likelihood ratio (LR) gradient estimators ar...

Code Repositories

importance-sampling

Code for experiments regarding importance sampling for training neural networks


view repo