Survey schemes for stochastic gradient descent with applications to M-estimation

01/09/2015
by   Stephan Clémençon, et al.
0

In certain situations that shall be undoubtedly more and more common in the Big Data era, the datasets available are so massive that computing statistics over the full sample is hardly feasible, if not unfeasible. A natural approach in this context consists in using survey schemes and substituting the "full data" statistics with their counterparts based on the resulting random samples, of manageable size. It is the main purpose of this paper to investigate the impact of survey sampling with unequal inclusion probabilities on stochastic gradient descent-based M-estimation methods in large-scale statistical and machine-learning problems. Precisely, we prove that, in presence of some a priori information, one may significantly increase asymptotic accuracy when choosing appropriate first order inclusion probabilities, without affecting complexity. These striking results are described here by limit theorems and are also illustrated by numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2019

Neural ODEs as the Deep Limit of ResNets with constant weights

In this paper we prove that, in the deep limit, the stochastic gradient ...
research
09/29/2022

Computational Complexity of Sub-linear Convergent Algorithms

Optimizing machine learning algorithms that are used to solve the object...
research
02/09/2021

Berry–Esseen Bounds for Multivariate Nonlinear Statistics with Applications to M-estimators and Stochastic Gradient Descent Algorithms

We establish a Berry–Esseen bound for general multivariate nonlinear sta...
research
12/19/2019

Central limit theorems for stochastic gradient descent with averaging for stable manifolds

In this article we establish new central limit theorems for Ruppert-Poly...
research
06/21/2019

Trade-offs in Large-Scale Distributed Tuplewise Estimation and Learning

The development of cluster computing frameworks has allowed practitioner...
research
01/12/2015

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

In a wide range of statistical learning problems such as ranking, cluste...
research
03/21/2022

Joint Probabilities within Random Permutations

A celebrated analogy between prime factorizations of integers and cycle ...

Please sign up or login with your details

Forgot password? Click here to reset