Statistical Inference for Polyak-Ruppert Averaged Zeroth-order Stochastic Gradient Algorithm

by   Yanhao Jin, et al.

As machine learning models are deployed in critical applications, it becomes important to not just provide point estimators of the model parameters (or subsequent predictions), but also quantify the uncertainty associated with estimating the model parameters via confidence sets. In the last decade, estimating or training in several machine learning models has become synonymous with running stochastic gradient algorithms. However, computing the stochastic gradients in several settings is highly expensive or even impossible at times. An important question which has thus far not been addressed sufficiently in the statistical machine learning literature is that of equipping zeroth-order stochastic gradient algorithms with practical yet rigorous inferential capabilities. Towards this, in this work, we first establish a central limit theorem for Polyak-Ruppert averaged stochastic gradient algorithm in the zeroth-order setting. We then provide online estimators of the asymptotic covariance matrix appearing in the central limit theorem, thereby providing a practical procedure for constructing asymptotically valid confidence sets (or intervals) for parameter estimation (or prediction) in the zeroth-order setting.



There are no comments yet.


page 1

page 2

page 3

page 4


Statistical Inference for Model Parameters in Stochastic Gradient Descent via Batch Means

Statistical inference of true model parameters based on stochastic gradi...

Online Statistical Inference for Gradient-free Stochastic Optimization

As gradient-free stochastic optimization gains emerging attention for a ...

A Fully Online Approach for Covariance Matrices Estimation of Stochastic Gradient Descent Solutions

Stochastic gradient descent (SGD) algorithm is widely used for parameter...

Statistical Inference for Model Parameters in Stochastic Gradient Descent

The stochastic gradient descent (SGD) algorithm has been widely used in ...

Solving Bayesian Risk Optimization via Nested Stochastic Gradient Estimation

In this paper, we aim to solve Bayesian Risk Optimization (BRO), which i...

Combining Spatial and Telemetric Features for Learning Animal Movement Models

We introduce a new graphical model for tracking radio-tagged animals and...

An efficient Averaged Stochastic Gauss-Newtwon algorithm for estimating parameters of non linear regressions models

Non linear regression models are a standard tool for modeling real pheno...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.