Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

02/25/2020
by   Yifan Hu, et al.
13

Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives, under smooth and non-smooth conditions. We also provide matching lower bounds of BSGD for convex CSO objectives. Extensive numerical experiments are conducted to illustrate the performance of BSGD on robust logistic regression, model-agnostic meta-learning (MAML), and instrumental variable regression (IV).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2023

Debiasing Conditional Stochastic Optimization

In this paper, we study the conditional stochastic optimization (CSO) pr...
research
06/26/2020

Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic Bounds and Applications

Motivated by a wide variety of applications, ranging from stochastic opt...
research
12/05/2019

Lower Bounds for Non-Convex Stochastic Optimization

We lower bound the complexity of finding ϵ-stationary points (with gradi...
research
08/25/2020

Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

Stochastic compositional optimization generalizes classic (non-compositi...
research
03/25/2019

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

We study the problem of learning-to-learn: inferring a learning algorith...
research
05/20/2014

Convex Optimization: Algorithms and Complexity

This monograph presents the main complexity theorems in convex optimizat...
research
12/14/2020

Noisy Linear Convergence of Stochastic Gradient Descent for CV@R Statistical Learning under Polyak-Łojasiewicz Conditions

Conditional Value-at-Risk (CV@R) is one of the most popular measures of ...

Please sign up or login with your details

Forgot password? Click here to reset