Log In Sign Up

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

by   Yifan Hu, et al.

Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives, under smooth and non-smooth conditions. We also provide matching lower bounds of BSGD for convex CSO objectives. Extensive numerical experiments are conducted to illustrate the performance of BSGD on robust logistic regression, model-agnostic meta-learning (MAML), and instrumental variable regression (IV).


page 1

page 2

page 3

page 4


Asynchronous decentralized accelerated stochastic gradient descent

In this work, we introduce an asynchronous decentralized accelerated sto...

Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic Bounds and Applications

Motivated by a wide variety of applications, ranging from stochastic opt...

Lower Bounds for Non-Convex Stochastic Optimization

We lower bound the complexity of finding ϵ-stationary points (with gradi...

Solving Stochastic Compositional Optimization is Nearly as Easy as Solving Stochastic Optimization

Stochastic compositional optimization generalizes classic (non-compositi...

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

We study the problem of learning-to-learn: inferring a learning algorith...

Convex Optimization: Algorithms and Complexity

This monograph presents the main complexity theorems in convex optimizat...