Bilevel stochastic methods for optimization and machine learning: Bilevel stochastic descent and DARTS

10/01/2021
by   Tommaso Giovannelli, et al.
0

Two-level stochastic optimization formulations have become instrumental in a number of machine learning contexts such as neural architecture search, continual learning, adversarial learning, and hyperparameter tuning. Practical stochastic bilevel optimization problems become challenging in optimization or learning scenarios where the number of variables is high or there are constraints. The goal of this paper is twofold. First, we aim at promoting the use of bilevel optimization in large-scale learning and we introduce a practical bilevel stochastic gradient method (BSG-1) that requires neither lower level second-order derivatives nor system solves (and dismisses any matrix-vector products). Our BSG-1 method is close to first-order principles, which allows it to achieve a performance better than those that are not, such as DARTS. Second, we develop bilevel stochastic gradient descent for bilevel problems with lower level constraints, and we introduce a convergence theory that covers the unconstrained and constrained cases and abstracts as much as possible from the specifics of the bilevel gradient calculation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2016

Second-Order Stochastic Optimization for Machine Learning in Linear Time

First-order stochastic methods are the state-of-the-art in large-scale m...
research
02/11/2019

Topology Optimization under Uncertainty using a Stochastic Gradient-based Approach

Topology optimization under uncertainty (TOuU) often defines objectives ...
research
01/31/2022

Inverse design of photonic devices with strict foundry fabrication constraints

We introduce a new method for inverse design of nanophotonic devices whi...
research
03/02/2016

Without-Replacement Sampling for Stochastic Gradient Methods: Convergence Results and Application to Distributed Optimization

Stochastic gradient methods for machine learning and optimization proble...
research
09/05/2023

PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates

This paper introduces PROMISE (Preconditioned Stochastic Optimization Me...
research
06/15/2016

Optimization Methods for Large-Scale Machine Learning

This paper provides a review and commentary on the past, present, and fu...
research
01/15/2013

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

Recently, we proposed to transform the outputs of each hidden neuron in ...

Please sign up or login with your details

Forgot password? Click here to reset