DeepAI AI Chat
Log In Sign Up

A Penalty Based Method for Communication-Efficient Decentralized Bilevel Programming

by   Parvin Nazari, et al.

Bilevel programming has recently received attention in the literature, due to a wide range of applications, including reinforcement learning and hyper-parameter optimization. However, it is widely assumed that the underlying bilevel optimization problem is solved either by a single machine or in the case of multiple machines connected in a star-shaped network, i.e., federated learning setting. The latter approach suffers from a high communication cost on the central node (e.g., parameter server) and exhibits privacy vulnerabilities. Hence, it is of interest to develop methods that solve bilevel optimization problems in a communication-efficient decentralized manner. To that end, this paper introduces a penalty function based decentralized algorithm with theoretical guarantees for this class of optimization problems. Specifically, a distributed alternating gradient-type algorithm for solving consensus bilevel programming over a decentralized network is developed. A key feature of the proposed algorithm is to estimate the hyper-gradient of the penalty function via decentralized computation of matrix-vector products and few vector communications, which is then integrated within our alternating algorithm to give the finite-time convergence analysis under different convexity assumptions. Owing to the generality of this complexity analysis, our result yields convergence rates for a wide variety of consensus problems including minimax and compositional optimization. Empirical results on both synthetic and real datasets demonstrate that the proposed method works well in practice.


A Decentralized Adaptive Momentum Method for Solving a Class of Min-Max Optimization Problems

Min-max saddle point games have recently been intensely studied, due to ...

On Penalty-based Bilevel Gradient Descent Method

Bilevel optimization enjoys a wide range of applications in hyper-parame...

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization

Adaptive gradient-based optimization methods such as ADAGRAD, RMSPROP, a...

Decentralized Stochastic Gradient Descent Ascent for Finite-Sum Minimax Problems

Minimax optimization problems have attracted significant attention in re...

FEDNEST: Federated Bilevel, Minimax, and Compositional Optimization

Standard federated optimization methods successfully apply to stochastic...

On the Convergence of Decentralized Federated Learning Under Imperfect Information Sharing

Decentralized learning and optimization is a central problem in control ...

Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

Bilevel Optimization Programming is used to model complex and conflictin...