A Provably Convergent Information Bottleneck Solution via ADMM

02/09/2021
by   Teng-Hui Huang, et al.
0

The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian ℒ_IB:=I(X;Z)-β I(Y;Z) is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting I(Y;Z) versus I(X;Z) for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier β. At phase transitions, both I(X;Z) and I(Y;Z) increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of β. Empirically, our gradient-descent-based method results in information plane points that are denser and comparable to those obtained through the conventional Blahut-Arimoto-based solvers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2022

A Linearly Convergent Douglas-Rachford Splitting Solver for Markovian Information-Theoretic Optimization Problems

In this work, we propose solving the Information bottleneck (IB) and Pri...
research
08/10/2023

Learning (With) Distributed Optimization

This paper provides an overview of the historical progression of distrib...
research
03/28/2023

Efficient Alternating Minimization Solvers for Wyner Multi-View Unsupervised Learning

In this work, we adopt Wyner common information framework for unsupervis...
research
12/14/2020

Disentangled Information Bottleneck

The information bottleneck (IB) method is a technique for extracting inf...
research
04/09/2018

Frank-Wolfe Splitting via Augmented Lagrangian Method

Minimizing a function over an intersection of convex sets is an importan...
research
11/25/2019

The Convex Information Bottleneck Lagrangian

The information bottleneck (IB) problem tackles the issue of obtaining r...
research
08/05/2019

Restricted Linearized Augmented Lagrangian Method for Euler's Elastica Model

Euler's elastica model has been extensively studied and applied to image...

Please sign up or login with your details

Forgot password? Click here to reset