DeepAI AI Chat
Log In Sign Up

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

by   Martin Bompaire, et al.

The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions. For such problems, a large set of stochastic first-order solvers based on the idea of variance reduction are available and combine both computational efficiency and sound theoretical guarantees (linear convergence rates). Such rates are obtained under both gradient-Lipschitz and strong convexity assumptions. Motivated by learning problems that do not meet the gradient-Lipschitz assumption, such as linear Poisson regression, we work under another smoothness assumption, and obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent (SDCA) that improves the current state-of-the-art. Our motivation for considering a solver working on the Fenchel-dual problem comes from the fact that such objectives include many linear constraints, that are easier to deal with in the dual. Our approach and theoretical findings are validated on several datasets, for Poisson regression and another objective coming from the negative log-likelihood of the Hawkes process, which is a family of models which proves extremely useful for the modeling of information propagation in social networks and causality inference.


page 1

page 2

page 3

page 4


Convergence Rates for Deterministic and Stochastic Subgradient Methods Without Lipschitz Continuity

We generalize the classic convergence rate theory for subgradient method...

Fast and Simple Optimization for Poisson Likelihood Models

Poisson likelihood models have been prevalently used in imaging, social ...

Decentralized Feature-Distributed Optimization for Generalized Linear Models

We consider the "all-for-one" decentralized learning problem for general...

SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient

In this paper, we propose a StochAstic Recursive grAdient algoritHm (SAR...

Fast Stochastic Variance Reduced ADMM for Stochastic Composition Optimization

We consider the stochastic composition optimization problem proposed in ...

Linear convergence of SDCA in statistical estimation

In this paper, we consider stochastic dual coordinate (SDCA) without st...

Self-concordant analysis of Frank-Wolfe algorithms

Projection-free optimization via different variants of the Frank-Wolfe (...