Fast and Simple Optimization for Poisson Likelihood Models

08/03/2016
by   Niao He, et al.
0

Poisson likelihood models have been prevalently used in imaging, social networks, and time series analysis. We propose fast, simple, theoretically-grounded, and versatile, optimization algorithms for Poisson likelihood modeling. The Poisson log-likelihood is concave but not Lipschitz-continuous. Since almost all gradient-based optimization algorithms rely on Lipschitz-continuity, optimizing Poisson likelihood models with a guarantee of convergence can be challenging, especially for large-scale problems. We present a new perspective allowing to efficiently optimize a wide range of penalized Poisson likelihood objectives. We show that an appropriate saddle point reformulation enjoys a favorable geometry and a smooth structure. Therefore, we can design a new gradient-based optimization algorithm with O(1/t) convergence rate, in contrast to the usual O(1/√(t)) rate of non-smooth minimization alternatives. Furthermore, in order to tackle problems with large samples, we also develop a randomized block-decomposition variant that enjoys the same convergence rate yet more efficient iteration cost. Experimental results on several point process applications including social network estimation and temporal recommendation show that the proposed algorithm and its randomized block variant outperform existing methods both on synthetic and real-world datasets.

READ FULL TEXT
research
07/10/2018

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

The minimization of convex objectives coming from linear supervised lear...
research
10/04/2016

A Generic Quasi-Newton Algorithm for Faster Gradient-Based Optimization

We propose a generic approach to accelerate gradient-based optimization ...
research
06/13/2023

Learning Unnormalized Statistical Models via Compositional Optimization

Learning unnormalized statistical models (e.g., energy-based models) is ...
research
06/18/2020

Improving the Convergence Rate of One-Point Zeroth-Order Optimization using Residual Feedback

Many existing zeroth-order optimization (ZO) algorithms adopt two-point ...
research
09/26/2019

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Gradient-based temporal difference (GTD) algorithms are widely used in o...
research
06/30/2022

On the Learning and Learnablity of Quasimetrics

Our world is full of asymmetries. Gravity and wind can make reaching a p...
research
07/17/2022

Uniform Stability for First-Order Empirical Risk Minimization

We consider the problem of designing uniformly stable first-order optimi...

Please sign up or login with your details

Forgot password? Click here to reset