Sparse Representations of Positive Functions via Projected Pseudo-Mirror Descent

11/13/2020
by   Abhishek Chakraborty, et al.
0

We consider the problem of expected risk minimization when the population loss is strongly convex and the target domain of the decision variable is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. We restrict focus to the case that the decision variable belongs to a nonparametric Reproducing Kernel Hilbert Space (RKHS). To solve it, we consider stochastic mirror descent that employs (i) pseudo-gradients and (ii) projections. Compressive projections are executed via kernel orthogonal matching pursuit (KOMP), and overcome the fact that the vanilla RKHS parameterization grows unbounded with time. Moreover, pseudo-gradients are needed, e.g., when stochastic gradients themselves define integrals over unknown quantities that must be evaluated numerically, as in estimating the intensity parameter of an inhomogeneous Poisson Process, and multi-class kernel logistic regression with latent multi-kernels. We establish tradeoffs between accuracy of convergence in mean and the projection budget parameter under constant step-size and compression budget, as well as non-asymptotic bounds on the model complexity. Experiments demonstrate that we achieve state-of-the-art accuracy and complexity tradeoffs for inhomogeneous Poisson Process intensity estimation and multi-class kernel logistic regression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2017

Decentralized Online Learning with Kernels

We consider multi-agent stochastic optimization problems over reproducin...
research
07/23/2019

Stochastic Tverberg theorems and their applications in multi-class logistic regression, data separability, and centerpoints of data

We present new stochastic geometry theorems that give bounds on the prob...
research
11/10/2019

Stochastic DCA for minimizing a large sum of DC functions with application to Multi-class Logistic Regression

We consider the large sum of DC (Difference of Convex) functions minimiz...
research
07/06/2017

Indefinite Kernel Logistic Regression

Traditionally, kernel learning methods requires positive definitiveness ...
research
10/27/2016

Poisson intensity estimation with reproducing kernels

Despite the fundamental nature of the inhomogeneous Poisson process in t...
research
08/30/2022

Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization

The Frank-Wolfe method has become increasingly useful in statistical and...
research
03/08/2017

Sparse Quadratic Logistic Regression in Sub-quadratic Time

We consider support recovery in the quadratic logistic regression settin...

Please sign up or login with your details

Forgot password? Click here to reset