A Fast, Principled Working Set Algorithm for Exploiting Piecewise Linear Structure in Convex Problems

07/20/2018
by   Tyler B. Johnson, et al.
0

By reducing optimization to a sequence of smaller subproblems, working set algorithms achieve fast convergence times for many machine learning problems. Despite such performance, working set implementations often resort to heuristics to determine subproblem size, makeup, and stopping criteria. We propose BlitzWS, a working set algorithm with useful theoretical guarantees. Our theory relates subproblem size and stopping criteria to the amount of progress during each iteration. This result motivates strategies for optimizing algorithmic parameters and discarding irrelevant components as BlitzWS progresses toward a solution. BlitzWS applies to many convex problems, including training L1-regularized models and support vector machines. We showcase this versatility with empirical comparisons, which demonstrate BlitzWS is indeed a fast algorithm.

READ FULL TEXT
research
07/05/2016

An Aggregate and Iterative Disaggregate Algorithm with Proven Optimality in Machine Learning

We propose a clustering-based iterative algorithm to solve certain optim...
research
09/24/2021

Accelerated nonlinear primal-dual hybrid gradient algorithms with applications to machine learning

The primal-dual hybrid gradient (PDHG) algorithm is a first-order method...
research
06/24/2020

Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression

Owing to their statistical properties, non-convex sparse regularizers ha...
research
02/21/2018

Dual Extrapolation for Faster Lasso Solvers

Convex sparsity-inducing regularizations are ubiquitous in high-dimensio...
research
03/27/2013

Efficiently Using Second Order Information in Large l1 Regularization Problems

We propose a novel general algorithm LHAC that efficiently uses second-o...
research
04/01/2020

Stopping Criteria for, and Strong Convergence of, Stochastic Gradient Descent on Bottou-Curtis-Nocedal Functions

While Stochastic Gradient Descent (SGD) is a rather efficient algorithm ...
research
04/16/2022

Beyond L1: Faster and Better Sparse Models with skglm

We propose a new fast algorithm to estimate any sparse generalized linea...

Please sign up or login with your details

Forgot password? Click here to reset