Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
Stochastic variance reduction has proven effective at accelerating first-order algorithms for solving convex finite-sum optimization tasks such as empirical risk minimization. Incorporating additional second-order information has proven helpful in further improving the performance of these first-order methods. However, comparatively little is known about the benefits of using variance reduction to accelerate popular stochastic second-order methods such as Subsampled Newton. To address this, we propose Stochastic Variance-Reduced Newton (SVRN), a finite-sum minimization algorithm which enjoys all the benefits of second-order methods: simple unit step size, easily parallelizable large-batch operations, and fast local convergence, while at the same time taking advantage of variance reduction to achieve improved convergence rates (per data pass) for smooth and strongly convex problems. We show that SVRN can accelerate many stochastic second-order methods (such as Subsampled Newton) as well as iterative least squares solvers (such as Iterative Hessian Sketch), and it compares favorably to popular first-order methods with variance reduction.
READ FULL TEXT