Less than a Single Pass: Stochastically Controlled Stochastic Gradient Method
We develop and analyze a procedure for gradient-based optimization that we refer to as stochastically controlled stochastic gradient (SCSG). As a member of the SVRG family of algorithms, SCSG makes use of gradient estimates at two scales, with the number of updates at the faster scale being governed by a geometric random variable. Unlike most existing algorithms in this family, both the computation cost and the communication cost of SCSG do not necessarily scale linearly with the sample size n; indeed, these costs are independent of n when the target accuracy is low. An experimental evaluation on real datasets confirms the effectiveness of SCSG.
READ FULL TEXT