Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums
We study structured nonsmooth convex finite-sum optimization that appears widely in machine learning applications, including support vector machines and least absolute deviation. For the primal-dual formulation of this problem, we propose a novel algorithm called Variance Reduction via Primal-Dual Accelerated Dual Averaging (). In the nonsmooth and general convex setting, has the overall complexity O(ndlogmin{1/ϵ, n} + d/ϵ ) in terms of the primal-dual gap, where n denotes the number of samples, d the dimension of the primal variables, and ϵ the desired accuracy. In the nonsmooth and strongly convex setting, the overall complexity of becomes O(ndlogmin{1/ϵ, n} + d/√(ϵ)) in terms of both the primal-dual gap and the distance between iterate and optimal solution. Both these results for improve significantly on state-of-the-art complexity estimates, which are O(ndlogmin{1/ϵ, n} + √(n)d/ϵ) for the nonsmooth and general convex setting and O(ndlogmin{1/ϵ, n} + √(n)d/√(ϵ)) for the nonsmooth and strongly convex setting, in a much more simple and straightforward way. Moreover, both complexities are better than lower bounds for general convex finite sums that lack the particular (common) structure that we consider. Our theoretical results are supported by numerical experiments, which confirm the competitive performance of compared to state-of-the-art.
READ FULL TEXT