An Optimal Algorithm for Decentralized Finite Sum Optimization
Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient Accelerated Decentralized stochastic algorithm for Finite Sums named ADFS, which uses local stochastic proximal updates and decentralized communications between nodes. On n machines, ADFS minimizes the objective function with nm samples in the same time it takes optimal algorithms to optimize from m samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples m, and on the network topology. We give a lower bound of complexity to show that ADFS is optimal among decentralized algorithms. To derive ADFS, we first develop an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. Then, we apply this coordinate descent algorithm to a well-chosen dual problem based on an augmented graph approach, leading to the general ADFS algorithm. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments.
READ FULL TEXT