Provably Accelerated Randomized Gossip Algorithms

In this work we present novel provably accelerated gossip algorithms for solving the average consensus problem. The proposed protocols are inspired from the recently developed accelerated variants of the randomized Kaczmarz method - a popular method for solving linear systems. In each gossip iteration all nodes of the network update their values but only a pair of them exchange their private information. Numerical experiments on popular wireless sensor networks showing the benefits of our protocols are also presented.

• 23 publications
• 25 publications
• 127 publications
09/23/2018

Accelerated Gossip via Stochastic Heavy Ball Method

In this paper we show how the stochastic heavy ball method (SHB) -- a po...
05/20/2019

Revisiting Randomized Gossip Algorithms: General Framework, Convergence Rates and Novel Block and Accelerated Protocols

In this work we present a new framework for the analysis and design of r...
01/27/2019

A Privacy Preserving Randomized Gossip Algorithm via Controlled Noise Insertion

In this work we present a randomized gossip algorithm for solving the av...
10/27/2021

Paving the Way for Consensus: Convergence of Block Gossip Algorithms

Gossip protocols are popular methods for average consensus problems in d...
02/12/2018

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization

We present the first accelerated randomized algorithm for solving linear...
05/09/2016

Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons

We revisit the problem of inferring the overall ranking among entities i...

1 Introduction

Distributed averaging is a fundamental problem in the area of distributed computing and multi-agent systems [1, 2]. Randomized gossip algorithms are one of the most popular class of methods for solving it. The seminal 2006 paper of Boyd et al. [3] on randomized gossip algorithms motivated a flurry of subsequent research, and now gossip algorithms appear in many applications, including distributed data fusion in sensor networks [4], load balancing [5] and clock synchronization [6]. The development and design of efficient gossip algorithms was studied extensively in the last decade. For a survey of selected relevant work prior to 2010, we refer the reader to the survey [7]. For more recent results on randomized gossip algorithms we suggest [8, 9, 10, 11, 12, 13]. See also [14, 15, 16, 17].

In the literature of gossip algorithms, an important task is the design of fast and efficient algorithms. Surprisingly, to the best of our knowledge, there are no variants of gossip algorithms that converge to consensus with an accelerated linear rate. In this work, our focus is precisely this. We design two provably accelerated randomized gossip protocols which converge to consensus fast.

The average consensus problem. In the average consensus (AC) problem we are given an undirected connected network with node set and edges . Each node holds a local value . The goal of AC is for every node to compute the average of these private values, , in a distributed fashion. That is, the exchange of information can only occur between connected nodes (neighbors).

Main contributions. In this work, building upon a recent framework for the design and analysis of randomized gossip algorithms [11, 18], we present two novel and provably accelerated randomized gossip protocols where in each step all nodes of the network update their values using their own information but only a pair of them exchange messages. The accelerated convergence rates of the proposed protocols are obtained by establishing a connection with the area of accelerated randomized Kaczmarz methods for solving consistent linear systems.

To the best of our knowledge, our protocols are the first randomized gossip algorithms that converge to consensus with an accelerated linear rate. The theoretical results are validated via computational testing on typical network topologies.

Structure of the paper. Section 2 introduces important technical preliminaries and the necessary background for understanding of our methods. Two accelerated variants of the randomized Kaczmarz (RK) method for solving linear systems and their theoretical convergence results are described. In Section 3 we present the two provably accelerated gossip protocols, along with some remarks on their implementation. Numerical evaluation of the new gossip protocols is presented in Section 4. Finally, concluding remarks are given in Section 5.

Notation. The following notational conventions are used in this paper. We write . Boldface upper-case letters denote matrices;

is the identity matrix. By

we denote the solution set of the linear system , where and . By and we indicate the row and the column of matrix , respectively. Throughout the paper, is the projection of onto (that is, is the solution of the best approximation problem; see equation (2)). With

we indicate the smallest nonzero eigenvalue of matrix

. and are used to denote the Euclidean norm and the Frobenius norm, respectively. Finally,

represents the vector with the local values of the

nodes of the network at the iteration. Here, denotes the value of node at the iteration.

2 Technical Preliminaries

In this section we present the connections between the randomized Kaczmarz methods for solving linear systems and the gossip algorithms for solving the AC problem, as discussed in more details in [11, 18]. In particular, we focus on the presentation of the two recently proposed accelerated variants of Kaczmarz methods and on their theoretical convergence analysis.

2.1 Kaczmarz-type methods and gossip algorithms

Kaczmarz-type methods are popular algorithms for solving linear systems with many equations. The randomized Kaczmarz method (RK) for solving consistent linear systems was first proposed and proved to converge with linear rate in [19]. This work triggered much research into developing and analyzing randomized linear solvers and several improved variants of RK have been proposed [20, 21, 22, 23, 24, 25, 26, 27, 28].

In particular, in its simplest version, RK works as follows; In each step, one row of matrix

is sampled with probability

and then is used to obtain the next iterate by following the update rule:

 xk+1=xk−Ai:xk−bi∥Ai:∥22A⊤i:. (1)

For the case of consistent linear systems, it was shown that RK and its variants solves the following problem (known as best approximation problem) [29, 30, 31] :

 minx=(x1,…,xn)∈Rn12∥x−x0∥2subject toAx=b. (2)

where is the initial vector of the method.

In [11] it was shown how RK works as a gossip algorithm when applied to a special linear system encoding the underlying network. The following definition is used to describe the class of linear systems considered here.

Definition 2.1 ([11])

A linear system is called an “average consensus (AC) system” when all solutions satisfy that for all .

Many linear systems satisfy the above definition. In this work we focus on the case where and is the incidence matrix of (or its normalized form where ). In this case, the row of the system corresponding to edge directly encodes the constraint .

Since the right hand side of the above system is , the update rule of equation (1) simplifies to: In the case that the starting point is it can be shown that RK solves the average consensus probem and that the above udpate rule is equivalent with the pairwise randomized gossip algorithm of [3] (see [11] for more details). The convergence performance of RK for solving the best approximation problem (and as a result the average consensus problem) is described by the following theorem.

Theorem 2.2 ([29, 30])

Let be the iterates produced by (1). Then where .

2.2 Accelerated Kaczmarz methods

There are two different but very similar ways to accelerate the randomized Kaczmarz method. The first paper that proves asymptotic convergence with an accelerated linear rate is [27]. The proof technique is similar to the framework developed by Nesterov in [32] for the acceleration of coordinate descent methods. In [33, 34] a modified version for the selection of the parameters was proposed and a non-asymptotic accelerated linear rate was established. In Algorithm 1, pseudocode of the Accelerated Kaczmarz method (AccRK) is presented where both variants can be cast as special cases, by choosing the parameters with the correct way.

There are two options for selecting the parameters, which we describe next.

1. From [27]: Choose and set . Generate the sequence by choosing to be the largest root of

 γ2k−γkm=(1−γkλm)γ2k−1

and generate the sequences and by setting

 αk=m−γkλγk(m2−λ),βk=1−γkλm.
2. From [34]: Let

 ν=maxu∈Range(A⊤)u⊤[∑mi=1A⊤i:Ai:(A⊤A)†A⊤i:Ai:]uu⊤A⊤Amu. (3)

Choose the three sequences to be fixed constants as follows: ,  ,   where .

2.3 Theoretical guarantees of AccRK

The two variants (Option 1 and Option 2) of AccRK are closely related, however their convergence analyses are different. Below we present the theoretical guarantees of the two options as presented in [27] and [34].

Theorem 2.3 ([27])

Let be the sequence of random iterates produced by Algorithm 1 with the Option 1 for the parameters. Let and define and . Then for any we have that:

 E[∥xk+1−x∗∥2]≤4λ(σk+11−σk+12)2∥x0−x∗∥2(A⊤A)+.
Corollary 1 ([27])

Note that as , we have that . This means that the decrease of the right hand side is governed mainly by the behavior of the term in the denominator and as a result the method converge asymptotically with a decrease factor per iteration:

Thus, by choosing and for the case that is small, Algorithm 1 will have significantly faster convergence rate than RK. Note that the above convergence results hold for normalized matrices , that is matrices that have for any .

Theorem 2.4 ([34])

Let and assume that . Let be the iterates of Algorithm 1 with the Option 2 for the parameters. Then

 Ψk≤(1−√λ+min(W)/ν)kΨ0

where

The above result implies that Algorithm 1 converges linearly with rate , which translates to a total of iterations to bring the quantity below . It can be shown that , (Lemma 2 in [34]) where is as defined in (3). Thus, which means that the rate of AccRK (Option 2) is always better than that of the RK which (see Theorem 2.2) is equal to for normalized matrices ().

3 Accelerated randomized gossip algorithms

In the previous section we presented the complexity analysis guarantees of AccRK for solving consistent linear systems with normalized matrices. Now, let us explain how the two options of AccRK behave as gossip algorithms when they are used to solve the linear system where is the normalized incidence matrix of the network. That is, each row of can be represented as where (resp.) is the (resp. ) unit coordinate vector in .

By using this particular linear system, the expression that appears in steps 8 and 9 of AccRK takes the following form when the row is sampled:

Let be the Laplacian matrix of the network. For solving the above AC system (see Definition 2.1), the simple RK requires iterations to achieve expected accuracy . To understand the acceleration in the gossip framework this should be compared to the of AccRK (Option 1) and the of AccRK (Option 2).

Algorithm 2 describes in a single framework how the two variants of AccRK of Section 2.2 behave as gossip algorithms when are used to solve the above linear system. Note that each node of the network have two local registers to save the quantities and . In each step using these two values every node of the network (activated or not) computes the quantity . Then in the iteration the activated nodes and of the randomly selected edge exchange their values and and update the values of , and , as shown in Algorithm 2. The rest of the nodes use only their own to update the values of and without communicate with any other node.

The parameter

can be estimated by all nodes in a decentralized manner using the method described in

[35]. In order to implement this algorithm, we assume that all nodes have synchronized clocks and that they know the rate at which gossip updates are performed, so that inactive nodes also update their local values. This may not be feasible in all applications, but when it is possible (e.g., if nodes are equipped with inexpensive GPS receivers, or have reliable clocks) then they can benefit from the significant speedup achieved.

Related work on accelerated gossip algorithms: The idea of having gossip updates in a network with two registers in each node is not new. It was first proposed in [36] and its analysis under strong conditions was presented in [9]. There local memory is exploited by installing shift registers at each agent where the first register stores the agent’s current value and the second the agent’s value before the latest update. In [18], the Stochastic Heavy Ball (SHB) method is used for solving the AC problem and an accelerated method is proposed which was shown to be in practice faster than the algorithm of [36, 9]. [18] is the first paper that presents gossip algorithms where in each step all nodes of the network update their values but only a subset of them exchange their private values.

4 Numerical Evaluation

We devote this section to numerically evaluate the performance of the proposed accelerated gossip protocols. In all of our experiments we compare the simple RK (equivalent to pairwise gossip algorithm of [3]) the Stochastic Heavy Ball method (SHB) proposed in [18] and the AccRK (Algorithm 2) with the two options for the selection of the parameters presented in Section 2.2. In comparing the methods we use the relative error measure where the starting vector of values is taken to be always Gaussian vector. For all of our experiments the horizontal axis represents the number of iterations. The networks used in the experiments are the cycle (ring graph), the 2-dimension grid and the randomized geometric graph (RGG) with radius . Code was written in Julia 0.6.3.

For the implementation of SHB we use the same parameters with the ones used in [18]. For the AccRK (Option 1) we use . Note that for all networks under study the two proposed protocols are faster than both the pairwise gossip algorithm of [3] and the SHB of [18].

5 Conclusion and Future Research

We proposed novel provably accelerated randomized gossip algorithms for solving the AC problem. Our approach is based on connections established between the gossip algorithms and the Kaczmarz methods for solving linear systems. We believe that many novel and efficient gossip protocols can be discovered using results from the literature of Kaczmarz methods either by using different AC linear systems or using other Kaczmarz-type algorithms than the one presented in this manuscript. We speculate that the gossip algorithms presented in this work can be extended to the more general setting of minimizing the average of convex functions in a decentralized way [12]. While preparing this work we become aware of [37] where an accelerated gossip algorithm is developed for solving the dual of the best approximation problem (2) using the accelerated coordinate descent method of [38]. A comparison of our protocols and the algorithm of [37] is an ongoing research work.