Graph connectivity in log-diameter steps using label propagation
The fastest deterministic algorithms for connected components take logarithmic time and perform superlinear work on a PRAM. These algorithms require pointer-chasing operations and are limited to shared-memory systems. Another popular method is `leader contraction' where non-leader vertices are contracted to adjacent leaders. The challenge is to select a constant fraction of leaders that are adjacent to a constant fraction of non-leaders with high probability. Instead we investigate whether simple label propagation can be as efficient as the fastest known algorithms for graph connectivity. Label propagation exchanges representative labels within a component. This is attractive for other models because it is deterministic and does not rely on pointer-chasing, but it is inherently difficult to complete in a sublinear number of steps. We are able to solve the problems with label propagation for graph connectivity. We introduce a simple framework for deterministic graph connectivity in log-diameter steps using label propagation that is easily translated to other computational models. We present new algorithms in PRAM, Stream, and MapReduce. Given a simple, undirected graph G=(V,E) with n=|V| vertices, m=|E| edges, and D diameter, all our algorithms complete in O( D) steps without pointer operations. We give the first label propagation algorithms that are competitive with the fastest PRAM algorithms, achieving O( D) time and O((m+n) D) work with O(m+n) processors. Our main contribution is in Stream and MapReduce models. We give an efficient Stream-Sort algorithm that takes O( D) passes and O( n) memory, and a MapReduce algorithm taking O( D) rounds and O((m+n) D) communication overall. These are the first O( D)-step graph connectivity algorithms in Stream and MapReduce models that are also deterministic and simple to implement.
READ FULL TEXT