Log In Sign Up

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

by   Igor Colin, et al.

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of U-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the U-statistic of interest. We establish convergence rate bounds of O(1/t) and O( t / t) for the synchronous and asynchronous cases respectively, where t is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.


page 1

page 2

page 3

page 4


Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

In decentralized networks (of sensors, connected objects, etc.), there i...

Robust Online and Distributed Mean Estimation Under Adversarial Data Corruption

We study robust mean estimation in an online and distributed scenario in...

On the Convergence Analysis of Asynchronous SGD for Solving Consistent Linear Systems

In the realm of big data and machine learning, data-parallel, distribute...

Distributed Statistical Inference for Massive Data

This paper considers distributed statistical inference for general symme...

Asynchrony and Acceleration in Gossip Algorithms

This paper considers the minimization of a sum of smooth and strongly co...

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

In a wide range of statistical learning problems such as ranking, cluste...

Distributed Estimation via Network Regularization

We propose a new method for distributed estimation of a linear model by ...