Distributed Estimation, Information Loss and Exponential Families

10/09/2014
by   Qiang Liu, et al.
0

Distributed learning of probabilistic models from multiple data repositories with minimum communication is increasingly important. We study a simple communication-efficient learning framework that first calculates the local maximum likelihood estimates (MLE) based on the data subsets, and then combines the local MLEs to achieve the best possible approximation to the global MLE given the whole dataset. We study this framework's statistical properties, showing that the efficiency loss compared to the global setting relates to how much the underlying distribution families deviate from full exponential families, drawing connection to the theory of information loss by Fisher, Rao and Efron. We show that the "full-exponential-family-ness" represents the lower bound of the error rate of arbitrary combinations of local MLEs, and is achieved by a KL-divergence-based combination method but not by a more common linear combination method. We also study the empirical properties of both methods, showing that the KL method significantly outperforms linear combination in practical settings with issues such as model misspecification, non-convexity, and heterogeneous data partitions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2020

Maximizing the Bregman divergence from a Bregman family

The problem to maximize the information divergence from an exponential f...
research
04/10/2022

Rethinking Exponential Averaging of the Fisher

In optimization for Machine learning (ML), it is typical that curvature-...
research
05/19/2013

Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families

We study online learning under logarithmic loss with regular parametric ...
research
10/31/2009

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

The versatility of exponential families, along with their attendant conv...
research
11/28/2021

An inverse Sanov theorem for curved exponential families

We prove the large deviation principle (LDP) for posterior distributions...
research
01/18/2022

Bregman Deviations of Generic Exponential Families

We revisit the method of mixture technique, also known as the Laplace me...
research
09/02/2020

Improving ERGM Starting Values Using Simulated Annealing

Much of the theory of estimation for exponential family models, which in...

Please sign up or login with your details

Forgot password? Click here to reset