Extreme Stochastic Variational Inference: Distributed and Asynchronous

05/31/2016
by   Jiong Zhang, et al.
0

We propose extreme stochastic variational inference (ESVI), an asynchronous and lock-free algorithm to perform variational inference on massive real world datasets. Stochastic variational inference (SVI), the state-of-the-art algorithm for scaling variational inference to large-datasets, is inherently serial. Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions. ESVI overcomes these limitations by requiring that each processor only access a subset of the data and a subset of the parameters, thus providing data and model parallelism simultaneously. We demonstrate the effectiveness of ESVI by running Latent Dirichlet Allocation (LDA) on UMBC-3B, a dataset that has a vocabulary of 3 million and a token size of 3 billion. To best of our knowledge, this is an order of magnitude larger than the largest dataset on which results using variational inference have been reported in literature. In our experiments, we found that ESVI outperforms VI and SVI, and also achieves a better quality solution. In addition, we propose a strategy to speed up computation and save memory when fitting large number of topics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2012

Stochastic Variational Inference

We develop stochastic variational inference, a scalable algorithm for ap...
research
01/12/2018

Asynchronous Stochastic Variational Inference

Stochastic variational inference (SVI) employs stochastic optimization t...
research
11/06/2014

Stochastic Variational Inference for Hidden Markov Models

Variational inference algorithms have proven successful for Bayesian ana...
research
04/15/2021

Variational Inference for Category Recommendation in E-Commerce platforms

Category recommendation for users on an e-Commerce platform is an import...
research
04/21/2018

Variational Inference In Pachinko Allocation Machines

The Pachinko Allocation Machine (PAM) is a deep topic model that allows ...
research
02/27/2018

ADMM-based Networked Stochastic Variational Inference

Owing to the recent advances in "Big Data" modeling and prediction tasks...
research
12/10/2015

Scalable Modeling of Conversational-role based Self-presentation Characteristics in Large Online Forums

Online discussion forums are complex webs of overlapping subcommunities ...

Please sign up or login with your details

Forgot password? Click here to reset