Probabilistic Adaptive Computation Time

12/01/2017
by   Michael Figurnov, et al.
0

We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed using a novel stochastic variational optimization method. The recently proposed Adaptive Computation Time mechanism can be seen as an ad-hoc relaxation of this model. We demonstrate training using the general-purpose Concrete relaxation of discrete variables. Evaluation on ResNet shows that our method matches the speed-accuracy trade-off of Adaptive Computation Time, while allowing for evaluation with a simple deterministic procedure that has a lower memory footprint.

READ FULL TEXT
research
09/18/2019

Scalable Deep Unsupervised Clustering with Concrete GMVAEs

Discrete random variables are natural components of probabilistic cluste...
research
09/01/2016

Neural Network Architecture Optimization through Submodularity and Supermodularity

Deep learning models' architectures, including depth and width, are key ...
research
10/24/2016

Learning to Reason With Adaptive Computation

Multi-hop inference is necessary for machine learning systems to success...
research
02/27/2013

State-space Abstraction for Anytime Evaluation of Probabilistic Networks

One important factor determining the computational complexity of evaluat...
research
05/19/2014

Scalable Semidefinite Relaxation for Maximum A Posterior Estimation

Maximum a posteriori (MAP) inference over discrete Markov random fields ...
research
03/04/2021

Learning to Predict with Supporting Evidence: Applications to Clinical Risk Prediction

The impact of machine learning models on healthcare will depend on the d...
research
03/16/2020

Stochastic Frontier Analysis with Generalized Errors: inference, model comparison and averaging

Our main contribution lies in formulation of a generalized, parametric m...

Please sign up or login with your details

Forgot password? Click here to reset