t-Exponential Memory Networks for Question-Answering Machines

09/04/2018
by   Kyriakos Tolias, et al.
0

Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describ- ing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances, by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2017

Recurrent Latent Variable Networks for Session-Based Recommendation

In this work, we attempt to ameliorate the impact of data sparsity in th...
research
02/10/2018

Deep learning with t-exponential Bayesian kitchen sinks

Bayesian learning has been recently considered as an effective means of ...
research
10/26/2017

Rotational Unit of Memory

The concepts of unitary evolution matrices and associative memory have b...
research
05/19/2023

Extending Memory for Language Modelling

Breakthroughs in deep learning and memory networks have made major advan...
research
03/31/2015

End-To-End Memory Networks

We introduce a neural network with a recurrent attention model over a po...
research
05/23/2018

Amortized Context Vector Inference for Sequence-to-Sequence Networks

Neural attention (NA) is an effective mechanism for inferring complex st...
research
09/17/2018

Quantum Statistics-Inspired Neural Attention

Sequence-to-sequence (encoder-decoder) models with attention constitute ...

Please sign up or login with your details

Forgot password? Click here to reset