Interpretable probabilistic embeddings: bridging the gap between topic models and neural networks

11/11/2017
by   Anna Potapenko, et al.
0

We consider probabilistic topic models and more recent word embedding techniques from a perspective of learning hidden semantic representations. Inspired by a striking similarity of the two approaches, we merge them and learn probabilistic embeddings with online EM-algorithm on word co-occurrence data. The resulting embeddings perform on par with Skip-Gram Negative Sampling (SGNS) on word similarity tasks and benefit in the interpretability of the components. Next, we learn probabilistic document embeddings that outperform paragraph2vec on a document similarity task and require less memory and time for training. Finally, we employ multimodal Additive Regularization of Topic Models (ARTM) to obtain a high sparsity and learn embeddings for other modalities, such as timestamps and categories. We observe further improvement of word similarity performance and meaningful inter-modality similarities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2019

Multi Sense Embeddings from Topic Models

Distributed word embeddings have yielded state-of-the-art performance in...
research
09/10/2019

Neural Embedding Allocation: Distributed Representations of Topic Models

Word embedding models such as the skip-gram learn vector representations...
research
11/12/2015

Multimodal Skip-gram Using Convolutional Pseudowords

This work studies the representational mapping across multimodal data su...
research
11/27/2015

Category Enhanced Word Embedding

Distributed word representations have been demonstrated to be effective ...
research
04/24/2017

Streaming Word Embeddings with the Space-Saving Algorithm

We develop a streaming (one-pass, bounded-memory) word embedding algorit...
research
11/04/2019

Spherical Text Embedding

Unsupervised text embedding has shown great power in a wide range of NLP...
research
09/23/2018

Learning and Evaluating Sparse Interpretable Sentence Embeddings

Previous research on word embeddings has shown that sparse representatio...

Please sign up or login with your details

Forgot password? Click here to reset