Exponential Family Embeddings

08/02/2016
by   Maja R. Rudolph, et al.
0

Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of high-dimensional data. As examples, we studied neural data with real-valued observations, count data from a market basket analysis, and ratings data from a movie recommendation system. The main idea is to model each observation conditioned on a set of other observations. This set is called the context, and the way the context is defined is a modeling choice that depends on the problem. In language the context is the surrounding words; in neuroscience the context is close-by neurons; in market basket data the context is other items in the shopping cart. Each type of embedding model defines the context, the exponential family of conditional distributions, and how the latent embedding vectors are shared across data. We infer the embeddings with a scalable algorithm based on stochastic gradient descent. On all three applications - neural activity of zebrafish, users' shopping behavior, and movie ratings - we found exponential family embedding models to be more effective than other types of dimension reduction. They better reconstruct held-out data and find interesting qualitative structure.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2017

Structured Embedding Models for Grouped Data

Word embeddings are a powerful approach for analyzing language, and expo...
research
03/23/2017

Dynamic Bernoulli Embeddings for Language Evolution

Word embeddings are a powerful approach for unsupervised analysis of lan...
research
10/25/2018

Provable Gaussian Embedding with One Observation

The success of machine learning methods heavily relies on having an appr...
research
12/04/2019

Natural Alpha Embeddings

Learning an embedding for a large collection of items is a popular appro...
research
01/08/2019

Fusion Strategies for Learning User Embeddings with Neural Networks

Growing amounts of online user data motivate the need for automated proc...
research
07/17/2019

Analysis of Word Embeddings using Fuzzy Clustering

In data dominated systems and applications, a concept of representing wo...

Please sign up or login with your details

Forgot password? Click here to reset