Gaussian Mixture Embeddings for Multiple Word Prototypes

11/19/2015
by   Xinchi Chen, et al.
0

Recently, word representation has been increasingly focused on for its excellent properties in representing the word semantics. Previous works mainly suffer from the problem of polysemy phenomenon. To address this problem, most of previous models represent words as multiple distributed vectors. However, it cannot reflect the rich relations between words by representing words as points in the embedded space. In this paper, we propose the Gaussian mixture skip-gram (GMSG) model to learn the Gaussian mixture embeddings for words based on skip-gram framework. Each word can be regarded as a gaussian mixture distribution in the embedded space, and each gaussian component represents a word sense. Since the number of senses varies from word to word, we further propose the Dynamic GMSG (D-GMSG) model by adaptively increasing the sense number of words during training. Experiments on four benchmarks show the effectiveness of our proposed model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2018

Probabilistic FastText for Multi-Sense Word Embeddings

We introduce Probabilistic FastText, a new model for word embeddings tha...
research
11/12/2019

Learning Multi-Sense Word Distributions using Approximate Kullback-Leibler Divergence

Learning word representations has garnered greater attention in the rece...
research
05/20/2020

GM-CTSC at SemEval-2020 Task 1: Gaussian Mixtures Cross Temporal Similarity Clustering

This paper describes the system proposed for the SemEval-2020 Task 1: Un...
research
04/14/2019

Distributed representation of multi-sense words: A loss-driven approach

Word2Vec's Skip Gram model is the current state-of-the-art approach for ...
research
02/12/2015

Ordering-sensitive and Semantic-aware Topic Modeling

Topic modeling of textual corpora is an important and challenging proble...
research
11/29/2017

Embedding Words as Distributions with a Bayesian Skip-gram Model

We introduce a method for embedding words as probability densities in a ...
research
03/05/2018

Calculated attributes of synonym sets

The goal of formalization, proposed in this paper, is to bring together,...

Please sign up or login with your details

Forgot password? Click here to reset