Gaussian Word Embedding with a Wasserstein Distance Loss

08/21/2018
by   Chi Sun, et al.
0

Comparing with word embedding that based on the point representation, distribution-based word embedding shows more flexibility on expressing uncertainty and therefore, embeds a richer semantic information when representing words. While the Wasserstein distance provides a natural notion of dissimilarity with probability measures and has a closed form solution when measuring the distance between two Gaussian distributions. Therefore, with the aim of representing words in a high-efficient way, we propose to operate the Gaussian word embedding model with the loss function which is based on the Wasserstein distance. In addition, external information which is drawn from ConceptNet will be used to semi-supervise the results of the Gaussian word embedding. Thirteen datasets from the word similarity task, together with one from the word entailment task and six datasets from the downstream document classification task will be evaluated in this paper to testify our hypothesis.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset