RPD: A Distance Function Between Word Embeddings

05/16/2020
by   Xuhui Zhou, et al.
0

It is well-understood that different algorithms, training processes, and corpora produce different word embeddings. However, less is known about the relation between different embedding spaces, i.e. how far different sets of embeddings deviate from each other. In this paper, we propose a novel metric called Relative pairwise inner Product Distance (RPD) to quantify the distance between different sets of word embeddings. This metric has a unified scale for comparing different sets of word embeddings. Based on the properties of RPD, we study the relations of word embeddings of different algorithms systematically and investigate the influence of different training processes and corpora. The results shed light on the poorly understood word embeddings and justify RPD as a measure of the distance of embedding spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2021

Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces

Pre-trained word representations became a key component in many NLP task...
research
11/05/2019

Incremental Sense Weight Training for the Interpretation of Contextualized Word Embeddings

We present a novel online algorithm that learns the essence of each dime...
research
04/06/2019

Simple dynamic word embeddings for mapping perceptions in the public sphere

Word embeddings trained on large-scale historical corpora can illuminate...
research
02/17/2022

Word Embeddings for Automatic Equalization in Audio Mixing

In recent years, machine learning has been widely adopted to automate th...
research
09/18/2015

Word, graph and manifold embedding from Markov processes

Continuous vector representations of words and objects appear to carry s...
research
11/05/2019

embComp: Visual Interactive Comparison of Vector Embeddings

This work introduces embComp, a novel approach for comparing two embeddi...
research
03/03/2016

MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification

We introduce a novel, simple convolution neural network (CNN) architectu...

Please sign up or login with your details

Forgot password? Click here to reset