Absolute Orientation for Word Embedding Alignment

06/04/2018
by   Sunipa Dev, et al.
0

We propose a new technique to align word embeddings which are derived from different source datasets or created using different mechanisms (e.g., GloVe or word2vec). We design a simple, closed-form solution to find the optimal rotation and optionally scaling which minimizes the root mean squared error or maximizes the average cosine similarity between two embeddings of the same vocabulary into the same dimensional space. Our methods extend approaches known as Absolute Orientation, which are popular for aligning objects in three-dimensions. We extend them to arbitrary dimensions, and show that a simple scaling solution can be derived independent of the rotation, and also that it optimizes cosine similarity. Then we demonstrate how to evaluate the similarity of embeddings from different sources or mechanisms, and that certain properties like synonyms and analogies are preserved across the embeddings and can be enhanced by simply aligning and averaging ensembles of embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2018

Semi-Supervised Multi-Task Word Embeddings

Word embeddings have been shown to benefit from ensembling several word ...
research
09/05/2020

Bio-inspired Structure Identification in Language Embeddings

Word embeddings are a popular way to improve downstream performances in ...
research
10/07/2019

Correlations between Word Vector Sets

Similarity measures based purely on word embeddings are comfortably comp...
research
04/20/2020

Learning Geometric Word Meta-Embeddings

We propose a geometric framework for learning meta-embeddings of words f...
research
04/18/2019

Analytical Methods for Interpretable Ultradense Word Embeddings

Word embeddings are useful for a wide variety of tasks, but they lack in...
research
04/14/2018

Frustratingly Easy Meta-Embedding -- Computing Meta-Embeddings by Averaging Source Word Embeddings

Creating accurate meta-embeddings from pre-trained source embeddings has...
research
08/16/2021

IsoScore: Measuring the Uniformity of Vector Space Utilization

The recent success of distributed word representations has led to an inc...

Please sign up or login with your details

Forgot password? Click here to reset