Hypothesis Testing based Intrinsic Evaluation of Word Embeddings

09/04/2017
by   Nishant Gurnani, et al.
0

We introduce the cross-match test - an exact, distribution free, high-dimensional hypothesis test as an intrinsic evaluation metric for word embeddings. We show that cross-match is an effective means of measuring distributional similarity between different vector representations and of evaluating the statistical significance of different vector embedding models. Additionally, we find that cross-match can be used to provide a quantitative measure of linguistic similarity for selecting bridge languages for machine translation. We demonstrate that the results of the hypothesis test align with our expectations and note that the framework of two sample hypothesis testing is not limited to word embeddings and can be extended to all vector representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/06/2018

The Limitations of Cross-language Word Embeddings Evaluation

The aim of this work is to explore the possible limitations of existing ...
research
04/05/2023

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

Word embeddings that map words into a fixed-dimensional vector space are...
research
09/06/2018

Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation

Following the recent success of word embeddings, it has been argued that...
research
05/04/2018

A Rank-Based Similarity Metric for Word Embeddings

Word Embeddings have recently imposed themselves as a standard for repre...
research
01/13/2020

On the Replicability of Combining Word Embeddings and Retrieval Models

We replicate recent experiments attempting to demonstrate an attractive ...
research
10/07/2020

Analogies minus analogy test: measuring regularities in word embeddings

Vector space models of words have long been claimed to capture linguisti...
research
06/25/2016

Intrinsic Subspace Evaluation of Word Embedding Representations

We introduce a new methodology for intrinsic evaluation of word represen...

Please sign up or login with your details

Forgot password? Click here to reset