Moving Other Way: Exploring Word Mover Distance Extensions

02/07/2022
by   Ilya Smirnov, et al.
0

The word mover's distance (WMD) is a popular semantic similarity metric for two texts. This position paper studies several possible extensions of WMD. We experiment with the frequency of words in the corpus as a weighting factor and the geometry of the word vector space. We validate possible extensions of WMD on six document classification datasets. Some proposed extensions show better results in terms of the k-nearest neighbor classification error than WMD.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/24/2018

Text Similarity in Vector Space Models: A Comparative Study

Automatic measurement of semantic text similarity is an important task i...
research
02/03/2021

Nearest Neighbor-based Importance Weighting

Importance weighting is widely applicable in machine learning in general...
research
08/04/2016

Bayesian Kernel and Mutual k-Nearest Neighbor Regression

We propose Bayesian extensions of two nonparametric regression methods w...
research
12/20/2019

What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

The main source of various religious teachings is their sacred texts whi...
research
09/01/2020

Document Similarity from Vector Space Densities

We propose a computationally light method for estimating similarities be...
research
06/01/2021

Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

Metric space magnitude, an active field of research in algebraic topolog...
research
08/11/2017

Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding

Many word clouds provide no semantics to the word placement, but use a r...

Please sign up or login with your details

Forgot password? Click here to reset