Word Rotator's Distance: Decomposing Vectors Gives Better Representations

04/30/2020
by   Sho Yokoi, et al.
0

One key principle for assessing semantic similarity between texts is to measure the degree of semantic overlap of them by considering word-by-word alignment. However, alignment-based approaches are inferior to the generic sentence vectors in terms of performance. We hypothesize that the reason for the inferiority of alignment-based methods is due to the fact that they do not distinguish word importance and word meaning. To solve this, we propose to separate word importance and word meaning by decomposing word vectors into their norm and direction, then compute the alignment-based similarity with the help of earth mover's distance. We call the method word rotator's distance (WRD) because direction vectors are aligned by rotation on the unit hypersphere. In addition, to incorporate the advance of cutting edge additive sentence encoders, we propose to re-decompose such sentence vectors into word vectors and use them as inputs to WRD. Empirically, the proposed method outperforms current methods considering the word-by-word alignment including word mover's distance with a big difference; moreover, our method outperforms state-of-the-art additive sentence encoders on the most competitive dataset, STS-benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2022

Towards Structure-aware Paraphrase Identification with Phrase Alignment Using Sentence Encoders

Previous works have demonstrated the effectiveness of utilising pre-trai...
research
05/08/2016

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

Lacking standardized extrinsic evaluation methods for vector representat...
research
08/22/2018

Deep Extrofitting: Specialization and Generalization of Expansional Retrofitting Word Vectors using Semantic Lexicons

The retrofitting techniques, which inject external resources into word r...
research
05/26/2023

Metaphor Detection via Explicit Basic Meanings Modelling

One noticeable trend in metaphor detection is the embrace of linguistic ...
research
05/19/2023

Contextualized Word Vector-based Methods for Discovering Semantic Differences with No Training nor Word Alignment

In this paper, we propose methods for discovering semantic differences i...
research
04/30/2019

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

Recent literature suggests that averaged word vectors followed by simple...
research
02/17/2023

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

Wake word detection exists in most intelligent homes and portable device...

Please sign up or login with your details

Forgot password? Click here to reset