Learning Rare Word Representations using Semantic Bridging

07/24/2017
by   Victor Prokhorov, et al.
0

We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identify the best combination for the proposed system. We then apply this to the task of enhancing the coverage of an existing word embedding's vocabulary with rare and unseen words. We show that our technique can provide considerable extra coverage (over 99 performance gain (around 10 3.3) on the Rare Word Similarity dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/12/2018

Unseen Word Representation by Aligning Heterogeneous Lexical Semantic Spaces

Word embedding techniques heavily rely on the abundance of training data...
research
04/06/2016

An Ensemble Method to Produce High-Quality Word Embeddings

A currently successful approach to computational semantics is to represe...
research
08/28/2018

Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models

Rare word representation has recently enjoyed a surge of interest, owing...
research
08/30/2021

RetroGAN: A Cyclic Post-Specialization System for Improving Out-of-Knowledge and Rare Word Representations

Retrofitting is a technique used to move word vectors closer together or...
research
01/27/2018

Improving Word Vector with Prior Knowledge in Semantic Dictionary

Using low dimensional vector space to represent words has been very effe...
research
06/10/2019

Embedding Imputation with Grounded Language Information

Due to the ubiquitous use of embeddings as input representations for a w...
research
05/06/2020

Moving Down the Long Tail of Word Sense Disambiguation with Gloss-Informed Biencoders

A major obstacle in Word Sense Disambiguation (WSD) is that word senses ...

Please sign up or login with your details

Forgot password? Click here to reset