Hash Embeddings for Efficient Word Representations

09/12/2017
by   Dan Svenstrup, et al.
0

We present hash embeddings, an efficient method for representing words in a continuous vector form. A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by k d-dimensional embeddings vectors and one k dimensional weight vector. The final d dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of B embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions of tokens. When using a hash embedding there is no need to create a dictionary before training nor to perform any kind of vocabulary pruning after training. We show that models trained using hash embeddings exhibit at least the same level of performance as models trained using regular embeddings across a wide range of tasks. Furthermore, the number of parameters needed by such an embedding is only a fraction of what is required by a regular embedding. Since standard embeddings and embeddings constructed using the hashing trick are actually just special cases of a hash embedding, hash embeddings can be considered an extension and improvement over the existing regular embedding types.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

Multi hash embeddings in spaCy

The distributed representation of symbols is one of the key technologies...
research
05/11/2017

Sketching Word Vectors Through Hashing

We propose a new fast word embedding technique using hash functions. The...
research
12/28/2019

Learning Numeral Embeddings

Word embedding is an essential building block for deep learning methods ...
research
06/09/2017

Learning to Embed Words in Context for Syntactic Tasks

We present models for embedding words in the context of surrounding word...
research
06/28/2023

Pb-Hash: Partitioned b-bit Hashing

Many hashing algorithms including minwise hashing (MinHash), one permuta...
research
09/20/2018

Local Density Estimation in High Dimensions

An important question that arises in the study of high dimensional vecto...
research
02/24/2021

Semantically Constrained Memory Allocation (SCMA) for Embedding in Efficient Recommendation Systems

Deep learning-based models are utilized to achieve state-of-the-art perf...

Please sign up or login with your details

Forgot password? Click here to reset