Hash2Vec, Feature Hashing for Word Embeddings

08/31/2016
by   Luis Argerich, et al.
0

In this paper we propose the application of feature hashing to create word embeddings for natural language processing. Feature hashing has been used successfully to create document vectors in related tasks like document classification. In this work we show that feature hashing can be applied to obtain word embeddings in linear time with the size of the data. The results show that this algorithm, that does not need training, is able to capture the semantic meaning of words. We compare the results against GloVe showing that they are similar. As far as we know this is the first application of feature hashing to the word embeddings problem and the results indicate this is a scalable technique with practical results for NLP applications.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/30/2020

Development of Word Embeddings for Uzbek Language

In this paper, we share the process of developing word embeddings for th...
02/15/2019

Contextual Word Representations: A Contextual Introduction

This introduction aims to tell the story of how we put words into comput...
10/16/2018

Subword Semantic Hashing for Intent Classification on Small Datasets

In this paper, we introduce the use of Semantic Hashing as embedding for...
07/23/2020

Word Embeddings: Stability and Semantic Change

Word embeddings are computed by a class of techniques within natural lan...
04/18/2020

Effect of Text Color on Word Embeddings

In natural scenes and documents, we can find the correlation between a t...
01/14/2020

Balancing the composition of word embeddings across heterogenous data sets

Word embeddings capture semantic relationships based on contextual infor...
10/22/2018

Proactive Security: Embedded AI Solution for Violent and Abusive Speech Recognition

Violence is an epidemic in Brazil and a problem on the rise world-wide. ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.