SubGram: Extending Skip-gram Word Representation with Substrings

06/18/2018
by   Tom Kocmi, et al.
0

Skip-gram (word2vec) is a recent method for creating vector representations of words ("distributed word representations") using a neural network. The representation gained popularity in various areas of natural language processing, because it seems to capture syntactic and semantic information about words without any explicit supervision in this respect. We propose SubGram, a refinement of the Skip-gram model to consider also the word structure during the training process, achieving large gains on the Skip-gram original test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2015

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Recently proposed Skip-gram model is a powerful method for learning high...
research
12/23/2019

Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings

We show that the skip-gram embedding of any word can be decomposed into ...
research
03/18/2020

An Analysis on the Learning Rules of the Skip-Gram Model

To improve the generalization of the representations for natural languag...
research
01/12/2015

Combining Language and Vision with a Multimodal Skip-gram Model

We extend the SKIP-GRAM model of Mikolov et al. (2013a) by taking visual...
research
02/17/2021

Contextual Skipgram: Training Word Representation Using Context Information

The skip-gram (SG) model learns word representation by predicting the wo...
research
03/11/2020

Semantic Holism and Word Representations in Artificial Neural Networks

Artificial neural networks are a state-of-the-art solution for many prob...

Please sign up or login with your details

Forgot password? Click here to reset