An Analysis on the Learning Rules of the Skip-Gram Model

03/18/2020
by   Canlin Zhang, et al.
0

To improve the generalization of the representations for natural language processing tasks, words are commonly represented using vectors, where distances among the vectors are related to the similarity of the words. While word2vec, the state-of-the-art implementation of the skip-gram model, is widely used and improves the performance of many natural language processing tasks, its mechanism is not yet well understood. In this work, we derive the learning rules for the skip-gram model and establish their close relationship to competitive learning. In addition, we provide the global optimal solution constraints for the skip-gram model and validate them by experimental results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/18/2018

SubGram: Extending Skip-gram Word Representation with Substrings

Skip-gram (word2vec) is a recent method for creating vector representati...
research
12/06/2020

Align-gram : Rethinking the Skip-gram Model for Protein Sequence Analysis

Background: The inception of next generations sequencing technologies ha...
research
05/17/2016

Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing

We show Correspondence Analysis (CA) is equivalent to defining Gini-inde...
research
01/11/2017

Job Detection in Twitter

In this report, we propose a new application for twitter data called job...
research
04/30/2022

To Know by the Company Words Keep and What Else Lies in the Vicinity

The development of state-of-the-art (SOTA) Natural Language Processing (...
research
03/11/2020

Semantic Holism and Word Representations in Artificial Neural Networks

Artificial neural networks are a state-of-the-art solution for many prob...

Please sign up or login with your details

Forgot password? Click here to reset