InfiniteWalk: Deep Network Embeddings as Laplacian Embeddings with a Nonlinearity

by   Sudhanshu Chanpuriya, et al.

The skip-gram model for learning word embeddings (Mikolov et al. 2013) has been widely popular, and DeepWalk (Perozzi et al. 2014), among other methods, has extended the model to learning node representations from networks. Recent work of Qiu et al. (2018) provides a closed-form expression for the DeepWalk objective, obviating the need for sampling for small datasets and improving accuracy. In these methods, the "window size" T within which words or nodes are considered to co-occur is a key hyperparameter. We study the objective in the limit as T goes to infinity, which allows us to simplify the expression of Qiu et al. We prove that this limiting objective corresponds to factoring a simple transformation of the pseudoinverse of the graph Laplacian, linking DeepWalk to extensive prior work in spectral graph embeddings. Further, we show that by a applying a simple nonlinear entrywise transformation to this pseudoinverse, we recover a good approximation of the finite-T objective and embeddings that are competitive with those from DeepWalk and other skip-gram methods in multi-label classification. Surprisingly, we find that even simple binary thresholding of the Laplacian pseudoinverse is often competitive, suggesting that the core advancement of recent methods is a nonlinearity on top of the classical spectral embedding approach.


page 1

page 2

page 3

page 4


Linking GloVe with word2vec

The Global Vectors for word representation (GloVe), introduced by Jeffre...

Learning the Dimensionality of Word Embeddings

We describe a method for learning word embeddings with data-dependent di...

Empirical Study of Diachronic Word Embeddings for Scarce Data

Word meaning change can be inferred from drifts of time-varying word emb...

Improving Skip-Gram based Graph Embeddings via Centrality-Weighted Sampling

Network embedding techniques inspired by word2vec represent an effective...

Spectral Analysis of Kernel and Neural Embeddings: Optimization and Generalization

We extend the recent results of (Arora et al., 2019) by a spectral analy...

The seriation problem in the presence of a double Fiedler value

Seriation is a problem consisting of seeking the best enumeration order ...

Node Embeddings and Exact Low-Rank Representations of Complex Networks

Low-dimensional embeddings, from classical spectral embeddings to modern...

Please sign up or login with your details

Forgot password? Click here to reset