Using k-way Co-occurrences for Learning Word Embeddings

by   Danushka Bollegala, et al.

Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning have used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-occurrences to cover k(≥2)-way co-occurrences among a set of k-words. Specifically, we prove a theoretical relationship between the joint probability of k(≥2) words, and the sum of ℓ_2 norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises k-way co-occurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller k values, k-way embeddings perform comparably or better than 2-way embeddings in a range of tasks.


Deconstructing word embedding algorithms

Word embeddings are reliable feature representations of words used to ob...

WiC: 10,000 Example Pairs for Evaluating Context-Sensitive Representations

By design, word embeddings are unable to model the dynamic nature of wor...

Towards robust word embeddings for noisy texts

Research on word embeddings has mainly focused on improving their perfor...

Context Vectors are Reflections of Word Vectors in Half the Dimensions

This paper takes a step towards theoretical analysis of the relationship...

Encoding Prior Knowledge with Eigenword Embeddings

Canonical correlation analysis (CCA) is a method for reducing the dimens...

IRB-NLP at SemEval-2022 Task 1: Exploring the Relationship Between Words and Their Semantic Representations

What is the relation between a word and its description, or a word and i...

Analogies Explained: Towards Understanding Word Embeddings

Word embeddings generated by neural network methods such as word2vec (W2...