To Know by the Company Words Keep and What Else Lies in the Vicinity

04/30/2022
by   Jake Ryland Williams, et al.
0

The development of state-of-the-art (SOTA) Natural Language Processing (NLP) systems has steadily been establishing new techniques to absorb the statistics of linguistic data. These techniques often trace well-known constructs from traditional theories, and we study these connections to close gaps around key NLP methods as a means to orient future work. For this, we introduce an analytic model of the statistics learned by seminal algorithms (including GloVe and Word2Vec), and derive insights for systems that use these algorithms and the statistics of co-occurrence, in general. In this work, we derive – to the best of our knowledge – the first known solution to Word2Vec's softmax-optimized, skip-gram algorithm. This result presents exciting potential for future development as a direct solution to a deep learning (DL) language model's (LM's) matrix factorization. However, we use the solution to demonstrate a seemingly-universal existence of a property that word vectors exhibit and which allows for the prophylactic discernment of biases in data – prior to their absorption by DL models. To qualify our work, we conduct an analysis of independence, i.e., on the density of statistical dependencies in co-occurrence models, which in turn renders insights on the distributional hypothesis' partial fulfillment by co-occurrence statistics.

READ FULL TEXT
research
03/18/2020

An Analysis on the Learning Rules of the Skip-Gram Model

To improve the generalization of the representations for natural languag...
research
06/04/2021

Language Model Metrics and Procrustes Analysis for Improved Vector Transformation of NLP Embeddings

Artificial Neural networks are mathematical models at their core. This t...
research
11/03/2019

Low-dimensional Semantic Space: from Text to Word Embedding

This article focuses on the study of Word Embedding, a feature-learning ...
research
05/16/2022

What company do words keep? Revisiting the distributional semantics of J.R. Firth Zellig Harris

The power of word embeddings is attributed to the linguistic theory that...
research
03/11/2020

Semantic Holism and Word Representations in Artificial Neural Networks

Artificial neural networks are a state-of-the-art solution for many prob...
research
09/04/2020

Recent Trends in the Use of Deep Learning Models for Grammar Error Handling

Grammar error handling (GEH) is an important topic in natural language p...
research
11/12/2021

On-the-Fly Rectification for Robust Large-Vocabulary Topic Inference

Across many data domains, co-occurrence statistics about the joint appea...

Please sign up or login with your details

Forgot password? Click here to reset