Stolen Probability: A Structural Weakness of Neural Language Models

05/05/2020
by   David Demeter, et al.
0

Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the inductive bias of NNLMs. Although NNLMs optimize well with this inductive bias, we show that this results in a sub-optimal ordering of the embedding space that structurally impoverishes some words at the expense of others when assigning probability. We present numerical, theoretical and empirical analyses showing that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2018

On The Inductive Bias of Words in Acoustics-to-Word Models

Acoustics-to-word models are end-to-end speech recognizers that use word...
research
08/22/2022

Interpreting Embedding Spaces by Conceptualization

One of the main methods for semantic interpretation of text is mapping i...
research
08/07/2023

Topological Interpretations of GPT-3

This is an experiential study of investigating a consistent method for d...
research
05/11/2018

State Gradients for RNN Memory Analysis

We present a framework for analyzing what the state in RNNs remembers fr...
research
12/22/2014

Diverse Embedding Neural Network Language Models

We propose Diverse Embedding Neural Network (DENN), a novel architecture...
research
12/11/2019

Just Add Functions: A Neural-Symbolic Language Model

Neural network language models (NNLMs) have achieved ever-improving accu...
research
05/29/2023

Taming AI Bots: Controllability of Neural States in Large Language Models

We tackle the question of whether an agent can, by suitable choice of pr...

Please sign up or login with your details

Forgot password? Click here to reset