A Simple Language Model based on PMI Matrix Approximations

07/17/2017
by   Oren Melamud, et al.
0

In this study, we introduce a new approach for learning language models by training them to estimate word-context pointwise mutual information (PMI), and then deriving the desired conditional probabilities from PMI at test time. Specifically, we show that with minor modifications to word2vec's algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models. A compelling aspect of our approach is that our models are trained with the same simple negative sampling objective function that is commonly used in word2vec to learn word embeddings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2016

PMI Matrix Approximations with Applications to Neural Language Modeling

The negative sampling (NEG) objective function, used in word2vec, is a s...
research
10/12/2016

Language Models with Pre-Trained (GloVe) Word Embeddings

In this work we implement a training of a Language Model (LM), using Rec...
research
09/06/2018

Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency

Noise Contrastive Estimation (NCE) is a powerful parameter estimation me...
research
10/22/2020

UniCase – Rethinking Casing in Language Models

In this paper, we introduce a new approach to dealing with the problem o...
research
10/20/2016

Clinical Text Prediction with Numerically Grounded Conditional Language Models

Assisted text input techniques can save time and effort and improve text...
research
05/25/2023

Language Models Implement Simple Word2Vec-style Vector Arithmetic

A primary criticism towards language models (LMs) is their inscrutabilit...
research
12/22/2022

Efficient Induction of Language Models Via Probabilistic Concept Formation

This paper presents a novel approach to the acquisition of language mode...

Please sign up or login with your details

Forgot password? Click here to reset