Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency

09/06/2018
by   Zhuang Ma, et al.
0

Noise Contrastive Estimation (NCE) is a powerful parameter estimation method for log-linear models, which avoids calculation of the partition function or its derivatives at each training step, a computationally demanding step in many cases. It is closely related to negative sampling methods, now widely used in NLP. This paper considers NCE-based estimation of conditional models. Conditional models are frequently encountered in practice; however there has not been a rigorous theoretical analysis of NCE in this setting, and we will argue there are subtle but important questions when generalizing NCE to the conditional case. In particular, we analyze two variants of NCE for conditional models: one based on a classification objective, the other based on a ranking objective. We show that the ranking-based variant of NCE gives consistent parameter estimates under weaker assumptions than the classification-based method; we analyze the statistical efficiency of the ranking-based and classification-based variants of NCE; finally we describe experiments on synthetic data and language modeling showing the effectiveness and trade-offs of both methods.

READ FULL TEXT
research
09/05/2016

PMI Matrix Approximations with Applications to Neural Language Modeling

The negative sampling (NEG) objective function, used in word2vec, is a s...
research
07/17/2017

A Simple Language Model based on PMI Matrix Approximations

In this study, we introduce a new approach for learning language models ...
research
06/10/2018

Conditional Noise-Contrastive Estimation of Unnormalised Models

Many parametric statistical models are not properly normalised and only ...
research
11/02/2020

Noise-Contrastive Estimation for Multivariate Point Processes

The log-likelihood of a generative model often involves both positive an...
research
06/13/2023

Learning Unnormalized Statistical Models via Compositional Optimization

Learning unnormalized statistical models (e.g., energy-based models) is ...
research
12/20/2022

Likelihood-based generalization of Markov parameter estimation and multiple shooting objectives in system identification

This paper considers the problem of system identification (ID) of linear...
research
07/30/2022

Efficient estimation and inference for the signed β-model in directed signed networks

This paper proposes a novel signed β-model for directed signed network, ...

Please sign up or login with your details

Forgot password? Click here to reset