Statistical Efficiency of Score Matching: The View from Isoperimetry

10/03/2022
by   Frederic Koehler, et al.
0

Deep generative models parametrized up to a normalizing constant (e.g. energy-based models) are difficult to train by maximizing the likelihood of the data because the likelihood and/or gradients thereof cannot be explicitly or efficiently written down. Score matching is a training method, whereby instead of fitting the likelihood log p(x) for the training data, we instead fit the score function ∇_x log p(x) – obviating the need to evaluate the partition function. Though this estimator is known to be consistent, its unclear whether (and when) its statistical efficiency is comparable to that of maximum likelihood – which is known to be (asymptotically) optimal. We initiate this line of inquiry in this paper, and show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated – i.e. the Poincaré, log-Sobolev and isoperimetric constant – quantities which govern the mixing time of Markov processes like Langevin dynamics. Roughly, we show that the score matching estimator is statistically comparable to the maximum likelihood when the distribution has a small isoperimetric constant. Conversely, if the distribution has a large isoperimetric constant – even for simple families of distributions like exponential families with rich enough sufficient statistics – score matching will be substantially less efficient than maximum likelihood. We suitably formalize these results both in the finite sample regime, and in the asymptotic regime. Finally, we identify a direct parallel in the discrete setting, where we connect the statistical properties of pseudolikelihood estimation with approximate tensorization of entropy and the Glauber dynamics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2023

Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains

Score matching is an approach to learning probability distributions para...
research
06/03/2023

Provable benefits of score matching

Score matching is an alternative to maximum likelihood (ML) for estimati...
research
11/20/2018

Learning deep kernels for exponential family densities

The kernel exponential family is a rich class of distributions,which can...
research
07/11/2021

Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Energy-based models (EBMs) are generative models that are usually traine...
research
05/26/2022

Tuning-parameter-free optimal propensity score matching approach for causal inference

Propensity score matching (PSM) is a pseudo-experimental method that use...
research
05/24/2023

Training Energy-Based Normalizing Flow with Score-Matching Objectives

In this paper, we establish a connection between the parameterization of...
research
03/18/2022

Generalized Score Matching for Regression

Many probabilistic models that have an intractable normalizing constant ...

Please sign up or login with your details

Forgot password? Click here to reset