Neural-Kernelized Conditional Density Estimation

by   Hiroaki Sasaki, et al.

Conditional density estimation is a general framework for solving various problems in machine learning. Among existing methods, non-parametric and/or kernel-based methods are often difficult to use on large datasets, while methods based on neural networks usually make restrictive parametric assumptions on the probability densities. Here, we propose a novel method for estimating the conditional density based on score matching. In contrast to existing methods, we employ scalable neural networks, but do not make explicit parametric assumptions on densities. The key challenge in applying score matching to neural networks is computation of the first- and second-order derivatives of a model for the log-density. We tackle this challenge by developing a new neural-kernelized approach, which can be applied on large datasets with stochastic gradient descent, while the reproducing kernels allow for easy computation of the derivatives needed in score matching. We show that the neural-kernelized function approximator has universal approximation capability and that our method is consistent in conditional density estimation. We numerically demonstrate that our method is useful in high-dimensional conditional density estimation, and compares favourably with existing methods. Finally, we prove that the proposed method has interesting connections to two probabilistically principled frameworks of representation learning: Nonlinear sufficient dimension reduction and nonlinear independent component analysis.


page 1

page 2

page 3

page 4


Kernel Conditional Density Operators

We introduce a conditional density estimation model termed the condition...

Implicitly Normalized Explicitly Regularized Density Estimation

We propose a new approach to non-parametric density estimation, that is ...

A Statistical Taylor Theorem and Extrapolation of Truncated Densities

We show a statistical version of Taylor's theorem and apply this result ...

A neural network based model for multi-dimensional nonlinear Hawkes processes

This paper introduces the Neural Network for Nonlinear Hawkes processes ...

Joint estimation of posterior probability and propensity score function for positive and unlabelled data

Positive and unlabelled learning is an important problem which arises na...

Conditional Density Estimation with Bayesian Normalising Flows

Modeling complex conditional distributions is critical in a variety of s...

Sum-of-Squares Polynomial Flow

Triangular map is a recent construct in probability theory that allows o...

Please sign up or login with your details

Forgot password? Click here to reset