Robust modal regression with direct log-density derivative estimation

by   Hiroaki Sasaki, et al.

Modal regression is aimed at estimating the global mode (i.e., global maximum) of the conditional density function of the output variable given input variables, and has led to regression methods robust against heavy-tailed or skewed noises. The conditional mode is often estimated through maximization of the modal regression risk (MRR). In order to apply a gradient method for the maximization, the fundamental challenge is accurate approximation of the gradient of MRR, not MRR itself. To overcome this challenge, in this paper, we take a novel approach of directly approximating the gradient of MRR. To approximate the gradient, we develop kernelized and neural-network-based versions of the least-squares log-density derivative estimator, which directly approximates the derivative of the log-density without density estimation. With direct approximation of the MRR gradient, we first propose a modal regression method with kernels, and derive a new parameter update rule based on a fixed-point method. Then, the derived update rule is theoretically proved to have a monotonic hill-climbing property towards the conditional mode. Furthermore, we indicate that our approach of directly approximating the gradient is compatible with recent sophisticated stochastic gradient methods (e.g., Adam), and then propose another modal regression method based on neural networks. Finally, the superior performance of the proposed methods is demonstrated on various artificial and benchmark datasets.


page 1

page 2

page 3

page 4


Regularized Multi-Task Learning for Multi-Dimensional Log-Density Gradient Estimation

Log-density gradient estimation is a fundamental statistical problem and...

Kernel Regression by Mode Calculation of the Conditional Probability Distribution

The most direct way to express arbitrary dependencies in datasets is to ...

Mode-Seeking Clustering and Density Ridge Estimation via Direct Estimation of Density-Derivative-Ratios

Modes and ridges of the probability density function behind observed dat...

Second order stochastic gradient update for Cholesky factor in Gaussian variational approximation from Stein's Lemma

In stochastic variational inference, use of the reparametrization trick ...

The modal age of Statistics

Recently, a number of statistical problems have found an unexpected solu...

Total stochastic gradient algorithms and applications in reinforcement learning

Backpropagation and the chain rule of derivatives have been prominent; h...

An implicit function learning approach for parametric modal regression

For multi-valued functions—such as when the conditional distribution on ...

Please sign up or login with your details

Forgot password? Click here to reset