Marginalised Normal Regression: Unbiased curve fitting in the presence of x-errors

09/02/2023
by   Deaglan Bartlett, et al.
0

The history of the seemingly simple problem of straight line fitting in the presence of both x and y errors has been fraught with misadventure, with statistically ad hoc and poorly tested methods abounding in the literature. The problem stems from the emergence of latent variables describing the "true" values of the independent variables, the priors on which have a significant impact on the regression result. By analytic calculation of maximum a posteriori values and biases, and comprehensive numerical mock tests, we assess the quality of possible priors. In the presence of intrinsic scatter, the only prior that we find to give reliably unbiased results in general is a mixture of one or more Gaussians with means and variances determined as part of the inference. We find that a single Gaussian is typically sufficient and dub this model Marginalised Normal Regression (MNR). We illustrate the necessity for MNR by comparing it to alternative methods on an important linear relation in cosmology, and extend it to nonlinear regression and an arbitrary covariance matrix linking x and y. We publicly release a Python/Jax implementation of MNR and its Gaussian mixture model extension that is coupled to Hamiltonian Monte Carlo for efficient sampling, which we call ROXY (Regression and Optimisation with X and Y errors).

READ FULL TEXT
research
12/17/2019

Jackknife covariance matrix estimation for observations from mixture

A general jackknife estimator for the asymptotic covariance of moment es...
research
05/06/2022

Comparison of continuity equation and Gaussian mixture model for long-term density propagation using semi-analytical methods

This paper compares the continuum evolution for density equation modelli...
research
08/13/2020

An estimator for predictive regression: reliable inference for financial economics

Estimating linear regression using least squares and reporting robust st...
research
12/23/2022

Design of Hamiltonian Monte Carlo for perfect simulation of general continuous distributions

Hamiltonian Monte Carlo (HMC) is an efficient method of simulating smoot...
research
11/08/2019

Maximum a-Posteriori Estimation for the Gaussian Mixture Model via Mixed Integer Nonlinear Programming

We present a global optimization approach for solving the classical maxi...
research
06/14/2018

Efficient sampling for Gaussian linear regression with arbitrary priors

This paper develops a slice sampler for Bayesian linear regression model...
research
03/18/2003

Statistical efficiency of curve fitting algorithms

We study the problem of fitting parametrized curves to noisy data. Under...

Please sign up or login with your details

Forgot password? Click here to reset