Robust Distributional Regression with Automatic Variable Selection

12/14/2022
by   Meadhbh O'Neill, et al.
0

Datasets with extreme observations and/or heavy-tailed error distributions are commonly encountered and should be analyzed with careful consideration of these features from a statistical perspective. Small deviations from an assumed model, such as the presence of outliers, can cause classical regression procedures to break down, potentially leading to unreliable inferences. Other distributional deviations, such as heteroscedasticity, can be handled by going beyond the mean and modelling the scale parameter in terms of covariates. We propose a method that accounts for heavy tails and heteroscedasticity through the use of a generalized normal distribution (GND). The GND contains a kurtosis-characterizing shape parameter that moves the model smoothly between the normal distribution and the heavier-tailed Laplace distribution - thus covering both classical and robust regression. A key component of statistical inference is determining the set of covariates that influence the response variable. While correctly accounting for kurtosis and heteroscedasticity is crucial to this endeavour, a procedure for variable selection is still required. For this purpose, we use a novel penalized estimation procedure that avoids the typical computationally demanding grid search for tuning parameters. This is particularly valuable in the distributional regression setting where the location and scale parameters depend on covariates, since the standard approach would have multiple tuning parameters (one for each distributional parameter). We achieve this by using a "smooth information criterion" that can be optimized directly, where the tuning parameters are fixed at log(n) in the BIC case.

READ FULL TEXT

page 16

page 19

page 21

page 24

research
10/06/2021

Variable Selection Using a Smooth Information Criterion for Multi-Parameter Regression Models

Modern variable selection procedures make use of penalization methods to...
research
12/29/2019

Robust Variable Selection Criteria for the Penalized Regression

We propose a robust variable selection procedure using a divergence base...
research
01/10/2019

Multi-Parameter Regression Survival Modelling: An Alternative to Proportional Hazards

It is standard practice for covariates to enter a parametric model throu...
research
09/27/2022

Wilcoxon-type Multivariate Cluster Elastic Net

We propose a method for high dimensional multivariate regression that is...
research
04/09/2018

Distributional Regression Forests for Probabilistic Precipitation Forecasting in Complex Terrain

To obtain a probabilistic model for a dependent variable based on some s...
research
10/30/2020

Enveloped Huber Regression

Huber regression (HR) is a popular robust alternative to the least squar...
research
07/20/2023

Distributional Regression for Data Analysis

Flexible modeling of how an entire distribution changes with covariates ...

Please sign up or login with your details

Forgot password? Click here to reset