Scalable Estimation for Structured Additive Distributional Regression

01/13/2023
by   Nikolaus Umlauf, et al.
0

Recently, fitting probabilistic models have gained importance in many areas but estimation of such distributional models with very large data sets is a difficult task. In particular, the use of rather complex models can easily lead to memory-related efficiency problems that can make estimation infeasible even on high-performance computers. We therefore propose a novel backfitting algorithm, which is based on the ideas of stochastic gradient descent and can deal virtually with any amount of data on a conventional laptop. The algorithm performs automatic selection of variables and smoothing parameters, and its performance is in most cases superior or at least equivalent to other implementations for structured additive distributional regression, e.g., gradient boosting, while maintaining low computation time. Performance is evaluated using an extensive simulation study and an exceptionally challenging and unique example of lightning count prediction over Austria. A very large dataset with over 9 million observations and 80 covariates is used, so that a prediction model cannot be estimated with standard distributional regression methods but with our new approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

Boosting Distributional Copula Regression

Capturing complex dependence structures between outcome variables (e.g.,...
research
07/18/2022

Boosting Multivariate Structured Additive Distributional Regression Models

We develop a model-based boosting approach for multivariate distribution...
research
08/09/2022

Copulaboost: additive modeling with copula-based model components

We propose a type of generalised additive models with of model component...
research
06/16/2020

Distributional (Single) Index Models

A Distributional (Single) Index Model (DIM) is a semi-parametric model f...
research
07/10/2022

Energy Trees: Regression and Classification With Structured and Mixed-Type Covariates

The continuous growth of data complexity requires methods and models tha...
research
10/07/2021

Accelerated Componentwise Gradient Boosting using Efficient Data Representation and Momentum-based Optimization

Componentwise boosting (CWB), also known as model-based boosting, is a v...
research
02/04/2023

mixdistreg: An R Package for Fitting Mixture of Experts Distributional Regression with Adaptive First-order Methods

This paper presents a high-level description of the R software package m...

Please sign up or login with your details

Forgot password? Click here to reset