A copula-based boosting model for time-to-event prediction with dependent censoring

10/10/2022
by   Alise Danielle Midtfjord, et al.
0

A characteristic feature of time-to-event data analysis is possible censoring of the event time. Most of the statistical learning methods for handling censored data are limited by the assumption of independent censoring, even if this can lead to biased predictions when the assumption does not hold. This paper introduces Clayton-boost, a boosting approach built upon the accelerated failure time model, which uses a Clayton copula to handle the dependency between the event and censoring distributions. By taking advantage of a copula, the independent censoring assumption is not needed any more. During comparisons with commonly used methods, Clayton-boost shows a strong ability to remove prediction bias at the presence of dependent censoring and outperforms the comparing methods either if the dependency strength or percentage censoring are considerable. The encouraging performance of Clayton-boost shows that there is indeed reasons to be critical about the independent censoring assumption, and that real-world data could highly benefit from modelling the potential dependency.

READ FULL TEXT
research
06/20/2023

Copula-Based Deep Survival Models for Dependent Censoring

A survival dataset describes a set of instances (e.g. patients) and prov...
research
09/05/2019

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

A crucial assumption in most statistical learning theory is that samples...
research
11/13/2021

Predicting Times to Event Based on Vine Copula Models

In statistics, time-to-event analysis methods traditionally focus on the...
research
02/10/2020

Towards Mixture Proportion Estimation without Irreducibility

Mixture proportion estimation (MPE) is a fundamental problem of practica...
research
08/10/2018

BooST: Boosting Smooth Trees for Partial Effect Estimation in Nonlinear Regressions

In this paper we introduce a new machine learning (ML) model for nonline...
research
07/03/2021

Boost-R: Gradient Boosted Trees for Recurrence Data

Recurrence data arise from multi-disciplinary domains spanning reliabili...

Please sign up or login with your details

Forgot password? Click here to reset