Scaling Survival Analysis in Healthcare with Federated Survival Forests: A Comparative Study on Heart Failure and Breast Cancer Genomics

08/04/2023
by   Alberto Archetti, et al.
0

Survival analysis is a fundamental tool in medicine, modeling the time until an event of interest occurs in a population. However, in real-world applications, survival data are often incomplete, censored, distributed, and confidential, especially in healthcare settings where privacy is critical. The scarcity of data can severely limit the scalability of survival models to distributed applications that rely on large data pools. Federated learning is a promising technique that enables machine learning models to be trained on multiple datasets without compromising user privacy, making it particularly well-suited for addressing the challenges of survival data and large-scale survival applications. Despite significant developments in federated learning for classification and regression, many directions remain unexplored in the context of survival analysis. In this work, we propose an extension of the Federated Survival Forest algorithm, called FedSurF++. This federated ensemble method constructs random survival forests in heterogeneous federations. Specifically, we investigate several new tree sampling methods from client forests and compare the results with state-of-the-art survival models based on neural networks. The key advantage of FedSurF++ is its ability to achieve comparable performance to existing methods while requiring only a single communication round to complete. The extensive empirical investigation results in a significant improvement from the algorithmic and privacy preservation perspectives, making the original FedSurF algorithm more efficient, robust, and private. We also present results on two real-world datasets demonstrating the success of FedSurF++ in real-world healthcare studies. Our results underscore the potential of FedSurF++ to improve the scalability and effectiveness of survival analysis in distributed settings while preserving user privacy.

READ FULL TEXT

page 1

page 13

research
02/06/2023

Federated Survival Forests

Survival analysis is a subfield of statistics concerned with modeling th...
research
01/16/2021

Deep Cox Mixtures for Survival Regression

Survival analysis is a challenging variation of regression modeling beca...
research
01/28/2023

Heterogeneous Datasets for Federated Survival Analysis Simulation

Survival analysis studies time-modeling techniques for an event of inter...
research
07/12/2022

FedPseudo: Pseudo value-based Deep Learning Models for Federated Survival Analysis

Survival analysis, time-to-event analysis, is an important problem in he...
research
10/13/2021

Metaparametric Neural Networks for Survival Analysis

Survival analysis is a critical tool for the modelling of time-to-event ...
research
05/09/2019

Beta Survival Models

This article analyzes the problem of estimating the time until an event ...
research
07/12/2022

Pseudo value-based Deep Neural Networks for Multi-state Survival Analysis

Multi-state survival analysis (MSA) uses multi-state models for the anal...

Please sign up or login with your details

Forgot password? Click here to reset