Random forests for high-dimensional longitudinal data

01/31/2019
by   Louis Capitaine, et al.
0

Random forests is a state-of-the-art supervised machine learning method which behaves well in high-dimensional settings although some limitations may happen when p, the number of predictors, is much larger than the number of observations n. Repeated measurements can help by offering additional information but no approach has been proposed for high-dimensional longitudinal data. Random forests have been adapted to standard (i.e., n > p) longitudinal data by using a semi-parametric mixed-effects model, in which the non-parametric part is estimated using random forests. We first propose a stochastic extension of the model which allows the covariance structure to vary over time. Furthermore, we develop a new method which takes intra-individual covariance into consideration to build the forest. Simulations reveal the superiority of our approach compared to existing ones. The method has been applied to an HIV vaccine trial including 17 HIV infected patients with 10 repeated measurements of 20000 gene transcripts and the blood concentration of human immunodeficiency virus RNA at the time of antiretroviral interruption. The approach selected 21 gene transcripts for which the association with HIV viral load was fully relevant and consistent with results observed during primary infection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2021

Double Machine Learning for Partially Linear Mixed-Effects Models with Repeated Measurements

Traditionally, spline or kernel approaches in combination with parametri...
research
07/18/2021

Sparse group variable selection for gene-environment interactions in the longitudinal study

Penalized variable selection for high dimensional longitudinal data has ...
research
08/11/2022

Random survival forests for competing risks with multivariate longitudinal endogenous covariates

Predicting the individual risk of a clinical event using the complete pa...
research
06/04/2019

Fréchet random forests

Random forests are a statistical learning method widely used in many are...
research
06/17/2020

FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

This paper proposes FREEtree, a tree-based method for high dimensional l...
research
08/08/2022

A review on longitudinal data analysis with random forest in precision medicine

Precision medicine provides customized treatments to patients based on t...
research
08/17/2023

Estimating Mean Viral Load Trajectory from Intermittent Longitudinal Data and Unknown Time Origins

Viral load (VL) in the respiratory tract is the leading proxy for assess...

Please sign up or login with your details

Forgot password? Click here to reset