CausalDeepCENT: Deep Learning for Causal Prediction of Individual Event Times
Deep learning (DL) has recently drawn much attention in image analysis, natural language process, and high-dimensional medical data analysis. Under the causal direct acyclic graph (DAG) interpretation, the input variables without incoming edges from parent nodes in the DL architecture maybe assumed to be randomized and independent of each other. As in a regression setting, including the input variables in the DL algorithm would reduce the bias from the potential confounders. However, failing to include a potential latent causal structure among the input variables affecting both treatment assignment and the output variable could be additional significant source of bias. The primary goal of this study is to develop new DL algorithms to estimate causal individual event times for time-to-event data, equivalently to estimate the causal time-to-event distribution with or without right censoring, accounting for the potential latent structure among the input variables. Once the causal individual event times are estimated, it would be straightforward to estimate the causal average treatment effects as the differences in the averages of the estimated causal individual event times. A connection is made between the proposed method and the targeted maximum likelihood estimation (TMLE). Simulation studies are performed to assess improvement in prediction abilities of the proposed methods by using the mean square error (MSE)-based method and rank-based C-Index metric. The simulation results indicate that improvement on the prediction accuracy could be substantial particularly when there is a collider among the input variables. The proposed method is illustrated with a publicly available and influential breast cancer data set. The proposed method has been implemented by using PyTorch and uploaded at https://github.com/yicjia/CausalDeepCENT.
READ FULL TEXT