
Efficient and Robust SemiSupervised Estimation of Average Treatment Effects in Electronic Medical Records Data
There is strong interest in conducting comparative effectiveness researc...
read it

Optimal Semisupervised Estimation and Inference for Highdimensional Linear Regression
There are many scenarios such as the electronic health records where the...
read it

Double Robust SemiSupervised Inference for the Mean: Selection Bias under MAR Labeling with Decaying Overlap
Semisupervised (SS) inference has received much attention in recent yea...
read it

MetaGradient Reinforcement Learning
The goal of reinforcement learning algorithms is to estimate and/or opti...
read it

Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization
We consider the problem of reinforcement learning over episodes of a fin...
read it

InformationTheoretic Considerations in Batch Reinforcement Learning
Valuefunction approximation methods that operate in batch mode have fou...
read it

Surrogate Assisted Semisupervised Inference for High Dimensional Risk Prediction
Risk modeling with EHR data is challenging due to a lack of direct obser...
read it
SemiSupervised Off Policy Reinforcement Learning
Reinforcement learning (RL) has shown great success in estimating sequential treatment strategies which account for patient heterogeneity. However, healthoutcome information is often not well coded but rather embedded in clinical notes. Extracting precise outcome information is a resource intensive task. This translates into only small wellannotated cohorts available. We propose a semisupervised learning (SSL) approach that can efficiently leverage a small sized labeled data ℒ with true outcome observed, and a large sized unlabeled data 𝒰 with outcome surrogates W. In particular we propose a theoretically justified SSL approach to Qlearning and develop a robust and efficient SSL approach to estimating the value function of the derived optimal STR, defined as the expected counterfactual outcome under the optimal STR. Generalizing SSL to learning STR brings interesting challenges. First, the feature distribution for predicting Y_t is unknown in the Qlearning procedure, as it includes unknown Y_t1 due to the sequential nature. Our methods for estimating optimal STR and its associated value function, carefully adapts to this sequentially missing data structure. Second, we modify the SSL framework to handle the use of surrogate variables W which are predictive of the outcome through the joint law ℙ_Y, O, W, but are not part of the conditional distribution of interest ℙ_Y O. We provide theoretical results to understand when and to what degree efficiency can be gained from W and O. Our approach is robust to misspecification of the imputation models. Further, we provide a doubly robust value function estimator for the derived STR. If either the Q functions or the propensity score functions are correctly specified, our value function estimators are consistent for the true value function.
READ FULL TEXT
Comments
There are no comments yet.