1 Introduction
T1D impacts more than 1,000,000 people in the US Menke et al. (2013b). Patients are often diagnosed with T1D as children, and for the rest of their lives they are responsible for calculating and administering doses of insulin, a potent, lethal drug Subramanian et al. (2000). The consequences of poor T1D management are severe. Patients who receive too little insulin experience hyperglycemia, a condition which leads to permanent organ damage. On the other hand, excessive doses of insulin cause hypoglycemia, a potentially fatal condition Nathan et al. (2005). Modern treatment systems automatically monitor blood glucose levels and deliver insulin to T1D patients Subramanian et al. (2000). Commonly known as artificial pancreata (APs), these devices combine an insulin pump with a continuous glucose monitor (CGM) operating in a closedloop manner to control blood glucose levels.
Fundamentally, the system defined by a T1D patient controlled with an AP is underactuated. Specifically, APs can only control insulin delivery, which lowers blood glucose levels; APs do not currently have any way to increase blood glucose levels. Typical methods to increase blood glucose levels, like eating food, are unavailable when the patient is sleeping or otherwise incapacitated. As such, AP algorithms are conservative in order to prevent hypoglycemic episodes for which the system has no direct actuation authority to escape. In practice, poor performance (e.g. hyper or hypoglycemic events occurring too frequently) is common enough at a population level that more than 1,000 patients with T1D have modified the algorithms on existing AP systems with unregulated opensource software Lee et al. (2017). The patients in this group have anecdotally shown superior outcomes compared with standard of care, and yet we do not accurately know how safe the systems truly are.
In this work, we aim to uncover the probability of failure for an AP under a datadriven distribution of T1D patient behavior and physiology. Specifically, we consider the scenario of overnight fasting, a dangerous time for T1D patients. Using an FDAapproved simulator for T1D patients’ physiology, we investigate the last evening meal and overnight fasting period. Our datadriven distribution consists of the carbohydrate composition of the evening meal, the fasting duration, and internal physiological parameters for a T1D patient. We use adaptive importance sampling to iteratively learn an estimator that efficiently learns the (rare) probability of a hypoglycemic event. Compared to naive Monte Carlo sampling, our adaptive importance sampling method increases the frequency of sampling rare hypoglycemic events and more accurately estimates the probability of these events. Indeed, for the same number of samples, we find 210
as many hypoglycemic events and our estimates of the probability of these events have 24smaller variance.
2 Simulator & Population Modeling
Simulator
We use an implementation of the 2008 UVa Padova simulator Dalla Man et al. (2007)
for simulating T1D patients. This simulator is composed of a system of ordinary differential equations modeling the internal dynamics of of a patient. The system is composed primarily of three subsystems: glucose physiology, insulin physiology, and carbohydrate ingestion physiology. The simulator has been approved for preclinical trials by the FDA
Bergenstal et al. (2016).Eating and Overnight Fasting Behavior
Because we are concerned with the overnight behavior of an AP, we build a distribution of evening meals and overnight fasting times of T1D patients. Unfortunately, there are few publicly available surveys of T1D behavior. Therefore, we estimate a sample using the National Health and Nutrition Examination Survey (NHANES), a representative crosssectional survey of the US population for Health Statistics (NCHS)
. Nevertheless, even NHANES does not explicitly classify whether patients have type 1 or type 2 diabetes. As outlined by
Menke et al. (2013a), we define T1D individuals in NHANES as those who started insulin within 1 year of diabetes diagnosis, are currently using insulin and were diagnosed with diabetes under age 40. NHANES collects two consecutive days’ worth of eating behavior for survey participants, including mealtimes as well as nutritional and caloric information of each meal. We define meals as ingestion of food that contains at least 1 calorie, and we define overnight fasting as the longest period of fasting greater than 5 hours that starts after 4pm on the first day and ends before 1pm on the second day. We fit a logitnormal distribution to these two variables since it has the ability to approximate the shape of beta and normal distributions while also having compact support. The latter fact is important since we want to prevent unrealistic tail behavior. The details of the model are in Appendix
A.1.Physiological Parameters
Thirty sample T1D patients (subpopulations of 10 children, 10 adolescents, 10 adults) are available for use with the UVa Padova simulator Dalla Man et al. (2007). The rarity of data for T1D necessitates a more personalized approach to analyzing risk and certifying safety of artificial pancreata. As such, we build generative models for physiological parameters focusing on small variations in the 61 parameters per patient, since we know that each of the thirty realizations of parameters is in fact realistic. We design logitnormal distributions around each patient’s parameters by first setting the compact interval range as 1/10 the interval range for the entire subpopulation centered at the patient’s parameters. The details can be found in Appendix A.2. This covariance structure we derive encodes the fact that we have no knowledge of the covariance of the small variations in the 61 parameters for an individual patient.
3 Rareevent simulation
Personalized clinical evaluation of an AP for a particular patient profile—especially one modified from its original settings—is infeasible due to the high costs and inherent dangers posed to trial subjects. Formal verification methods are challenging to apply due to the difficultly in specifying the operating domain. For example, a patient may have run a marathon the previous day, dramatically increasing her insulin sensitivity.
Instead of introducing completely adhoc restrictions to the set of scenarios under consideration, we consider a datadriven a probabilistic approach. We posit a base distribution on the set of initial conditions , and denote by a continuous measure of risk; a natural measure for an artificial pancreas is given by the minimum blood glucose level over a period of interest. Then, our goal is to evaluate the probability of an adverse event
based on samples and rollouts from our simulator. Here, the parameter is a threshold that defines an “adverse event”; if denotes the minimum glucose level over a rollout, then is the threshold for a clinical definition of a hypoglycemic event Cryer (2010).
Our probabilistic approach rests on the notion of a base distribution , which could introduce subjective valuations of what constitutes a realistic scenario. To alleviate these issues in a systematic manner, we use datadriven model whenever population level distributions are deemed to be meaningful at the individual level as well (e.g. amount of carbohydrates consumed per meal). To model variations in personalized measurements, we account for sensor noise and uncertainty in physiological parameters by postulating a logitnormal distribution with small amounts of variance. We combine: (1) datadriven estimation approach for core exogenous sources of uncertainties, e.g.
meals and fasting (2) autoregressive moving average process noise for sensor measurements (3) white noise to estimates of unobservable physiological parameters. Through this effort we are able to represent three different sources of randomness in our base distribution
.For reliable control algorithms, an adverse event will be rare, and the probability close to . We treat this as a rare event simulation problem (see (Asmussen and Glynn, 2007, Chapter VI) for an overview of this topic), and use adaptive importance sampling techniques to accelerate our evaluation. To address the shortcomings of the naive Monte Carlo method for estimating rare event probabilities , we use an adaptive importance sampling approach Asmussen and Glynn (2007). Our goal is to find an importance sampling distribution that produces estimates with low variance. The optimal importancesampling distribution for estimating is given by the conditional density , where is the density function of . Indeed, since if , the estimate is exact. However, sampling from this distribution requires knowledge of , the quantity under estimation. Instead, we consider a family of parameterized importance sampling distributions for , and use a modelbased optimization method that iteratively modifies to better approximation .
In particular, we use the crossentropy method Rubinstein and Kroese (2004), which iteratively approximates , the projection of onto the class of parameterized distributions . We use natural exponential families as our model class of importance samplers.
Since adverse events are rare, the crossentropy method maintains a surrogate distribution where is a (potentially random) sequence of alternative thresholds . Using this multilevel approach, at each iteration we use samples from our current iterate to update as an approximate projection of onto . Such a procedure is guided towards distributions that upweights regions of with low values of (unsafe regions). To choose the level at each iteration , we use an empirical estimate of the
quantile of
where , where (see Homemde Mello (2007) for other variants). Further details are provided in Appendix B.The crossentropy method (Algorithm 1 in Appendix B
) is a modelbased heuristic for approximating the optimal importance sampler,
. Empirically, we find that with careful choice of hyperparameters, the crossentropy method learns importance samplers that samples risky scenarios much more frequently. We observe significant improvements over the naive Monte Carlo method in our experiments, but we observe that the importance sampler focuses on a particular “risk mode” of overeating.
4 Results and Discussion
We investigate hypoglycemic events as failure modes for AP systems. Recall that APs are underactuated systems which currently have the ability to lower, but not raise, blood glucose levels. As such, we choose to simulate an evening meal followed by an overnight fast to capture the one of the most dangerous time periods for a T1D patient, a period when they are incapacitated and cannot react to hypoglycemia Hovorka et al. (2011). There are three classes of patients: children (under age 13), adolescents (between ages 13 and 19), and adults (above age 19). We show in the sequel, the frequency of hypoglycemic events subsequently varies considerably between children and both adolescents and adults.
We let the random variable
be composed of concatenated with , such that a specific realization of defines a synthetic patient () as well as their evening meal carbohydrate intake and overnight fasting time (). We define the risk as the minimum blood glucose level for a patient during their simulated overnight fast. We choose a subset of one child, one adolescent, and one adult form our population of 30 patients. For each patient, we compute both a naive Monte Carlo estimate and an estimate utilizing the framework described in Section 3.Since the feature vector
is a logitnormal distribution, we posit the space of logitnormal distributions as the model space searched over by the crossentropy method. Concretely, we consider logitnormal distributions with the same covariance structure. Formally, this is equivalent to taking a logistic transform ofand searching over multivariate Gaussian distributions
with covariance estimated in Section 2. Specifically, we consider the search space where is the mean vector estimated in Section 2, and we choose to maximize acceleration while retaining numerical stability of the likelihood ratios.We use , and samples per iteration of the crossentropy method. We observe that is a crucial hyperparameter for achieving acceleration; we chose based on the number of adverse events sampled for the adult patient, and used it throughout all age groups. We note that in order to avoid overfitting to the samples drawn at each crossentropy iteration, one has to increase appropriately as approaches 0. Another crucial design choice that determine the performance of the crossentropy method is , size of the search space. Based on a small preliminary experiment, we fix them at for child, adolescent, and adult patients.
To evaluate the performance of our learned importance sampling distribution , we draw samples to simulate the minimum glucose level. In Figure 1, we find that the crossentropy method learns to sample hypoglycemic events  times more frequently compared to the naive Monte Carlo method. We see in Figure 2 that our ability to sample more adverse events leads to estimators with smaller variance. Since both naive Monte Carlo and importance sampling estimators are unbiased, the observed variance reduction implies that the crossentropy method can significantly accelerate evaluation of adverse events. The relative variance reduction is especially pronounced for the patient in the child group, whose probability of the adverse event is rarest out of the three patients.
References
 Asmussen and Glynn [2007] S. Asmussen and P. W. Glynn. Stochsatic Simulation: Algorithms and Analysis. Springer, 2007.
 Bergenstal et al. [2016] R. M. Bergenstal, S. Garg, S. A. Weinzimer, B. A. Buckingham, B. W. Bode, W. V. Tamborlane, and F. R. Kaufman. Safety of a Hybrid ClosedLoop Insulin Delivery System in Patients With Type 1 Diabetes. JAMA, 316(13):1407–1408, Oct. 2016. ISSN 00987484. doi: 10.1001/jama.2016.11708. URL http://jamanetwork.com/journals/jama/fullarticle/2552454.
 Cryer [2010] P. E. Cryer. Hypoglycemia in type 1 diabetes mellitus. Endocrinology and Metabolism Clinics, 39(3):641–654, 2010.
 Dalla Man et al. [2007] C. Dalla Man, D. M. Raimondo, R. A. Rizza, and C. Cobelli. GIM, Simulation Software of Meal Glucose–Insulin Model. Journal of diabetes science and technology (Online), 1(3):323–330, May 2007. ISSN 19322968. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2769591/.
 for Health Statistics [NCHS] N. C. for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Centers for Disease Control and Prevention (CDC), https://wwwn.cdc.gov/nchs/nhanes/NhanesCitation.aspx, 2018.
 Friedman et al. [2008] J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, 2008.
 Homemde Mello [2007] T. Homemde Mello. A study on the crossentropy method for rareevent probability estimation. INFORMS Journal on Computing, 19(3):381–394, 2007.
 Hovorka et al. [2011] R. Hovorka, K. Kumareswaran, J. Harris, J. M. Allen, D. Elleri, D. Xing, C. Kollman, M. Nodale, H. R. Murphy, D. B. Dunger, S. A. Amiel, S. R. Heller, M. E. Wilinska, and M. L. Evans. Overnight closed loop insulin delivery (artificial pancreas) in adults with type 1 diabetes: crossover randomised controlled studies. The BMJ, 342, Apr. 2011. ISSN 09598138. doi: 10.1136/bmj.d1855. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3077739/.
 Lee et al. [2017] J. M. Lee, M. W. Newman, A. Gebremariam, P. Choi, D. Lewis, W. Nordgren, J. Costik, J. Wedding, B. West, N. B. Gilby, C. Hannemann, J. Pasek, A. Garrity, and E. Hirschfeld. RealWorld Use and SelfReported Health Outcomes of a PatientDesigned DoitYourself Mobile Technology System for Diabetes: Lessons for Mobile Health. Diabetes Technology & Therapeutics, 19(4):209–219, Apr. 2017. ISSN 15209156, 15578593. doi: 10.1089/dia.2016.0312. URL http://www.liebertpub.com/doi/10.1089/dia.2016.0312.
 Menke et al. [2013a] A. Menke, T. J. Orchard, G. Imperatore, K. M. Bullard, E. MayerDavis, and C. C. Cowie. The prevalence of type 1 diabetes in the united states. Epidemiology (Cambridge, Mass.), 24(5):773, 2013a.
 Menke et al. [2013b] A. Menke, T. J. Orchard, G. Imperatore, K. M. Bullard, E. MayerDavis, and C. C. Cowie. The Prevalence of Type 1 Diabetes in the United States. Epidemiology (Cambridge, Mass.), 24(5):773–774, Sept. 2013b. ISSN 10443983. doi: 10.1097/EDE.0b013e31829ef01a. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4562437/.
 Nathan et al. [2005] D. M. Nathan, P. A. Cleary, J.Y. C. Backlund, S. M. Genuth, J. M. Lachin, T. J. Orchard, P. Raskin, B. Zinman, and Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) Study Research Group. Intensive diabetes treatment and cardiovascular disease in patients with type 1 diabetes. The New England Journal of Medicine, 353(25):2643–2653, Dec. 2005. ISSN 15334406. doi: 10.1056/NEJMoa052187.

Rubinstein and Kroese [2004]
R. Y. Rubinstein and D. P. Kroese.
The crossentropy method: A unified approach to Monte Carlo simulation, randomized optimization and machine learning
. Information Science & Statistics, Springer Verlag, NY, 2004.  Subramanian et al. [2000] S. Subramanian, D. Baidal, J. S. Skyler, and I. B. Hirsch. The Management of Type 1 Diabetes. In L. J. De Groot, G. Chrousos, K. Dungan, K. R. Feingold, A. Grossman, J. M. Hershman, C. Koch, M. Korbonits, R. McLachlan, M. New, J. Purnell, R. Rebar, F. Singer, and A. Vinik, editors, Endotext. MDText.com, Inc., South Dartmouth (MA), 2000. URL http://www.ncbi.nlm.nih.gov/books/NBK279114/.
Appendix A Generative Models
a.1 Eating and Overnight Fasting Behavior
We fit a distribution to the data in the following manner. Denoting as the random variable denoting overnight fasting time and the carbohydrate intake of the last meal prior to fasting, we denote random variable with parameters and as follows:
(1a)  
(1b)  
(1c) 
where is a multivariate normal distribution, and the operators , , and operate elementwise on vectors. The four parameters are fit from the empirical data distribution with its corresponding distribution that places weight on each datapoint.
(2a)  
(2b)  
(2c)  
(2d)  
(2e) 
where , and operate elementwise on vectors.
a.2 Physiological Parameters
We attempted to build a generative model of the 61 parameters required by the simulator by fitting a logitnormal distribution to each of the three subpopulations of parameters in a manner similar to that in Equation (2). Namely, we fit , , and using the same approach as in Equations (2a), (2b), and (2c). Due to the high dimensionality of the data and the low sample size, we replace Equation (2d) with the graphical lasso algorithm Friedman et al. [2008]. This method estimates the inverse covariance matrix with the following convex optimization problem Friedman et al. [2008]:
However, roughly 40% of 1,000,000 synthetic patients drawn from this model spontaneously become dangerously hypoglycemic in short periods of time without taking insulin or eating any meals, implying that the distribution is unrealistic (at least for use with the simulator). Attempts at refining the sampling scheme using convexhulls around the 60% of realistic synthetic patients were not successful due to the high dimensionality of the space. Because the interdependencies between the 61 parameters are evidently paramount to ensuring realistic synthesis of patients, we move away from attempting to build accurate subpopulation distributions given only 10 sample patients for each subpopulation.
As noted in the main text, we design logitnormal distributions around each patient’s parameters by first setting the compact interval range as 1/10 the interval range for the entire subpopulation centered at the patient’s parameters. Namely, for patient , we have:
(3a)  
(3b)  
where the operator is elementwise over vectors.^{1}^{1}1For some dimensions, must be set as since the parameter must be nonnegative. Then, we choose the mean and covariance as follows:  
(3c)  
(3d) 
This covariance structure encodes the fact that we have no knowledge of the covariance of the minute variations in the 61 parameters for an individual patient; it approximates a normal distribution with roughly 99% of the probability mass in the interval .
Appendix B The Crossentropy method
Concretely, consider the following updates to the parameter vector at iteration : compute projections of a mixture of and onto
(4) 
However, the term is unknown so we use an empirical approximation. For , letting be the quantile of and
(5) 
we use in place of in the idealized update (4). Summarizing this procedure, we obtain Algorithm 1; as our final importance sampler, we choose with the lowest quantile of .
Comments
There are no comments yet.