Deep Learning Models to Predict Pediatric Asthma Emergency Department Visits

07/25/2019 ∙ by Xiao Wang, et al. ∙ PCCI Parkland Health & Hospital System 0

Pediatric asthma is the most prevalent chronic childhood illness, afflicting about 6.2 million children in the United States. However, asthma could be better managed by identifying and avoiding triggers, educating about medications and proper disease management strategies. This research utilizes deep learning methodologies to predict asthma-related emergency department (ED) visit within 3 months using Medicaid claims data. We compare prediction results against traditional statistical classification model - penalized Lasso logistic regression, which we trained and have deployed since 2015. The results have indicated that deep learning model Artificial Neural Networks (ANN) slightly outperforms (with AUC = 0.845) the Lasso logistic regression (with AUC = 0.842). The reason may come from the nonlinear nature of ANN.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

More than million Americans have asthma, which affects of adults and of children [6]. Asthma has been increasing since the early s in all age, sex and racial groups. Currently, asthma afflicts about million children under the age of in the United States [34], rendering it the leading chronic childhood illness. Asthma accounts for million doctor’s office visits, discharges from hospital inpatient care and million emergency department (ED) visits each year [7]. From to , the annual economic cost of asthma was more than billion, including billion medical costs [23]. The median annual medical cost of asthma was per child in , with a range of in Arizona to in Michigan [22]. Asthma disproportionately affects low-income, minority and Medicaid-insured children, causing increased condition issues [25]. There is no cure to asthma, but studies reveal that it can be managed with proper disease management and adequate medical treatment, which can effectively improve asthma-related outcomes and reduce health care costs [15, 17].

Asthma is an ambulatory care-sensitive condition and most exacerbations that lead to ED visits or hospitalizations are avoidable [9]. Timely identification of high risk asthma patients and proactive interventions are the key to improving asthma care in the long-term. Risk factors for asthma-related adverse events have been extensively studied and multi-factorial risk predictive models are playing an increasingly recognized role in optimizing value by focusing asthma care on those at greatest risk [1, 31]. These models are mostly trained as traditional statistical models, including logistic regressions,

-nearest meighbors, decision trees and support vector machines


Parkland Center for Clinical Innovation (PCCI) has developed a Lasso logistic regression model in to predict asthma ED visit in the following months for children under years old, using clinical, health services utilization and socio-demographic variables from Medicaid claims data. Compared to the published predictive models, our model has higher clinical relevance [33], shows decent predictive accuracy [12], is derived from relatively large populations [28], and is well-evaluated [2]. We have been sending monthly alert reports to providers, which contain information of predicted high risk asthma patients, as well as inserting the Best Practice Alert (BPA) into Epic system, aiming to reduce unnecessary hospital utilization and cost, increase patient adherence to medication and clinic visit, and improve overall health care experience.

In this research, we continued to focus on Medicaid pediatric patients from Parkland Community Health Plan (PCHP), a Medicaid health management organization (HMO) in north Texas, who provided us with the study setting. Our primary goal was to utilize administrative claims data for a Medicaid-enrolled pediatric patient population to train and test deep learning predictive models for forecasting the risk of asthma-related ED visits or hospitalizations within the next months and to evaluate if the emerging deep learning models could outperform the current Lasso logistic regression already in practice by comparing their predictive power. Our Lasso logistic regression model served as the baseline benchmark against which deep learning model results would be compared. To the best of our knowledge, this approach for this particular use case leveraging administrative claims data has not been attempted before.

2 Background

By the Council of State of Territorial Epidemiologists (CSTE) definition, a patient has “probable" asthma if s/he had at least one ED visit or hospitalization or outpatient visit with a primary diagnosis of asthma, or at least one asthma medication prescription in the preceding

months [32]. We used this CSTE standard to identify our original cohort population in this study.

Health systems and clinicians have relied on traditional reporting tools for asthma case management and risk assessment to imporve care quality. The Healthcare Effectiveness Data and Information Set (HEDIS) definition for persistent asthma is a commonly used set of criteria [4, 21, 13], which is a combination checking of a patient’s asthma-related ED visit, hospitalization and outpatient visit, as well as asthma medication dispensing events. We refer to [32] for detailed medication prescription criteria. Note that HEDIS persistent asthma definition is a stronger condition than CSTE “probable" asthma definition. The asthma medication ratio (AMR) also enjoys wide usage in identifying high risk asthma patients [2, 5], which is defined as follows:

An is usually associated with higher risk of patients ending up in the hospital with an acute asthma exacerbation in the following several months [2, 30, 29]. Both HEDIS and AMR criteria are typically computed from a -month time cycle. And neither of them addresses the socio-demographic or comorbidities factors.

PCCI’s Lasso logistic regression model is robust and clinically relevant, which significantly outperforms both HEDIS persistent asthma case-definition criteria and AMR

clinical criteria in the predictive power to classify high risk patients.

3 Methodology

Deep learning models identifies intricate structure in large data sets [19]

through multiple layers in the neural network architectures that learn directly from the data without the need for manual feature extraction. Health care stands to benefit immensely from deep learning technologies because of the data volume, as well as the emerging unstructured complex types of data including electronic health records (EHR), imaging and text data

[11, 20]. Recently various deep learning models have been extensively applied to different subfields in health care and have achieved great advancements [27, 26, 3, 24, 18].

3.1 Data and Features

Claims data consists of billing codes that health care providers and facilities submit to payers. It follows a consistent format and uses a standard set of pre-established codes that describe specific diagnosis, procedures, medications, as well as billed and paid amounts [16]. Additionally, claims data documents nearly all interactions a patient has across all the health care systems. Claims data captures broader information for patients and provides access to larger and more diverse patient cohort. However, claims data has, by its nature, a time lag of about to days due to the processing time before it is finally added to the database and becomes available for analysis use.

In order to compare with our baseline model in production, the data for this study was extracted from PCHP claims data between July and June , which was the same time range as the one we used to train the Lasso logistic regression model. We first filtered for children aged between months and years old at prediction time and applied CSTE “probable” asthma criteria to identify original cohort. To be included in the model cohort, we further checked that the patient was enrolled in PCHP continually in our study time range. The size of the final data set was reduced to unique patients. And the prevalence rate for patients with asthma ED visits in the following months was about . The Figure 1 illustrates and summarizes the cohort selection process for our study.

Figure 1: Asthma cohort selection process

The features generated from claims data for predictive model could be broadly classified into five categories, as shown in the Table 3.1. Considering the inherent -day time lag in PCHP claims data, we excluded the features that were highly likely to be incorrectly coded, for example the ED visit number in the past month at prediction time. After this feature pre-selection step, we had in total features.

Category Example features
Demographics Gender, age
Medication AMR, controller medication dispensation events, reliever medication dispensation events
Health service utilization Number of asthma-related ED visits in the past months
Comorbid illinesses Obesity, sleep apnea
Insurance gap Number of insurance gaps in the past months
Table 3.1: Model features in broader categories

3.2 Methods

Artificial Neural Networks are inspired by biological neural networks, which are based on a family of interconnected units, called artificial neurons

. Each connection can receive a signal from artificial neurons as an input, change the internal state and transmit the output to another artificial neurons connected to it. ANNs could learn and model complex nonlinear relationships in the data sets

[8]. The feedforward networks are used in our study and the model structure is shown in the Figure 2

. The input layer consists of all the predictors that were previously validated and normalized when necessary. The hidden layer applies the activation function to the weighted sum of input layers and forward passes the results to the next layer. In the final output layer, the


activation function is used to produce a probability for our desired binary outcome. We define the loss function and backpropagate the error to hidden layers to update the weights. We iterate this process until the predefined convergence rate is achieved.

Figure 2: ANN connection in our model

3.3 Training and Test Strategies

Data was randomly divided into training and test sets in proportions. We applied various resampling methods to obtain different training samples original training data, oversampled data and downsampled data. We applied dropout regularization technique to avoid overfitting. Considering the size of the data set, we only used two hidden layers as one hidden layer are usually sufficient for most classification problems and too many hidden layers could easily cause overfitting [14]. The model was trained using Keras library in Python and the Table 3.2

shows our best hyperparameter selection of the ANN model.

Parameters Selection
Loss function Binary cross-entropy
Optimizer Adam
Activation function

Leaky ReLU (

) and Sigmoid
Batch size
Dropout (recurrent dropout)
Learning rate
Table 3.2: Hyperparameter selection in ANN model

4 Results and Discussion

We provided the following statistical metrics from the test data to evaluate and compare the classification power between ANN model and Lasso logistic regression model: the area under the Receiver Operating Characteristic curve (ROC AUC, C-statistic), recall, precision,

score and the area under precision-recall curve (PR AUC). Here score is defined as . Due to the significant imbalance in the test data, we didn’t utilize prediction accuracy to assess model performance. Based on the prevalence rate of actual adverse events, model performance metrics and clinical assessment of the likely capacity of a potential intervention program, we designed the patients in the top most percent of risk scores as “High", the to percentile range as “Medium” and the rest as “Low” risk in the Lasso logistic regression model. We kept the same thresholds for ANN model. The predicted adverse events were only from “High” risk category. Results from proposed models are shown in the Table 4.1 and the Figure 3. Larger areas from both ROC curve and PR curve for ANN model indicated that it outperformed the Lasso logistic regression in classification power. However, lower reminded us that we need to re-adjust our thresholds for ANN model.

Models ROC AUC Recall Precision score PR AUC
Lasso logistic regression
Table 4.1: Evaluation metrics for model performance
Figure 3: ROC curve and PR curve of ANN

5 Conclusion

With the exact same data set and initial feature list, ANN model only produced slightly higher statistical classification power than the Lasso logistic regression. This is consistent with the results from [10] to compare logistic regression and ANN models in multiple medical data classification tasks. This study further confirmed that the Lasso logistic regression model developed by PCCI in could produce desirable statistical performance that is non-inferior to deep learning models which are more difficult to interpret. And in order for our predictive models to be deployed and effectively improve patient care, we need to work closely with clinicians to explain predictions in comprehensive and interpretable formats to build trust and transparency with stakeholders.

For future studies, blender algorithms would be tested against other singular models to achieve better statistical performances. We would explore the temporal relationships in claims data using other deep learning models, like Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). In order to get more timely and accurate patient information, we could link EHR data, which is longitudinal in nature, with claims data. We could also combine Social Determinants of Health (SDoH) data with claims data, to analyze how it contributes to the outcome. We would continue validating our model using the most recent available claims data and retrain the model if necessary to capture more accurate characteristics of our cohort populations.

6 Acknowledgment

We acknowledge Parkland Community Health Plan (PCHP) for providing us with the data and giving us the opportunity to work on this project. We are thankful to the leadership of Parkland Health and Hospital System (PHHS) for the support.


  • [1] F. Ahmadizar, S. J. Vijverberg, H. G. Arets, A. de Boer, J. E. Lang, M. Kattan, C. N. Palmer, S. Mukhopadhyay, S. Turner, and A. H. Maitland-van der Zee. Childhood obesity in relation to poor asthma control and exacerbation: a meta-analysis. European Respiratory Journal, 48(4):1063–1073, 2016.
  • [2] A. L. Andrews, A. N. Simpson, W. T. Basco Jr, R. J. Teufel, et al. Asthma medication ratio predicts emergency department visits and hospitalizations in children with asthma. Medicare & medicaid research review, 3(4), 2013.
  • [3] M. A. Badgeley, J. R. Zech, L. Oakden-Rayner, B. S. Glicksberg, M. Liu, W. Gale, M. V. McConnell, B. Percha, T. M. Snyder, and J. T. Dudley. Deep learning predicts hip fracture using confounding patient and healthcare variables. npj Digital Medicine, 2(1):31, 2019.
  • [4] W. E. Berger, A. P. Legorreta, M. S. Blaiss, E. C. Schneider, A. T. Luskin, D. A. Stempel, S. Suissa, D. C. Goodman, S. W. Stoloff, J. A. Chapman, et al. The utility of the health plan employer data and information set (hedis) asthma measure to predict asthma-related outcomes. Annals of Allergy, Asthma & Immunology, 93(6):538–545, 2004.
  • [5] M. S. Broder, B. Gutierrez, E. Chang, D. Meddis, and M. Schatz. Ratio of controller to total asthma medications: determinants of the measure. The American journal of managed care, 16(3):170–178, 2010.
  • [6] CDC - asthma - data and surveillance - asthma surveillance data, 2018.
  • [7] CDC - asthma, 2019.
  • [8] G. Daniel. Principles of artificial neural networks, volume 7. World Scientific, 2013.
  • [9] L. T. Das, E. L. Abramson, A. E. Stone, J. E. Kondrich, L. M. Kern, and Z. M. Grinspan. Predicting frequent emergency department visits among children with asthma using ehr data. Pediatric pulmonology, 52(7):880–890, 2017.
  • [10] S. Dreiseitl and L. Ohno-Machado. Logistic regression and artificial neural network classification models: a methodology review. Journal of biomedical informatics, 35(5-6):352–359, 2002.
  • [11] A. Esteva, A. Robicquet, B. Ramsundar, V. Kuleshov, M. DePristo, K. Chou, C. Cui, G. Corrado, S. Thrun, and J. Dean. A guide to deep learning in healthcare. Nature medicine, 25(1):24, 2019.
  • [12] E. Forno, A. Fuhlbrigge, M. E. Soto-Quirós, L. Avila, B. A. Raby, J. Brehm, J. M. Sylvia, S. T. Weiss, and J. C. Celedón. Risk factors and predictive clinical scores for asthma exacerbations in childhood. Chest, 138(5):1156–1165, 2010.
  • [13] E. W. Gelfand, G. L. Colice, L. Fromer, W. B. Bunn III, and T. J. Davies. Use of the health plan employer data and information set for measuring and improving the quality of asthma care. Annals of Allergy, Asthma & Immunology, 97(3):298–305, 2006.
  • [14] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016.
  • [15] D. K. Greineder, K. C. Loane, and P. Parks. A randomized controlled trial of a pediatric asthma outreach program. Journal of Allergy and Clinical Immunology, 103(3):436–440, 1999.
  • [16] W. J and B. A. The benefit of using both claims data and electronic medical record data in health care analysis. Technical report, Optum Insight, 2012.
  • [17] P. Karnick, H. Margellos-Anast, G. Seals, S. Whitman, G. Aljadeff, and D. Johnson. The pediatric asthma intervention: a comprehensive cost-effective approach to asthma management in a disadvantaged inner-city community. Journal of Asthma, 44(1):39–44, 2007.
  • [18] S. M. Lauritsen, M. E. Kalør, E. L. Kongsgaard, K. M. Lauritsen, M. J. Jørgensen, J. Lange, and B. Thiesson. Early detection of sepsis utilizing deep learning on electronic health record event sequences. arXiv preprint arXiv:1906.02956, 2019.
  • [19] Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521(7553):436, 2015.
  • [20] R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley. Deep learning for healthcare: review, opportunities and challenges. Briefings in bioinformatics, 19(6):1236–1246, 2017.
  • [21] D. M. Mosen, E. Macy, M. Schatz, G. Mendoza, T. B. Stibolt, J. McGaw, J. Goldstein, and J. Bellows. How well do the hedis asthma inclusion criteria identify persistent asthma. Am J Manag Care, 11(10):650–4, 2005.
  • [22] T. Nurmagambetov, O. Khavjou, L. Murphy, and D. Orenstein. State-level medical and absenteeism cost of asthma in the united states. Journal of Asthma, 54(4):357–370, 2017.
  • [23] T. Nurmagambetov, R. Kuwahara, and P. Garbe. The economic burden of asthma in the united states, 2008–2013. Annals of the American Thoracic Society, 15(3):348–356, 2018.
  • [24] V. Osmani, L. Li, M. Danieletto, B. Glicksberg, J. Dudley, and O. Mayora. Processing of electronic health records using deep learning: A review. arXiv preprint arXiv:1804.01758, 2018.
  • [25] C. M. Pacheco, C. E. Ciaccio, N. Nazir, C. M. Daley, A. DiDonna, W. S. Choi, C. S. Barnes, and L. J. Rosenwasser. Homes of low-income minority families with asthmatic children have increased condition issues. In Allergy and asthma proceedings, volume 35, page 467. OceanSide Publications, 2014.
  • [26] T. Pham, T. Tran, D. Phung, and S. Venkatesh. Deepcare: A deep dynamic memory model for predictive medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 30–41. Springer, 2016.
  • [27] B. K. Reddy and D. Delen. Predicting hospital readmission for lupus patients: An rnn-lstm-based deep-learning methodology. Computers in biology and medicine, 101:199–209, 2018.
  • [28] M. Schatz, E. F. Cook, A. Joshua, and D. Petitti. Risk factors for asthma hospitalizations in a managed care organization: development of a clinical prediction rule. The American journal of managed care, 9(8):538–547, 2003.
  • [29] M. Schatz, R. S. Zeiger, W. M. Vollmer, D. Mosen, G. Mendoza, A. J. Apter, T. B. Stibolt, A. Leong, M. S. Johnson, and E. F. Cook. The controller-to-total asthma medication ratio is associated with patient-centered as well as utilization outcomes. Chest, 130(1):43–50, 2006.
  • [30] R. H. Stanford, M. B. Shah, A. O. D’Souza, and M. Schatz. Predicting asthma outcomes in commercially insured and medicaid populations? The American journal of managed care, 19(1):60–67, 2013.
  • [31] C. Tolomeo, C. Savrin, M. Heinzer, and A. Bazzy-Asaad. Predictors of asthma-related pediatric emergency department visits and hospitalizations. Journal of Asthma, 46(8):829–834, 2009.
  • [32] D. B. Wakefield and M. M. Cloutier. Modifications to hedis and cste algorithms improve case recognition of pediatric asthma. Pediatric pulmonology, 41(10):962–971, 2006.
  • [33] M. Xu, K. G. Tantisira, A. Wu, A. A. Litonjua, J.-h. Chu, B. E. Himes, A. Damask, and S. T. Weiss.

    Genome wide association study to predict severe asthma exacerbations in children using random forests classifiers.

    BMC medical genetics, 12(1):90, 2011.
  • [34] H. S. Zahran, C. M. Bailey, S. A. Damon, P. L. Garbe, and P. N. Breysse. Vital signs: asthma in children—united states, 2001–2016. Morbidity and Mortality Weekly Report, 67(5):149, 2018.