A Shapley Value Solution to Game Theoretic-based Feature Reduction in False Alarm Detection

12/05/2015 ∙ by Fatemeh Afghah, et al. ∙ Northern Arizona University University of Michigan 0

False alarm is one of the main concerns in intensive care units and can result in care disruption, sleep deprivation, and insensitivity of care-givers to alarms. Several methods have been proposed to suppress the false alarm rate through improving the quality of physiological signals by filtering, and developing more accurate sensors. However, significant intrinsic correlation among the extracted features limits the performance of most currently available data mining techniques, as they often discard the predictors with low individual impact that may potentially have strong discriminatory power when grouped with others. We propose a model based on coalition game theory that considers the inter-features dependencies in determining the salient predictors in respect to false alarm, which results in improved classification accuracy. The superior performance of this method compared to current methods is shown in simulation results using PhysionNet's MIMIC II database.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Several monitoring and therapeutic devices are utilized in intensive care units (ICUs) to measure vital signs, support or replace impaired or failing organs and administer medications to patients [1]. Each of these devices might generate optic/acoustic alarms due to patient’s physiologic condition, patient movement, motion artifact, malfunction of individual sensors and imperfections in the patient–equipment contact [2]. Many of the alarms (80% to 99% [3]) could be false and/or clinically insignificant which are not related to patient condition. These alarms could compromise quality and safety of care resulting to many problems such as “alarm fatigue” among care–givers as well as the possibility of missing a real event due to care–givers’ insensitivity to these unreliable alarms known as “cry–wolf” effect. Dealing with false alarms is widely considered the number one hazard imposed by the medical technology and an important concern in ICUs [3]. Several approaches have been utilized to decrease the number of false alarms as reviewed in  [3] and [4].

In [5]

, a genetic algorithm-based approach for false alarm reduction is proposed, where the features are extracted from electrocardiogram (ECG), Arterial Blood Pressure (ABP), and Photoplethysmogram (PPG or PLETH) arrhythmia patients. Using a relevance vector machine (RVM) as a classifier, false alarm suppression was reported to be

, and , respectively for asystole, extreme extreme bradycardia and extreme tachycardia. An automated method for false arrhythmia suppression was proposed in [6]

that is based on quality assessment of normal and abnormal rhythms of ECG signals. Different approaches including k–nearest neighbors (KNN), Naive Bayes, Decision Tree, SVM and multi–layer Perception have been tested on a database from MIMIC II for alarms classification, where features have been extracted from age, sex, Central Venous Pressure (CVP), SpO2, ABP, ECG and Pulmonary Arterial Pressure (PAP)

[7]. The suppression rate for true alarm detection is between 2.33% and 17.73% for 5 alarms and false alarm suppression rate is between 71.73% to 99.23%. The aforementioned models considered a number of features/parameters extracted from multiple continuously–measured physiological signals. The main challenge in these multi–parameter approaches is the presence of many parameters / features that individually have low impact on the model performance, which may not be included in the model, while when coupled with other such parameters could significantly improve the accuracy and specificity of the alarm detection algorithms.

Different hybrid feature selection algorithms have been utilized in big data analysis problems to improve the prediction accuracy and reliability through reducing the feature space to a more concise and relevant set of attributes

[8, 9, 10, 11]. Of the three major approaches for feature selection– filter-based, wrapper-based [12, 11, 13], and embedded methods [14] – the last two are known to be susceptible to overfitting and are computationally intensive [8]. The majority of these conventional methods, either only account for the effect of individual features on the target or consider the inter-feature mutual information to obtain higher performance. This often results in discarding the features that are relevant to the target class but are highly correlated to the ones that are already selected. This can significantly degrade the performance of the model in scenarios where a set of features together have a considerable effect on the classifier, while each individual attribute in the set does not [15].

In this paper, we propose a coalition-based game theoretic model for feature selection that accounts for intrinsic inter-features correlation among the predictors across the data sets to improve the accuracy of our model. Coalition game theory has been recently utilized in data analysis problems to improve the performance of feature selection by considering the contribution of the features on classification accuracy when the features are grouped with other features in the data set [16, 17, 18, 19]. In [18], we proposed a coalition-based game theoretic feature selection method to determine the salient features over a heterogeneous data set to predict the hemorrhage severity, where the features are modeled as the game players. The importance of each feature in the game is measured by Shapley value defined as its contribution in improving the classification accuracy considering all possible coalitions of the features. In [16], we developed a network-based coalition game theoretic framework to discover the most informative gene subnetworks in predicting ovarian cancer by integrating gene expression profiling of cancer tissues with protein-protein interaction (PPI) networks. This model considered the genes as the game players and develops pathways emerging from a seed gene set in PPI network by traversing the network to discover the most informative pathways associated with a desired outcome computing the Shapely value of the players.

Here we describe a recently proposed coalition-based game theoretic model in [20] to suppress the false alarm using three signals of ECG, PLETH and ABP from Physionet’s MIMIC II database. The impact of each feature in the game in interaction with other features when they form a coalition is measured by Multi–perturbation Shapely for coalitions of size 4, which results in significant accuracy improvement comparing to other feature selection techniques including Chi–square, Gain Ratio, Relief and Info Gain methods.

2 Signal Processing and Feature Extraction

In this study, five types of life threatening arrhythmias including asystole, extreme bradycardia, extreme tachycardia, ventricular tachycardia, and ventricular flutter/fibrillation are considered. Three main signals; ECG, ABP, and PLETH are used as the inputs of our proposed model. In the first stage (i.e. signal analysis) wavelet coefficients of each signal at different levels of decomposition are calculated using a discrete wavelet transform (DWT) on the 1–D input signals. At each level of decomposition process, DWT decomposes the signals into approximate and detail coefficients. Approximation set is obtained by applying a high–pass filter at low scales and detail coefficients are computed by applying a low–pass filter at high scales. We used Daubechies 8 (db8) for ECG signal as there is a good match between the shape of ECG signal and this wavelet. Daubechies 4 is used for PLETH and ABP signals for the same reason. Each of the three aforementioned signals are decomposed into levels by convolving the high-pass and low–pass filters. Feeding all these wavelet coefficients as features into the classification algorithm is not efficient and may significantly decrease the generalization property of the trained model due to over–fitting. Therefore, we reduce the number of features by extracting 20 representative statistical and information–theoretic properties of the wavelet vectors.

3 Coalition-based Game-theoretic Feature Selection

Cooperative game theory has been recently utilized in feature selection algorithms [17, 18, 16, 19]. In these games, the players cooperate with each other by forming various sub–groups called coalitions. These games are defined based on exhaustive scenarios that players may form a group and how the total shared payoff is divided among the members. A transferable utility coalition (TU–coalition) game with players can be defined by , where denotes the set of players, , and characteristic function, is a real–valued function defined on the set of all coalitions, . For a coalition ,

, the characteristic function,

represents the total payoff that can be gained by the members of this coalition, and satisfies the following conditions, i) characteristic function of an empty coalition is zero, , and ii) if and , () are two disjoint coalitions, the characteristic function of their union has super–additivity property, meaning that .

Here, we model the features as the players of the game, and the characteristic function of a coalition,

is measured by contribution of its members (features) to the performance of the classifier (e.g. success rate in supervised learning). Different possible grouping of the features are examined to recognize the optimal coalition. The contribution of feature

in classification accuracy when it joins a coalition is defined by marginal importance as follows

(1)

A solution of a coalition game is determined by how the coalition of players can be formed and how the total payoff of a coalition is divided among the members. Let’s define the value function, that assigns an –tuple of real numbers, to each possible characteristic function, in which measures the value of player in the game with characteristic function . Shapley value can be utilized as a fair unique solution of the coalition game [21]. The Shapley value of player is defined as the weighted mean of its marginal importance over all possible subsets of the players.

(2)

where is the set of all permutations over and is the set of features (players) preceding player in permutation . Since in feature selection, the order of features in a coalition does not change the value of coalition, the calculations in (2), can be further simplified by excluding the permutation of coalitions in the average:

(3)

where presents the coalitions that player does not belong to. It is equivalent to the weighted average of coalitions, where the weight of each coalition is the number of its all possible permutations. As shown in (2) and (3), the Shapely value solution accounts for all possible coalitions that can be formed by the players [21]. Since in false alarm detection problem, the data set includes a large number of features, thereby calculating the Shapley value would be computationally intractable. Therefore, we utilize the Multi–perturbation Shapley value measurement with coalition sizes up to

rather than the original Shapely value, which is determined using an unbiased estimator based on Shapley value

[22, 23].

In our proposed algorithm, at each round, the features are randomly divided into groups of size . Then, we calculate the corresponding Multi-perturbation Shapely value of feature inside its group, considering all possible coalitions of size

. This is equivalent to randomly sampling from uniformly distributed feature

, is calculated as follows.

(4)

where denotes the sampled permutation on sub–groups of features of size . There is an essential trade–off to set in the proposed method. Large values consider higher order relations, while increasing the complexity of finding Multi–perturbation Shapely value at each subgroup. We conjecture that the optimum value of for our datasets taking into account various factors such as the nature of data, number of features, and the inter-feature dependence is in the range of to . This is confirmed by simulation results in section 4.

4 Numerical Analysis

For this study, we used a publicly available PhysionNet’s MIMIC II databaset [24], where measurement for three vital signals ECG, PLETH, and APB are provided for patients and each subject is labeled as true, false, or impossible to tell. We first apply six-level wavelet decomposition to obtain time-frequency information at different resolutions. Therefore, each sample is represented by vectors of wavelet coefficients. Subsequently, statistical and information-theoretic features are extracted from each of the vectors, resulting in total of features. Experimental results are provided in this section for the proposed alarm validation method as well as other state–of–the–art explicit feature selection methods including Chi–square, Gain Ratio, Relief and Info Gain methods. The numerical results are obtained utilizing the proposed coalition–game theoretic method where the multi–perturbation Shapley value is calculated for coalitions’ size up to 4, . The alarm typing rate for all feature selection methods are evaluated in combination with Bayes Net classification as a representative classifier. In all simulations, the 30 most informative features are selected to compare the performance of different feature selection techniques.

The comparison results in Fig. 0(a) suggest a considerable improvement for the proposed method in discarding the false alarms compared to the competitor methods. The alarm typing success rate for the proposed method is about meaning that only of alarms are deemed false, whereas the false alarm report rate for the best competitor method (Gain Ratio) is at least . The improvement is due to potential synergy impact of coalitions among features which is overlooked or not directly addressed in other methods. The proposed method outperforms the case of incorporating all wavelet coefficients (represented by None in Fig. 0(a)) due to eliminating the irrelevant features. It is notable that the promising rate of is obtained using only 30 statistical features for any subject, which significantly reduces the risk of over–fitting compared to using all wavelet coefficients for each signal. The average appearance of signals are depicted in Fig. 0(b). It is clear from these results that the first wavelet decomposition level of ECG and PLETH signals play significantly higher roles in the alarm validation. Indeed, the collective contribution of levels 2 to 6 are less than the contribution of level 1 solely. However, all levels of signal APB signal contribute almost equally for alarm recognition.

(a) False alarm rate detection for the first 30 features using different feature selection methods with Bayes Net classification.
(b) Average appearance of the six levels of wavelet decomposed vectors for ECG, PLETH and APB signals

References

  • [1] Michael Imhoff and Silvia Kuhls, “Alarm algorithms in critical care monitoring,” Anesthesia & Analgesia, vol. 102, no. 5, pp. 1525–1537, 2006.
  • [2] Elizbha Philip, Evaluation of Medical Alarm Sounds, Ph.D. thesis, New Jersey Institute of Technology, Department of Biomedical Engineering, 2009.
  • [3] Maria Cvach, “Monitor alarm fatigue: an integrative review,” Biomedical Instrumentation & Technology, vol. 46, no. 4, pp. 268–277, 2012.
  • [4] Michael Imhoff, Silvia Kuhls, Ursula Gather, and Roland Fried, “Smart alarms from medical devices in the or and icu,” Best Practice & Research Clinical Anaesthesiology, vol. 23, no. 1, pp. 39–50, 2009.
  • [5] Qiao Li and Gari D Clifford, “Signal quality and data fusion for false alarm reduction in the intensive care unit,” Journal of electrocardiology, vol. 45, no. 6, pp. 596–603, 2012.
  • [6] Joachim Behar, Julien Oster, Qiao Li, and Gari D Clifford, “Ecg signal quality during arrhythmia and its application to false alarm reduction,” Biomedical Engineering, IEEE Transactions on, vol. 60, no. 6, pp. 1660–1666, 2013.
  • [7] Bernd Baumgartner, Kolja Rodel, and Aaron Knoll, “A data mining approach to reduce the false alarm rate of patient monitors,” in Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE. IEEE, 2012, pp. 5935–5938.
  • [8] Yvan Saeys, Iñaki Inza, and Pedro Larrañaga, “A review of feature selection techniques in bioinformatics,” Bioinformatics, vol. 23, no. 19, pp. 2507–2517, Sep. 2007.
  • [9] R. Tibshirani, “Regression shrinkage and selection via the lasso: a retrospective,” Journal of Royal Statistical Society, vol. 73, no. 3, pp. 273–282, 2011.
  • [10] L.C. Molina, L. Belanche, and A. Nebot, “Feature selection algorithms: a survey and experimental evaluation,” in Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, 2002, pp. 306–313.
  • [11] H. Peng, Fulmi Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27, no. 8, pp. 1226–1238, 2005.
  • [12] R. Kohavi and G.H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, pp. 273–324, 1997.
  • [13] Jinjie Huang, Yunze Cai, and Xiaoming Xu, “A wrapper for feature selection based on mutual information,” in Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, 2006, vol. 2, pp. 618–621.
  • [14] Díaz-Uriarte Ramón and Sara Alvarez De Andres,

    “Gene selection and classification of microarray data using random forest,”

    BMC bioinformatics, vol. 7, no. 1, 2006.
  • [15] J. Fan, R. Samworth, and Y. Wu, “Ultrahigh dimensional feature selection: Beyond the linear model,”

    Journal of Machine Learning Research

    , vol. 10, pp. 2013–2038, 2009.
  • [16] A. Razi, F. Afghah, and V. Varadan, “Identifying gene subnetworks associated with clinical outcome in ovarian cancer using network based coalition game,” in 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference (EMBC’15), 2015.
  • [17] Xin Sun, Yanheng Liu, Jin Li, Jianqi Zhu, Huiling Chen, and Xuejie Liu, “Feature evaluation and selection with cooperative game theory,” Pattern Recogn., vol. 45, no. 8, pp. 2992–3002, Aug. 2012.
  • [18] A. Razi, F. Afghah, A. Belle, K. Ward, and K. Najarian, “Blood loss severity prediction using game theoretic based feature selection,” in IEEE-EMBS International Conferences on Biomedical and Health Informatics (BHI’14), 2014, pp. 776–780.
  • [19] G. Cohen, S. Dror and G. Ruppin, “Feature selection via coalitional game theory,” Neural Computation, vol. 19, no. 7, pp. 1939–1961, 2007.
  • [20] F. Afghah, A. Razi, S.M.R. Soroushmehr, S. Molaei, H. Ghanbari, and K. Najarian, “A game theoretic predictive modeling approach to reduction of false alarm,” in 2015 International Conference for Smart Health (ICSH’15), 2015.
  • [21] L. S. Shapley, “A value for -person games,” H. W. Kuhn, and A. W. Tucker (Eds.), Contributions to the theory of games, vol. 2, pp. 307–317, 1953.
  • [22] A. Keinan, B. Sandbank, C. Hilgetag, I. Meilijson, and E. Ruppin, “Axiomatic scalable neurocontroller analysis via the shapley value,” Artificial Life, vol. 12, pp. 333–352, 2006.
  • [23] Kupiec M. Kaufman, A. and E. Ruppin, “Multi-knockout genetic network analysis: The rad6 example,” in IEEE Computational Systems Bioinformatics Conference (CSB’04), 2004, pp. 332–340.
  • [24] “Reducing false arrhythmia alarms in the ICU,” http://www.physionet.org/challenge/2015/, Accessed: 2015-09-07.