Patients with advanced congestive heart failure (CHF) do not respond to traditional therapies like medical treatment and symptom management. Heart transplantation is considered the gold standard treatment for eligible patients with advanced CHF but is severely limited by donor availability. It was performed for only 2000-2500 eligible patients per year in the US (half of the number of candidates added to the wait list every year) . Therefore, there has been an increasing need to provide advanced CHF patients with alternative treatment options,such as left ventricular assist devices (LVAD). The LVAD is a pump that augments the flow of blood from the left ventricle of the heart, as shown in Fig. 1.
In 2005, the National Institutes of Health (NIH) established INTERMACS, which is currently the nation-wide database of longitudinal data for mechanical assist devices with data from 20,000 patients and 180 hospitals over the past decade . Based on the INTERMACS 2017 annual report, overall 1-year and 2- year patient survival rates were 81 and 70, respectively . It also reported an enhanced patient quality of life in the first 3 months after LVAD implantation that lasted for at least 24 months . Yet, INTERMACS reported several risks of recurrent adverse events (AEs) following LVAD implantation including bleeding, infection, cardiac arrhythmia, stroke, and respiratory failure . For 17,632 LVAD patients between 2008- 2016, bleeding and infection were the most frequent AEs, especially in the first 3 months post-LVAD with rate (events per 100 patient-months) of 16.24 and 13.63, respectively. More importantly, subsequent survival is impacted by the incidence of major AEs during the first 3 months post-LVAD .
There have been numerous clinical studies in recent years focusing on the incidence, prevalence, and risk factors of post-LVAD AE’s [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]. Most studies emphasized one type of major AE like bleeding [6, 9, 13, 7], infection [15, 16, 17, 18, 10], and stroke [5, 14, 11, 12, 19, 20]. Some studies investigated AEs of one specific type of LVAD pump [8, 9, 14, 16], while others compared AEs between different types of pumps [5, 6, 17, 7]. Some studies underlined the difference between the AEs of LVAD as short-term versus long-term therapy [14, 16]. There have been also few studies investigated data-driven techniques for AEs prediction including applying Bayesian model to predict the right heart failure or survival after LVAD [21, 22] and developing decision support system for patients management after LVAD .
Although these studies provide many valuable clinical aspects of AEs, they suffer from major limitations. Some of the results from these studies cannot be generalized as they were case studies or based on one hospital with low numbers of patients, usually a few hundred patient, compared with thousands of patients in INTERMACS. Furthermore, most were clinical studies that focused on traditional statistical methods such as multivariable Cox-Regression Models to establish hazard ratios, odds ratios, and event rates. No study has implemented modern data mining techniques to modelpatterns of post-LVAD AEs. Some are cross sectional studies based on data at a specific point in time after LVAD implantation and therefore do not consider the transitions between AEs over duration after LVAD implant. In addition, these studies treated each type of AE as a separate event, neglecting the possible interrelation or bootstrapping between AEs. LVAD-associated AE rarely occur in isolation. This is due to the causes of events being closely related. For example, a single gastrointestinal (GI) bleed is often followed by more GI bleeds, as the introduction of the continuous flow LVAD can lead to von Willebrand factor dysfunction and arteriovenous malformations . Similarly, bleeding and stroke can occur one after the other in LVAD patients due to issues finding the right balance of anticoagulation . No research has studied the pattern of transition between AEs to answer questions such as, “Which AEs occur commonly in a specific temporal order after LVAD implantation?”.
A patient’s post-LVAD AEs can be represented as a temporal sequence in which each AE is considered a unique element connected to other elements in a sequence to identify the transitions between AEs. Data from a large number of patients’ histories of temporal AEs could be summarized and visualized as a set of temporal sequences of AEs and then patterns within the AE sequences could be extracted using data mining techniques. Several recent studies have applied various temporal mining techniques, such as Markov models and Heuristics models, to extract recurring patterns from longitudinal healthcare data[25, 26, 27, 28, 29, 30, 31, 32, 33]. These studies have identified possible relativity and association between clinical variables such as medical interventions, diagnoses, and treatments [25, 26, 27, 28, 29, 30, 31, 32, 33].
To the best of our knowledge, this study is the first attempt to explore and model sequential patterns of post-LVAD AEs and specifically differentiate distinct groups of patients based on their sequences of AEs. The overall approach for this study is motivated by the methodology introduced by Zhang, et al. et al. [4, 34] and has three main steps, as shown in Fig. 2. First, selected data from INTERMACS were transferred into sequences of AEs for each patient through multiple preprocessing tasks. Next, patients’ AE sequences were clustered into groups with similar sequences using hierarchical clustering. Lastly, patterns of chains of transitions between AEs for each group were extracted using Markov Modeling. The results of this study provide valuable real-world insights about post-LVAD AEs patterns, which clinicians can use to make more informed treatment and management decisions to improve clinical outcomes of LVADs.
Ii-a Data pre-processing
Ii-A1 INTERMACS data selection and pre-processing
This study included 58,575 recorded AEs of 13,192 patients (median age of 50-59; 10,333 male vs. 2,859 female;) with advanced heart failure who received a continuous flow LVAD between 2006 to 2015, extracted from INTERMACS. It should be noted that the incidence of heart failure is significantly higher in men than women at all ages . For patients with multiple device implants, AEs after the first LVAD explant are excluded as patients with multiple subsequent LVAD devices are clinically treated differently.
Final outcomes, such as death, explant, and transplant, were included as the last elements in the sequences of AEs. For the subset of patients who received a right-ventricular assist device (RVAD), the explant of that device was named to “REXP”. For the visualization of the results, each type of AE and final outcomes were color coded, as shown in Fig. 3.
Ii-A2 Forming patients’ sequences
The sequence of AEs for each patient was identified and unified as a single record. A sequence for patient () can be presented as follow:
As INTERMACS provided no information related to the order of concurrent AEs (negligible percent of AEs), they were alphabetically ordered in patients’ sequences.
A consequence of the large number of event types (15) is the possibility of a large number of sparse patterns. For example, if we only considered sequences of length = 3, there would be 2,940 (; “Death” could be considered for only the last element) possible combinations. To overcome this issue, hierarchical clustering was used to divide the space of patients’ sequences into more dense sub-spaces that represent patients with relatively similar sequences which would therefore be more amenable to pattern mining.
Ii-B Hierarchical clustering
The goal of clustering is to identify clinically meaningful groups of patients with relatively similar sequences of post-LVAD AEs.
Ii-B1 Defining the measure of dissimilarity
A distance matrix, d, comprised of distances between each pair of patient sequences was defined as:
where, and are the lengths of the sequences for patients of and , respectively; is the Longest Common Subsequence between and , as formulated below:
Here, is a set of all common subsequences between and and is the length of common subsequence. A subsequence is a secondary sequence derived from another (primary) sequence by deleting some or no elements while maintaining the same order of the remaining elements of the primary sequence. A common subsequence is a subsequence that is common to both and . As an example, the only common subsequence between the , , shown in Fig. 4, is the subsequences of (Bleeding)-(Infection). Thus, in this example, the LCS is 2 and d(,) is 3 which means by 3 movements (deletions of respiratory failure and death from ’s sequence and insertion a bleeding AE to ’s sequence) ’s sequence becomes similar to ’s sequence.
Ii-B2 Defining linkage method for hierarchical clustering
After forming dissimilarity matrix of AE sequences, they were clustered using bottom to top hierarchical clustering with Ward linkage. Hierarchical clustering merges sequences with lowest distance, d
into a single group, and updates the distance matrix for the newly merged group and remaining patients. This merging process repeats using Ward’s method in which groups of patients with lowest post-merging in-group variance (sum of squares) are merged until all patients are in one group. The Ward linkage distance between clusters is computed using Lance–Williams recurrence algorithm. Briefly, considering two clusters of and , the distance between new cluster (= ) and remaining clusters such as is formulated as follows :
Here, is the distance between new cluster and cluster . Lance–Williams coefficients , , and are different for various linkage methods and are defined for Ward linkage as follows:
Here, indicates absolute value, and , , and are numbers of patients in each cluster.
Ii-B3 Define criteria for choosing a number of clusters
The clustering algorithm was implemented to maximize similarity between sequences within a group (internal validation), and minimize similarity between groups (external validation).
Internal validation (the within-group similarity) was performed by extracting the most common subsequences and their support values. This is the proportion of sequences in a group that contain that specific subsequence and ranges from 0 to 1 (0% to 100%). The most common subsequences were extracted from AE sequences by applying a prefix-tree-based search algorithm using “TraMineR” package from R[37, 38]. It should be noted that this algorithm computes the support value for a given subsequence by including all longer sequences containing that subsequence. For instance, patients with the subsequence of (Infection)-(Bleeding) are counted among the patients with the subsequence of (Infection).
External validation (the between-groups dissimilarity) was performed by identifying the subsequences that best differentiate two groups of patients’ sequences using the Pearson Chi-square test (p-value of 0.01).
Step-wise evaluation of clustering: The clustering evaluation for choosing the number of groups (n) started with evaluating the two-cluster solution (n= 2) and evaluation was continued for bigger numbers of groups until both internal and external criteria were satisfied for all the groups. At each step of clustering evaluation, n was increased by 1; only one group was divided into two new groups (G1 & G2) and their qualities (high internal similarity and low external similarity) were evaluated. First, external validation was checked between the G1 and G2. If the external criteria were satisfied, internal validations will be checked for each of the G1 and G2. If any of G1 or G2 satisfied the internal validation, it was considered as a qualified group, otherwise it was considered as an unqualified group that needed to be split into sub-groups with more similar sequences in the later steps of clustering evaluation (bigger values of n).
Interactive visualization evaluation: To help visualize the composition of AEs within each group (or sub-group), a histogram was constructed using the same color coding from Fig. 3. in which the proportions of each category of AE was plotted for each position in the sequence. Fig. 5 provides the histogram for the aggregate of all the 13,192 sequences of AEs over their chronological positions in the sequences. The first column of the graph, AE1, presents the proportions of various types of AE that patients experienced as their first AEs (the first element of the sequences of AEs), the second column for the second AEs, and so forth, through the thirty sixth AE. The uniformity of the distribution in the first set of columns of the histogram (e.g. AE1 through AE20) is contrasted with the heterogeneity of the subsequent columns. This reflects the increasing diversity of AE’s as there is a decreasing number of patients with longer sequences. For instance, there is only one patient with a sequence of 36 AEs - the last AE being death (yellow). This type of visualization is helpful to get a general quick view of the distribution of AEs in a cluster of patients.
The final clusters were evaluated by our clinical experts to determine the reasonableness of the clusters.
Ii-C Markov Chain Models of AEs
Following clinical confirmation of the results of hierarchical clustering, patterns of AEs for each group (cluster) were analyzed using Markov modeling (MM). This has been shown to be a useful tool to model repetitive events, where the timing and the order of events are important . For instance, the transition of bleeding to infection can be considered different if bleeding occurred as the 1st AE followed immediately by infection, (Bleeding1)-(Infection2), versus bleeding occurred as the 2nd
AE, and then infection occurred, (Bleeding2)-(Infection3). Accordingly, the transitions between sequential AEs were assessed for likelihood of transitions as a function of the chronological position in the sequence of AEs. As this is the first attempt to model the sequential AEs after LVAD implant, it was preferred to start with a simple solid model such as the first-order Markov chain. MM was defined for a sequence of, , ,… for discrete points of time: was defined as follows:
was an AE from 15 different types of AE that could occur in various orders in a sequence of AEs. The above formula assumes that the probability of transitioning to the next state,, depends only on the present state, . Another assumption of MM is homogeneity, which assumes all patients in the same state have the same risk of transition.
MM considers each types of AE as unique Markov state, and transitions between states as events. As an example, a sequence of (Bleeding1)-(Infection2)-(Death3) was represented as (Bleeding1 Infection2) (Infection2 Death3) that includes 2 events presenting 2 transitions between 3 states/AEs in the sequence. It also shows the transitivity relation between 2 events as the target state in the first event, Infection2, is the source state in the second event. A transition matrix was formed by computing transition probabilities between each pair of states/AEs from all the sequences of AEs in each group. Then, chains of events were extracted from the transition matrix by connecting events that have transitivity relations. Finally, thresholding of the extracted chains was performed, based on the distributions of transition probabilities and frequency of occurrence to eliminate the “noise” of numerous infrequent or rare transitions. The thresholds were chosen subjectively as 0.1 for transition probability and between 30 to 50 for frequency of occurrences, respectively, to achieve a compromise between reducing noise versus over-simplifying the resulting collection of MMs.
This study was performed with data from INTERMACS for 13,192 patients with advanced heart failure who underwent continuous flow LVAD implant between 2006 and 2015. A total number of 58,575 AEs, including 15 various types of AE, were included in this study. Table. i@ summarizes the 15 types AEs and final outcomes, and indicates that “Bleeding”, “Infection”, and “Cardiac Arrhythmia” are the most common AEs.
The time of recorded AEs ranged between 0 (at the time of implant) and 87 months (7.25 years) after LVAD implant with mean of 9.95 months. A great proportion of AEs (81% of total AEs) occurred within the first 18 months after LVAD with the peak of AEs (27%) during the first month.
|2 Cardiac Arrhythmia||6,361||10.9|
|2 Device Malfunction||3,726||6.3|
|2 Hepatic Dysfunction||850||1.4|
|2 Neurological Dysfunction||3,873||6.6|
|2 Renal Dysfunction||2,248||3.8|
|2 Respiratory Failure||3,614||6.2|
|2 REXP (RVAD explant)||92||0.1|
|2 Right Heart Failure||747||1.3|
|2 Explant: Transplant||3,796||6.5|
|Total recorded events||58,575||100.0|
Iii-a Sequences of AEs
The length of AE sequences ranged from 1 to 36, however 94% of the sequences were less than or equal to 10. The distribution of lengths 15 are provided in Fig. 6. (Lengths 16-32 were very rare, 1%, and therefore not shown).
Iii-B Hierarchical Clustering
Fig. 7 shows the distribution of dissimilarity scores between AE sequences, which is left-skewed. The min/max of scores were 0/53 with the mean of 10 and the median of 8. The numbers of dissimilarity scores less than 2 and greater than 22 were negligible (0%).
The result of hierarchical clustering is presented as a dendrogram, shown in Fig. 8(a), which depicts the taxonomic relationship between clusters formed at each level of grouping. The bottom of the dendrogram represents all the patients’ dissimilarity scores, which is a dense dark area because of the large number of patients. Moving from bottom to top of the dendrogram, one cluster is formed at each level of hierarchical clustering by grouping two sub-clusters until, at the top of the dendrogram, all patients are in a single group. The y axis, “Height”, represents the dissimilarity between clusters (groups of patients) at that level of the tree, which is measured using Ward linkage.
The eventual partitioning of the dendrogram was based on the three criteria: the internal validation (within-group similarity), external validation (between-groups dissimilarity), and clinical interpretation. This was performed iteratively by “cutting” the dendrogram horizontally. An example is shown in Fig. 8(a) in which a horizontal cut results in two groups (G1 and G2). The first group includes 862 patients (7% of total patients), and the second group that includes 12,330 patients (93%). Their corresponding histograms (similar to Fig. 5) are provided in Fig. 8(b), in which the proportions of various types of AE are temporally ordered. Visual inspection of the histogram corresponding to G1 reveals an obvious dominance of the Bleeding AE (red color). This also implies a degree of similarity of patients in this group. By contrast, the histogram corresponding to G2 does not present any single dominant AE. This is understandable since this group is comprised of a much larger proportion of the total number of patients (93%). Thus, the patients in the second group are not similar and require further stratification.
Fig. 8(c) represents the support values of the five most common subsequences in these two groups. In the first group (G1), the unary sequence (Bleeding) and the binary sequence (Bleeding)-(Bleeding) are the most common, both with support value of 1, indicating that 100% of patients in this group had a minimum of two bleeding AEs in their sequences. The next most common sequences were found to be (Bleeding)-(Bleeding)-(Bleeding) and (Bleeding)-(Bleeding)-(Bleeding)-(Bleeding), with support values of 0.98 and 0.73. It was concluded that the sequences in the first group all shared common subsequences, indicating high within-group similarity. In contrast, the maximum support value for the second group (G2) was approximately 50% corresponding to the (Infection) subsequence. This indicated low similarity within this group, and hence does not pass the internal validation.
As a final test, Fig. 8(d) shows the external validation in which subsequences which best discriminate sequences of the two groups via Pearson Chi-square test. The table shows the first four most discriminative subsequences, in which the plus (+) and minus (-) sign indicates that the observed frequencies of the subsequences were higher (+) or lower (-) than equally distributed frequencies. Here, p-values below 0.05 were taken to indicate discrimination between groups. The subsequence of (Bleeding)-(Bleeding)-(Bleeding) was the most discriminating subsequence with p-value of 0.01 and support percentage for G1 greater than 98%. All of the remaining discriminative subsequences consisted of at least two bleeding AEs, emphasizing that multiple bleeding AEs was responsible for differentiating G1 from G2. Accordingly, it was concluded that the groupings were externally validated.
In summary, the initial two-cluster solution passed the external validation, but only G1 passed the internal validation. However, the second group which includes the remaining 93% of patients required further subdivision. This was accomplished in a similar manner, by bisecting the dendrogram at a lower level, effectively separating the second group into two sub-groups of 6,168 and 6,162 patients. This was followed by the same process to evaluate the external and internal validation. This procedure was repeated until all the resulting groups passed validation, resulting in a final number of seven groups (summarized in Table. ii@). For convenience, each of the seven groups was given a mnemonic name based on visual inspection of the histogram including: GRP1: “Recurrent bleeding”, GRP2: “ Trajectory of device malfunction & explant”, GRP3: “ Infection”, GRP4: “Trajectories to transplant”, GRP5: “ Cardiac arrhythmia”, GRP6: “Trajectory of neurological dysfunction & death”, and GRP7: “ Trajectory of respiratory failure & renal dysfunction & death”. For example, this histogram of GRP1 reveals an obvious dominance of the Bleeding AE (red color), and was therefore given the name “Recurrent Bleeding.” In a similar fashion, G2 revealed a dominance device malfunction (forest green) and explant (lime green.) Since the two are always related sequentially, this group was named “Trajectory of device malfunction & explant.”
Fig. 9 demonstrates clustering results through a regression tree. There is histogram associated with each of the groups; the ordinate of which reflecting the proportion of each color-coded AE type (from 0% to 100%) and abscissa reflecting the location in the respective sequences. Since each group has a unique maximum length, this is reflected in the varied width of these plots. Each of these groups were assigned a descriptive title that reflected the dominant AE or AEs therein. For instance, the dominant colors in the GRP2 plot are yellow green (representing device malfunction) and spurious green (representing Explant outcome). Sequence analysis for GRP2, similar to sequences analysis for GRP1 in Fig. 8(c)&(d), reveals that 1,097 out of 1,193 patients who experienced device malfunction eventually had the device explanted, indicating the temporal pattern (subsequence) of (Device Malfunction)-(Explant). The sequence analysis was performed for each of the seven groups.
|Number of AEs||AEs post-LVAD time (month)||AEs time span (month)|
|Number of patients (*)||Number of AEs (**)||Min/Max||Mean||Median||Mean||Median||Mean||Median|
|862 (7%)||8,388 (14%)||3/36||9.73||9||12.55||7||19.48||16|
|GRP2:Trajectory of device malfunction & explant|
|1,591 (12%)||7,049 (12%)||1/21||4.43||4||11.19||6||9.89||4|
|3,438 (26%)||15,771 (27%)||1/31||4.59||3||11.31||6||10.25||4|
|GRP4:Trajectories to heart transplant|
|3,302 (25%)||10,093 (17%)||1/22||3.06||3||7.98||5||7.31||4|
|1,275 (10%)||5,715 (10%)||1/22||4.48||3||8.63||3||9.48||4|
|GRP6:Neurological dysfunction & death|
|1,616 (12%)||5,911 (10%)||1/28||3.66||3||9.55||5||6.42||1|
|GRP7:Trajectory of respiratory failure & renal dysfunction & death|
|1,108 (8%)||5,648 (10%)||1/18||5.10||5||6.06||1||5.78||1|
* % of the total 13,192 patients in this study
** % of the total 58,575 recorded AEs in this study
AEs post-LVAD time is based on the month after LVAD implant (0 post-LVAD month means AE occurred at the time of LVAD implant)
AEs time span = time of the last AE (post-LVAD month) - time of first AE (post-LVAD month)
Table. ii@ provides statistics of each patient’s group. GRP3 and GRP4 had the highest number of patients and AEs by having 26% and 25% of total number of patients, respectively, and 27% and 17% of total recorded AEs, respectively, in this study. On the other hand, GRP1 had the lowest number of patients, 7% of the total number of patient (862 patients), while they had 14% of total AEs. In addition, patients in GRP1 had the greatest number of AEs with average number of 9.73 AEs, and minimum and maximum numbers of 3 and 36 AEs. The average numbers of AEs in other groups were 5 and minimum numbers of AEs were 1. Columns of 6 and 7 of Table. ii@ shows information related to the time of AEs occurrences measured by the months after the LVAD implants. The distributions of post-LVAD time (month) of AEs occurrences were skewed to the right in all the groups as the means were greater than the medians. The average time of AEs occurrences were less than 13 months in all the groups and the median time were less than 7 months. AEs of GRP7 occurred at the earliest post-LVAD time by average of 6.06 months after LVAD and median of 1 month after LVAD. The last two columns of Table. ii@ presents information related to the time span of patients’ AEs (time of last AE - time of first AE) measured in month. The distributions of time span of patients’ AEs were also skewed to right indicating higher number of patients experienced AEs in a short time span. GRP1 had the longest time span of AEs by average of 19.48 months between the first AE and last AE, while, the GRP6 and the GRP7 had the shortest time span by average of approximately 6 months and median of 1 month.
Iii-C Markov Chain Models of AEs
The following sections presents results of Markov modeling (MM) within each of the groupings of patients presented above. The chains of transitions between AEs are presented with the graphs in which the size of the circles represent the frequency of AEs at each position in the chain and the thickness of the arrows reflecting the frequency of each AE that are followed by the subsequent AE.
GRP1: Recurrent bleeding
All the 862 patients in GRP1 had at least two bleeding AEs, and among them, 98% (847 patients) had at least three bleeding AEs. The Markov chain for GRP1, shown in Fig. 10 is characterized by a long sequence of recurrent bleeding AE’s (red circles), with a limited amount of branching involving an intermediate infection AE (blue circles). The frequencies of transitions depicted by the thickness of the arrows is seen to diminish progressively along the chain over the time. For example, the transition frequency for the 1st through 5th bleeding exceeded 300; but beyond the 5th bleeding event, the frequency reduced to the range 100-250. This analysis revealed that the probability of recurrent bleeding exceeded 50% for all instances up to the 13th AE. The transitions probabilities from infection AEs to bleeding AEs were 50% to 65%, while from bleeding AEs to infection AEs were 14% to 18%.
GRP2: Trajectory of device malfunction & explant.
75% of 1,591 patients in GRP2 experienced a device malfunction and 94% had Explant as their final outcome. The Markov chain for GRP2 (Fig. 11) was found to be much more diverse than the chain for GRP1. Multiple paths involving device malfunction (dark green) were found, although the terminal event was most commonly (50%) device explant (light green). Only a small number of patients in this group (n=111, approximately 7%) experienced device explant as the initial, isolated AE (indicated by the small light green node at the bottom of the 1st column of Fig. 11). The majority of patients for which device explant was recorded was preceded by another AE, most commonly bleeding and infection (probability between 20-30%). There were also about 18% and 19% probabilities of recurrent device malfunction as the 2nd AEs or the 3th AEs, respectively. The frequency of transition from Infection as the 2nd AE to Explant was 21%, but with no reported device malfunction AE (n=211, the spurious green circle in the 1st column).
The Markov chain for GRP3 (Fig. 12) is characterized by a long sequence of infection AEs (blue circles), corresponding to a total number of 6,462 recorded infection AEs for 3,438 patients in GRP3, with a limited amount of branching involving an intermediate bleeding AE (red circles). This analysis revealed that the probability of recurrent infection ranged from 34% to 49%. The frequencies of recurrent infection AEs depicted by the thickness of the arrows is seen to diminish progressively along the chain. For example, the transition frequency for the 1st through 4th infection AEs ranged 300-420; but beyond the 4th infection event, the frequency reduced to the range of 50-300. The transition probabilities from bleeding to infection were 39% to 45%, while from infection to bleeding were less than 18%. There were also 285 patients (8%) who received a heart transplant after experiencing an infection AE (represented by the dark green circle in the 2nd column in Fig. 12). There were also transitions from cardiac arrhythmia and respiratory failures as the 1st AEs to infection as the 2nd AEs.
GRP4: Trajectories to transplant
The Markov chains for GRP4 (Fig. 13) represents AE trajectories of 3,302 patients who ended in receiving a heart transplant. The majority of patients who revived heart transplants was precede by various types of AE, most commonly bleeding, infection, and cardiac arrhythmia. Only 970 patients (29%) in this group received a heart transplant as the initial, isolated event (the dark green circle in the first column of Fig. 13). AE trajectories to heart transplants from some specific types of AE like bleeding, infection, or cardiac arrhythmia were more likely than other types of AE like neurological dysfunction AE or device malfunction AE. For instance, 520 patients who experienced only one AE and then received a heart transplant were preceded by mostly bleeding or cardiac arrhythmia (cumulative 388 patients), and minimally by neurological dysfunction or device malfunction (cumulative 132 patients).
GRP5: Cardiac arrhythmia
Fig. 14 represents a chain of recurrent cardiac arrhythmia AEs (orange circles) in GRP5 (1,275 patients), with a limited amount of branching involving an intermediate infection AE (blue circles). The frequency of transitions depicted by the thickness of the arrows is seen to diminish progressively along the chain. As an example, 919 patients in GRP5 (72%) experienced cardiac arrhythmia as 1st AE and only 242 of them experienced cardiac arrhythmia as the 2nd AE too; beyond the 2nd AE the frequency gradually decreased to 71 for cardiac arrhythmia as the 6th AE. The probability of transitions for recurrent cardiac arrhythmia AEs were between 34% to 49% over the time. There were also transitions between cardiac arrhythmia and infection with probabilities from 16% to 36%.
GRP6: Trajectory of neurological dysfunction & death
The Markov chain of GRP6 (Fig. 15) shows the trajectory of 1,616 patients who died (yellow circles) after suffering from neurological dysfunction AEs (purple circles). The rate of death for patients who suffered from neurological dysfunction AEs as the 1st AE trough 3th AE ranged from 27% to 39%. The majority of patients for which neurological dysfunction AE was recorded was preceded by other types of AE, most commonly bleeding and infection. Only a small number of patients in GRP6 (n=229, approximately 14%) died with no reported AEs (the yellow circle in the 1st column of Fig. 15). There were also a small number of patients who died after one or recurrent infection AEs with no recorded neurological dysfunction AE (thin blue arrows from blue circles to yellow circles).
GRP7: Trajectory of respiratory failure & renal dysfunction & death
The Markov chain of GRP7 (Fig. 16) illustrates two different AE trajectories to death. One main trajectory represents 929 patients who died after suffering from respiratory failure (934 recorded respiratory failure AE) and/or renal dysfunction AEs (712 recorded renal dysfunction AEs) with transition probabilities between 22% to 44%. It also revealed that patients with reported respiratory failure and renal dysfunction were preceded by other types of AE, most commonly by infection AE and bleeding AE. The transition probabilities from renal dysfunction or infection to respiratory failure ranged from 32% to 36%. Another trajectory of this group presents 179 patients who died after suffering from one or recurrent bleeding AEs with transition probabilities from 19% to 24%.
Patients who experienced recurrent bleeding AEs after LVAD. Recurrent occurrences of various types of bleeding such as gastrointestinal bleeding is commonly reported in clinical studies of patients who receive an LVAD [9, 13, 7]
|GRP2:Trajectory of device malfunction & explant|
Patients who had device malfunction and commonly had their LVADs removed (explanted). This trajectory was found to be preceded by the two types of AEs including infection and bleeding. Two others, less common, trajectories within GRP2 were 1. patients who had an explant without device malfunction and 2. patients who had a device malfunction without ending in explant. In clinical literature, device malfunction is defined as failure of one or more parts of LVAD that cause the inability to maintain adequate circulatory support. Device malfunction might be deadly or could be solved by replacing the device (explant) [40, 41]. It is important to note that within INTERMACS patients there may be some who had serious pump malfunction such as internal thrombosis, but the patient was not considered a candidate for pump exchange and the LVAD may have been simply turned off.
Patients who suffered mostly from infection AEs. The most recent INTERMACS annual report indicated infection as the most frequent AE after bleeding during the first three months and the most common AE thereafter .
|GRP4:Trajectories to heart transplant|
Patients who received a heart transplant and the pump was explanted as part of the procedure. AE trajectories to heart transplant were mostly consisted of bleeding AE, infection AE, and cardiac arrhythmia AE. In practical terms these AEs resulted in an upgraded listing for cardiac transplant resulting in a higher likelihood of achieving a heart transplant. It is not uncommon for LVAD complications to drive more urgent listing status of a candidate and this analysis conforms that strategy. The most recent INTERMACS annual reports indicated slightly more than 30% of heart transplant candidates who received continuous-flow LVAD received a heart transplant [42, 3].
Patients who experienced one or recurrent cardiac arrhythmia AEs after LVAD implant that accompanied mostly with infection AEs and bleeding AEs. Patients with ventricular arrhythmias tend to have recurrent episodes of these rhythm disturbances one they begin to manifest them.
|GRP6:Neurological dysfunction & death|
|GRP7:Trajectory of respiratory failure & renal dysfunction & death|
GRP7 trajectory is predominated by both respiratory failure AEs and renal dysfunction AEs that ended in death although some of the trajectories also were accompanied by a bleeding AE and an infection AE. The eighth INTERMACS annual report named renal dysfunction and chronic pulmonary disease among the non-cardiac system commodities that impact the LVAD survival rate .
This is the first study to explore patterns of sequential AEs and their related final outcomes in patients with advanced heart failure who received an LVAD implant. The results of our analysis revealed previously-unknown or under-appreciated patterns of adverse events in LVAD patients, and their relationships with final outcomes, as presented in Table. iii@. These trajectories can be used as the basis for personalized medical management of the LVAD patient: by identifying developing patterns of AE’s and therefore intervening with evasive treatment.
Patients were not equally distributed among the groups that resulted from clustering. As an example, GRP1 (Recurrent bleeding) include small group of patients (less than 1000) who experienced greatest numbers of AEs (with a median of 9 AEs). While, each of GRP3 (Infection) and GRP4 (Trajectories to transplant) included more than 3000 patients with the median number of 3 AEs. The imbalanced number of patients in the groups implies high incidence of specific AE patterns among the patients with LVAD that seek more clinical attention for future enhancement of LVAD outcome.
The time of occurrence following implant is very important to the analysis of AEs. It is highly valuable for physicians to know how many days or months after LVAD implant AEs are likely to occur or how quickly a series of AEs may occur for a patient. The time analysis of AE sequential patterns (Table. ii@) indicates various timing characteristics among the groups. For instance, AEs in the GRP7 with respiratory and renal failure occurred in the immediate months after LVAD implant with median time of 1 month after LVAD implant reflecting typical post-operative timing of these events. This is in contrast to AEs in the bleeding group (GRP1) with median time of 9 months after LVAD implant, more typical of onset of gastrointestinal bleeding. This analysis also revealed differences in time span over which AEs occurred between groups. For example, AEs for patients with neurological events (GRP6) occurred over a short period of time (median time span of 1 month) contrasted with GRP5 patients whose AEs spanned a median of 4 months. These different timing characteristics of AE sequential patterns may have implications in guiding post-LVAD medical interventions or preventative measures.
Challenges & Limitations & Future work
The first challenge encountered in this study was related to data preprocessing including choosing a right population of patients and AEs. This part of study was an iterative process, involving feedback from clinicians about medical interpretation of the results. As an example, it was decided to exclude AEs after the explant of the first LVAD as patterns of AEs and clinical decisions and therapies for patients with more than one device are unique and need to be studied separately. Also, the choice of criteria to evaluate clustering results and decide about the number of groups was another challenge. Zhang et al.  used Silhouette values to determine cluster number and support values for internal evaluation of clustering. However, the number of patients in this study (13,192 patients) was more than 10 times greater compared with the number of patients in  (1,576 patients). Thus, considering only support values was inadequate. It is obvious that high numbers of groups will result in increasing similarity of patients within each group; however, the number of groups should be limited to preserve the clinical utility of the results. This was the motivation to the step-wise clustering evaluation that was implemented that considers both within-group similarity and between-groups dissimilarity criteria.
Markov models in this study considered the order of occurrences for AE transitions in patient’s sequences. For instance, the transition from bleeding as the first AE to infection as the second AE was considered different from transition from bleeding as the fifth AE to infection as the sixth AE. Thus, the probabilities of transitions were evaluated by considering where in the patient’s sequence transitions occurred. This temporal constraint is very useful since AEs have a different effect on LVAD final outcome when they occurred in immediate succession as compared to a sequence with a different intermediate AE.
One main limitation of this study is related to the voluntary collection and reporting of INTERMACS data. As an example, some AEs like infection is a longitudinal AE that might last for a while, but INTERMACS only records the occurrence of AE without recording its duration. Another issue is that there is no information regarding the order of concurrent AE (events occurred at the same day). The problem was exacerbated by the fact that this registry is comprised of contributions of over 100 centers, and hence involves differences in interpretation of definitions, omissions, and data entry errors. As an example, the ongoing change in the definition of right heart failure (RHF) causes inconsistency between studies about analysis of RHF, and therefore, reduces the confidence in results. Consequently, it would be helpful to pull in expertise from field to learn more about INTERMACS definitions and workflow of data collection, to avoid bias in the future studies.
The time gaps between sequential AEs were not considered in clustering and Markov modeling to minimize the diversity of sequences. One solution that was evaluated was to segment post-LVAD time, based on the critical points based on the AEs distribution and physicians’ suggestions. This reduced the maximum number of time points in the sequences from 36 to 7. However, it was not adequate to prevent enormously increasing diversity. Further consolidating the timeline into short-term and long-term could be another solution that requires some iterative process to find an optimum. Adding time gaps will also help concurrent AEs issue by considering basket of AEs with 0 time gaps as one element in the sequence.
The existence of sequential post-LVAD AEs raises the question of what causes patients to experience different AEs patterns. Further research of post-LVAD AEs must seek to include pre and post LVAD risk factors to improve the prognostic value of this analysis. An accurate prediction of the next AE (AEn+1) by considering previous AEs (AE1 to AEn) would be another important contribution, and a valuable tool for physicians to optimize treatment to minimize the risk of future AEs.
This study, to the best knowledge of the authors, was the first exploratory to discover sequential chains of AEs following LVAD implant. Mining of the AE sequences of 13,192 patients with advanced heart failure derived from the INTERMACS registry revealed the existence of seven groups of sequential chains of AEs, each characterized by a dominant AE or multiple AEs and occurring in a unique order. The discovered chains of AEs disclose potential interdependence between AEs and provide clinicians a valuable insight into the patient oriented post-LVAD AEs evidence. It is hoped that this analysis may support post-LVAD follow-up by alerting medical providers of the likelihood of impending AEs - based on a combination of independent factors, and patterns of prior AEs.
This work was supported by an R01 grant (R01HL086918) from the National Institutes of Health/National Heart, Lung, and Blood Institute. We would also like to thank the Data Access, Analysis, and Publications Committee of INTERMACS for allowing us to use their registry for the study.
-  Thoratec, [Online; accessed 11-July-2018]. [Online]. Available: http://www.thoratec.com/about-us/media-room/library.aspx
-  K. Khush, J. Zaroff, J. Nguyen et al., “National decline in donor heart utilization with regional variability: 1995–2010,” Am. J. Transplantn., vol. 15, no. 3, pp. 642–649, 2015.
-  J. K. Kirklin, F. D. Pagani, R. L. Kormos et al., “Eighth annual INTERMACS report: Special focus on framing the impact of adverse events,” J. Heart Lung Transplant., vol. 36, no. 10, pp. 1080–1086, 2017.
-  Y. Zhang, R. Padman, and N. Patel, “Paving the cowpath: Learning and visualizing clinical pathways from electronic health record data,” J. Biomed. Inform., vol. 58, pp. 186–197, 2015.
-  M. S. Slaughter, J. G. Rogers, C. A. Milano et al., “Advanced heart failure treated with continuous-flow left ventricular assist device,” N. Engl. J. Med., vol. 361, no. 23, pp. 2241–2251, 2009.
-  P. S. Joy, G. Kumar, A. K. Guddati et al., “Risk factors and outcomes of gastrointestinal bleeding in left ventricular assist device recipients,” Am. J. Cardiol., vol. 117, no. 2, pp. 240–244, 2016.
-  K. V. Draper, R. J. Huang, and L. B. Gerson, “GI bleeding in patients with continuous-flow left ventricular assist devices: a systematic review and meta-analysis,” Gastrointest. Endosc., vol. 80, no. 3, pp. 435–446, 2014.
-  J. M. Stulak, M. E. Davis, N. Haglund et al., “Adverse events in contemporary continuous-flow left ventricular assist devices: A multi-institutional comparison shows significant differences,” J. Thorac. Cardiovasc. Surg., vol. 151, no. 1, pp. 177–189, 2016.
-  M. C. Bunte, E. H. Blackstone, L. Thuita et al., “Major bleeding during HeartMate II support,” JACC, vol. 62, no. 23, pp. 2188–2196, 2013.
-  V. K. Topkara, S. Kondareddy, F. Malik et al., “Infectious complications in patients with left ventricular assist device: etiology and outcomes in the continuous-flow era,” Ann. Thorac. Surg., vol. 90, no. 4, pp. 1270–1277, 2010.
-  L. Harvey, C. Holley, S. S. Roy et al., “Stroke after left ventricular assist device implantation: outcomes in the continuous-flow era,” Ann. Thorac. Surg., vol. 100, no. 2, pp. 535–541, 2015.
-  J. Z. Willey, M. V. Gavalas, P. N. Trinh et al., “Outcomes after stroke complicating left ventricular assist device,” J. Heart Lung Transplant., vol. 35, no. 8, pp. 1003–1009, 2016.
-  S. Sami, S. Al-Araji, and K. Ragunath, “Review article: gastrointestinal angiodysplasia–pathogenesis, diagnosis and management,” Aliment. pharmacol. & Ther., vol. 39, no. 1, pp. 15–34, 2014.
-  A. J. Boyle, U. P. Jorde, B. Sun et al., “Pre-operative risk factors of bleeding and stroke during left ventricular assist device support: an analysis of more than 900 heartmate ii outpatients,” JACC, vol. 63, no. 9, pp. 880–888, 2014.
-  K. Toda, Y. Yonemoto, T. Fujita et al., “Risk analysis of bloodstream infection during long-term left ventricular assist device support,” Ann. Thorac. Surg., vol. 94, no. 5, pp. 1387–1393, 2012.
-  D. J. Goldstein, D. Naftel, W. Holman et al., “Continuous-flow devices and percutaneous site infections: clinical outcomes,” J. Heart Lung Transplant., vol. 31, no. 11, pp. 1151–1157, 2012.
-  J. M. Schaffer, J. G. Allen, E. S. Weiss et al., “Infectious complications after pulsatile-flow and continuous-flow left ventricular assist device implantation,” J. Heart Lung Transplant., vol. 30, no. 2, pp. 164–174, 2011.
-  V. Sharma, S. V. Deo, J. M. Stulak, L. A. Durham et al., “Driveline infections in left ventricular assist devices: implications for destination therapy,” Ann. Thorac. Surg., vol. 94, no. 5, pp. 1381–1386, 2012.
-  F. Al-Mufti, A. Bauerschmidt, J. Claassen et al., “Neuroendovascular interventions for acute ischemic strokes in patients supported with left ventricular assist devices: A single-center case series and review of the literature,” World Neurosurg, vol. 88, pp. 199–204, 2016.
-  T. S. Kato, P. C. Schulze, J. Yang et al., “Pre-operative and post-operative risk factors associated with neurologic complications in patients with advanced heart failure supported by a left ventricular assist device,” J Heart Lung Transplant, vol. 31, no. 1, pp. 1–8, 2012.
-  M. Kanwar, C. Khoo, L. Lohmueller et al., “Predicting post lvad acute severe right heart failure using bayesian analysis,” J. Heart Lung Transplant, vol. 38, no. 4, p. S357, 2019.
-  M. Kanwar, L. Lohmueller, R. Kormos et al., “Bayesian model for predicting 90 day event free survival in lvad patients,” J. Heart Lung Transplant, vol. 37, no. 4, p. S193, 2018.
-  E. Karvounis, M. Tsipouras, A. Tzallas et al., “A decision support system for the treatment of patients with ventricular assist device support,” Methods Inf. Med., vol. 53, no. 02, pp. 121–136, 2014.
-  P. Shah, U. S. Tantry, K. P. Bliden, and P. A. Gurbel, “Bleeding and thrombosis associated with ventricular assist device therapy,” J. Heart Lung Transplant, vol. 36, no. 11, pp. 1164–1173, 2017.
-  E. Rojas, J. Munoz-Gama, M. Sepúlveda et al., “Process mining in healthcare: A literature review,” J. Biomed. Inform., vol. 61, pp. 224–236, 2016.
-  W. Yang and Q. Su, “Process mining for clinical pathway: Literature review and future directions,” in 11th Int Conf ICSSSM. IEEE, 2014, pp. 1–5.
-  M. Ghasemi and D. Amyot, “Process mining in healthcare: A systematised literature review,” Int J Electron Healthc, vol. 9, no. 1, pp. 60–88, 2016.
-  R. S. Mans, M. Schonenberg, M. Song et al., “Application of process mining in healthcare–a case study in a dutch hospital,” in Int.l Conf. Biomed. Eng. Syst. Technol. Springer, 2008, pp. 425–438.
-  U. Kaymak, R. Mans, T. van de Steeg et al., “On process mining in health care,” in IEEE Int. Conf. Syst. Man Cybern., 2012, pp. 1859–1864.
-  Z. Zhou, Y. Wang, and L. Li, “Process mining based modeling and analysis of workflows in clinical care-a case study in a chicago outpatient clinic,” in IEEE 11th Int. Conf. Netw. Sens. Control, 2014, pp. 590–595.
-  V. Augusto, X. Xie, M. Prodel et al., “Evaluation of discovered clinical pathways using process mining and joint agent-based discrete-event simulation,” in Proc. Winter Simulation Conf. IEEE Press, 2016, pp. 2135–2146.
-  C. Fernandez-Llatas, L. Sacchi, J. M. Benedi et al., “Temporal abstractions to enrich activity-based process mining corpus with clinical time series,” in Int. Conf. Biomed. Health Informat. IEEE, 2014, pp. 785–788.
-  T. G. Erdogan and A. Tarhan, “Systematic mapping of process mining studies in healthcare,” IEEE ACCESS, vol. 6, pp. 24 543–24 567, 2018.
-  Y. Zhang, R. Padman, L. Wasserman et al., “On clinical pathway discovery from electronic health record data,” IEEE Intell. Syst., vol. 30, no. 1, pp. 70–75, 2015.
-  P. Mehta and M. Cowie, “Gender and heart failure: a population perspective,” Heart, vol. 92, no. suppl 3, pp. iii14–iii18, 2006.
-  F. Murtagh and P. Legendre, “Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion?” J. classification, vol. 31, no. 3, pp. 274–295, 2014.
-  F. Masseglia, “Algorithmes et applications pour l’extraction de motifs séquentiels dans le domaine de la fouille de données: de l’incrémental au temps réel,” These de doctorat, Université de Versailles Saint-Quentin-en-Yvelines, 2002.
-  G. Ritschard, R. Bürgin, and M. Studer, “Exploratory mining of life event histories,” in Contemporary issues in exploratory data mining in the behavioral sciences. Routledge New York, NY, 2013, pp. 221–253.
-  F. A. Sonnenberg and J. R. Beck, “Markov models in medical decision making: a practical guide,” Med. Decis. Making, vol. 13, no. 4, pp. 322–338, 1993.
-  W. P. Dembitsky, A. J. Tector, S. Park et al., “Left ventricular assist device performance with long-term circulatory support: lessons from the rematch trial,” Ann Thorac Surg, vol. 78, no. 6, pp. 2123–2130, 2004.
-  A. G. Rizzieri, J. L. Verheijde, M. Y. Rady et al., “Ethical challenges with the left ventricular assist device as a destination therapy,” Philos Ethics Humanit Med, vol. 3, no. 1, p. 20, 2008.
-  J. K. Kirklin, D. C. Naftel, F. D. Pagani, R. L. Kormos et al., “Seventh INTERMACS annual report: 15,000 patients and counting,” J. Heart Lung Transplant., vol. 34, no. 12, pp. 1495–1504, 2015.
-  J. Ayres, J. Flannick, J. Gehrke et al., “Sequential pattern mining using a bitmap representation,” in 8th ACM SIGKDD. ACM, 2002, pp. 429–435.