Preventing the development of heart disease in patients with known risk factors, such as hypertension, represents a major challenge for healthcare systems worldwide. Although much is known about how these risk factors influence development of disease, the recent availability of large scale databases such as the UK Biobank represents an excellent opportunity to extend this knowledge. Learning from a large number of highly detailed, multidimensional cardiac magnetic resonance (CMR) datasets can help further understanding of how risk factors impact cardiac function, and tailor medical interventions to individual patients.
Traditionally, machine learning techniques have relied on the use of handcrafted features to effectively perform a specific task without using explicit instructions. In some cases, the accuracy of these models was limited by the model being restricted to the use of these features. Recently, deep learning (DL) techniques have demonstrated a significant increase in performance over traditional machine learning methods. DL allows features to be learned from the data themselves, without preselection. One drawback of DL approaches is the lack of interpretability, as the learned relationships and features are often abstract and opaque to human users. Especially in medicine, interpretability and accountability are vital for two main reasons: (a) they can promote clinician trust in the learned models; and (b) the use of well-established, interpretable biomarkers allows the models to be used to better understand disease processes, translate the results to other populations and to interpret the newly learned information in the light of already existing clinical scientific evidence. For example, complex full 3D cardiac motion of the heart can be used to outperform current models in survival estimation for patients with pulmonary hypertension. While methods like this demonstrate the power of DL, it is difficult for clinicians to understand the features underlying the predictions and to use the model to better understand the disease.
A large number of biomarkers can be calculated from CMR. These are well-understood by clinicians and provide comprehensive information about underlying physiological processes. However, estimating them all is labour-intensive. In this paper, we employ a fully automated DL-based pipeline for estimating a wide range of biomarkers of cardiac function from cine CMR data. Our main contribution is to propose a framework that enables the interpretability of these automatically computed biomarkers to be combined with the power of learned features in DL. This framework is based on the use of a variational autoencoder with a latent space regression loss (R-VAE), in which the input data are the clinical biomarkers. In addition, we use a dummy variable in the regression to differentiate between population groups. We use the proposed method to investigate the impact of systolic blood pressure (SBP: a measure of hypertension) on cardiac function in the healthy population differentiated by gender.
Related Work: In the clinical literature, many groups have investigated the relationship between SBP and ventricular structure, function and geometry [2, 11]. However, most studies only investigated the influence of SBP on global parameters or only in the left ventricle. The proposed pipeline enables a more detailed investigation of the impact of SBP on cardiac function, as we demonstrate in Section 4. VAEs have previously been used for identification and visualization of features in medical image-based classification tasks [5, 4, 8], but the features were still learnt from the data and were not well-established clinical biomarkers as in our work. DL models have also previously been proposed for regression using interpretable features. For example, Xie et al.  proposed an autoencoder-based deep belief regression network to forecast daily particulate matter concentrations. Similarly, Bose et al.  proposed a stacked autoencoder based regression framework to optimize process control and productivity in intelligent manufacturing. Both techniques combined handcrafted features obtained from the image domain with autoencoders. In the medical field, Xie et al.  have proposed a deep autoencoder model for regression of gene expression profiles from genotype. Similar to these works we integrate a regression loss into the autoencoder to learn relationships between the latent space and another variable (SBP in our case). Our work is methodologically distinct from [16, 6, 17] as we employ a VAE, which enables us to sample from the distribution of the latent space and decode the clinical biomarkers for these samples. We also employ a dummy variable in the regression to enable the investigation to be stratified by gender. Our work also represents the first time that a regression-based autoencoder has been applied to investigate the impact of risk factors on cardiac function.
We evaluate our approach on subjects selected from the UK Biobank data set, which contains multiple imaging and non-imaging information from more than half a million 40-69 year-olds. We included only participants with CMR imaging data. From this group, we excluded participants with a history of cardiovascular disease, respiratory disease, renal disease, cancer, rheumatological disease, symptoms of chest pain or dyspnoea. For each subject, the following cine CMR acquisitions were used; a short-axis (SA) stack covering the full heart, and two orthogonal long-axis (LA) acquisitions (2-chamber (2Ch) and 4-chamber (4Ch) views). For each image slice 50 temporal frames were available covering a full cardiac cycle (temporal resolution 14-24 ms/frame). All CMR imaging was carried out on a 1.5 Tesla scanner (Siemens Healthcare, Erlangen, Germany). Details of the image acquisition protocol can be found in [petersen2016uk]. Blood pressure was measured using the HEM-70151T digital blood pressure monitor (Omron, Hoofddorp, The Netherlands) 
In the following sections we first describe the automated estimation of biomarkers of cardiac function from CMR images, and secondly present the R-VAE network used to learn the relationship between cardiac function and SBP. Figure 1 summarizes these steps and how they interrelate.
3.1 Biomarker Estimation From CMR
The procedure used to extract the clinical biomarkers of cardiac function is based on the work published in  and is briefly summarized below.
Automatic segmentation network: We first used a fully-convolutional network with a 17 convolutional layer VGG-like architecture for automatic segmentation of the left ventricle (LV) blood pool, LV myocardium and right ventricle (RV) blood pool from SA and LA slices in all frames through the cardiac cycle [1, 13]. After this, all segmentations were aligned to correct for breath-hold induced motion artefacts using the iterative registration algorithm proposed in .
Biomarker calculation: LV and RV blood volume curves were calculated from the obtained segmentations. From these curves, end-diastolic volume (EDV), end-systolic volume (ESV), stroke volume (SV), ejection fraction (EF), LVED mass (LVEDM), peak ejection rate (PER), peak early filling rate (PEFR), atrial contribution (AC) and peak atrial filling rate (PAFR) were obtained. Cardiac volumes were indexed to body surface area (BSA) using the Dubois and Dubois formula . The complete list of biomarkers calculated was: iLVEDV (indexed LVEDV), iLVSV, LVEDM, LVPER, LVPFR, LVPAFR, LVAC, iRVEDV, RVPER, RVPFR, RVPAFR and RVAC.
Quality control: Similar to , two quality control (QC) methods were implemented to automatically reject subjects with insufficient image quality or incorrect segmentations, ensuring the robustness of the estimated biomarkers. The first QC step (QC1) used trained DL models to reject any image with poor quality or incorrect planning, and the second QC step (QC2) detected incorrect segmentation results using an SVM model that identified physiologically unrealistic or unusual volume curves.
3.2 Deep Learning Regression Model
To combine the interpretability of handcrafted features with the power of DL we propose to use a VAE featuring a regression loss in the latent space to simultaneously learn efficient representations of cardiac function and map their change with regard to differences in SBP. As a regression model we used the multivariable regression modelling commonly used in epidemiological studies , where the effect of different independent variables are included as confounders on the regression model. As a result, the VAE linearizes the relationship between different clinical biomarkers and the variable of interest (SBP in this case), and these features are incorporated in a standard regression model.
Variational autoencoder: The encoding part of a VAE allows a number of features, , to be mapped into a lower dimensional representation (the latent space), whilst the decoder maps this representation back to the original higher dimensional space. The proposed VAE has
input units representing the clinical biomarkers, two hidden layers with 8 and 4 hidden units respectively and a latent space of dimensionality 2. To avoid over-fitting we apply dropout with probability 0.3 after each hidden layer during training.
Linear regression with indicator (dummy) variable:
We use a linear regression model in the latent space to estimate SBP, that incorporates a dummy variable (encoded as 0 or 1) to differentiate between population groups. In our experiments we used gender as a dummy variable, but the method is general and could be used to investigate a wide range of other factors. Mathematically, the linear regression model can be formulated as follows:, where are the latent space activations, the dummy variable, the ground truth label (i.e. SBP) and the regression coefficients. The regression loss is the mean squared error: .
Joint learning for regression: We denote the input data by
(i.e. a vector ofclinical biomarkers) and its corresponding latent space representation as .The decoded clinical biomarkers are denoted by , and the predicted label by
We combine the VAE loss and the regression loss by minimising the following joint loss function:
where and control the weights of the components of the loss function,
is the Kullback-Leibler divergence between the learnt latent distribution and a unit Gaussian, andis the Huber loss for the regression task. We first train the model only using the VAE loss, i.e. , and secondly train both the VAE and the regression together using . We set throughout.
4 Experiments and Results
Using our selection criteria, 3,781 subjects were included in our experiments. Of these, 54 subjects were rejected during QC1, and a further 127 during QC2. The remaining 3,600 cases were used to build the models. Of these cases, 1,321 had normotension (SBP 120 mmHg), 1,697 hypertension (SBP 140 mmHg) and 582 cases had prehypertension (a SBP between 120 and 140 mmHg).
Experiment 1 - Comparative evaluation on regression task:
We compared our proposed R-VAE model with two state-of-the-art techniques for multivariate regression: (1) Lasso regression, a linear regression model with aregularizer 
; and (2) Random Forest regression
, an ensemble method that has shown excellent performance in complex regression and classification tasks. For all methods, we used a five-fold cross validation to obtain the optimal model. We split the dataset into training, validation and test (60/20/20), and optimized the hyperparameters using a grid search strategy. We calculated the root-mean-square deviation (RMSD), the normalized root-mean-square deviation (nRMSD), and the coefficient of determination () between the ground truth SBP and the predicted SBP. Table 1 shows these results. It can be seen that all methods performed similarly with regard to regression. However, note that the next two experiments are only made possible by our use of the R-VAE architecture and would not be possible with the other regression techniques.
|Methods||RMSD (mmHg)||nRMSD (mmHg)|
|Random Forest ||12.98||0.12||0.33|
Experiment 2 - Investigating the effect of SBP on cardiac function:
We applied our R-VAE network to investigate how cardiac function changes with increasing SBP in healthy individuals, stratified by gender. We applied our R-VAE network to investigate how cardiac function changes with increasing SBP in healthy individuals, stratified by gender. We sampled the latent space of the model along the regression line for different SBP values (between 100 to 170 mmHg in steps of 10 mmHg). At each step, we took 20 random samples from a normal distribution in a perpendicular direction to the regression line and used the R-VAE to decode the clinical biomarkers. Figure2
shows the means and standard deviations for a selection of the decoded biomarkers stratified by gender. The results show that iLVEDV (indexed LVEDV) decreases with increasing SBP, while iRVEDV (indexed RVEDV) remains constant. LVPAFR and LVPER increase with increasing SBP. For both iLVEDV and LVPER, the change seems to be larger in males compared to females. Overall, the observed changes in the models’ predictions suggest that parameters associated with diastolic function of the LV are mostly affected by SBP, while the RV was less affected. These results suggest that stiffening of the LV myocardium could be an important disease process in deterioration of cardiac function in the light of increased SBP.
Experiment 3 - Identifying abnormal response:
In the normal population, some individuals with prehypertension might be predisposed to increased risk of cardiac disease, while others are not. We used our R-VAE model to identify subjects from the prehypertension group in whom predicted SBP was lower (i.e. predicted normotension) or higher (i.e. predicted hypertension) based on the latent space regression, assuming that being wrongly classified as normotensive or hypertensive identifies individuals with low versus increased risk of developing cardiac disease. Subsequently, we decoded the cardiac biomarkers for these cases using latent features at the regression line of their true SBP as well as using latent features at their predicted SBP. We calculated the percentage difference for each biomarker for the cases of under- and over-prediction respectively and investigated which factors contributed most to the lower or higher prediction in these subjects. Figure3 shows the mean difference of selected biomarkers that lead to classification as normotensive (dark) and hypertensive (light). Biomarkers related to LV diastolic function (blue bars) showed the largest changes with regard to the under- and overprediction. These results show again that diastolic function of the LV was a major contributor to the model predictions of SBP. Moreover, they suggest that biomarkers related to LV diastolic function might be effective when trying to stratify cardiac risk, in particular in subjects with ‘prehypertension’.
In this paper, we have presented an automated DL method for analysing cardiac function and predicting cardiac risk profiles from CMR. Our framework encompasses all steps from CMR segmentation through quality control to modelling of the impact of SBP, a known cardiovascular risk factor, on cardiac function.
Instead of inputting CMR images directly into the R-VAE, we chose to first automatically estimate clinical biomarkers from the images. Combined, these biomarkers give a comprehensive description of cardiac function and are well-understood by clinicians. While some of the information of the high dimensional image data is inevitably lost by this approach, it allows the model to be interpretable by clinicians directly.
The combination of the VAE with the regression loss allowed us to decode the clinical biomarkers from the latent space, while also providing a mapping of these biomarkers to another variable, SBP. As we show in Experiment 2, this design enabled us to get a clear description of the changes in cardiac function that occur with increasing SBP. Using the trained model, we showed that increases in SBP are mainly linked to changes in diastolic LV function. This suggests that SBP results in slowly progressive changes in the myocardium that increase ventricular stiffening. As shown in Experiment 3, the model also allowed us to identify key factors separating high and low risk subjects. Again, due to the interpretability of the framework, this allowed us to identify biomarkers that could be further investigated for their utility in screening patients in clinical practice. SBP is not the only factor influencing cardiac function and that explains the relatively low of the regression models. In this paper, we used SBP as an example to illustrate the potential power of our proposed method. We aim to further extend our model in the following ways: we plan to include image and segmentation data directly into the model, in combination with the clinical biomarkers to maintain interpretability; we also plan to extend the model to investigate more risk factors. In conclusion, this work represents a novel use of DL which has produced an important contribution to furthering our understanding of the influences on cardiac function.
This work was supported by the EPSRC (grants EP/R005516/1 and EP/P001009/1) and the Wellcome EPSRC Centre for Medical Engineering at the School of Biomedical Engineering and Imaging Sciences, King’s College London (WT 203148/Z/16/Z). This research has been conducted using the UK Biobank Resource under Application Number 17806.
-  (2018) Automated cardiovascular magnetic resonance image analysis with fully convolutional networks. Journal of Cardiovascular Magnetic Resonance 20 (1), pp. 65. Cited by: §3.1.
-  (2014) Impact of prehypertension on left ventricular structure, function and geometry. Journal of Clinical and Diagnostic Research: JCDR 8 (4), pp. BC07. Cited by: §1.
-  (2019) Deep-learning cardiac motion analysis for human survival prediction. Nature machine intelligence 1 (2), pp. 95. Cited by: §1.
-  (2019) Explainable shape analysis through deep hierarchical generative models: application to cardiac remodeling. arXiv preprint arXiv:1907.00058. Cited by: §1.
-  (2018) Learning interpretable anatomical features through deep generative models: application to cardiac remodeling. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 464–471. Cited by: §1.
-  (2018) Machine load estimation via stacked autoencoder regression. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2126–2130. Cited by: §1.
-  (2016) Associations with intraocular pressure in a large cohort: results from the uk biobank. Ophthalmology 123 (4), pp. 771–782. Cited by: §2.
-  (2019) Global and local interpretability for cardiac mri classification. arXiv preprint arXiv:1906.06188. Cited by: §1.
-  (1989) A formula to estimate the approximate surface area if height and weight be known. Nutrition 5, pp. 303–313. Cited by: §3.1.
-  (2006) Random forests and adaptive nearest neighbors. Journal of the American Statistical Association 101 (474), pp. 578–590. Cited by: Table 1, §4.
-  (1995) The bergen blood pressure study: prehypertensive changes in cardiac structure and function in offspring of hypertensive families. Blood pressure 4 (1), pp. 16–22. Cited by: §1.
-  (2019) Fully automated, quality-controlled cardiac analysis from cmr: validation and large-scale application to characterize cardiac function. JACC: Cardiovascular Imaging. Cited by: §3.1.
-  (2017) Fully automated segmentation-based respiratory motion correction of multiplanar cardiac magnetic resonance images for large-scale datasets. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 332–340. Cited by: §3.1.
-  (2002) How should meta-regression analyses be undertaken and interpreted?. Statistics in medicine 21 (11), pp. 1559–1573. Cited by: §3.2.
-  (1996) Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58 (1), pp. 267–288. Cited by: Table 1, §4.
-  (2018) Autoencoder-based deep belief regression network for air particulate matter concentration forecasting. Journal of Intelligent & Fuzzy Systems 34 (6), pp. 3475–3486. Cited by: §1.
-  (2017) A deep auto-encoder model for gene expression prediction. BMC genomics 18 (9), pp. 845. Cited by: §1.