Radiological images and machine learning: trends, perspectives, and prospects

03/27/2019 ∙ by Zhenwei Zhang, et al. ∙ IEEE 0

The application of machine learning to radiological images is an increasingly active research area that is expected to grow in the next five to ten years. Recent advances in machine learning have the potential to recognize and classify complex patterns from different radiological imaging modalities such as x-rays, computed tomography, magnetic resonance imaging and positron emission tomography imaging. In many applications, machine learning based systems have shown comparable performance to human decision-making. The applications of machine learning are the key ingredients of future clinical decision making and monitoring systems. This review covers the fundamental concepts behind various machine learning techniques and their applications in several radiological imaging areas, such as medical image segmentation, brain function studies and neurological disease diagnosis, as well as computer-aided systems, image registration, and content-based image retrieval systems. Synchronistically, we will briefly discuss current challenges and future directions regarding the application of machine learning in radiological imaging. By giving insight on how take advantage of machine learning powered applications, we expect that clinicians can prevent and diagnose diseases more accurately and efficiently.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Radiology is a branch of medicine that uses imaging techniques to detect, diagnose and treat diseases [1, 2, 3]. Diagnostic radiology helps radiologists image internal body structures to diagnose the cause of symptoms, screen for illnesses and detect the body’s response to treatments. The most common radiology modalities include: plain X-ray, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and ultrasound imaging. Fig. 1 shows these internal body structures viewed via these different imaging techniques, and Fig. 2 illustrates an example of CT and PET images. In MRI images, the white areas represent subcutaneous fat, while in the CT images, the white areas represent the skull. However, the main disadvantage for all x-ray and gamma ray imaging modalities is the risk of radiation exposure for patients [4, 5, 6, 7, 8]. Ultrasound imaging is convenient because it does not expose patients or radiologists to radiation, but it has poor penetration through bone or air, which makes images difficult to interpret [9, 10]. MRI and CT images can capture anatomical changes in tissues, while PET images detects biochemical and physiological changes, which often occur before anatomical changes [11]. Disadvantageously, patients with ferromagnetic orthopedic implants, materials, and devices cannot undergo MRI procedures. MRIs also have relatively long scanning times which imposes limitations for patients in need of urgent care [12, 13]. The broader use of radiological image analysis increases the workload for radiologists, and therefore the development of intelligent computer-aided systems for automated image analysis that can achieve faster and more accurate results for large volumes of imaging data is essential.

This paper provides an overview of machine learning techniques used in radiological image analysis. We begin with a brief overview of current imaging technologies. In section 2, we review general concepts of machine learning and detail methods most commonly used in recent years. In section 3, we provide an overview of the most current studies dealing machine learning and radiological images. This review paper mainly focuses on the most recent contributions to different machine learning techniques (i.e., after 2014), and the reader should refer to previous review papers for older contributions related to machine learning and biomedical imaging [14, 15, 16, 17, 18], or contributions that focus solely on a single machine learning approach (e.g., deep learning [19, 20]). Lastly, we have summarized these contributions by outlining current technological limitations and potential future areas of research in this field.

Contributions cited in this review were collected using various research databases such as GoogleScholar, PubMed (MEDLINE), IEEE Xplore and SpringerLink. All contributions collected were published between the middle of 2014 and the middle of 2017. We used variations of the keywords including but not limited to combinations of machine learning techniques (SVM, random forest, regression, neural networks, deep learning), applications (segmentation, computer assisted system, brain studies) and imaging modalities (MRI, x-rays, ultrasound, CT). While deep learning techniques have been prevalent in the past five years, our search not only included these hot topics but also included traditional methods. In this review, the preference was given to papers that presented real data rather than theoretical frameworks. Similarly, we did not include papers that repeated past experiments unless the data collection or data analysis procedures were different.

Figure 1: An example of CT (a), MRI (b) and ultrasound (c) images displaying brain structures. Soft tissue has a better resolution in MRI images. Each types of MRI sequence displays a different brightness for the same structures [21]. Ultrasound is more convenient than CT and MRI, however it is unable to capture information well, as ultrasound waves do not transmit well through bone [22].

2 Machine Learning in Radiology

In recent years, machine learning algorithms have become useful tools for the analysis of medical images in many radiology applications [15, 23]. For example, machine learning algorithms can extract the useful information found within the details of medical images [24]. Thus, computer-aided systems based on machine learning help radiologists to make informed decisions while interpreting these images [15].

Figure 2: (A) axial views of a CT scan, (B) coronal PET. CT images show better resolution than PET images. However, each type of image can provide useful information for diseases. In this case, coronal PET images shows multiple foci of intense FDG uptake in the pelvic area while CT images do not demonstrate any abnormalities [25]

2.1 Types of Learning

Depending on the utilization of labels in training data, there are three categories of machine learning algorithms: supervised learning, unsupervised learning, and semi-supervised learning. Supervised-learning is the most common form in machine learning, and researchers widely use supervised-learning for classification and regression

[26]. Data is usually collected and labeled in categories, as the purpose of supervised learning is to find an appropriate input-output function from training data, which generalizes well against the testing data. We can compute an objective function to measure the error between the desired pattern and the output score. In general, many scientific contributions focus on finding a suitable objective function with adjustable parameters. Contrariwise, in cases where labeled data sets are relatively rare or difficult to acquire, unsupervised learning can derive deductions from data without corresponding label information; the purpose of unsupervised learning is to discover the hidden structure or distribution of data [27]

. Unsupervised learning approaches include clustering and blind signal separation techniques such as principal component analysis and independent component analysis. Lastly, semi-supervised learning lie between supervised learning and unsupervised-learning

[28, 29]. During the training phase, semi-supervised learning begins with a small set of labeled data and augments the training data size by gradually labeling unlabeled data.

2.2 Feature Selection

Feature extraction and representation is a crucial step in medical image processing. With the development of modern medical techniques, higher resolution and more features have become obtainable to feed the classifiers; however, this is an obstacle for machine learning techniques in achieving an optimal solution using high dimensional features. Significant interest exists in extracting and identifying reliable features from radiological images to improve classification performance[30, 31]. Several methods exist for extraction of features from medical images including region-based, shape-based , texture-based, and bag-of-words features [32, 33, 34, 35, 36, 37, 38]. The performance of most image retrieval systems is dependent on the use of these features. Table 1 summaries image features used in radiological image analysis. Color features are one of the essential features of images, including RGB, histograms [39]

, color moments


and color coherence vectors. Groups of pixels can calculate texture features, which can help characterize a wide range of images. The Gabor filter is the most common method for texture extraction


. Scale invariant feature transform and speed up robust features algorithm are two popular methods for scale and rotation invariant feature detector and descriptor in computer vision

[41]. Different types of images have significant contrast variation. Thus visual features such as color, shape and texture are not enough to easily classify images. Thus high-level features are useful to overcome the intensity variations in different types of images and extract the appropriate information from said images. The process to select ideal features that can reflect the most useful contents of images remains a challenging problem in machine learning.

Features Examples
Color Invariant from different size and direction Histogram [42, 43, 44]
Shape Binary representation of images Sphericity [44, 45]
Texture Description of image structure, randomness, linearity, Haralick’s features [46, 45]
roughness, granulation, and homogeneity Gabor features [47, 48, 49]
Co-occurrence [50]
Curvelet-based [51, 52]
Wavelet-based [53, 54]
Local Description of local image information using region, Local binary pattern [44]
object of interest, corners, or edges Scale invariant feature transform [55, 56, 57]
Speed up robust features [58, 57]
Other Other methods to extract image features CNN [59]
Table 1: A summary of image features used in ML systems

2.3 Overview of Machine Learning Methods

Machine learning has been developing rapidly in recent years, and it is impossible to cover all recently-developed techniques in one section. In this section, we will review the most commonly used machine learning methods in radiology, such as linear models, the support vector machine, decision tree learning, the ensemble classifier, as well as neural networks and deep learning. This section provides a general description of machine learning techniques and will help understanding their applications in the field of radiology, as described in subsequent sections.

Figure 3: Basic idea of linear classification and non-linear classification, (a) linear case (b) non linear case. The linear model uses linear functions to separate the data yet is not suitable for non-linear cases. SVM is one way to separate non-linear models using different kernel functions.

2.3.1 Linear Models for Regression and Classification

Regression predicts the value from the given input features, whereas classification assigns input to one of the predefined classes.[60]

. The simplest linear models establish a linear relationship among input variables. Commonly used linear models include linear regression, Fisher’s linear discriminant (LDA), and logistic regression. Given

, the input feature vector, the output

. Logistic regression is the most basic classifier, it predicts the probability that an input

belongs to a class (class 1), versus the probability that it belongs to another class (class 0). The basic idea of logistic regression is that we learn the logistic function of the form:

where is the input vector and is a weight vector for input. The logistic function is a continuous function which can turn any input from negative infinity to positive infinity into an output that is always between zero and one [61]. Fig. 3 illustrates linear and non-linear separable cases for a dataset.

2.3.2 Support Vector Machine

Support vector machines (SVM) are kernel-based supervised learning techniques widely used for classification and regression [60, 62]

. The basic idea of SVM is to find an optimal hyperplane for linear separable patterns. It attempts to maximize the geometric margin on the training set and minimize the training error. Then, a kernel function maps the original data into a new space for non-linearly separable cases, resulting in a two-class classification problem.

are feature vectors of the training set , and of corresponding class indicator . The goal of SVM is to construct a classifier in the form of:

The function

is called the kernel function, and their different mathematical properties enable many pattern recognition and regression models. SVM with a linear kernel equation is computationally faster than SVM with quadratic kernel functions. SVM models using fewer but more significant features are most likely robust and less prone to overfitting


Figure 4: A medical example of decision trees. In this example, patients are classified into two classes: high risk and low risk. The features include blood pressures, age, etc. In this case, the classification tree operates similarly to a clinician’s examination process.

2.3.3 Decision Tree Learning

Decision trees are one of the most popular classification approaches in machine learning [64]. The decision tree consists of a “root”, ”leaves”, and internal nodes [65, 66, 67]. The internal nodes use certain features to split the instance space into two or more subspaces. Each leaf represents one class. The leaf may represent the most appropriate target value or indicate the probability of the target having a specific value. Fig. 4 is an example of the decision tree model. Decision trees are capable of handling datasets that may have missing values and errors, however, this method may overfit training data and add unnecessary features. In radiological image analysis, decision trees are usually ensembled to form random forests for prediction and classification.

2.3.4 Ensemble Learning

Ensemble learning combines multiple classifiers and applies voting algorithms to achieve a final classification. Popular ensemble approaches include boosting and bagging [68]. Fig. 5 shows the basic idea of ensemble learning. In boosting, extra weight is assigned to incorrectly predicted points, and a set of weak classifiers are applied to deal with data in the training phase; the outputs of weak classifiers and the weighted inputs help calculate the final prediction. In bagging, the sub-classifier is independently constructed using a bootstrap sample of the data set and a majority voting method is applied for the final prediction [69]. Random forests are an ensemble learning method that consists of a multitude of decision trees. In standard tree construction, the node is split using the best split among all features. In a random forest, a random subset of features split each node. The random forest is one of the most powerful machine learning predictors used in detection, classification, and segmentation [70], particularly for brain [71, 72] and heart [73, 74] images.

Figure 5: The concept of ensemble learning: an ensemble classifier is made up of several sub-classifiers, the final output is combined with outputs from these sub classifiers and their weights.

2.3.5 Neural Networks and Deep Learning

Deep learning techniques have become a hot topic in machine learning due to the availability of sufficient computational power and a high volume of data. These approaches can select specific features directly from the data for classification and detection purposes[75, 76]

. Deep learning avoids designing specific features from the data, which is its main advantage in comparison with other machine learning methods. Some outstanding frameworks such as the restricted Boltzmann machine


, convolutional neural networks (CNNs)


and sparse autoencoders have proven useful tools in many applications such as Alzheimer’s disease diagnosis

[79], segmentation [80], and tissue classification [81]. CNNs have a large number of parameters, which requires vast volumes of labeled training data. However, this requirement makes the training of CNNs from medical images challenging due to the difficulty of acquiring a database with labeled data [82]. However, several studies use CNNs to extract features for medical images and achieve good performance in classification [83, 84].

2.4 Evaluating Machine Learning Techniques

Physicians may rely on the prediction or classification results of machine learning algorithms. However, performing one round of training and testing on data sets may not yield a meaningful idea of the accuracy of an algorithm. Cross-validation reduces the variance of accuracy scores by ensuring that each data instance is used for both training and testing an equal number of times. The cross-validation method randomly splits data into

subsets and holds out each one while training on the rest.

The Dice similarity coefficient is used in segmentation, and it measures the spatial overlap between two segmented target regions[85]. A and B are target regions or volumes, and the Dice similarity coefficient is defined as the ratio of their intersection to the average [86]:

The Dice similarity coefficient has a value of 0 for no overlap and 1 when pa complete agreement is present. Fig. 6 illustrates the Dice similarity coefficient with different overlaps.

In clinical practice, subjects with a disease are labeled as positive and healthy subjects are labeled as negative. True positive (TP), false positive (FP), false negative (FN) and true negative (TN) are defined as follows:

TP: a test detects the disease when the disease is present

TN: a test does not detect the disease when the disease is absent

FP: a test detects the disease when the disease is absent

FN: a test does not detect the disease when the disease is present

The goal of a computer-aided diagnosis system is to detect as many true positives as possible and minimize false positives and false negatives. There are several popular metrics used to assess classifier outcomes. Sensitivity shows the ability of a test to correctly detect patients with diseases while specificity is the ability of a test to identify healthy subjects correctly. They can be written as:

Figure 6: The Dice similarity coefficient represents spatial overlap.
Figure 7: ROC curves consist of the points evaluated from model many times with different classification thresholds. AUC computes the area beneath the ROC curves, which is more efficient to evaluate the models compared to ROC curve.

Other popular methods used to assess models include the area under the receiver operating characteristic (ROC) and the top precision value. ROC curves describe the relationship between sensitivity and specificity. The area under the curve (AUC) measures the entire area under the ROC curve from (0,0) to (1,1) and represents the probability that the model can distinguish between classes. Figure 7 illustrates the relation between ROC and AUC. The top precision, the portion of top-ranked relevant images before the top irrelevant database image [38, 87]

, is a popular evaluation metrics in retrieval systems.

3 Application of Machine Learning in Radiology

3.1 Segmentation

Image segmentation is a necessary step in effective disease diagnosis and treatment in radiology imaging research. It helps clinicians to understand structural information and spatial anatomic relationships, however, it depends on the experience of clinicians and is very time-consuming [24]. Automatic classification methods are essential for improving diagnosis analysis and for the reproducibility of large-scale clinical studies.

3.1.1 Brain segmentation

Tree based methods are hot topics currently being investigated in the brain segmentation field. For example, Yoo et al. segmented multiple sclerosis lesions in multi-3D MR images from unsupervised features[88]

. Features were extracted from T2-weighted and proton density MR images using a deep belief network, and a random forest was built for the final supervised classification. In order to improve the model performance from noisy training data and robustness against overfitting, Maier et al. proposed an extra tree forest to locate, segment and quantify sub-acute ischemic stroke lesions


. They used voxel-wise local features such as intensity, weighted local mean, local histogram and 2D center distance. However, their method can only deal with the T1-weighted and diffusion-weighted data sequences and high-quality images. Multimodal data from the same patient can provide extra useful information for diagnosis. Therefore, Mitra et al. proposed to use features from multimodal data to segment ischemic lesions, white matter and other secondary lesions. In their study, algorithms combined expectation maximization likelihood estimation and Bayesian-Markov random field to segment the probable lesion areas from FLAIR data then applied random forest on the multimodal data


Neural networks and deep learning techniques are powerful tools in brain segmentation tasks. Si et al. proposed a semi-automatic method to classify the pixels of brain MRI into lesioned and healthy tissues by use of an artificial neural network with gray levels and statistical features as inputs [91]. The segmentation of early-brain tissues is more difficult than that of adult brains due to the lower tissue contrast [92], while multiple image modalities contain complementary information for insufficient tissue contrast [93]. Zhang et al. [94] showed that fractional anisotropy images are more potent in distinguishing gray matter and white matter, and that T2-weighted images have higher performance in capturing cerebrospinal fluid. Zhang et al. proposed a CNN method combining these multiple modality image data to improve segmentation performance. Similarly, Kleesiek segmented the brain and non-brain tissues by feeding data into a neural network with seven hidden convolutional layers [95]. Their model can be applied on any single image modality or a combination of several modalities with varying size. Deep learning methods can also automatically segment MRI images of the human brain into many anatomical regions [96, 97]. As shown in Fig 8, Chen et al. extended ResNet into volumetric brain anatomical segmentation [98]. They integrated the low-level image appearance features, implicit shape information and high-level context to further improve the volumetric segmentation performance [98].

3.1.2 Other segmentation applications

Segmentation is also applied to identify and detect other structures [99], such as organs, bones, muscles, and fractures. Similar to the brain segmentation, tree-based methods are popular as well in other types of segmentation tasks. Lombaert et al. presented kidneys segmentation using the Laplacian Forest [100]. They used intensity within a randomly-shaped cuboid centered around several pixels during their data training. The idea of the Laplacian Forest is to use a guided bagging strategy to produce more related image information for tree models, which have more substantial improvements in model accuracy. Conze proposed a semi-automatic liver tumor segmentation combining a simple linear iterative clustering super pixel algorithm and random forest, which considers the inter-dependencies among voxels [101]. The multi-phase cluster-wise features that consider the spatial consistency applied in their approach are more robust for a random forest. The analysis of the knee also plays vital role in clinical assessment and surgical planning of the disease. The cartilage is typically small, and the segmentation results of Haar-like operators are often unreliable in extracting context features. To overcome these limitations, Liu proposed a novel method using a multi-atlas context forest, which segments bones first and then cartilage [102]. They trained classifiers using appearance features and context features to align the expert segmentation of the atlases in each iteration.

Medical segmentation research utilizes regression-based models. Chen et al. proposed an automated method to localize and segment intervertebral discs from MRI [103]. They used unified regression and classification frameworks to estimate displacements for image points by using the visual features around them and achieved satisfactory results. Ventricle structure segmentation in MRI is an essential task for investigating most cardiac disorders. The primary challenge of this task is the considerable shape variation among different patients [104]

. et al. proposed a segmentation method using cascade shape regression for the right ventricle in cardiac MRI. They applied gradient boosted regression to regress multidimensional right ventricle shape landmarks from image appearance, which consider correlations between landmarks. Their method minimizes the shape alignment error over training data and shows better segmentation performance than multi-atlas-label-fusion based segmentation methods.

The other traditional supervised methods applied in segmentation tasks include dictionary learning and Bayes classification. Tong et al. proposed the extraction of voxel-intensity features for multi-organ segmentation (liver, kidneys, pancreas, and spleen) using dictionary learning and a sparse coding technique (Fig. 9) [105]. The atlases selected against which to segment the images profoundly influence the performance of multi-based methods [106]. To deal with the high inter-subject variation in CT images, they applied a voxel-wised local atlas selection strategy to improve performance. Griffis proposed a supervised learning method that automatically delineates stroke lesions using Naïve Bayes classification in single T1-weighted MRI sequence data [107]. In order to save time and money, their approach focuses on using single scan data, which detects direct lesion effects and has a better performance than manual delineation.

Image quality remains a limitation of the extraction of features from the radiology images. In many cases such as brain boundary segmentation, the data is of low contrast by nature. Additionally, both resolution and partial volume effects influence the definition of boundaries [108]. Some research contributions focus on multi-modalities to obtain complementary information [109, 110, 111, 90]; however, it is difficult and inconvenient to apply various testing methods on patients. Also, the accuracy of the segmentation system is difficult to measure and compare because the “ground truth” varies based on the delineation by different experts [112]. However, it is challenging and expensive to obtain manually labeled data from several experts for reliability tests[113].

image types # images goal methods Dice coefficients
[114] MRI 12 Brain tissue Sparse dictionary learning
0.91 (Gray matter)
0.87 (White matter)
[90] MRI 36 Stroke lesion Random forest 0.82
[101] CT 42 Liver tumor Random forests & supervoxels 0.93
[115] CT 30 Liver tumor CNN 0.84
[102] MRI 70 Knee Multi-atlas context forests
0.97 (Bone)
0.81 (Cartilage)
[105] CT 150 Multi-organ Discriminative dictionary learning
0.90 (Liver)
0.88 (Kidney)
0.55 (Pancreas)
0.92 (Spleen)
[94] MRI 10 Brain tissue CNN
0.95 (Gray matter)
0.86 (White matter)
[116] CT 82 Pancreas CNN 0.72
[107] MRI 30 Stroke lesion Gaussian Naïve Bayes classification 0.81
[91] MRI 12 Brain lesion ANN 0.79
[117] MRI 66 Prostate Sparse auto-encoder & sparse patch matching 0.88
[118] MRI 45 Left ventricle CNN & stacked-auto-encoder 0.97
[95] MRI 53 Brain tumor CNN 0.95
[119] CT 73 Lung texture Convolutional restricted Boltzmann machines 0.74
[97] MRI 57 Brain segmentation CNN 0.86
[120] 4D-CT 22 Brain tissue SVM
0.79 (Gray matter)
0.81 (White matter)
[121] MRI 65 Brain lesion CNN 0.79
[122] CT 42 Liver tumor CNN 0.97
[123] MRI 73 Brain tumor CNN 0.65
Table 2: Overview of segmentation methods for different radiological images
Figure 8: Chen et al. applied their model on different imaging modalities: (a)-(c) denote T1, T1-IR and T2-FLAIR MR images; (d) represents the ground truth label; (e)-(g) illustrates the segmentation results using single image modality respectively; (h) is the result that combines all image modalities. [98].
Figure 9: Tong et al. performed discriminative dictionary learning in muliresolution to generate probabilistic atlas for each organ. The graph-cuts algorithm is implemented in Native space, combining the information across resolutions and achieving the final segmentation results [105].

3.2 Computer Aided Diagnosis

Computer-aided diagnosis (CAD) systems can detect, mark, and assess potential pathologies for radiologists to help improve identification accuracy in the case of data overload and human resource limitation. The analysis, quantification, and categorization of images with these methods is an important technique, which can improve patient safety and care. CAD systems have achieved breakthroughs in the detection of lesions [124, 125], epidural masses [126], fractures [127], as well as a degenerative disease [128] and cancer [129]. Fisher’s linear discriminant, Bayesian methods, artificial neural networks, and SVM are widely used as classifiers in CAD applications [130, 13]. Table 3 summarizes some current CAD investigations with machine learning techniques.

year image type # cases disease Measurements results keywords
[131] 2014 mammography 956 Breast cancer AUC 0.81 Combination of classifiers
[132] 2014 mammography 500 Breast cancer AUC 0.91 Naïve Bayes classification
[63] 2014 MRI 81 Cervical cancer Accuracy 0.69 Texture features, SVM
[133] 2015 mammography 340 Breast cancer AUC 0.73 Texture features, SVM
[134] 2015 mammography 772 Breast cancer AUC 0.89 Feature selection method
[135] 2015 CT 750 Lung AUC 0.98 Structured SVM
[136] 2015 X-ray 5,440 Lung Accuracy 0.92 SVM
[137] 2015 MRI 83 Pediatric cardiomyopathy Accuracy 0.81 Bayesian rule learning
[138] 2016 mammography 736 Breast cancer AUC 0.82 CNN
[139] 2016 mammography 2,604 Breast cancer AUC 0.93 Wavelet neural network
[140] 2016 ultrasound 520 Breast lesions Accuracy 0.82 Stacked denoising auto-encoder
[141] 2016 ultrasound 95 Liver lesions Accuracy 0.87 SVM
[142] 2016 CT 104 Vertebral body fractures TP 0.81 SVM
[143] 2016 CT 409 Wrist, radius, ulna fractures ROC 0.89 Random forest
[144] 2017 mammography 45,000 Breast cancer AUC 0.91 CNN
[145] 2017 CT 1012 Lung cancer Sensitivity 0.89 ANN
[146] 2017 CT 52 Teeth Accuracy 0.89 CNN
[147] 2017 CT 344 Prostate cancer ROC 0.80 CNN
[148] 2017 X-ray 1391 Bone age MAE 0.80 CNN
[149] 2017 X-ray 108,948 Thorax diseases Accuracy 0.63 CNN
[150] 2017 X-ray 112,120 Thorax diseases AUC 0.84 CNN
[151] 2017 MRI 107 Brain tumor Accuracy 0.88 Logistic Regression
Table 3: A summary of recent CAD studies.
AUC = area under curve; ROC = receiver operating cruve; TP = true positive rate; MAE = mean average error

Breast cancer is one of the most common cancers in the world. Currently, about one in ten women suffer from it, and early diagnosis and treatment of breast cancer could increase the chance of survival significantly [152]. Among these diagnoses techniques, mammography is the best approach to detect breast cancer in its early stages and features indicating abnormalities can be extracted directly from medical images [153, 154]. The identification of benign and malignant masses is the core principle for using mammography as a means to diagnose breast cancer [155]. Perez et al. developed machine learning classifiers that combine suitable feature selection methods with different machine learning techniques [131]. The feature selection methods include chi-square discretization, information gain, one rule, relief, and u-test based filter. They then improved their feature selection algorithm called uFilter, which ranks features in a descending manner[134]. Their method was useful for different datasets and reduced the number of employed features without decreasing the classification accuracies.

The SVM classifier is widely used in breast cancer diagnosis with various features, such as wavelet features, gray-level-co-occurrence matrix features, intensity features, and texture features [156, 133]

. Banaem et al. proposed a fully automatic tool that can classify the mammogram data into normal and abnormal. They used gray level co-occurrence and maximum difference method to extract proper features and the ensemble classification combining SVM, KNN and Naïve Bayes was applied to improve the diagnostic accuracy

[157]. Many investigations not only consider the accuracy of the model, but also the model complexity. Arevalo et al. trained an SVM model that integrated two layers CNN for mass lesion classification [138, 158]. Similarly, Jiao et al. trained two SVM classifiers using deep learning features extracted from two different layers of CNN networks [159]. An automated CAD system was proposed, combing the content-based image retrieval to detect masses [132]. The main idea of their approach is to use scale invariant feature transform features to match query mammogram and exemplar masses in the database, and then uses Naïve Bayes classification and thresholded maps to detect masses. In their method, the model complexity is low as there is no sliding window-based scanning.

The SVM method is also widely studied in other diagnose such as lesion, injury and fractures detection. In these diagnosis tasks, choice of features plays a significant role in model accuracy. Torheim et al. predicted cervical cancer from dynamic contrast enhanced MRI. In their study, gray-level-co-occurrence matrices based textural features were implemented as explanatory variables [63]. Wang improved the accuracy of lung lesion detection from CT images by using a 3D matrix patterns-based SVM with latent variables. Their study focused on detecting lung lesions that had irregular shape and low-intensity, rather than the nodules, which provides a new thought for the detection of lung lesions [135]. In the detection task of thoracic and lumbar vertebral fractures [142], Burns extracted 28 features from the cortical shell from CT images based on the essential method (Denis ‘middle column’), which is specific to detection of fracture discontinuities on vertebral body cortices. Jin et al. established a prognosis model of cervical spondylotic myelopathy using a least-square SVM [160]. In their studies, they extracted values of fractional anisotropy, axial diffusivity, mean diffusivity and radial diffusivity from each slice of DTI metrics as features, which yielded 88.62% prediction accuracy.

The popular methods such as deep relief networks [161] and convolutional neural networks [162, 140] achieved promising results in many diagnosis applications. The important diagnosis tasks based on neural networks include chest pathology identification [163], cancer detection[164, 165] and lung diseases [166] . Neural network based methods rely heavily on the support of big data. A semi-supervised algorithm has been proposed to deal with a large amount of unlabeled data with CNN approaches [167]. Their approaches using unlabeled data increased the overall accuracy, rather than just using labeled data.

There are many advantages to using machine learning techniques in CAD systems. The first advantage of machine learning is its accurate and robust performance in many radiology studies. For instance, CAD systems have reached perfect accuracy e.g., over 99 in oral cancer detection[168], which is comparable to manual diagnosis. Moreover, CAD systems are expected to perform consistently and produce robust results with large amounts of data at any time and in any space, while manual diagnosis results may be affected by fatigue, reading time, and emotion on the part of the practitioner. The second advantage is that the diagnosis can be finalized in a brief time. Many radiology analyses are time-consuming and require experienced radiologists. For example, the software developed for breast cancer prediction[169] can review charts 30 times faster than humans can. Another example is that the suggested approach in breast cancer diagnosis is the double reading of mammograms by two radiologists [131]. With the help of a CAD system, only one radiologist is needed instead of two, which could help to increase the survival rate among women in a cost-effective manner [170].

Although we are witnessing better accuracy of computer-aided diagnosis systems to tackle the most common clinical problems, current contributions still have potentials for improvement before their applications in clinical practice. First, a majority of current diagnosis contributions mainly focus on the prediction of one type of disease, which may not meet the clinical demands. There may be one or more diseases existing in one radiological image (for example, effusion & atelectasis in one chest x-ray image). Second, the current model trainings is mainly based on one type of measurement. However, most disease decisions in clinical practice rely on multiple domain measurement (such as patient demographics, image screening, blood test and drug test). Information from multi measurement may increase model accuracies. Third, current medical datasets mainly cover common diseases. Only a limited number of rare diseases are exposed to human clinicians, and many contributions may not consider these individual cases during their model training. More comprehensive systems that can detect various types of diseases and report rare cases are expected to be seen in the future.

3.3 Functional Brain Studies and Neurological Diseases

Brain tumors, neurological disorders such as epilepsy, and neurodegenerative diseases have attracted much attention in brain-related investigations. In brain-related image diagnosis, a large number of features can be extracted from brain regions related to the nature of pathological changes. Cortical thickness [171], the volume of brain structures [172], and voxel tissue probability maps around some regions of interest [173] are popular choices for feature extraction [174]. Different MRI modalities such as T1-weighted or fluid-attenuated inversion recovery imaging contain large amounts of information and noise [91]. Therefore, compelling feature fusion strategy is necessary for neuroimaging analysis and classification [175, 176].

3.3.1 Support vector machine in brain studies

In brain studies, the SVM is a powerful tool for feature selection, which may improve model accuracy. Larroza et al. developed a classification model of brain metastasis and radiation necrosis in contrast-enhanced T1-weighted images. Features were extracted by texture analysis and reduced by using a linear SVM [31]. Bron proposed a feature selection method based on the SVM significance value [177]. The significance value (p-value) serves to quantify the contribution of each feature to the SVM classifier and is used to reduce voxel-based morphometry features. Neurodegenerative diseases such as Parkinson’s disease begins before the onset of symptoms. Thus, medical treatment is more effective if it is detected in early stage. Among various forms of Parkinsonism, progressive supranuclear palsy is one of the most difficult to be identified in an early disease stages [178]. Salvatore et al. proposed to classify control subjects, progressive supranuclear palsy patients, and Parkinson’s disease patients based on SVM models. Features were extracted by spatial transformations and principal component analysis from T1-weighted sequences. The accuracy of discrimination of Parkinson’s disease and progressive supranuclear palsy is above 90[24]. Fig. 10

uses a color scale to express the importance of each region during classification. To improve the diagnostic accuracy of classifying Parkinson’s disease patients, Singh proposed an unsupervised feature extraction method from a T1-weighted sequence by using a Kohonen self-organizing map algorithm. With the least square SVM, the accuracy of identifying the affected area in Parkinson’s disease is up to 99

[179]. In [174], features were extracted using a deep network and a stacked denoising sparse autoencoder, which makes the input data points more linearly separable in SVM [174]. Liu proposed an inherent structure-guided multi-view learning method to classify Alzheimer’s disease and mild cognitive impairment patients [180]. They extracted 1500 features from gray matter density, and multi-task feature selection was applied to reduce the dimension, followed by an ensemble classification method using multiple SVM classifiers.

Besides the disease studies, some research work applied machine learning techniques to understand the brain’s functional network architecture. Smyser compared the fMRI data from 50 preterm-born and 50 term-born infants using SVM [181]. Their results show that inter- and intra-hemispheric functional connections throughout the brain are stronger in full-term infants. Their findings might be helpful for the development of models for defining indices of brain maturation.

3.3.2 Ensemble learning in brain studies

Ensemble learning methods combines multiple classifiers, which is popular in Alzheimer’s disease diagnosis. Alzheimer’s disease is estimated to affect around 5.4 million patients in America, and is the most common form of dementia among the elderly population [182, 128], which leads to the loss of cognitive function and death. Liu proposed a classification framework that works on different image modalities for the classification of Alzheimer’s disease patients [175]. Their method contains level classifiers: low-level classifiers that use different types of low-level features from patches, high-level classifiers that combine coarse-scale imaging features in each patch and outputs of low-level classifiers, as well as a final ensemble classification that combines the decisions of a high-level classifier with a weighted voting strategy (Fig. 11). In [183], high accuracy results were obtained from Alzheimer’s disease/healthy and mild cognitive impairment/healthy classification. However, accuracies in classifying mild cognitive impairment as converted to Alzheimer’s disease are very low (57.4), but is slightly higher than majority classification. Komlagan et al. developed an ensemble learning method using gray matter for a weak classifier and selecting the most relevant sub-ensembles through sparse logistic regression [184]. They trained a global linear SVM classifier for the final classification. Combining high quality biomarkers with advanced learning methods makes results comparable to those of multi-modality methods.

3.3.3 Others techniques in brain studies

Some researchers leverage regression and principle analysis components in classification and feature mapping. Ahmed et al. detected neocortical structural lesions with an automated approach, which contained five surface-based MRI features and combined them in a logistic regression [185]. To deal with imbalance issues, they used a “bagging” approach and an iterative-reweighted least squares algorithm. The base-level classifier was trained on all the minority class instances and the same size of random data from majority class instances. Hong proposed a machine learning technique combining surface-based analysis in patients with a subtype of focal cortical dysplasia [186]. Their automated approach used features of Focal cortical dysplasia morphology and intensities, and Fisher’s linear discriminant was applied as a classifier to identify Focal cortical dysplasia in patients. Huang proposed the use of a soft-split random forest to predict clinical scores in Alzheimer’s disease patients [187]

. In their method, lasso regression is applied to map MRI features, and then features are reduced by principal component analysis. Li combines principal component analysis, the lasso method, and a deep learning framework to extract features by fusing information from MRI and PET images in the classifier of Alzheimer’s disease/mild cognitive impairment patients

[183]. Zhu et al. focused on the identification of Alzheimer’s disease patients with multi-view or visual features of image data. They proposed several feature selection approaches for Alzheimer’s disease classification. They integrated subspace learning into a sparse least square regression framework for multi-classification in 2014 [188]. Then, they mapped the histogram of oriented gradient features (which are diverse) onto a region of interest features (which are robust to noise), which provided complementary information for features and enhanced disease status identification accuracy [189]. Other machine learning techniques such as convolutional neural networks are widely investigated in the field as well. Table 4 summarizes recent contributions related to Alzheimer’s disease classification.

Figure 10: Salvatore et al. [24] proposed a supervised learning method to identify PD and PSP using MR images. The figures show maps of voxel-based pattern distribution of brain structural differences. The color scale expresses the importance of each voxel in SVM classification.
Figure 11: Flow chart of the hierarchical classification algorithm proposed in [175], the low-level classifiers are used to transform imaging and spatial-correlation features from the local patch, and the output of these low-level classifiers is integrated into high-level classifiers with coarse-scale imaging features. The final classification is achieved by ensemble outputs from high-level classifiers.
year databses image # image types classification gruops accuracy keywords
[111] 2014 ADNI 834 MRI AD vs. NC 89% Multiple instance learning
pMCI vs. sMCI 70%
[188] 2014 ADNI 202 MRI+PET AD vs. MCI vs. NC 73.35% Sparse discrimination feature selection
AD vs. pMCI vs. sMCI vs. NC 61.06%
[190] 2014 ADNI 1071 MRI AD vs. NC 89%

Manifold and transfer learning

pMCI vs. sMCI 73%
[184] 2014 ADNI 814 MRI pMCI vs. sMCI 75.6% Gray matter grading, weak-classifier fusion
[180] 2015 ADNI 459 MRI AD vs. NC 93.83% Hierarchial fusion of features
pMCI vs. sMCI 80.9%
pMCI vs. NC 89.09%
[191] 2015 ADNI 202 PET+ MRI pMCI vs. sMCI 78.7% Multimodel multi-label transfer learning
[189] 2015 ADNI 830 MRI AD vs. NC 91.31% HoG mapping
MCI vs. NC 78.07%
pMCI vs. sMCI 75.54%
[47] 2016 OASIS 416 MRI AD vs. NC 80.76% Gabor filter
[192] 2016 ADNI 416 MRI AD vs. NC vs. MCI 89.1 % CNN
[193] 2016 Self-collected 67 MRI AD vs. NC 96.77% SVM
[184] 2014 ADNI 814 MRI pMCI vs. sMCI 75.6% Gray matter grading, weak-classifier fusion
[194] 2016 Self-collected 89 fMRI AD vs. NC 97.50% SVM
MCI vs. AD 87.30%
MCI vs. NC 72.00%
[195] 2016 Dartmouth College 116 MRI AD vs. NC 97.14% Feature ranking selection
[196] 2016 Self-collected 43 fMRI AD vs. NC 96.85% Deep learning selection
[197] 2017 Self-collected 250 DTI AD vs. NC 89.60% Elastic net selection
Table 4: Recent studies on Alzheimer’s diseases
NC: normal; AD: Alzheimer’s disease; pMCI: progressive mild cognitive impairment; sMCI: stable mild cognitive impairment

3.4 Image Retrieval

With the increased use of modern medical diagnostic techniques, there are numbers of medical images stored in hospital archives. Manual annotation and attribution of these images are impractical [14]. Picture archiving and communication systems have been widely introduced in many hospitals [198]. These systems could retrieve images based on keywords, however these images may not be directly useful in helping to making clinical decisions. Different from traditional image search systems, which are based on matching keywords and image tags, content-based image retrieval extracts rich contents from images and searches for other images with similar contents. Content-based image retrieval is becoming necessary for the medical image databases, which may potentially become effective tools of anatomical and functional information for diagnostic, educational, and research purposes [199]. Table 5 lists current investigations on image retrieval. Recently, similarity or distance learning is a hot topic in the image retrieval field. Traditional choices include the Euclidean distance function, square distance function, Mahalanobis distance, norm distance function [38], maximum likelihood approach [200] and Bayes ensemble [201]. Like other machine learning tasks, features extraction is an important step in image retrieval systems. Kurtz et al. proposed the use of hierarchical semantic-based distance to retrieve images based on 72 manually annotated semantic terms from each region of interest [202]. Then, they built a semantic framework that learns image descriptions of each term using Riesz wavelets and SVM. In [203], local wavelet patterns were introduced as a new feature descriptor. Their experiments first utilized the relationship among the neighboring pixels and performed well in CT image retrieval. Their results were shown in Fig. 12

. Different from traditional similarity learning that only maximizes the margin, Meng et al. proposed a novel similarity learning algorithm which considered the top precision performance measure in the loss function

[38]. Their methods showed advantages over other traditional similarity learning methods.

The other supervised techniques applied in retrieval systems include online dictionary learning, ensemble learning and principal component analysis. The main advantage of an online dictionary learning system is the computational time, as learned dictionaries are used to represent the dataset in a sparse model, which is an valuable tool for representing data [204]. A method using online dictionary learning and its features extracted by multi-scale wavelet packet decomposition from different types of images is proposed in [205]. Srinivas et al. proposed a medical image classification approach using online dictionary learning with the edge- and patch- based features to distinguish 18 categories [206]. Ahn et al. developed a robust method to improve X-ray image classification [207]. A fusion strategy was proposed that combines domain transferred convolutional neural networks and sparse spatial pyramid classification. The combined method performs better than the single method used. Faria et al. proposed a retrieval method for brain MRI images. They captured anatomical features from T1-weighted images using least-square discriminant analysis and principal component analysis and performed a search for images between healthy controls and patients with primary progressive aphasia [208].

As large portions of medical images in the dataset lack labels and annotations, semi-supervised and unsupervised techniques are required in the retrieval systems. Uunsupervised image retrieval based on the clustering method using K-SVD executes iterations between grouping similar images into clusters and generating a dictionary for clusters until clusters converge [42]. The advantage of this method is that it requires no training data for classification and is not restricted to a specific context. Since labeled data is limited, Herrera proposed a semi-supervised learning method for image classification using k-nearest neighbors to expand the training data set and a random forest for final classification [113].

Medical image retrieval gives an opportunity for clinicians to search for similar disease cases. Accuracy and performance time are both vital aspects of the medical image retrieval system. A practical model and relevant image feature extraction are required to get better results. Furthermore, some image retrieval contributions mainly investigated small datasets and limited disease cases. With an increasing number of digital radiological images in hospital databases every year, whether systems can retrieve disease cases stably and efficiently in huge datasets is still an exciting avenue for researchers.

Figure 12: The method for retrieving images using Local wavelet pattern features and similarity measurement. All retrieved images are from the same category, achieving 100 % precision in this example [203]: (a) Query image. (b) Top 10 retrieved images.
year image types # images results keywords
[209] 2014 CT 72 AUC:0.93 Riesz wavelets, hierarchical semantic-based distance
[208] 2015 MRI 30 Accuracy:0.88 Partial least square discriminant analysis, principal component analysis
[203] 2015 CT
TCIA: 604
Local wavelet pattern
[210] 2015 Multimodality
10 thousand
MAP:0.29 Deep Boltzmann machine
[211] 2015 MRI OASIS:421 Precision: 0.48 Local binary patterns, gray-level-co-occurrence matrices
[206] 2016 X-ray & CT ImageCLEF:5400 Accuracy:0.98 Sparse representation, online dictionary learning
[38] 2016 Multimodality
Top precision:
Support top irrelevant machine
[212] 2017 CT
TCIA: 604
Gabor and Schmid filters
Table 5: A summary of recent image retrieval research using machine learning techniques

3.5 Image Prediction

With the development of neuroimaging techniques, various new image modalities have been applied in daily clinical practice to make diagnosis and treatment more efficient and accurate. Thus, image prediction methods, which combine various image modalities and provide information for diagnosis, are fundamental. The main idea of image prediction is to estimate radiological images in different modalities or higher resolution, which can provide detailed functional information for assessment and diagnosis. PET is a molecular imaging technique which is widely used in clinical cancer diagnosis that produces 3D images, which can reflect tissue metabolic activity in the human body [213]. However, the quality of PET images is proportional to the dose injected and imaging time. As a result, low-dose PET images can suffer in quality. Thus, a great deal of effort has been made to predict high-quality PET images. Kang proposed a regression forest based approach to predict standard-dose PET images from low-dose PET and multimodal MRI images [214]. They used a regression forest as their non-linear prediction model and features from local intensity patches of MRI data and low-dose PET. Meanwhile, Wang used a mapping-based sparse representation approach for prediction [11]. They used a graph-based distribution mapping method to reduce the patch distribution differences between MRI and low-dose PET and constructed a patch selection based dictionary learning method to predict standard-dose PET. Both methods performed better when compared with a path-based sparse model. In [215], Xiang et al. used convolutional neural networks to estimate standard PET image from low dose PET/MR image (Fig. 13). By using neural network techniques, they can map the inputs to the output directly, without any pre/post-processing beyond the optimization in the training stage. Huynh predicted CT images from MRI data using a structured random forest instead of a classical random forest [70]. A structured random forest is an extension of a random forest, which predicts structured outputs instead of scalar outputs [216, 217]. Characterizing the information obtained from multiple sources of information improves prediction accuracy.

Compared to classification and segmentation tasks, contributions involving radiological imaging prediction is still quite limited. The rapid developments in hybrid imaging scanners (PET-CT, SPECT-CT, PET-MRI) has provided integrated images for diagnostic purposes. The primary challenge remains to find the appropriate way to match the correspondences among different images modalities [218]. Besides, current contributions are mainly focused on brain data, but we expect to see more non brain related contributions to this area soon.

Figure 13: Deep auto-context convolutional neural networks were proposed for standard-dose PET (SPET) image estimation from low-dose PET (LPET) and T1 images [215]. SPET images were estimated by using LPET images along and combination of LPET and T1 images. The neural networks perform better results when two image modalities were included.

4 Current Challenges

Until now, machine learning has been used to help radiologists in diagnostic tasks, but still cannot be a substitute for the clinician’s role. There are some limitations regarding the application of machine learning in clinical practice [219]. One such limitation is that the majority of these studies in radiology are based on supervised learning. The algorithms learn specific patterns based on previous decisions made by radiologists. The algorithm is expected to reach an accuracy of 100 compared with the human clinician, however, in many cases, an accurate diagnosis was made by several radiologists after multiple diagnostic tests. Whether a machine can perform well alone still needs to be investigated in the future.

Facilitating data collection and sharing is a crucial point for further investigation in many studies. Clinically applied algorithms depend on two critical factors: robustness for large datasets and accuracy achieved [220]. In many machine learning related studies, the size of the chosen dataset is relatively small [94, 208, 107, 89, 103]. This is due to limited access to patients or limited diagnostic work in the research setting. For example, in some clinical practices, there is no pathological exam done as a follow-up procedure in cases of brain metastasis [31], which limits the acquisition of data. The application of machine learning to a limited number of cases (only around twenty patients were studied in some cases) is not always persuasive. Whether these clinical tools are robust enough to analyze immense amounts of medical data accurately remains a question. While 20-50 images were sufficient in past research, hundreds of or more image sets are required in the future to meet increasing requirements concerning robustness and accuracy. The creation of large databases and sharing centers such as ADNI for Alzheimer’s disease patients, NIH repositories for chest X-rays, and TCIA for cancer imaging help to effectively collect millions of images for research. Furthermore, current studies trained and evaluated their models based on various datasets, which makes it challenging to compare their algorithms. A systematic evaluation standard based on various diseases and various public datasets is required in medical applications.

Excepting the data size, data quality and feature selection remain highly important for effective machine learning techniques. Medical images contain rich features that are clinically important. It is challenging to use low-level image features to get the visual appearance of disease. However, high dimensional features could be redundant for results in many fields such as image retrieval and classification. The choice of different high-level descriptions as input features is a prominent research topic. Choosing informative features for training can lead to robust models, whereas overfitting, underfitting, and misclassification usually occur when features are not selected well. More work remains to be done for selecting and utilizing proper features from images.

Transfer learning is an accepted scenario to learn information from limited datasets. There are three main transfer learning research directions in the field of radiological imaging. One is used to reduce bias among different equipment for image acquisition and different protocols [123, 221]. The second approach is to learn various abnormalities from the same data source [222, 223]. The last approach is to find a good feature representation from various domains and then apply them to the radiological imaging field [224]. Transfer learning allows us to deal with various scenarios by leveraging the already existing information from some related task or domain. For more details on transfer learning techniques and radiological imaging, readers should refer to [225].

Imbalanced data is standard in medical diagnosis, where the majority of data is normal data, but only a minority class is abnormal [226]. For example, brain tumors are not common, occurring only in 1 of the population. However it remains the most fatal form of cancer [16]. This data imbalance might affect prediction accuracy and cause a bias toward the majority class [227]. Several researchers considered the imbalanced situation in their models [185, 38, 94], however, the majority of studies have not properly addressed this issue [94, 228], or they use the same amount of data from different classes. How to utilize imbalanced data to improve the accuracy of machine learning algorithms remains an open question in this field.

Large amounts of radiological images are produced in hospitals every year. However, most of these images are not utilized for further training of machine learning algorithms, as the training process constrains available resources. Useful information is hidden in this mass of data, and diagnostic machine learning models could be improved by using these streaming data. Online learning is a novel idea in recommender systems and other machine learning based systems that could update the model while streaming data are currently developed in other fields. This idea can also be transferred to medical diagnosis systems to make full use of streaming image datasets.

Researchers have generated more powerful and more innovative diagnosis models for radiological imaging [229]. However, very few of them are commercialized and deployed into the market. The main challenge is to comply with government requirements in various countries [230]. Current FDA protocols suggest that medical products should pass the clinical trials, and be produced, commercialized, and used in a defined, unchanging form. If a machine learning model is used for medical diagnosis, a pre-build and freeze model must be tested in different clinical environments, assessed for various real-life medical conditions, and carefully evaluated on how these conditions affect the accuracy of the diagnosis. A recent study showed that a pre-trained model demonstrated significantly lower external performance on the data obtained from another hospital system [231]. In order to reduce the bias and improve performance, current machine learning solutions in nonmedical fields typically update the parameters of the models every time new data is included. However, this is not realistic for medical products as the system must pass a new clinical trial after the update. Therefore, this remains a major issue for machine learning algorithms in medical applications.

Knowing how deep neural networks work is an open question, and this prompts clinicians and patients to distrust these models. Due to huge amounts of parameters in the models, it is difficult to interpret how the models make diagnostic decisions between input and output. This could potentially be fatal if a machine learning model leads to a wrong conclusion [232], as medical experts can not verify these models. Deep learning researchers have recently computed heat maps using class activation maps in order to a more concrete analysis of how these models perform [233]. An activation map visually highlights the discriminative regions in medical images that models used to identify the category. However, the related researches on network explainability and visualization are still limited in the field of radiological imaging.

5 Conclusion and Future Work

In this paper, we reviewed five applications of machine learning techniques on radiologic images: image segmentation, computer-aided detection and diagnosis, functional brain studies and neurological disease diagnosis, image classification and retrieval, ands image registration. While machine learning techniques are active in computer-aided systems to assist radiologists in daily diagnosis and studies, the use of machine learning techniques in radiology is still evolving. There are many strategies that this field could investigate in the future:

  • Previous contributions have shown that machine learning-based systems showed accurate results comparable to those of radiologists themselves. However, the system accuracy of these techniques must still be improved, that is, systems must be more accurate than those of radiologists. Otherwise, the widespread application of machine learning techniques will be limited. A possible approach to achieve this superior performance is to design better machine learning models or to gather more representative data that can be continuously used to improve the algorithms.

  • Although the core advancement of deep learning is its ability to learn useful features directly from data, its accuracy and performance are highly limited by the size of data. Traditional machine learning methods still play a role in the case of small amounts of labeled data. Understanding how to choose and use features from images effectively is still a significant direction for these traditional methods.

  • Another issue deals with the translation of these techniques to clinical practice. While many machine learning algorithms have already shown good results, it still needs to pass clinical trials required by the government. Additionally, many people still believe in human decisions, as clinicians always consciously tend to decide with all the relevant information in mind. These decisions make it difficult to justify the use of algorithms for clinical decision making in all possible cases, but through rigorous research contributions, we can justify the use of machine learning algorithms in the cases when patients’ outcomes can be improved.

The application and development of machine learning techniques to radiological images is a hot topic currently and a large number of algorithms are being developed to ensure higher accuracy and lower computational complexity. We expect that machine learning techniques will become essential components in clinical tools and will be widely used to assess patients’ health in the future.


Research reported in this publication was supported by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health under Award Number R01HD092239. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflicts of Interest

None declared.


  • [1] R. A. Novelline and L. F. Squire, Squire’s fundamentals of radiology.   La Editorial, UPR, 2004.
  • [2] M. Chen, T. Pope, and D. Ott, Basic radiology.   McGraw Hill Professional, 2010.
  • [3] W. Herring, Learning radiology: Recognizing the basics.   Elsevier Health Sciences, 2015.
  • [4] S. J. Swensen, J. R. Jett, T. E. Hartman, D. E. Midthun, S. J. Mandrekar, S. L. Hillman, A.-m. Sykes, G. L. Aughenbaugh, A. O. Bungum, and K. L. Allen, “Radiology CT screening for lung cancer : five-year prospective,” Cancer, pp. 259–265, 2005.
  • [5] V. R. Iyer and S. I. Lee, “MRI, CT, and PET/CT for ovarian cancer detection and adnexal lesion characterization,” American Journal of Roentgenology, vol. 194, no. 2, pp. 311–321, 2010.
  • [6] M. S. Pearce, J. A. Salotti, M. P. Little, K. McHugh, C. Lee, K. P. Kim, N. L. Howe, C. M. Ronckers, P. Rajaraman, A. W. Craft, L. Parker, and A. B. De González, “Radiation exposure from CT scans in childhood and subsequent risk of leukaemia and brain tumours: A retrospective cohort study,” The Lancet, vol. 380, no. 9840, pp. 499–505, 2012.
  • [7] R. Smith-Bindman, J. Lipson, R. Marcus, K. P. Kim, M. Mahesh, R. Gould, A. B. de Gonzalez, and D. L. Miglioretti, “Radiation dose associated with common computed tomography examinations and the associated lifetime attributable risk of cancer,” Archives of Internal Medicine, vol. 169, no. 22, pp. 2078–2086, 2009.
  • [8] D. P. Frush, L. F. Donnelly, and N. S. Rosen, “Computed tomography and radiation risks: what pediatric health care providers should know.” Pediatrics, vol. 112, no. 4, pp. 951–957, 2003.
  • [9] Y. L. Huang, D. R. Chen, and Y. K. Liu, “Breast cancer diagnosis using image retrieval for different ultrasonic systems,” in International Conference on Image Processing, 2004, pp. 2957–2960.
  • [10] J. Shan, “A fully automatic segmentation method for breast ultrasound images,” Ph.D. dissertation, Utah State University, 2011.
  • [11] Y. Wang, P. Zhang, L. An, G. Ma, J. Kang, F. Shi, X. Wu, J. Zhou, D. S. Lalush, W. Lin, and D. Shen, “Predicting standard-dose PET image from low-dose PET and multimodal MR images using mapping-based sparse representation,” Physics in Medicine and Biology, vol. 61, no. 2, p. 791, 2016.
  • [12] M. Sundaram and M. H. Mcguire, “Computed tomography or magnetic resonance for evaluating the solitary tumor or tumor-like lesion of bone?” Skeletal Radiology, vol. 17, no. 6, pp. 393–401, 1988.
  • [13] J. Yao, J. E. Burns, and R. M. Summers, “Computer aided detection of bone metastases in the thoracolumbar spine,” in Spinal Imaging and Image Analysis, 2015, pp. 97–130.
  • [14] H. S. J. Ibrahim and A. Mukhtar, “Content based image retrieval in mammograms: a survey,” International Journal of Engineering Science, vol. 4638, 2016.
  • [15] S. Wang and R. M. Summers, “Machine learning and radiology,” Medical Image Analysis, vol. 16, no. 5, pp. 933–951, 2013.
  • [16] S. Bauer, R. Wiest, L.-P. Nolte, and M. Reyes, “A survey of MRI-based medical image analysis for brain tumor studies,” Physics in Medicine and Biology, vol. 58, no. 13, pp. R97–R129, 2013.
  • [17] D. García-Lorenzo, S. Francis, S. Narayanan, D. L. Arnold, and D. L. Collins, “Review of automatic segmentation methods of multiple sclerosis white matter lesions on conventional magnetic resonance imaging,” Medical Image Analysis, vol. 17, no. 1, pp. 1–18, 2013.
  • [18] K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Computational and Structural Biotechnology Journal, vol. 13, pp. 8–17, 2015.
  • [19] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. van der Laak, B. van Ginneken, and C. I. Sánchez, “A survey on deep learning in medical image analysis,” arXiv preprint arXiv:1702.05747, 2017.
  • [20] D. Shen, G. Wu, and H.-I. Suk, “Deep learning in medical image analysis,” Annual Review of Biomedical Engineering, no. 0, 2017.
  • [21] University of Wisconsin School of Medicine and Public Health. (2016) Neuroradiology learning module.
  • [22] C. Bailey, T. A. Huisman, R. M. de Jong, and M. Hwang, “Contrast-enhanced ultrasound and elastography imaging of the neonatal brain: A review,” Journal of Neuroimaging, vol. 27, no. 5, pp. 437–441, 2017.
  • [23] B. J. Erickson, P. Korfiatis, Z. Akkus, and T. L. Kline, “Machine learning for medical imaging,” Radiographics, vol. 37, no. 2, pp. 505–515, 2017.
  • [24] C. Salvatore, A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopez, G. Arabia, M. Morelli, M. C. Gilardi, and A. Quattrone, “Machine learning on brain MRI data for differential diagnosis of Parkinson’s disease and Progressive Supranuclear Palsy,” Journal of Neuroscience Methods, vol. 222, pp. 230–237, 2014.
  • [25] D. W. Townsend, T. Beyer, and T. M. Blodgett, “Pet/ct scanners: a hardware approach to image fusion,” in Seminars in nuclear medicine, vol. 33, no. 3.   Elsevier, 2003, pp. 193–204.
  • [26] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
  • [27] P. Dayan, “Unsupervised learning,” The MIT Encyclopedia of the Cognitive Sciences, pp. 1–7, 2009.
  • [28] T. Mitchell and A. Blum, “Combining labeled and unlabeled data with co-training,” in

    11th Annual Conference on Computational Learning Theory

    , 1998, pp. 92–100.
  • [29] X. Zhu, “Semi-supervised learning,” in Encyclopedia of Machine Learning, 2011, pp. 892–897.
  • [30] S. T. Chao, M. S. Ahluwalia, G. H. Barnett, G. H. J. Stevens, E. S. Murphy, A. L. Stockham, K. Shiue, and J. H. Suh, “Challenges with the diagnosis and treatment of cerebral radiation necrosis,” International Journal of Radiation Oncology Biology Physics, vol. 87, no. 3, pp. 449–457, 2013.
  • [31] A. Larroza, D. Moratal, A. Paredes-Sánchez, E. Soria-Olivas, M. L. Chust, L. A. Arribas, and E. Arana, “Support vector machine classification of brain metastasis and radiation necrosis based on texture analysis in mri,” Journal of Magnetic Resonance Imaging, vol. 42, no. 5, pp. 1362–1368, 2015.
  • [32] C. F. Tsai, “Image mining by spectral features: A case study of scenery image classification,” Expert Systems with Applications, vol. 32, no. 1, pp. 135–142, 2007.
  • [33] M. M. Islam, D. Zhang, and G. Lu, “A geometric method to compute directionality features for texture images,” in IEEE International Conference on Multimedia and Expo, no. 3, 2008, pp. 1521–1524.
  • [34] N. C. Yang, W. H. Chang, C. M. Kuo, and T. H. Li, “A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval,” Journal of Visual Communication and Image Representation, vol. 19, no. 2, pp. 92–105, 2008.
  • [35] D. P. Tian, “A review on image feature extraction and representation techniques,” International Journal of Multimedia and Ubiquitous Engineering, vol. 8, no. 4, pp. 385–395, 2013.
  • [36] W. Xie, Y. Li, and Y. Ma, “Breast mass classification in digital mammography based on extreme learning machine,” Neurocomputing, vol. 173, pp. 930–941, 2016.
  • [37] R. Rastghalam and H. Pourghassem, “Breast cancer detection using MRF-based probable texture feature and decision-level fusion-based classification using HMM on thermography images,” Pattern Recognition, vol. 51, pp. 176–186, 2016.
  • [38] J. Meng, Y. Jiang, X. Xu, and I. Priananda, “Support top irrelevant machine: learning similarity measures to maximize top precision for image retrieval,” Neural Computing and Applications, pp. 1–10, 2016.
  • [39] J. Yue, Z. Li, L. Liu, and Z. Fu, “Content-based image retrieval using color and texture fused features,” Mathematical and Computer Modelling, vol. 54, no. 3, pp. 1121–1127, 2011.
  • [40] G. Pass and R. Zabih, “Histogram refinement for content-based image retrieval,” in IEEE Workshop on Applications of Computer Vision, 1996, pp. 96–102.
  • [41] L. Juan and O. Gwun, “A comparison of SIFT, PCA-SIFT and SURF,” International Journal of Image Processing, vol. 3, no. 4, pp. 143–152, 2009.
  • [42] M. Srinivas, R. R. Naidu, C. S. Sastry, and C. K. Mohan, “Content based medical image retrieval using dictionary learning,” Neurocomputing, vol. 168, pp. 880–895, 2015.
  • [43] R. R. Gundreddy, M. Tan, Y. Qiu, S. Cheng, H. Liu, and B. Zheng, “Assessment of performance and reproducibility of applying a content-based image retrieval scheme for classification of breast lesions,” Medical physics, vol. 42, no. 7, pp. 4241–4249, 2015.
  • [44] X. Yingying, L. Lanfen, H. Hongjie, Y. Huajun, J. Chongwu, W. Jian, H. Xianhua, and C. Yen-Wei, “Combined density, texture and shape features of multi-phase contrast-enhanced CT images for CBIR of focal liver lesions: a preliminary study,” in Innovation in Medicine and Healthcare 2015.   Springer, 2016, pp. 215–224.
  • [45] A. K. Dhara, S. Mukhopadhyay, A. Dutta, M. Garg, and N. Khandelwal, “A combination of shape and texture features for classification of pulmonary nodules in lung CT images,” Journal of Digital Imaging, vol. 29, no. 4, pp. 466–475, 2016.
  • [46]

    L. P. Suresh, S. S. Dash, and B. K. Panigrahi, “Artificial intelligence and evolutionary algorithms in engineering systems,”

    Advances in Intelligent Systems and Computing, vol. 324, pp. 109–117, 2015.
  • [47] P. Keserwani, V. S. C. Pammi, O. Prakash, A. Khare, and M. Jeon, “Classification of Alzheimer disease using gabor texture feature of hippocampus region,” International Journal of Image, Graphics and Signal Processing, vol. 8, no. 6, pp. 13–20, 2016.
  • [48] X. Zhu, X. He, P. Wang, Q. He, D. Gao, J. Cheng, and B. Wu, “A method of localization and segmentation of intervertebral discs in spine MRI based on Gabor filter bank,” BioMedical Engineering OnLine, vol. 15, no. 1, p. 32, 2016.
  • [49] W. L. Lee, K. Chang, and K. S. Hsieh, “Unsupervised segmentation of lung fields in chest radiographs using multiresolution fractal feature vector and deformable models,” Medical and Biological Engineering and Computing, vol. 54, no. 9, pp. 1409–1422, 2016.
  • [50] S. Murala and Q. M. Jonathan Wu, “Local ternary co-occurrence patterns: A new feature descriptor for MRI and CT image retrieval,” Neurocomputing, vol. 119, pp. 399–412, 2013.
  • [51] S. Dhahbi, W. Barhoumi, and E. Zagrouba, “Breast cancer diagnosis in digitized mammograms using curvelet moments,” Computers in Biology and Medicine, vol. 64, pp. 79–90, 2015.
  • [52]

    G. Sethi and B. S. Saini, “Abdomen disease diagnosis in CT images using flexiscale curvelet transform and improved genetic algorithm,”

    Australasian Physical & Engineering Sciences in Medicine, vol. 38, no. 4, pp. 671–688, 2015.
  • [53] D. C. Pereira, R. P. Ramos, and M. Z. do Nascimento, “Segmentation and detection of breast cancer in mammograms combining wavelet analysis and genetic algorithm,” Computer Methods and Programs in Biomedicine, vol. 114, no. 1, pp. 88–101, 2014.
  • [54] H. Madero Orozco, O. O. Vergara Villegas, V. G. Cruz Sánchez, H. D. J. Ochoa Domínguez, and M. D. J. Nandayapa Alfaro, “Automated system for lung nodules classification based on wavelet feature descriptor and support vector machine.” BioMedical Engineering OnLine, vol. 14, no. 1, p. 9, 2015.
  • [55]

    J. Arias, J. Martínez-Gómez, J. A. Gámez, A. G. Seco de Herrera, and H. Müller, “Medical image modality classification using discrete Bayesian networks,”

    Computer Vision and Image Understanding, vol. 151, pp. 61–71, 2016.
  • [56] D.-H. Lee, D.-W. Lee, and B.-S. Han, “Possibility study of scale invariant feature transform (SIFT) algorithm application to spine magnetic resonance imaging,” Plos One, vol. 11, no. 4, p. e0153043, 2016.
  • [57] M. Alkhawlani and M. Elmogy, “Content-based image retrieval using local features descriptors and bag-of-visual words,” International Journal of Advanced Computer Science and Applications, vol. 6, no. 9, pp. 212–219, 2015.
  • [58] K. Velmurugan and S. S. Baboo, “Content-based image retrieval using SURF and colour moments,” Global Journal of Computer Science and Technology, vol. 11, no. 10, pp. 1–4, 2011.
  • [59] M. Srinivas, D. Roy, and C. K. Mohan, “Discriminative feature extraction of X-ray images using deep convolutional neural networks,” Icassp 2016, pp. 917–921, 2016.
  • [60] C. M. Bishop, Pattern Recognition and Machine Learning.   Springer, 2006.
  • [61] K. Deng, “OMEGA : On-line memory-based general purpose system classifier,” Ph.D. dissertation, Carnegie Mellon University, 1998.
  • [62] J. A. K. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural Processing Letters, vol. 9, no. 3, pp. 293–300, 1999.
  • [63] T. Torheim, E. Malinen, K. Kvaal, H. Lyng, U. G. Indahl, E. K. F. Andersen, and C. M. Futsaether, “Classification of dynamic contrast enhanced MR images of cervical cancers using texture analysis and support vector machines,” IEEE Transactions on Medical Imaging, vol. 33, no. 8, pp. 1648–1656, 2014.
  • [64] W.-Y. Loh, “Fifty years of classification and regression trees,” International Statistical Review, vol. 82, no. 3, pp. 329–348, 2014.
  • [65] L. Rokach and O. Maimon, “Classification Trees,” in Data Mining and Knowledge Discovery Handbook, 2010, pp. 149–174.
  • [66] A. T. Azar and S. M. El-Metwally, “Decision tree classifiers for automated medical diagnosis,” Neural Computing and Applications, vol. 23, no. 7-8, pp. 2387–2403, 2013.
  • [67] N. Speybroeck, “Classification and regression trees,” International Journal of Public Health, vol. 57, no. 1, pp. 243–246, 2012.
  • [68] E. Bauer, R. Kohavi, P. Chan, S. Stolfo, and D. Wolpert, “An empirical comparison of voting classification algorithms: bagging, Boosting, and variants,” Machine Learning, vol. 36, no. August, pp. 105–139, 1999.
  • [69] A. Liaw and M. Wiener, “Classification and regression by randomForest,” R News, vol. 2, no. December, pp. 18–22, 2002.
  • [70] Tri,Huynh, G. Yaozong, K. Jiayin, W. Li, Z. Pei, S. Dinggang, and Alzheimer’s Disease Neuroimaging Initiative, “Multi-source information gain for random forest: an application to CT image prediction from MRI data,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 321–329.
  • [71] D. Zikic, B. Glocker, E. Konukoglu, A. Criminisi, C. Demiralp, J. Shotton, O. M. Thomas, T. Das, R. Jena, and S. J. Price, “Decision forests for tissue-specific segmentation of high-grade gliomas in multi-channel MR,” Medical Image Computing and Computer-Assisted Intervention, vol. 15, no. Pt 3, pp. 369–76, 2012.
  • [72] E. Geremia, O. Clatz, B. H. Menze, E. Konukoglu, A. Criminisi, and N. Ayache, “Spatial decision forests for MS lesion segmentation in multi-channel magnetic resonance images,” NeuroImage, vol. 57, no. 2, pp. 378–390, 2011.
  • [73] R. Sammouda, R. M. Jomaa, and H. Mathkour, “Heart region extraction and segmentation from chest CT images using Hopfield Artificial Neural Networks,” in International Conference on Information Technology and e-Services, 2012, pp. 3–8.
  • [74] V. Lempitsky, V. Lempitsky, M. Verhoek, M. Verhoek, J. A. Noble, J. A. Noble, A. Blake, and A. Blake, “Random forest classication for automatic delineation of myocardium in real-time 3D echocardiography,” in International Conference on Functional Imaging and Modeling of the Heart, 2009, pp. 447–456.
  • [75] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers, “Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning.” IEEE Transactions on Medical Imaging, vol. PP, no. 99, p. 1, 2016.
  • [76] Y. Song, L. Zhang, S. Chen, D. Ni, B. Lei, and T. Wang, “Accurate segmentation of cervical cytoplasm and nuclei based on multiscale convolutional network and graph partitioning,” IEEE Transactions on Biomedical Engineering, vol. 62, no. 10, pp. 2421–2433, 2015.
  • [77] R. Salakhutdinov and G. E. Hinton, “Deep boltzmann machines,” in 12th International Conference on Artificial Intelligence and Statics, no. 3, 2009, pp. 448–455.
  • [78] Y. LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and applications in vision,” in IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, 2010, pp. 253–256.
  • [79] H.-I. Suk and D. Shen, “Deep learning-based feature representation for AD/MCI classification,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2013, pp. 583–590.
  • [80]

    A. Prasoon, K. Petersen, C. Igel, F. Lauze, E. Dam, and M. Nielsen, “Deep feature learning for knee cartilage segmentation using a triplanar convolutional neural network,” in

    International Conference on Medical Image Computing and Computer-Assisted Intervention, 2013, pp. 246–253.
  • [81] A. A. Cruz-Roa, J. E. Arevalo Ovalle, A. Madabhushi, and F. A. González Osorio, “A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2013, pp. 403–410.
  • [82] J. Shiraishi, L. L. Pesce, C. E. Metz, and K. Doi, “Experimental design and data analysis in receiver operating characteristic studies : lessons learned from reports in radiology from 1997 to 2006,” Radiology, vol. 253, no. 3, 2009.
  • [83] B. Van Ginneken, A. A. A. Setio, C. Jacobs, and F. Ciompi, “Off-the-shelf convolutional neural network features for pulmonary nodule detection in computed tomography scans,” in 12th IEEE International Symposium on Biomedical Imaging, 2015, pp. 286–289.
  • [84] S. Choi, “X-ray image body part clustering using deep convolutional neural network,” ImageCLEF 2015 Medical Clustering Task, pp. 6–8, 2015.
  • [85] K. H. Zou, S. K. Warfield, A. Bharatha, C. M. C. Tempany, M. R. Kaus, S. J. Haker, W. M. Wells, F. A. Jolesz, and R. Kikinis, “Statistical validation of image segmentation Quality Based on a Spatial Overlap Index,” Academic Radiology, vol. 11, no. 2, pp. 178–189, 2004.
  • [86] V. K. Reed, W. A. Woodward, L. Zhang, E. A. Strom, G. H. Perkins, W. Tereffe, J. L. Oh, T. K. Yu, I. Bedrosian, G. J. Whitman, T. A. Buchholz, and L. Dong, “Automatic segmentation of whole breast using atlas approach and deformable image registration,” International Journal of Radiation Oncology Biology Physics, vol. 73, no. 5, pp. 1493–1500, 2009.
  • [87] N. Li, R. Jin, and Z. Zhou, “Top rank optimization in linear time,” Advances in Neural Information Processing Systems, pp. 1–9, 2014.
  • [88] Y. Yoo, T. Brosch, A. Traboulsee, D. Li, and R. Tam, “Deep learning of image features from unlabeled data for multiple sclerosis lesion segmentation,” Mlmi, pp. 117–124, 2014.
  • [89] O. Maier, M. Wilms, J. von der Gablentz, U. M. Krämer, T. F. Münte, and H. Handels, “Extra tree forests for sub-acute ischemic stroke lesion segmentation in MR sequences,” Journal of Neuroscience Methods, vol. 240, pp. 89–100, 2015.
  • [90] J. Mitra, P. Bourgeat, J. Fripp, S. Ghose, S. Rose, O. Salvado, A. Connelly, B. Campbell, S. Palmer, G. Sharma, S. Christensen, and L. Carey, “Lesion segmentation from multimodal MRI using random forest following ischemic stroke,” NeuroImage, vol. 98, pp. 324–335, 2014.
  • [91] T. Si, A. De, and A. Kumar, “Artificial neural network based lesion segmentation of brain MRI,” Communications on Applied Electronics, vol. 4, no. 5, pp. 1–5, 2016.
  • [92] G. Li, L. Wang, F. Shi, W. Lin, and D. Shen, “Multi-atlas based simultaneous labeling of longitudinal dynamic cortical surfaces in infants,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2013, pp. 58–65.
  • [93] L. Wang, F. Shi, Y. Gao, G. Li, J. H. Gilmore, W. Lin, and D. Shen, “Integration of sparse multi-modality representation and anatomical constraint for isointense infant brain MR image segmentation,” NeuroImage, vol. 89, pp. 152–164, 2014.
  • [94] W. Zhang, R. Li, H. Deng, L. Wang, W. Lin, S. Ji, and D. Shen, “Deep convolutional neural networks for multi-modality isointense infant brain image segmentation,” NeuroImage, vol. 108, pp. 214–224, 2015.
  • [95] J. Kleesiek, G. Urban, A. Hubert, D. Schwarz, K. Maier-Hein, M. Bendszus, and A. Biller, “Deep MRI brain extraction: a 3D convolutional neural network for skull stripping,” NeuroImage, vol. 129, pp. 460–469, 2016.
  • [96] A. de Brebisson and G. Montana, “Deep neural networks for anatomical brain segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015, pp. 20–28.
  • [97] P. Moeskops, M. A. Viergever, A. M. Mendrik, L. S. de Vries, M. J. Benders, and I. Išgum, “Automatic segmentation of mr brain images with a convolutional neural network,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1252–1261, 2016.
  • [98] H. Chen, Q. Dou, L. Yu, J. Qin, and P.-A. Heng, “Voxresnet: Deep voxelwise residual networks for brain segmentation from 3d mr images,” NeuroImage, 2017.
  • [99] C. Lindner, S. Thiagarajah, J. M. Wilkinson, T. Consortium, G. A. Wallis, and T. F. Cootes, “Fully automatic segmentation of the proximal femur using random forest regression voting,” Medical Image Analysis, vol. 32, no. 8, pp. 1462–1472, 2013.
  • [100] H. Lombaert, D. Zikic, A. Criminisi, and N. Ayache, “Laplacian forests: Semantic image segmentation by guided bagging,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2014, pp. 496–504.
  • [101] S. R. B, A. Carass, J. L. Prince, and D. L. Pham, “Semi-automatic liver tumor segmentation in dynamic contrast-enhanced CT scans using random forests and supervoxels,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 212—-219.
  • [102] Q. Liu, Q. Wang, L. Zhang, Y. Gao, and D. Shen, “Multi-atlas context forests for knee MR image segmentation,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 186–193.
  • [103] C. Chen, D. Belavy, and G. Zheng, “3D intervertebral disc localization and segmentation from MR images by data-driven regression and classification,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2014, pp. 50–58.
  • [104] S. Sedai, P. Roy, and R. Garnavi, “Segmentation of right ventricle in cardiac MR images using shape regression,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 1–8.
  • [105] T. Tong, R. Wolz, Z. Wang, Q. Gao, K. Misawa, M. Fujiwara, K. Mori, J. V. Hajnal, and D. Rueckert, “Discriminative dictionary learning for abdominal multi-organ segmentation,” Medical Image Analysis, vol. 23, no. 1, pp. 92–104, 2015.
  • [106] P. Aljabar, R. A. Heckemann, A. Hammers, J. V. Hajnal, and D. Rueckert, “Multi-atlas based segmentation of brain images: atlas selection and its effect on accuracy,” NeuroImage, vol. 46, no. 3, pp. 726–738, 2009.
  • [107] J. C. Griffis, J. B. Allendorfer, and J. P. Szaflarski, “Voxel-based Gaussian naïve Bayes classification of ischemic stroke lesions in individual T1-weighted MRI scans,” Journal of Neuroscience Methods, vol. 257, pp. 97–108, 2016.
  • [108] Y. Wang, J. Nie, P. T. Yap, G. Li, F. Shi, X. Geng, L. Guo, and D. Shen, “Knowledge-guided robust MRI brain extraction for diverse large-scale neuroimaging studies on humans and non-human primates,” PLoS ONE, vol. 9, no. 1, pp. 1–23, 2014.
  • [109] Y. Jin, Y. Shi, L. Zhan, B. A. Gutman, G. I. de Zubicaray, K. L. McMahon, M. J. Wright, A. W. Toga, and P. M. Thompson, “Automatic clustering of white matter fibers in brain diffusion MRI with an application to genetics,” NeuroImage, vol. 100, pp. 75–90, 2014.
  • [110] D. Zhang, Daoqiuang; Shen, “Multi modal multi task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease,” NeuroImage, vol. 59, no. 2, pp. 895–907, 2013.
  • [111] T. Tong, R. Wolz, Q. Gao, R. Guerrero, J. V. Hajnal, and D. Rueckert, “Multiple instance learning for classification of dementia in brain MRI,” Medical Image Analysis, vol. 18, no. 5, pp. 808–818, 2014.
  • [112] S. F. Eskildsen, P. Coupé, V. Fonov, J. V. Manjón, K. K. Leung, N. Guizard, S. N. Wassef, L. R. Østergaard, and D. L. Collins, “BEaST: brain extraction based on nonlocal segmentation technique,” NeuroImage, vol. 59, no. 3, pp. 2362–2373, 2012.
  • [113] A. G. S. D. Herrera, D. Markonis, R. Joyseeree, R. Schaer, and A. Foncubierta-rodr, “Semi – Supervised Learning for Image Modality Classification,” Multimodal Retrieval in the Medical Domain, pp. 85–98, 2015.
  • [114] S. Roy, A. Carass, J. L. Prince, and D. L. Pham, “Subject specific sparse dictionary learning for atlas based brain MRI segmentation,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 248–255.
  • [115] W. Li, F. Jia, and Q. Hu, “Automatic segmentation of liver tumor in ct images with deep convolutional neural networks,” Journal of Computer and Communications, vol. 3, no. 11, p. 146, 2015.
  • [116] H. R. Roth, L. Lu, A. Farag, H.-C. Shin, J. Liu, E. B. Turkbey, and R. M. Summers, “Deeporgan: Multi-level deep convolutional networks for automated pancreas segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2015, pp. 556–564.
  • [117] Y. Guo, Y. Gao, and D. Shen, “Deformable mr prostate segmentation via deep feature learning and sparse patch matching,” IEEE transactions on medical imaging, vol. 35, no. 4, pp. 1077–1089, 2016.
  • [118] M. Avendi, A. Kheradvar, and H. Jafarkhani, “A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac mri,” Medical image analysis, vol. 30, pp. 108–119, 2016.
  • [119] G. van Tulder and M. de Bruijne, “Combining generative and discriminative representation learning for lung ct analysis with convolutional restricted boltzmann machines,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1262–1272, 2016.
  • [120] R. Manniesing, T. Marcel, H. Oei, L. J. Oostveen, J. Melendez, E. J. Smit, B. Platel, C. I. Sánchez, J. Frederick, A. Meijer et al., “White matter and gray matter segmentation in 4d computed tomography,” Scientific Reports (Nature Publisher Group), vol. 7, p. 1, 2017.
  • [121] M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P.-M. Jodoin, and H. Larochelle, “Brain tumor segmentation with deep neural networks,” Medical image analysis, vol. 35, pp. 18–31, 2017.
  • [122] P. Hu, F. Wu, J. Peng, P. Liang, and D. Kong, “Automatic 3d liver segmentation based on deep learning and globally optimized surface evolution,” Physics in Medicine and Biology, vol. 61, no. 24, p. 8676, 2016.
  • [123] D. Paredes, A. Saha, and M. A. Mazurowski, “Deep learning for segmentation of brain tumors: can we train with images from different institutions?” in Medical Imaging 2017: Computer-Aided Diagnosis, vol. 10134.   International Society for Optics and Photonics, 2017, p. 101341P.
  • [124] S. D. O’Connor, J. Yao, and R. M. Summers, “Lytic metastases in thoracolumbar spine: computer-aided detection at CT–preliminary study.” Radiology, vol. 242, no. 3, pp. 811–816, 2007.
  • [125] J. Yao, H. Munoz, J. E. Burns, and L. Lu, “Computer aided detection of spinal degenerative osteophytes on sodium fluoride PET/CT,” Computational Methods and Clinical Applications for Spine Imaging, pp. 51–60, 2014.
  • [126] J. Liu, S. Pattanaik, J. Yao, E. Turkbey, W. Zhang, X. Zhang, and R. M. Summers, “Computer aided detection of epidural masses on computed tomography scans,” Computerized Medical Imaging and Graphics, vol. 38, no. 7, pp. 606–612, 2014.
  • [127] J. Yao, J. E. Burns, H. Munoz, and R. M. Summers, “Detection of vertebral body fractures based on cortical shell unwrapping,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, vol. 15, no. 3, 2012, pp. 509–516.
  • [128] K.-H. Thung, C.-Y. Wee, P.-T. Yap, D. Shen, and the Alzheimer’s Disease Neuroimaging Initiative, “Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion,” NeuroImage, vol. 91, pp. 386–400, 2014.
  • [129] C. D. Lehman, R. D. Wellman, D. S. M. Buist, K. Kerlikowske, A. N. A. Tosteson, and D. L. Miglioretti, “Diagnostic accuracy of digital screening mammography with and without computer-aided detection.” JAMA Internal Medicine, vol. 175, no. 11, pp. 1–10, 2015.
  • [130] J. Yao, A. Dwyer, R. M. Summers, and D. J. Mollura, “Computer-aided diagnosis of pulmonary infections using texture analysis and support vector machine classification,” Academic Radiology, vol. 18, no. 3, pp. 306–314, 2011.
  • [131] N. Pérez, M. A. Guevara, A. Silva, I. Ramos, and J. Loureiro, “Improving the performance of machine learning classifiers for breast cancer diagnosis based on feature selection,” in Federated Conference on Computer Science and Information Systems, vol. 2, 2014, pp. 209–217.
  • [132] M. Jiang, S. Zhang, and D. Metaxas, “Detection of mammographic masses by content-based image retrieval,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 33–41.
  • [133] W. Sun, T.-L. B. Tseng, W. Qian, J. Zhang, E. C. Saltzstein, B. Zheng, F. Lure, H. Yu, and S. Zhou, “Using multiscale texture and density features for near-term breast cancer risk analysis.” Medical Physics, vol. 42, no. 6, pp. 2853–2862, 2015.
  • [134] N. P. Pérez, M. A. Guevara López, A. Silva, and I. Ramos, “Improving the Mann-Whitney statistical test for feature selection: An approach in breast cancer diagnosis on mammography,” Artificial Intelligence in Medicine, vol. 63, no. 1, pp. 19–31, 2015.
  • [135] Q. Wang, W. Zhu, and B. Wang, “Three-dimensional SVM with latent variable: application for detection of lung lesions in CT images.” Journal of Medical Systems, vol. 39, no. 1, p. 171, 2015.
  • [136] S. Antani, “Automated detection of lung diseases in chest X-Rays,” US National Library of Medicine, 2015.
  • [137] V. Gopalakrishnan, P. G. Menon, and S. Madan, “cmri-bed: A novel informatics framework for cardiac mri biomarker extraction and discovery applied to pediatric cardiomyopathy classification,” Biomedical engineering online, vol. 14, no. 2, p. S7, 2015.
  • [138] J. Arevalo, F. A. González, R. Ramos-Pollán, J. L. Oliveira, and M. A. Guevara Lopez, “Representation learning for mammography mass lesion classification with convolutional neural networks,” Computer Methods and Programs in Biomedicine, vol. 127, pp. 248–257, 2016.
  • [139] S. P. Singh and S. Urooj, “An improved CAD system for breast cancer diagnosis based on generalized pseudo-zernike moment and Ada-DEWNN classifier,” Journal of Medical Systems, vol. 40, no. 4, pp. 1–13, 2016.
  • [140] J.-Z. Cheng, D. Ni, Y.-H. Chou, J. Qin, C.-M. Tiu, Y.-C. Chang, C.-S. Huang, D. Shen, and C.-M. Chen, “Computer-aided diagnosis with deep learning architecture: applications to breast lesions in us images and pulmonary nodules in ct scans,” Scientific reports, vol. 6, 2016.
  • [141] A. Rani, D. Mittal et al., “Detection and classification of focal liver lesions using support vector machine classifiers,” Journal of Biomedical Engineering and Medical Imaging, vol. 3, no. 1, p. 21, 2016.
  • [142] J. E. Burns, J. Yao, H. Muñoz, and R. M. Summers, “Automated detection, localization, and classification of traumatic vertebral body fractures in the thoracic and lumbar spine at CT,” Radiology, vol. 278, no. 1, pp. 64–73, 2016.
  • [143] R. Ebsim, J. Naqvi, and T. Cootes, “Detection of wrist fractures in x-ray images,” in Workshop on Clinical Image-Based Procedures.   Springer, 2016, pp. 1–8.
  • [144] T. Kooi, G. Litjens, B. van Ginneken, A. Gubern-Mérida, C. I. Sánchez, R. Mann, A. den Heeten, and N. Karssemeijer, “Large scale deep learning for computer aided detection of mammographic lesions,” Medical Image Analysis, vol. 35, pp. 303–312, 2017.
  • [145] X. Liu, F. Hou, H. Qin, and A. Hao, “A cade system for nodule detection in thoracic ct images based on artificial neural network,” Science China Information Sciences, vol. 60, no. 7, p. 072106, 2017.
  • [146] Y. Miki, C. Muramatsu, T. Hayashi, X. Zhou, T. Hara, A. Katsumata, and H. Fujita, “Classification of teeth in cone-beam ct using deep convolutional neural network,” Computers in Biology and Medicine, vol. 80, pp. 24–29, 2017.
  • [147] A. Mehrtasha, A. Sedghic, M. Ghafooriana, M. Taghipoura, C. M. Tempanya, T. Kapura, P. Mousavic, P. Abolmaesumib, and A. Fedorova, “Classification of clinical significance of mri prostate findings using 3d convolutional neural networks,” in SPIE Medical Imaging.   International Society for Optics and Photonics, 2017, pp. 101 342A–101 342A.
  • [148] C. Spampinato, S. Palazzo, D. Giordano, M. Aldinucci, and R. Leonardi, “Deep learning for automated skeletal bone age assessment in x-ray images,” Medical Image Analysis, vol. 36, pp. 41–51, 2017.
  • [149] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers, “Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases,” in Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on.   IEEE, 2017, pp. 3462–3471.
  • [150] P. Rajpurkar, J. Irvin, K. Zhu, B. Yang, H. Mehta, T. Duan, D. Ding, A. Bagul, C. Langlotz, K. Shpanskaya et al., “Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning,” arXiv preprint arXiv:1711.05225, 2017.
  • [151] K. L.-C. Hsieh, C.-M. Lo, and C.-J. Hsiao, “Computer-aided grading of gliomas based on local and global mri features,” Computer methods and programs in biomedicine, vol. 139, pp. 31–38, 2017.
  • [152] C. H. Lee, D. D. Dershaw, D. Kopans, P. Evans, B. Monsees, D. Monticciolo, R. J. Brenner, L. Bassett, W. Berg, S. Feig, E. Hendrick, E. Mendelson, C. D’Orsi, E. Sickles, and L. W. Burhenne, “Breast cancer screening with imaging: recommendations from the society of breast imaging and the ACR on the use of mammography, breast MRI, breast ultrasound, and other technologies for the detection of clinically occult breast cancer,” Journal of the American College of Radiology, vol. 7, no. 1, pp. 18–27, 2010.
  • [153] R. Nithya and B. Santhi, “Classification of normal and abnormal patterns in digital mammograms for diagnosis of breast cancer,” International Journal of Computer Applications, vol. 28, no. 6, pp. 21–25, 2011.
  • [154] S.-T. Luo and B.-W. Cheng, “Diagnosing breast masses in digital mammography using feature selection and ensemble methods,” Journal of Medical Systems, vol. 36, no. 2, pp. 569–577, 2012.
  • [155] Y. Jiang, R. M. Nishikawa, R. A. Schmidt, C. E. Metz, M. L. Giger, and K. Doi, “Improving breast cancer diagnosis with computer-aided diagnosis,” Academic Radiology, vol. 6, no. 1, pp. 22–33, 1999.
  • [156] W. Sun, B. Zheng, F. Lure, T. Wu, J. Zhang, B. Y. Wang, E. C. Saltzstein, and W. Qian, “Prediction of near-term risk of developing breast cancer using computerized features from bilateral mammograms,” Computerized Medical Imaging and Graphics, vol. 38, no. 5, pp. 348–357, 2014.
  • [157] H. Y. Banaem, A. M. Dehnavi, and M. Shahnazi, “Ensemble supervised classification method using the regions of interest and grey level co-occurrence matrices features for mammograms Data,” Iranian Journal of Radiology, vol. 12, no. 3, 2015.
  • [158] J. Arevalo, F. A. González, R. Ramos-Pollán, J. L. Oliveira, and M. A. Guevara Lopez, “Convolutional neural networks for mammography mass lesion classification,” in 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015, pp. 797–800.
  • [159] Z. Jiao, X. Gao, Y. Wang, and J. Li, “A deep feature based framework for breast masses classification,” Neurocomputing, vol. 197, pp. 1–11, 2016.
  • [160]

    R. Jin, K. D. Luk, J. Cheung, and Y. Hu, “A machine learning based prognostic prediction of cervical myelopathy using diffusion tensor imaging,” in

    Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), 2016 IEEE International Conference on.   IEEE, 2016, pp. 1–4.
  • [161] A. M. Abdel-Zaher and A. M. Eldeib, “Breast cancer classification using deep belief networks,” Expert Systems with Applications, vol. 46, pp. 139–144, 2016.
  • [162] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck, “Deep learning for identifying metastatic breast cancer,” arXiv preprint arXiv:1606.05718, 2016.
  • [163] Y. Bar, I. Diamant, L. Wolf, and H. Greenspan, “Deep learning with non-medical training used for chest pathology identification,” in Medical Imaging 2015: Computer-Aided Diagnosis, vol. 9414.   International Society for Optics and Photonics, 2015, p. 94140V.
  • [164] R. Rasti, M. Teshnehlab, and S. L. Phung, “Breast cancer diagnosis in dce-mri using mixture ensemble of convolutional neural networks,” Pattern Recognition, vol. 72, pp. 381–390, 2017.
  • [165] X. Wang, W. Yang, J. Weinreb, J. Han, Q. Li, X. Kong, Y. Yan, Z. Ke, B. Luo, T. Liu et al., “Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning,” Scientific reports, vol. 7, no. 1, p. 15415, 2017.
  • [166] M. Anthimopoulos, S. Christodoulidis, L. Ebner, A. Christe, and S. Mougiakakou, “Lung pattern classification for interstitial lung diseases using a deep convolutional neural network,” IEEE transactions on medical imaging, vol. 35, no. 5, pp. 1207–1216, 2016.
  • [167] W. Sun, T.-L. Tseng, J. Zhang, and W. Qian, “Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data,” Computerized Medical Imaging and Graphics, 2016.
  • [168] K. P. Exarchos, Y. Goletsis, and D. I. Fotiadis, “Multiparametric decision support system for the prediction of oral cancer reoccurrence,” IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 6, pp. 1127–1134, 2012.
  • [169]

    T. A. Patel, M. Puppala, R. O. Ogunti, J. E. Ensor, T. He, J. B. Shewale, D. P. Ankerst, V. G. Kaklamani, A. A. Rodriguez, S. T. C. Wong, and J. C. Chang, “Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods,”

    Cancer, pp. 1–8, 2016.
  • [170] T. Ayer, M. U. Ayvaci, Z. X. Liu, O. Alagoz, and E. S. Burnside, “Computer-aided diagnostic models in breast cancer screening.” Imaging in Medicine, vol. 2, no. 3, pp. 313–323, 2010.
  • [171] S. Kloppel, C. M. Stonnington, C. Chu, B. Draganski, R. I. Scahill, J. D. Rohrer, N. C. Fox, C. R. Jack, J. Ashburner, and R. S. J. Frackowiak, “Automatic classification of MR scans in Alzheimer’s disease,” Brain, vol. 131, no. 3, pp. 681–689, 2008.
  • [172] M. Chupin, A. Hammers, R. S. N. Liu, O. Colliot, J. Burdett, E. Bardinet, J. S. Duncan, L. Garnero, and L. Lemieux, “Automatic segmentation of the hippocampus and the amygdala driven by hybrid constraints: Method and validation,” NeuroImage, vol. 46, no. 3, pp. 749–761, 2009.
  • [173] Y. Fan, D. Shen, R. C. Gur, R. E. Gur, and C. Davatzikos, “COMPARE: classication of morphological patterns using adaptive regional elements,” IEEE Transactions on Medical Imaging, vol. 26, no. 1, pp. 93–105, 2007.
  • [174] Y. Chen, B. Shi, C. D. Smith, and J. Liu, “Nonlinear feature transformation and deep fusion for Alzheimer’s disease staging analysis,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015, pp. 304–312.
  • [175] M. Liu, D. Zhang, and D. Shen, “Hierarchical fusion of features and classifier decisions for Alzheimer’s disease diagnosis,” Human Brain Mapping, vol. 35, no. 4, pp. 1305–1319, 2014.
  • [176] C. Chu, A. L. Hsu, K. H. Chou, P. Bandettini, and C. Lin, “Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images,” NeuroImage, vol. 60, no. 1, pp. 59–70, 2012.
  • [177] E. Bron, M. Smits, J. Van Swieten, W. Niessen, and S. Klein, “Feature selection based on SVM significance maps for classification of dementia,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 272–279.
  • [178] D. Gelb, E. Oliver, and S. Gilman, “Diagnostic criteria for Parkinson disease,” Arch Neurol, vol. 56, no. 4, pp. 368–376, 1999.
  • [179] G. Singh and L. Samavedham, “Unsupervised learning based feature extraction for differential diagnosis of neurodegenerative diseases: A case study on early-stage diagnosis of Parkinson disease,” Journal of Neuroscience Methods, vol. 256, pp. 30–40, 2015.
  • [180] M. Liu, D. Zhang, and D. S. B, “Inherent structure-guided multi-view learning for Alzheimer’s disease and mild cognitive impairment classification,” in International Workshop on Machine Learning in Medical Imaging, 2015, pp. 296–303.
  • [181] C. D. Smyser, N. U. Dosenbach, T. A. Smyser, A. Z. Snyder, C. E. Rogers, T. E. Inder, B. L. Schlaggar, and J. J. Neil, “Prediction of brain maturity in infants using machine-learning algorithms,” NeuroImage, vol. 136, pp. 1–9, 2016.
  • [182] Alzheimer’s Association, “Alzheimer’s disease facts and figures,” Alzheimer’s & Dementia, vol. 12, no. 4, p. 88, 2015.
  • [183] F. Li, L. Tran, K.-H. Thung, S. Ji, D. Shen, and J. Li, “Robust deep learning for improved classification of AD / MCI patients,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 240–247.
  • [184] M. Komlagan, V.-T. Ta, X. Pan, J.-P. Domenger, D. Collins, and P. Coupé, “Anatomically constrained weak classifier fusion for early detection of Alzheimer’s disease,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 141–148.
  • [185] B. Ahmed, C. E. Brodley, K. E. Blackmon, R. Kuzniecky, G. Barash, C. Carlson, B. T. Quinn, W. Doyle, J. French, O. Devinsky, and T. Thesen, “Cortical feature analysis and machine learning improves detection of “MRI-negative” focal cortical dysplasia,” Epilepsy & Behavior, vol. 48, pp. 21–28, 2015.
  • [186] S. J. Hong, H. Kim, D. Schrader, N. Bernasconi, B. C. Bernhardt, and A. Bernasconi, “Automated detection of cortical dysplasia type II in MRI-negative epilepsy,” Neurology, vol. 83, no. 1, pp. 48–55, 2014.
  • [187] L. Huang, Y. Gao, Y. Jin, K.-H. Thung, and D. Shen, “Soft-split sparse regression based random forest for predicting future clinical scores of Alzheimer’s disease,” International Workshop on Machine Learning in Medical Imaging, pp. 194–202, 2015.
  • [188] X. Zhu, H.-i. Suk, and D. Shen, “Sparse discriminative feature selection for multi-class Alzheimer’s disease classification,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 157–164.
  • [189] X. Zhu, H.-i. Suk, Y. Zhu, and K.-h. Thung, “Multi-view classification for identification of Alzheimer’s Disease,” in International Workshop on Machine Learning in Medical Imaging, vol. 255-262, 2015, pp. 255–262.
  • [190] R. Guerrero, C. Ledig, and D. Rueckert, “Manifold alignment and transfer learning for classification of Alzheimer’s disease,” in International Workshop on Machine Learning in Medical Imaging, 2014, pp. 77–84.
  • [191] B. Cheng, M. Liu, and D. Zhang, “Multimodal multi-label transfer learning for early diagnosis of Alzheimer’s disease,” in International Workshop on Machine Learning in Medical Imaging.   Springer, 2015, pp. 238–245.
  • [192] S. Sarraf, J. Anderson, G. Tofighi et al., “Deepad: Alzheimer′ s disease classification via deep convolutional neural networks using mri and fmri,” bioRxiv, p. 070441, 2016.
  • [193] Z. Long, B. Jing, H. Yan, J. Dong, H. Liu, X. Mo, Y. Han, and H. Li, “A support vector machine based method to identify mild cognitive impairment with multi-level characteristics of magnetic resonance imaging,” Neuroscience, vol. 331, pp. 169–176, 2016.
  • [194] A. Khazaee, A. Ebrahimzadeh, and A. Babajani-Feremi, “Application of advanced machine learning methods on resting-state fmri network for identification of mild cognitive impairment and alzheimer’s disease,” Brain imaging and behavior, vol. 10, no. 3, pp. 799–817, 2016.
  • [195] R. Armananzas, M. Iglesias, D. A. Morales, and L. Alonso-Nanclares, “Voxel-based diagnosis of Alzheimer’s disease using classifier ensembles,” IEEE Journal of Biomedical and Health Informatics, vol. PP, no. 99, pp. 1–7, 2016.
  • [196] S. Sarraf and G. Tofighi, “Deep learning-based pipeline to recognize alzheimer’s disease using fmri data,” in Future Technologies Conference (FTC).   IEEE, 2016, pp. 816–820.
  • [197] T. M. Schouten, M. Koini, F. de Vos, S. Seiler, M. de Rooij, A. Lechner, R. Schmidt, M. van den Heuvel, J. van der Grond, and S. A. Rombouts, “Individual classification of alzheimer’s disease with diffusion magnetic resonance imaging,” Neuroimage, vol. 152, pp. 476–481, 2017.
  • [198] R. S. Kumar and M. Senthilmurugan, “Content-based image retrieval system in medical applications,” International Journal of Engineering Research and Technology, vol. 2, no. 3, 2013.
  • [199] C.-H. Wei, C.-T. Li, and R. Wilson, “A content–based approach to medical image database retrieval,” Database Modeling for Industrial Data Management: Emerging Technologies and Applications, pp. 258–291, 2005.
  • [200] J. Yu, J. Amores, N. Sebe, P. Radeva, and Q. Tian, “Distance learning for similarity estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 451–462, 2008.
  • [201] T. Emrich, F. Graf, H. P. Kriegel, M. Schubert, and M. Thoma, “Similarity estimation using Bayes ensembles,” in International Conference on Scientific and Statistical Database Management, 2010, pp. 537–554.
  • [202] C. Kurtz, C. F. Beaulieu, S. Napel, and D. L. Rubin, “A hierarchical knowledge-based approach for retrieving similar medical images described with semantic annotations,” Journal of Biomedical Informatics, vol. 49, pp. 227–244, 2014.
  • [203] S. R. Dubey, S. K. Singh, and R. K. Singh, “Local wavelet pattern: a new feature descriptor for image retrieval in medical CT databases,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5892–5903, 2015.
  • [204] I. Ramirez, P. Sprechmann, and G. Sapiro, “Classification and clustering via dictionary learning with structured incoherence and shared features,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, no. 1, 2010, pp. 3501–3508.
  • [205] M. Srinivas and C. K. Mohan, “Medical images modality classification using multi-scale dictionary learning,” in International Conference on Digital Signal Processing, no. August, 2014, pp. 621–625.
  • [206] ——, “Classification of medical images using edge-based features and sparse representation,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 912–916.
  • [207] E. Ahn, A. Kumar, J. Kim, C. Li, D. Feng, M. Fulham, N. Medicine, R. Prince, and A. Hospital, “X-ray image classification using domain transferred convolutional neural networks and local sparse spatial pyramid,” in 2016 IEEE 13th International Symposium on Biomedical Imaging, 2016, pp. 855–858.
  • [208] A. V. Faria, K. Oishi, S. Yoshida, A. Hillis, M. I. Miller, and S. Mori, “Content-based image retrieval for brain MRI: an image-searching engine and population-based analysis to utilize past clinical data for future diagnosis,” NeuroImage: Clinical, vol. 7, pp. 367–376, 2015.
  • [209] C. Kurtz, A. Depeursinge, S. Napel, C. F. Beaulieu, and D. L. Rubin, “On combining image-based and ontological semantic dissimilarities for medical image retrieval applications,” Medical Image Analysis, vol. 18, no. 7, pp. 1082–1100, 2014.
  • [210] Y. Cao, S. Steffey, H. Jianbiao, D. Xiao, C. Tao, P. Chen, and H. Müller, “Medical image retrieval: a multimodal approach,” Cancer Informatics, vol. 13, pp. 125–136, 2015.
  • [211] M. Verma and B. Raman, “Center symmetric local binary co-occurrence pattern for texture , face and bio-medical image retrieval,” Journal of Visual Communication and Image Representation, vol. 32, pp. 224–236, 2015.
  • [212] R. Lan, S. Zhong, Z. Liu, Z. Shi, and X. Luo, “A simple texture feature for retrieval of medical images,” Multimedia Tools and Applications, pp. 1–14, 2018.
  • [213] E. M. Rohren, T. G. Turkington, and R. E. Coleman, “Clinical applications of PET in oncology.” Radiology, vol. 231, pp. 305–332, 2004.
  • [214] J. Kang, Y. Gao, F. Shi, D. S. Lalush, W. Lin, and D. Shen, “Prediction of standard-dose PET image by low-dose PET and MRI images,” Medical Physics, vol. 42, no. 9, pp. 5301–5309, 2015.
  • [215] L. Xiang, Y. Qiao, D. Nie, L. An, W. Lin, Q. Wang, and D. Shen, “Deep auto-context convolutional neural networks for standard-dose pet image estimation from low-dose pet/mri,” Neurocomputing, vol. 267, pp. 406–416, 2017.
  • [216] P. Kontschieder, S. R. Bulò, H. Bischof, and M. Pelillo, “Structured class-labels in random forests for semantic image labelling,” in IEEE International Conference on Computer Vision, 2011, pp. 2190–2197.
  • [217] P. Dollar and C. L. Zitnick, “Structured forests for fast edge detection,” in IEEE International Conference on Computer Vision, 2013, pp. 1841–1848.
  • [218] X. Yang, R. Kwitt, M. Styner, and M. Niethammer, “Quicksilver: Fast predictive image registration–a deep learning approach,” NeuroImage, vol. 158, pp. 378–396, 2017.
  • [219] A. Cerasa, “Machine learning on Parkinson’s disease? Let’s translate into clinical practice,” Journal of Neuroscience Methods, vol. 266, pp. 161–162, 2015.
  • [220] J. Weese and C. Lorenz, “Four challenges in medical image analysis from an industrial perspective,” Medical Image Analysis, vol. 33, pp. 1339–1351, 2016.
  • [221] V. Cheplygina, A. van Opbroek, M. A. Ikram, M. W. Vernooij, and M. de Bruijne, “Asymmetric similarity-weighted ensembles for image segmentation,” in Biomedical Imaging (ISBI), 2016 IEEE 13th International Symposium on.   IEEE, 2016, pp. 273–277.
  • [222] W. Shen, M. Zhou, F. Yang, D. Dong, C. Yang, Y. Zang, and J. Tian, “Learning from experts: developing transferable deep features for patient-level lung cancer prediction,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2016, pp. 124–131.
  • [223] B. Cheng, M. Liu, H.-I. Suk, D. Shen, D. Zhang, A. D. N. Initiative et al., “Multimodal manifold-regularized transfer learning for mci conversion prediction,” Brain imaging and behavior, vol. 9, no. 4, pp. 913–926, 2015.
  • [224] R. Paul, S. H. Hawkins, Y. Balagurunathan, M. B. Schabath, R. J. Gillies, L. O. Hall, and D. B. Goldgof, “Deep feature transfer learning in combination with traditional features predicts survival among patients with lung adenocarcinoma,” Tomography: a journal for imaging research, vol. 2, no. 4, p. 388, 2016.
  • [225] V. Cheplygina, M. de Bruijne, and J. P. Pluim, “Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis,” arXiv preprint arXiv:1804.06353, 2018.
  • [226] L. Mena and J. a. Gonzalez, “Machine learning for imbalanced datasets: Application in medical diagnostic,” Breast, pp. 574–579, 2006.
  • [227] N. Japkowicz and S. Stephen, “The class imbalance problem: A systematic study,” Intelligent Data Analysis, vol. 6, no. 5, pp. 429–449, 2002.
  • [228] J. Wang, X. Yang, H. Cai, W. Tan, C. Jin, and L. Li, “Discrimination of breast cancer with microcalcifications on mammography by deep learning,” Scientific Reports, vol. 6, no. February, p. 27327, 2016.
  • [229] W. Samek, T. Wiegand, and K.-R. Müller, “Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models,” arXiv preprint arXiv:1708.08296, 2017.
  • [230] E. Thelisson, K. Padh, and L. E. Celis, “Regulatory mechanisms and algorithms towards trust in ai/ml,” in Proceedings of the IJCAI 2017 Workshop on Explainable Artificial Intelligence (XAI), Melbourne, Australia, 2017.
  • [231] J. R. Zech, M. A. Badgeley, M. Liu, A. B. Costa, J. J. Titano, and E. K. Oermann, “Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study,” PLoS medicine, vol. 15, no. 11, p. e1002683, 2018.
  • [232] R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2015, pp. 1721–1730.
  • [233] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, “Learning deep features for discriminative localization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2921–2929.