Radiomics is a promising and powerful prognostic method for the detection of cancerous tissue. Referring to the high-throughput extraction and analysis of large amounts of quantitative imaging-based features from standardized medical imaging data, radiomics allows for quantitative tumour phenotype characterizations and cancer detection and prognosis via a high-dimensional mineable feature space . Radiomics has previously been applied to lung, breast, prostate, and head-and-neck cancer patient cases    , and demonstrated the prognostic power of radiomics and the potential of radiomic features for personalized medicine, risk stratification, and predicting patient outcomes. However, these radiomic-driven methods rely on pre-defined, hand-crafted quantitative features based on intensity, texture, and shape, and as may not be able to fully characterize the unique traits of specific forms of cancer. As such, a way to uncover quantitative radiomic features tailored for characterizing unique cancer phenotype from standardized imaging data is highly desired. In this study, we introduce a novel discovery radiomics framework where we bypass conventional predefined, hand-crafted radiomic feature models and directly discover custom radiomic feature models via the abundance of readily available medical imaging data. Discovery radiomics has the potential to find new abstract features that capture unique characteristics of cancer phenotypes beyond what predefined feature models can extract, allowing for improved personalized medicine.
The proposed discovery radiomics framework can be described as follows (see Figure 1
). Given past radiology data and corresponding pathology-verified radiologist tissue annotations from a medical imaging data archive, the radiomic sequencer discovery process learns a radiomics sequencer that can extract highly customized radiomic features that are tailored for characterizing unique tissue phenotype that differentiate cancerous tissue from healthy tissue. The discovered radiomic sequencer can be applied to a new patient data to extract the corresponding radiomic sequence for cancer screening and diagnosis purposes.
The radiomic sequencer being discovered in this study is built upon a deep convolutional StochasticNet 
architecture, where a deep convolutional neural network (CNN) is represented as a random graph and the neural connections within this network are formed stochastically based on a probabilistic neural connectivity model, thus leveraging random graph theory to construct more efficient deep neural network architectures that retain modeling capabilities of traditional, densely-connected network architectures. The radiomic sequencer discovered in this study consists of three stochastically-formed convolutional layers, each containing 32, 32, and 64 receptive fields (size
), respectively. Each receptive field is part of a realization of a random graph with a uniform neural connectivity probability of 0.5; that is, the expected number of parameters in each receptive field of the proposed sequencer is only half that of a receptive field in a sequencer built using a conventional deep CNN. Less number of parameters and, therefore, more efficient training and faster testing running time are the most important advantages of the proposed framework compared to the conventional CNN approaches.
3 Results and Discussion
In this study, we used a subset of the LIDC-IDRI [8, 9] dataset. The CT images were acquired via a broad range of CT scanner models from different manufacturers using the following tube peak potential energies for acquiring the scans: 120 (), 130 (), 135 (), and 140 (). A subset of 93 patient cases which have definitive diagnostic results was selected and, using data augmentation, an enriched dataset of 42,340 lung lesions was obtained via the rotation of each malignant and benign lesion by 45 and 10 increments, respectively. The proposed framework was evaluated using the enriched dataset and quantitatively compared to two state-of-art methods  . Note that while a Multi-scale Convolutional Neural Networks (MCNN) architecture  was recently proposed and achieved an accuracy as high as via parameter tuning, this method was trained and tested on radiologist interpretations only and not definite diagnostic results, making the ground truth subject to high inter-observer variability.
As seen in Table 1 the StochasticNet radiomic sequencer (SNRS) achieves high sensitivity while maintains the good specificity. The reported overall accuracy for SNRS noticeably outperforms the tested state-of-the-art methods. These preliminary results illustrate the potential of the proposed discovery radiomics framework for improving cancer screening and diagnosis.
Comparison of performance metrics for belief decision trees (BDT)
, a deep autoencoding radiomic sequencer (DARS), and the discovered StochasticNet radiomic sequencer (SNRS).
This research has been supported by the Ontario Institute of Cancer Research (OICR), Canada Research Chairs programs, Natural Sciences and Engineering Research Council of Canada (NSERC), and the Ministry of Research and Innovation of Ontario. The authors also thank Nvidia for the GPU hardware used in this study through the Nvidia Hardware Grant Program.
-  P. Lambin, E. Rios-Velazquez, R. Leijenaar, S. Carvalho, R. G. van Stiphout, P. Granton, C. M. Zegers, R. Gillies, R. Boellard, A. Dekker et al., “Radiomics: extracting more information from medical images using advanced feature analysis,” European Journal of Cancer, vol. 48, no. 4, pp. 441–446, 2012.
-  H. J. W. L. e. a. Aerts, “Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach.” Nat Commun, vol. 45, no. 4, 2014.
-  O. Gevaert, J. Xu, C. Hoang, A. Leung, Y. Xu, A. Quon, D. Rubin, S. Napel, and S. Plevritis, “Non-small cell lung cancer: identifying prognostic imaging biomarkers by leveraging public gene expression microarray data,” Radiology, vol. 2, no. 4, pp. 387–96, 2012.
-  F. Khalvati, A. Wong, and M. A. Haider, “Automated prostate cancer detection via comprehensive multi-parametric magnetic resonance imaging texture feature models,” BMC medical imaging, vol. 15, no. 1, p. 27, 2015.
-  N. Maforo, H. Li, W. Weiss, L. Lan, and M. Giger, “Radiomics of multi-parametric breast mri in breast cancer diagnosis: A quantitative investigation of diffusion weighted imaging, dynamic contrast-enhanced, and t2-weighted magnetic resonance imaging,” Medical physics, vol. 42, no. 6, pp. 3213–3213, 2015.
-  M. J. Shafiee, P. Siva, and A. Wong, “Stochasticnet: Forming deep neural networks via stochastic connectivity,” arXiv preprint arXiv:1508.05463, 2015.
-  E. N. Gilbert, “Random graphs,” The Annals of Mathematical Statistics, pp. 1141–1144, 1959.
-  S. G. Armato III, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P. Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, E. A. Hoffman et al., “The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans,” Medical physics, vol. 38, no. 2, pp. 915–931, 2011.
-  S. G. Armato III, G. McLennan, M. F. McNitt-Gray, C. R. Meyer, D. Yankelevitz, D. R. Aberle, C. I. Henschke, E. A. Hoffman, E. A. Kazerooni, H. MacMahon et al., “Lung image database consortium: Developing a resource for the medical imaging research community 1,” Radiology, vol. 232, no. 3, pp. 739–748, 2004.
-  D. Zinovev, J. Feigenbaum, J. Furst, and D. Raicu, “Probabilistic lung nodule classification with belief decision trees,” in Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE. IEEE, 2011, pp. 4493–4498.
D. Kumar, A. Wong, and D. A. Clausi, “Lung nodule classification using deep features in ct images,” inComputer and Robot Vision (CRV), 2015 12th Conference on. IEEE, 2015, pp. 133–138.
-  W. Shen, M. Zhou, F. Yang, C. Yang, and J. Tian, “Multi-scale convolutional neural networks for lung nodule classification,” in Information Processing in Medical Imaging. Springer, 2015, pp. 588–599.