Liver cancer is one of the leading causes of cancer-related deaths in many parts of the world and one of the most common cancers among males in Singapore [sgweb2020]. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer which is the sixth most common cancer in the world and the fourth leading cause of cancer mortality globally [yang2019global]. HCC is known to be a highly heterogeneous disease [dragani2010risk], especially in genetic level, which means the lack of consistency in the therapeutic outcome and thus may lead to the difficulty in clinic for designing targeted therapy and precision medicine. Recent studies have identified several mutations in genes associated with HCC. For instance, telomerase reverse transcriptase (TERT) promoter, TP53/p53 and CTNNB1/beta-catenin have been identified as the most commonly mutated genes in HCC [rao2016frequently].
In general, current methods of genome sequencing require liver biopsy and specialized equipment, which limit the routine usage. Besides, these invasive sampling methods are destructive, and can not provide multiple continuous snapshots. Many research work have demonstrated that different imaging features in computed tomography (CT) scans are correlated with the global gene expression of primary human liver cancer, e.g., Segal et al. used 28 features out of over 100 image features manually extracted from CT scans to reconstruct of the gene expression patterns in HCC patients [segal2007decoding]. Such an integrated diagnostic approach between Radiology and Genomics is termed as “radiogenomics” [rutman2009radiogenomics].
However, the development of radiogenomics for HCC is critically impeded by the following challenges. First, Dynamic CT images are widely used in liver CT scanning. Multi-phase CT scans are divided into four phases, i.e., non-contrast (NC) phase, arterial (ART) phase, portal venous (PV) phase, and delay (DL) phase in Fig. 1. Varying object sizes, e.g., lesions and tumors make it difficult to obtain comprehensive information in dynamic multi-modal images. Second, there is intra-tumour heterogeneity (ITH) in HCC [zhai2017spatial]
, which means gene mutations are different in different parts of tumors. This scenario makes machine learning approaches, especially convolutional neural network (CNN) very difficult to train for gene mutation prediction in HCC, as the input slices of the whole tumor with different gene mutations are assigned the same mutation label. Third, radiogenomics requires quantitative imaging features annotated by expert radiologists, which is laborious, time-consuming and suffers from high intra / inter-observer variance.
In this work, we propose a deep CNN framework to address the challenges of automatic gene mutation prediction in HCC. More specifically, a multi-stream CNN is applied for multi-phase cross-modal feature extraction followed by an aggregation layer to effectively fuse and utilize 4D information. Besides, some image traits (biomarkers)[ocker2018biomarkers] identified by radiologists are aggregated as auxiliary information for decision making. Extensive experiments are implemented on the dataset collected from different hospitals in Singapore, in which, multi-region sampling is applied to avoid mismatch problems in ITH. Experimental results and analysis demonstrate the effectiveness of the proposed framework for mutation prediction in APOB, COL11A1, and ATRX genes, which is easily implemented to predict more gene mutations if sufficient data are provided.
Ii Related Work
Ii-a Radiogenomics and AI
As a rapidly developing field, radiogenomics has shown potential value for diagnostic and therapeutic strategies. In the fields of radiomics and radiogenomics, high-throughput extraction of qualitative and quantitative imaging features from radiographs is required to obtain diagnostic, predictive, or prognostic information [kumar2012radiomics]. To explore the relationships between gene expression and imaging, Segal et al. [segal2007decoding] defined ‘units of distinctiveness’, termed as ‘traits’ from qualitative imaging features of liver cancer and reconstructed nearly 80% of the global gene expression profiles using 28 image traits. Aerts et al. [aerts2014decoding]
proposed a quantitative strategy for the correlation between CT images and genome data, which reflects great clinical significance. These contribute to the development of emerging technologies such as computer vision and deep learning in radiogenomics[li2018novel, smedley2018using]
. The use of artificial intelligence (AI) for genomics and molecular profiling of cancers will be hugely beneficial as it is non-invasive and captures a comprehensive view of the tumor. However, most of the current radiogenomics analysis in liver cancer extracted image features from single-phase images which did not consider the changes of shapes and sizes of tumors across phases. Besides, merging multiphasic information remains a critical issue.
Ii-B Multi-region Sampling
Genomic profiling methods behind most of the current radiogenomics analysis rely on single biopsy samples, which often reflects a part of the tumor. These methods significantly underestimate intratumoral genomic heterogeneity in cancers especially in HCC. A large proportion of HCC displays a clear geographic segregation where spatially closer sectors are genetically more similar [zhai2017spatial]. Furthermore, this genetic heterogeneity in HCC influences the training process of CNNs. Slices with different mutations are put into networks with the same mutation label, which misleads the training process of networks. Zhai et al. [zhai2017spatial] first carried out research on intra-tumour heterogeneity (ITH) in HCC using multi-region sampling. As shown in Fig. 1, first, tumor and sectors were annotated on the CT scans. Then a central slice is cut from the patient tumor, and a linear grid of tumor sectors is then harvested and examined for further multiomics analysis. The dataset used in this work adopted multi-region sampling.
The proposed method for gene mutation prediction is illustrated in Fig. 3, which consists of three main components: (i) Multi-stream Feature Extraction, (ii) Feature Aggregation, and (iii) Auxiliary Embedding. First, the processed images from different phases are put into four deep CNN blocks separately for feature extraction; then a lightweight CNN is applied to fuse multi-stream information for further prediction [DLCase]. For better prediction, biomarkers annotated by doctors are embedded into the network as auxiliary information to improve the prediction accuracy.
In multi-stream feature extraction, Residual Neural Network (ResNet) [resnet, zhao2019bira]deng2009imagenet] is applied.
In feature aggregation, feature maps from four streams are combined by concatenation along the channel dimension, followed by a shallow network including one convolution layer (
Compared to using images only, some image traits (biomarkers) are embedded into the final fully connected layer. These image traits are the characteristics described by radiologists, which can be regarded as indicators of physiologic and pathologic processes in response to various diagnostic or therapeutic procedures. In our experiments, nine kinds of biomarkers are annotated as binary variables by doctors, such as “presence of intra-tumoral vessels”. More details can be found in Fig.5
. These biomarkers are treated as a series of binary numbers followed by a fully connected layer with a size of 128. The output of the fully connected layer is combined with image features for further prediction. Binary cross-entropy is used as the loss function for the binary classification problem.
Iv-a Dataset and implementation
The dataset has been collected from multiple hospitals located in Singapore, which consists of 3D multiphasic CT scans, genomics information and biomarker sequences from patients with the approval of the Institutional Review Board. Due to various sources of data,
patients with all four phases were considered in our experiments. The DNA sequencing was based on multi-region sampling, in which, gene mutations are different from different sectors of the tumor, therefore, our training samples are on the basis of the sector rather than patient. The ground truth mutation labels were extracted from the DNA sequencing, and the corresponding sectors were extracted from the patients’ CT scans. Because of various sizes of sectors, the sectors were padded with zeros to keep the same size in length and width, and five adjacent slices along-axis based on the central slice of sector were cropped from volumetric images to probe the spatial information along the third dimension, see Fig. 4. Typically, HCC in different regions of the liver is clinically scored on the basis of their CT image features, such as size and margin. Therefore, nine biomarkers were annotated by radiologists based on CT scans.
Because of data imbalance in different gene mutations, prevalent mutated genes, i.e., APOB, COL11A1 and ATRX are considered in this study. Given that the mutations are not mutually exclusive, a separate model per mutation is used to alleviate the problem of data imbalance among mutations. We compare our model with single-stream CNN using single-phase CT scans, in which, ResNet-18 is trained on images from one of the four phases, e.g., Single Stream (ART phase). To explore the effectiveness of auxiliary embedding in the proposed architecture, the proposed model without biomarkers is also implemented, i.e
., Proposed (without biomarkers). To validate the stability of the proposed model, leave-one-out cross validation (LOOCV) is performed, in which, each patient is excluded from the training set one at a time and then classified on the basis of the predictor built from the data for all the other patients.
|Single Stream (NC phase)||58.9%||51.3%||60.1%|
|Single Stream (ART phase)||70.1%||55.5%||57.5%|
|Single Stream (PV phase)||64.1%||56.4%||68.1%|
|Single Stream (DL phase)||68.4%||56.9%||64.5%|
|Proposed (without biomarkers)||77.3%||67.7%||73.4%|
Iv-B Results and discussion
Table I summarizes the results of all methods on the dataset. The proposed method outperforms all other methods in different gene mutations, which demonstrates the effectiveness of feature fusion cross phases. Besides, biomarkers can help to improve the accuracy of mutation prediction in COL11A1 and ATRX, which proves the feasibility of biomarkers in radiogenomics analysis.
To further explore the correlations among biomarkers and gene mutations, a correlation map is generated with the whole dataset, as shown in Fig. 5
. We find that most of these biomarkers are correlated with gene mutations. Furthermore, some gene mutations are correlated with each other. We performed hierarchical clustering of the correlation map and found that some gene mutations are grouped into the same cluster, therefore, the complex relationships among gene mutations can be modeled using graph neural networks for prediction in the future[chen2019gated].
V Conclusions and Future Work
In this work, we discuss the challenges and standard-of-care imaging technologies in HCC radiogenomics analysis. Considering the intra-tumour heterogeneity (ITH) in HCC, we propose a sector-based multi-stream cross-modal deep learning framework for mutation prediction in genes. Multiphasic CT scans are processed and extracted by multi-stream CNN followed by feature aggregation. Moreover, the biomarkers are embedded into the final layer for further prediction. Experimental results on the dataset show the effectiveness of the proposed framework on mutation prediction in APOB, COL11A1, and ATRX genes. However, our framework is extendable to more gene mutations with sufficient training data. The correlation between the gene mutations and biomarkers not only validate the predictive value of biomarkers, but also show the significant correlations among different gene mutations. In our future work, the relationships between gene mutations will be analyzed and considered as predictor variables.