Lesion Focused Super-Resolution

10/15/2018 ∙ by Jin Zhu, et al. ∙ University of Cambridge Imperial College London 0

Super-resolution (SR) for image enhancement has great importance in medical image applications. Broadly speaking, there are two types of SR, one requires multiple low resolution (LR) images from different views of the same object to be reconstructed to the high resolution (HR) output, and the other one relies on the learning from a large amount of training datasets, i.e., LR-HR pairs. In real clinical environment, acquiring images from multi-views is expensive and sometimes infeasible. In this paper, we present a novel Generative Adversarial Networks (GAN) based learning framework to achieve SR from its LR version. By performing simulation based studies on the Multimodal Brain Tumor Segmentation Challenge (BraTS) datasets, we demonstrate the efficacy of our method in application of brain tumor MRI enhancement. Compared to bilinear interpolation and other state-of-the-art SR methods, our model is lesion focused, which is not only resulted in better perceptual image quality without blurring, but also more efficient and directly benefit for the following clinical tasks, e.g., lesion detection and abnormality enhancement. Therefore, we can envisage the application of our SR method to boost image spatial resolution while maintaining crucial diagnostic information for further clinical tasks.



There are no comments yet.


page 2

page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Images with high resolution (HR) are greatly in demand for many real applications[trinh2014novel]. However, the resolution and quality of the images are normally limited by the imaging hardware. For medical images, which provide useful and crucial details of the anatomical and physiological information for the patients, are very desirable with HR. In addition to the possible restrictions of the imaging hardware, medical images are more susceptible by the health limitations (e.g., ionizing radiation dose of using X-ray) and acquisition time limitations (e.g., Specific Absorption Rate limits of using MRI). Moreover, movements due to patients fatigue and organs pulsation will further degrade image qualities and result in images with lower signal-to-noise ratio (SNR). Low resolution (LR) medical images with limited field of view and degraded image quality could reduce the visibility of vital pathological details and compromise the diagnostic accuracy and prognosis [yang2016combined].

Research studies have shown that image super-resolution (SR) provides an alternative and relatively cheaper solution to improve the perceptual quality of medical images in terms of the spatial resolution enhancement instead of hardware improvement. Compared to conventional image interpolation, SR methods can provide better HR outputs with higher SNR and less blurry effects. Broadly speaking, there are two different types of SR: (1) using multiple LR images acquired from different views of the same object to reconstruct the HR output, but acquiring multi-view images could be expensive and sometimes infeasible; (2) learning a particular SR model using LR-HR training pairs, and performing the inference on a new input LR image to yield the HR output [yang2012coupled, trinh2014novel].

More recently, deep learning base SR methods have boosted the performance of the super-resolved HR images mainly owe to the development of the computing power and the available big data. For example, the SRGAN method

[ledig2017photo], which was developed based on a Generative Adversarial Network (GAN) model, has demonstrated fast and accurate SR results. However, SRGAN has been developed for natural images and there are still limited studies for medical images.

In this study, we developed a lesion focused SR (LFSR) method that leverage the merits of GAN based models to generate perceptually more realistic SR results and also avoid introducing non-existing features into the lesion area after SR. By performing simulation based studies on the Multimodal Brain Tumor Segmentation Challenge (BraTS) datasets, we demonstrate the efficacy of our SR method in application of spatial resolution enhancement of brain tumor MRI images to potentially maintain crucial diagnostic information for further clinical tasks.

2 Methods

2.1 Lesion Focused SR

Our LFSR includes a lesion detection neural network

, a super resolution images generator , a HR/SR images discriminator , and a pre-trained 19 layers VGG[vgg19] (Fig.1). The aims to detect the region of interest (ROI, e.g. brain tumors), and , from whole size LR and HR images and before we applying the GAN:


We propose a max pooling residual block and an input-scale free residual neural network

. Compared to the residual blocks [resnet]

and skip connection have been widely used, a max pooling layers is added after two residual blocks, which include two skip connections between four convolution and batch normalization layers. This can help accelerate the training process, and reduce the memory cost of the ROI detection task.

During training, and of the GAN are playing a game:

aims to estimate as realistic as possible SR images,

, from , and the discriminator aims to figure them out from the ground truth . With the lesion detection, the training aims to solve:


where and are the trainable parameters, and

are the loss functions for the

and . In our proposed LFSR, we use a SR residual network (SRResNet) as the generator , which includes 16 residual blocks, and following sub-pixel convolution layers. The discriminator and the pre-trained VGG are trained simultaneously with to generate perceptually realistic image features.

Figure 1: The schema of our proposed lesion focused super resolution (LFSR) neural network.

2.2 Data Preprocessing and Training Settings

We have tested bilinear interpolation, SRResNet, SRGAN[ledig2017photo] and LFSR on the post-contrast T1-weighted (T1Gd) MRI scans from the BraTS 2018 datasets [menze2015multimodal], which have been randomly divided into training ( images) and validation (

images) datasets. All the slices are normalized to zero-mean and unit-variance. We simulated the LR images by downsampling the HR ground truth and tested with additive white Gaussian noises (AWGN,

) applied in the k-spaces [Bao2003].

All the experiments were performed on a Linux workstation with NVIDIA TITAN Xp GPUs. All the models were implemented in Python, based on the TensorLayer [tensorlayer2017] library, and were trained with Adam optimizer with the initial learning rate of . The

was trained independently for 100 epochs with

loss. The SRResNet was trained for 350 epochs with the pixel-wise mean square error loss . The generator in SRGAN and LFSR was initially trained with for 50 epochs, then the GAN was trained with , and , where was the percentage of incorrectly distinguished , and , and were the percentages of correctly distinguished and .

3 Results and Discussions

3.1 Lesion Detection

Our has achieved high accuracies on both X2 and X4 downsampled images. In evaluation, we defined that if a tumor was covered by the predicted ROI, it was a perfect detection, and if it was covered, it was a acceptable detection. In the X2 case, images () were perfect detections, and other () were acceptable detections. In the X4 case, images () were perfect detections, and other () acceptable detections .

3.2 X2 and X4 SR

Here we showed the X2 and X4 SR results in Fig. 2. Both bilinear interpolation and SRResNet have produced blurry SR results although SRResNet has achieved the highest PSNR. SRGAN and our proposed LFSR have resulted in images with more realistic texture features compared to the ground truth. Compared to the SRGAN, our LFSR has obtained higher (X2 cases) or equivalent (X4 cases) PSNR. More importantly, our LFSR has achieved significant reduction of the GPU memory cost; therefore, our LFSR can double the batch size, which has accelerated the training process to 266.8s/epoch for X2 and 194.8s/epoch for X4 (compared to SRGAN training time 649.8s/epoch for X2, and 370.8s/epoch for X4).

Figure 2: The ROIs of ground truth, the detected ROI with the predicted SR images(range: [-1, 1]), PSNR/SSIM of each result are also displayed.

3.3 X2 SR with Additive Noise

We have also tested LFSR in X2 SR with additive Gaussian noise. The bilinear interpolation with non local means denoising [nonlocaldenoising] method (B+NLD) was tested to suppress the noise and provided a more fair comparison. All three deep learning methods have achieved higher PSNR and SSIM when noise presented (Table 1). The based SRResNet has still achieved the highest PSNR and SSIM. In contrast, both SRGAN and LFSR have been still able to generate more perceptually realistic textures from our qualitative studies. Furthermore, LFSR has achieved higher PSNR and SSIM than the SRGAN for the noisy cases (Table 1), and more efficient training.


X2 X4 X2() X2() X2 X4 X2() X4()
B+NLD 29.1 25.2 20.7 17.1 0.900 0.761 0.623 0.483
SRResNet 35.6 27.9 32.8 30.6 0.962 0.832 0.895 0.840
SRGAN 29.6 25.7 27.6 26.2 0.865 0.741 0.789 0.731
LFSR 32.1 25.1 29.0 27.4 0.914 0.723 0.832 0.772
Table 1: PSNR and SSIM results for simulations with and without additive Gaussian noise (bold: better than SRGAN).

4 Conclusion

In summary, we have developed and validated a lesion focused SR (i.e., LFSR) method to super-resolve the tumor ROIs imaged by MRI. Compared to state-of-the-arts SR method, our proposed LFSR method is more efficient and it can result in perceptually more realistic SR, which will maintain crucial image features for further clinical tasks and decisions. In the final camera ready version, we will include a more detailed description of our method and more comparison results.