Projection image-to-image translation in hybrid X-ray/MR imaging

by   Bernhard Stimpel, et al.

The potential benefit of hybrid X-ray and MR imaging in the interventional environment is enormous. However, a vast amount of existing image enhancement methods requires the image information to be present in the same domain. To unlock this potential, we present a solution to image-to-image translation from MR projections to corresponding X-ray projection images. The approach is based on a state-of-the-art image generator network that is modified to fit the specific application. Furthermore, we propose the inclusion of a gradient map to the perceptual loss to emphasize high frequency details. The proposed approach is capable of creating X-ray projection images with natural appearance. Additionally, our extensions show clear improvement compared to the baseline method.



There are no comments yet.


page 3


Projection-to-Projection Translation for Hybrid X-ray and Magnetic Resonance Imaging

Hybrid X-ray and magnetic resonance (MR) imaging promises large potentia...

MR to X-Ray Projection Image Synthesis

Hybrid imaging promises large potential in medical imaging applications....

Azimuthal Anamorphic Ray-map for Immersive Renders in Perspective

Wide choice of cinematic lenses enables motion-picture creators to adapt...

Graph2Pix: A Graph-Based Image to Image Translation Framework

In this paper, we propose a graph-based image-to-image translation frame...

Deep Image Translation for Enhancing Simulated Ultrasound Images

Ultrasound simulation based on ray tracing enables the synthesis of high...

A 3-D Projection Model for X-ray Dark-field Imaging

Talbot-Lau X-ray phase-contrast imaging is a novel imaging modality, whi...

A Convolutional Approach to Vertebrae Detection and Labelling in Whole Spine MRI

We propose a novel convolutional method for the detection and identifica...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Hybrid imaging exhibits high potential in diagnostic and interventional applications. Future advances in research may leverage the combination of Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) to clinical applicability. Especially for interventional purposes, the gain from simultaneously acquiring soft- and dense-tissue information would yield great opportunities. Assuming the information of both modalities is present at the same time, numerous existing post-processing methods would become applicable. Image fusion techniques, e.g., image overlays, have proven useful in the past. Additionally, one can think about image enhancement techniques, e.g., image de-noising or super-resolution. To enable the latter methods, it is beneficial to have the data available in the same domain. Different solutions to generate CT images from corresponding MRI data were presented in the past

Navalpakkam2013 ; Nie2017 . However, all of these are applied to volumetric data. In contrast, interventional procedures rely heavily on line integral data from X-ray projection imaging. Projection images which exhibit the same perspective distortion can also be acquired directly using an MR device Syben2017

. This avoids time-consuming volumetric acquisition and subsequent forward projection. To solve this image-to-image translation task, we investigate a deep learning-based solution to generate X-ray projections from corresponding MRI views.

2 Methods

Network architecture:

Our network’s design is based on the architecture proposed by Johnson et al. Johnson2016 , which was also adopted for various other applications, e.g., Wanga

. In the original manuscript, multiple residual blocks are introduced at the lowest resolution level of the network. However, the underlying variance in medical projection images is only small compared to natural image scenes. Additionally, during interventional treatments, valuable information is largely drawn from high-frequency details such as contrast and clear edges. To this end, we distribute the residual blocks at higher resolution levels instead. Furthermore, bilinear upscaling is used in place of the transposed convolution operation, which was recently related to checkerboard artifacts 

Odena2016 . A visualization of the final architecture is shown in Fig. 1.

Figure 1: The proposed architecture for the generator network.

Objective function:

Considering the importance of high frequency structures, using a perceptual loss as proposed by Johnson2016 is suitable, as pixel-wise metrics are related to blurrier results in comparison Dosovitskiy2016

. In prior work we concluded that utilizing the VGG-19 network pre-trained on ImageNet for the computation of this perceptual loss is appropriate for medical projection images

Stimpel2017a . To emphasize the influence of high-frequency details, we include a gradient map of the label images into the optimization process. Subsequently, this map is used to weight the loss such that the loss generated from edges is emphasized and that from homogeneous regions is attenuated. Mathematically, this can be formulated as


where and are the feature activation maps of the VGG-19 network of the label and generated image at the layer , respectively, and is the gradient map of the label image computed using the Sobel filter.

Data and Experiments:

Four patient head scans from both modalities were provided by the Department of Neuroradiology, University Hospital Erlangen (MR: 1.5 T MAGNETOM Aera / CT: SOMATON Definition, Siemens Healthineers, Erlangen / Forchheim, Germany). The tomographic data was registered using 3D Slicer and forward projections were generated using the CONRAD framework Maier2013 . The projections from three patient scans were used for training and one for testing.

3 Results and Discussion

The proposed approach was successful in generating X-ray projections with a contrast similar to the one seen in true fluoroscopic X-ray images. Results of the proposed projection image-to-image translation pipeline are shown in Fig. 2. In Fig. (d)d to (f)f the influence of the modified network architecture, as well as the weighted loss w.r.t. to the edge map are presented. Improvements can be observed in the overall increased contrast of high-frequency details. Using the originally proposed architecture Johnson2016 ; Wanga , which gathers the residual blocks at the lowest resolution level, results in overall blurrier results and missing bone structures as seen in Fig. (d)d. In contrast, the projections generated with the edge-weighted loss resemble the label images more closely. This can especially be observed at the base of the head which is marked in the respective figures. The projections created without the weighting also produce many high-frequency details in this region, however, these are less specific in comparison. Naturally, details that are not visible in the MRI projections can also not be transferred to the generated images. An example would be interventional devices that are X-ray but not MR sensitive. Regarding subsequent post-processing applications, the question arises how this missing information in the generated projection images should be dealt with, which is subject to future work.

(a) Input: MRI projection
(b) Label: X-ray projection
(c) Abs. difference: Label (b) and Ours (f)
(d) Output: Original architecture with edge weighted loss
(e) Output: Our approach without edge weighted loss
(f) Output: Our approach with edge weighted loss
Figure 2: Representative examples of the projection image-to-image translation.

4 Conclusion

We presented an approach to generate X-ray-like projection images from corresponding MRI projections. The proposed extensions of the image-to-image translation pipeline with regards to the baseline method derived from natural image synthesis showed qualitative improvements in the generated output. With future advances in hybrid X-ray and MR imaging, especially in the interventional environment, this domain transfer can be used to apply valuable post-processing methods.


This work has been supported by the project P3-Stroke, an EIT Health innovation project. EIT Health is supported by EIT, a body of the European Union.


  • (1) B. K. Navalpakkam et al., “Magnetic resonance-based attenuation correction for PET/MR hybrid imaging using continuous valued attenuation maps,” Invest. Radiol., vol. 48, no. 5, pp. 323–332, 2013.
  • (2) D. Nie et al., “Medical Image Synthesis with Context-Aware Generative Adversarial Networks,” Med. Image Comput. Comput. Interv. − MICCAI 2017, pp. 417–425, 2017.
  • (3) C. Syben et al., “Fan-beam Projection Image Acquisition using MRI,” in 3rd Conf. Image-Guided Interv. Fokus Neuroradiol., M. Skalej et al., Eds., 2017, pp. 14–15.
  • (4) J. Johnson et al., “Perceptual Losses for Real-Time Style Transfer and Super-Resolution,” arXiv:1603.08155, 2016.
  • (5) T.-C. Wang et al., “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs,” arXiv:1603.08155, 2017.
  • (6) A. Odena et al., “Deconvolution and Checkerboard Artifacts,” Distill, vol. 1, no. 10, 2016.
  • (7) A. Dosovitskiy et al., “Generating Images with Perceptual Similarity Metrics based on Deep Networks,” Adv. Neural Inf. Process. Syst. 29 (NIPS 2016), pp. 658–666, 2016.
  • (8) B. Stimpel et al., “MR to X-Ray Projection Image Synthesis,” in Proc. Fifth Int. Conf. Image Form. X-Ray Comput. Tomogr., 2017.
  • (9) A. Maier et al., “CONRAD - A software framework for cone-beam imaging in radiology,” Med. Phys., vol. 40, no. 11, 2013.