Progressively Volumetrized Deep Generative Models for Data-Efficient Contextual Learning of MR Image Recovery

11/27/2020 ∙ by Mahmut Yurt, et al. ∙ 0

Magnetic resonance imaging (MRI) offers the flexibility to image a given anatomic volume under a multitude of tissue contrasts. Yet, scan time considerations put stringent limits on the quality and diversity of MRI data. The gold-standard approach to alleviate this limitation is to recover high-quality images from data undersampled across various dimensions such as the Fourier domain or contrast sets. A central divide among recovery methods is whether the anatomy is processed per volume or per cross-section. Volumetric models offer enhanced capture of global contextual information, but they can suffer from suboptimal learning due to elevated model complexity. Cross-sectional models with lower complexity offer improved learning behavior, yet they ignore contextual information across the longitudinal dimension of the volume. Here, we introduce a novel data-efficient progressively volumetrized generative model (ProvoGAN) that decomposes complex volumetric image recovery tasks into a series of simpler cross-sectional tasks across individual rectilinear dimensions. ProvoGAN effectively captures global context and recovers fine-structural details across all dimensions, while maintaining low model complexity and data-efficiency advantages of cross-sectional models. Comprehensive demonstrations on mainstream MRI reconstruction and synthesis tasks show that ProvoGAN yields superior performance to state-of-the-art volumetric and cross-sectional models.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 10

page 12

page 24

page 25

page 26

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Magnetic resonance imaging (MRI) is a non-invasive imaging modality that has pervasive diagnostic applications in the clinic [intro_ref]

. Fundamental advantages of MRI over competing modalities include its ability to produce volumetric images of bodily tissues at arbitrary spatial orientations and its ability to capture a given anatomy under diverse set of distinct tissue contrasts. In an MRI protocol, tissue contrast is setup by tailored pulse sequences, and then tissue signals are spatially resolved via Fourier encoding. In traditional pipelines, data collected in the spatial frequency domain known as k-space are then inverse Fourier transformed to reconstruct volumetric images of a selected anatomy. However, this intrinsically slow data acquisition process is a limiting factor for the quality and diversity of MR images collected. Therefore, there has been persistent interest in methods for accelerated MRI. To maximize diagnostic information, these methods solve inverse problems aiming to recover a comprehensive set of high-quality MR images from lower-quality acquisitions undersampled across k-space

[Pruessmann1999, Griswold2002, Lustig2007, Lustig2008, Lustig2010, Ravishankar2011a, Quan2018c, Yu2018c, Mardani2019b, lee2018deep, Hammernik2017, Yang2016, Schlemper2017a, Mardani2017, Akcakaya2019, Kwon2017, Zhu2018, Hyun2018, Wang2016, Han2018a, Cheng2018, Dartransfer] or less diverse acquisitions undersampled across contrast sets [com_sen_mr_tissue, atlas_based_intensity, patch_based_one_to_one_1, patch_based_one_to_one_2, patch_based_one_to_one_3, patch_based_one_to_one_4, simulta_super_res, loc_sens_nn_1, unsup_cross_model_synth, dictionary_one_to_one_1, dict_learning_im_synth, modality_prop, nn_one_to_one_1, nn_one_to_one_2, Jog2017b, example_based, les_seg, ex_mod_prop, Chartsias2018c, Joyce2017c, mmgan, Dar2019, collagan, 3D_cgan, mr_tra_seg, synthetic_mri, mra_synth, darsynergistic, eagan, diamondgan]. Naturally, learning-based models are gaining immense traction in this area due to their capacity to solve even the most challenging inverse problems [Ravishankar2011a, Quan2018c, Yu2018c, Mardani2019b, lee2018deep, Hammernik2017, Yang2016, Schlemper2017a, Mardani2017, Akcakaya2019, Kwon2017, Zhu2018, Hyun2018, Wang2016, Han2018a, Cheng2018, patch_based_one_to_one_1, patch_based_one_to_one_2, patch_based_one_to_one_3, patch_based_one_to_one_4, simulta_super_res, loc_sens_nn_1, unsup_cross_model_synth, dictionary_one_to_one_1, dict_learning_im_synth, modality_prop, nn_one_to_one_1, nn_one_to_one_2, Jog2017b, example_based, les_seg, ex_mod_prop, Chartsias2018c, Joyce2017c, mmgan, Dar2019, collagan, 3D_cgan, mr_tra_seg, synthetic_mri, mra_synth, darsynergistic, eagan, diamondgan, mustgan].
 
Two mainstream image recovery problems in MRI are reconstruction 111Reconstruction is the process of transforming undersampled raw MRI acquisitions into fully-sampled high-quality MR images.[Pruessmann1999, Griswold2002, Lustig2007, Lustig2008, Lustig2010, Ravishankar2011a, Quan2018c, Yu2018c, Mardani2019b, lee2018deep, Hammernik2017, Yang2016, Schlemper2017a, Mardani2017, Akcakaya2019, Kwon2017, Zhu2018, Hyun2018, Wang2016, Han2018a, Cheng2018, Dartransfer] and synthesis 222Synthesis is the process of recovering images of missing MR contrasts from corresponding images of available MR contrasts within the same anatomy.[Ravishankar2011a, Quan2018c, Yu2018c, Mardani2019b, lee2018deep, Hammernik2017, Yang2016, Schlemper2017a, Mardani2017, Akcakaya2019, Kwon2017, Zhu2018, Hyun2018, Wang2016, Han2018a, Cheng2018, Dartransfer, patch_based_one_to_one_1, patch_based_one_to_one_2, patch_based_one_to_one_3, patch_based_one_to_one_4, simulta_super_res, loc_sens_nn_1, unsup_cross_model_synth, dictionary_one_to_one_1, dict_learning_im_synth, modality_prop, nn_one_to_one_1, nn_one_to_one_2, Jog2017b, example_based, les_seg, ex_mod_prop, Chartsias2018c, Joyce2017c, mmgan, Dar2019, collagan, 3D_cgan, mr_tra_seg, synthetic_mri, mra_synth, darsynergistic, eagan, diamondgan, mustgan]. Their solutions require learning a transformation between three-dimensional (3D) source and target images. A native approach would therefore perform a single-shot global mapping between the two volumes [3D_cgan, 3d_synth_2, eagan, 3d_rec_1]. Learning-based volumetric models leverage spatial correlations across all dimensions to better capture contextual information [3D_cgan, 3d_synth_2, eagan]. Introduction of these contextual priors can theoretically lead to more consistent and accurate recovery across the volume. However, 3D models involve substantially more parameters than their two-dimensional (2D) counterparts [3Dreview]. Furthermore, each volume constitutes a single training sample for a 3D model, whereas it would yield several tens of samples for a 2D model. Taken together, these factors render heavier demand for data and impair the learning process for volumetric models [3Dreview].
 
A less data-intensive approach for learning-based MRI recovery is to perform a spatially-localized mapping between corresponding cross-sections of source and target volumes [Quan2018c, Yu2018c, Mardani2019b, lee2018deep, Hammernik2017, Yang2016, Schlemper2017a, Mardani2017, Akcakaya2019, Kwon2017, Zhu2018, Hyun2018, Wang2016, Han2018a, Cheng2018, Dartransfer, nn_one_to_one_2, Chartsias2018c, Joyce2017c, mmgan, Dar2019, collagan, mr_tra_seg, mra_synth, darsynergistic, diamondgan, mustgan]. Volumes are split along a specific rectilinear orientation, and cross-sectional models are then trained to learn the 2D mapping. Since a lower-dimensional mapping is to be learned, cross-sectional models are of lower complexity and have reduced demand for data. In turn, this facilitates the learning process, and often results in more detailed mappings along the transverse dimensions within cross-sections compared to 3D models. That said, 2D models do not fully utilize the contextual information across the longitudinal dimension [3D_cgan, 3d_synth_2, eagan, 3d_rec_1]. This limitation can lead to inconsistency across recovered cross-sectional images and compromise overall recovery performance [3D_cgan, 3d_synth_2, eagan, 3d_rec_1].
 
Here, we introduce a novel progressively volumetrized deep generative model, ProvoGAN, for data-efficient contextual learning of MR image recovery. To improve efficiency, ProvoGAN decomposes complex volumetric recovery tasks into a series of simpler cross-sectional subtasks. The subtasks are implemented consecutively in three rectilinear orientations (e.g., axial-coronal-sagittal), with task-specific optimization of progression order. For a given subtask with a select orientation, the source volume is split across the corresponding longitudinal dimension, and a 2D model is trained to map between cross-sections of source and target images. The predicted cross-sections are then reformatted into a volume and input to the next subtask. This progressive procedure empowers ProvoGAN to recover fine-structrual details in each orientation while ensuring contextual consistency across the volume. Since ProvoGAN comprises a cascade of 2D models, these improvements in recovery performance are achieved without elevated demand for training data. In this study, the 2D models in ProvoGAN were based on conditional generative adversarial networks to ensure a high degree of realism in the recovered images [Goodfellow2014a, condgans]. Comprehensive demonstrations are provided for reconstruction and synthesis in multi-contrast MRI protocols. Our results indicate the superiority of ProvoGAN against state-of-the-art volumetric and cross-sectional models.

2 Methods

2.1 Generative Adversarial Networks

Generative adversarial networks (GAN) are generative models composed of two subnetworks. The first subnetwork is a generator () that aims to synthesize fake samples closely mimicking a target data distribution, while the second subnetwork is a discriminator () that aims to detect whether a given data sample has been drawn from the target distribution or not [Goodfellow2014a]. These subnetworks are trained alternately in a two player zero-sum min-max game in an adversarial setup:

(1)

where

is the adversarial loss function,

denotes expectation,

denotes a random noise vector sampled from a prior distribution, and

denotes an arbitrary real sample drawn from the target domain. In practice, the log-likelihood terms are replaced with squared-loss terms to improve stability [lsgan]:

(2)

where is trained to maximize , whereas is trained to minimize it.
 

While the basic GAN model synthesizes target data samples given a random noise input, recent studies on computer vision

[pix2pix, cycleGAN] and medical imaging [mmgan, Dar2019, collagan, 3D_cgan, mr_tra_seg, synthetic_mri, mra_synth, darsynergistic, eagan, diamondgan] have demonstrated that conditional GAN (cGAN) models [condgans]

are highly effective in image-to-image translation tasks. The central aim in these tasks is to synthesize data samples from the target image domain, given data samples from a separate source image domain. The cGAN model is therefore modified to condition both

and on the source-domain image:

(3)

where denotes the source domain image, and denotes the target domain image. When paired images from the source and target domains are available, a pixel-wise loss between the ground truth and synthesized images can also be included:

(4)

The pixel-wise loss is typically based on the mean-absolute error to reduce sensitivity to outliers and alleviate undesirable smoothing. The mapping learned by the cGAN model grows more accurate as the statistical dependence between source and target domains gets stronger.

2.2 MR Image Recovery via Volumetric GANs

As MR images are intrinsically volumetric, a comprehensive approach for three-dimensional (3D) MR image recovery is to use volumetric GAN (vGAN) models that perform a global mapping between source and target volumes. To learn this mapping, vGAN models commonly employ complex generator and discriminator modules containing 3D convolutional kernels. The loss function is defined over the entire volume in an adversarial setup:

(5)

where denotes the source and denotes the target volumetric images. For MRI reconstruction, is typically the Fourier reconstruction of undersampled acquisitions, and is the fully-sampled reference volume. For MRI synthesis, is the source-contrast volume, and is the target-contrast volume. Note that, in MRI reconstructions, an additional constraint is introduced to enforce consistency of acquired and recovered k-space data:

(6)

where denotes the partial Fourier operator that is defined at the acquired k-space points.
 
Due to their 3D nature, vGAN models can better incorporate contextual information across MRI volumes by leveraging spatial correlations across separate cross-sections [3D_cgan, 3d_synth_2, eagan]. This contextual prior can lead to elevated consistency across the volume and increased accuracy in recovery performance. That said, learning in 3D network models is inherently more difficult since they involve substantially more parameters [3Dreview]. The learning process might be further impaired by data scarcity as the entire volume of each subject is taken as a single training sample [3Dreview] . These limitations often cause vGAN models to settle on suboptimal parameter sets, compromising recovery performance.

2.3 MR Image Recovery via Cross-Sectional GANs

A more focused approach for 3D MRI recovery is based on cross-sectional GAN (sGAN) models that perform localized mappings between 2D cross-sectional images within source and target volumes. These 2D images are typically taken to be individual cross-sections within the volume in a specific rectilinear orientation, i.e., axial, sagittal or coronal. To learn this 2D mapping, sGAN models employ relatively simpler generator and discriminator modules containing 2D convolutional kernels. The loss function is defined for individual cross-sections in an adversarial setup with a pixel-wise loss:

(7)

where and denote the th cross-sections within the source and target volumes. As with sGAN models, - are taken as cross-section images for undersampled and fully-sampled acquisitions in MRI reconstruction, and - are taken as cross-section images of source and target contrasts in MRI synthesis. Consistency between acquired and recovered data can again be enforced during reconstruction via the following procedure:

(8)

where denotes the partial Fourier operator that is defined at the acquired k-space points. Once the mapping between the source and target cross-sections is learned, cross-sections of the target volumes are independently generated, and then the target volumes are recovered by concatenating the generated cross-sections.
 
Due to their 2D nature, sGAN models are less complex and so they naturally have lower demand for data. Individual cross-sections within a subject’s volume are taken as separate training samples, expanding the effective size of the dataset. As a result, more detailed cross-sectional mapping can be learned. However, this advantage comes at the expense of neglecting global contextual information across the volume [3D_cgan, 3d_synth_2, eagan]. Therefore, sGAN models might suffer from inconsistency or inaccuracy of recovered images across cross-sections.

2.4 Progressively Volumetrized GAN (ProvoGAN)

Here, a novel architecture is proposed to address the limitations of volumetric and cross-sectional GAN models. The proposed model, named progressively volumetrized GAN (ProvoGAN), decomposes the complex volumetric image recovery tasks into a series of simpler cross-sectional tasks (Figure 1). The cross-sectional recovery tasks are defined in separate orientations, and are implemented sequentially via cascaded 2D GAN models. We consider rectilinear cross-sections of volumetric MRI datasets in this study, so the selected orientations are axial, coronal and sagittal. Given a specific order of the three orientations (, , ), ProvoGAN first learns a 2D recovery model in orientation

. The entire source volume is processed by this model to estimate the target volume. Afterwards, this volumetric estimate is separated into cross-sections in orientation

, and a separate 2D recovery model is trained. The estimated target volume for is then fed onto the final stage, where a third 2D recovery model is trained in orientation .
 
The cascaded 2D models in three rectilinear orientations in ProvoGAN empower it to progressively enhance recovery of fine-structural details in each individual orientation while at the same time enforcing contextual consistency across the volume. Therefore, ProvoGAN offers the ability to capture global contextual information without drastically expanding model complexity. The ordering of the progression sequences across the orientations is adaptively tuned for maximized recovery performance for individual tasks. We provide detailed formulations for the ProvoGAN model below.
 

Figure 1: ProvoGAN decomposes the complex volumetric image recovery tasks into a cascade of progressive cross-sectional tasks defined across the rectilinear orientations, i.e., axial, coronal, and sagittal. Given a specific order of progression sequence (axial sagittal coronal is given here for demonstration), ProvoGAN first learns a cross-sectional mapping in the first orientation, and processes cross-sections within the entire source volume to estimate the target volume. This volumetric estimate is then divided into cross-sections in the second orientation, and a separate cross-sectional model is learned in the second orientation. The volumetric estimate from the second progression is then fed onto the final progression in which a third cross-sectional model is learned for final recovery. The sequential implementation of the progressive cross-sectional models enables ProvoGAN to gradually improve capture of fine-structural details in each orientation while ensuring global contextual consistency within the volume.

First Progression: ProvoGAN first learns a cross-sectional mapping between the source-target volumes in via a generator () and a discriminator (). The source and target cross-sections in are extracted with a division block (.

(9)

where denotes the source volume, denotes the target volume, denotes the th cross-section of the source volume in , denotes the th cross-section of the target volume in , and denotes the total number of cross-sections within the volumes in . then learns to recover the cross-sections of the target volume from the corresponding cross-sections of the source volume.

(10)

where denotes the th cross-section of the target volume in recovered via the first progression. Meanwhile, learns to distinguish between the real and fake cross-sections.

(11)

where is either generated () or ground truth () target cross-section. To simultaneously train and , a loss function () consisting of adversarial and pixel-wise losses is used.

(12)

Once and are properly trained, cross-sections in for the target volume are independently generated, and then combined with a concatenation block () to recover the entire target volume.

(13)

where denotes the target volume recovered after the first progression.
 
Second Progression: Having learned the cross-sectional mapping in , ProvoGAN then learns a separate recovery model in the second orientation to gradually enhance capture of fine-structural details and spatial correlations. The prediction for the target volume generated in the first progression is also incorporated as an input to the generator to leverage global contextual priors.

(14)

where denotes the th cross-section of the source volume in , denotes the th cross-section in of the target volume recovered in the first progression, and denotes the th cross-section in of the target volume recovered in the second progression. Meanwhile, discriminator learns to distinguish between the generated and real cross-sections.
 
Third Progression: Lastly, ProvoGAN learns a cross-sectional mapping in the third orientation . As in the second progression, the prediction from the previous progression is incorporated into the mapping as prior information. Therefore, the third generator receives as input the cross-sections in of the source volume and the previously recovered volume:

(15)

where denotes the th cross-section of the source volume in , denotes the th cross-section in of the target volume recovered in the second progression, and denotes the th cross-section in of the target volume recovered in the third progression. Meanwhile, discriminator learns to distinguish between the generated and real cross-sections. The final output volume of the proposed method is recovered by combining the generated cross-sections in via a concatenation block :

(16)

where denotes the total number of cross-sections in . Note that, in MRI reconstruction, an additional constraint is introduced after each progression to enforce consistency of the acquired and recovered k-space data via the following procedure.

(17)

where denotes the partial Fourier operator, and denotes the ongoing progression index.

2.5 Network Architectures

To comparatively demonstrate the efficacy of ProvoGAN, we compared it with traditional volumetric and cross-sectional models. For the volumetric model, a three-dimensional (3D) GAN based architecture referred to as volumetric GAN (vGAN) was implemented to learn a global mapping between source and target volumes (see Section 2.2 for further details and formulations). For the cross-sectional model, a two-dimensional (2D) GAN based architecture referred to as cross-sectional GAN (sGAN) was implemented to learn a localized mapping between cross-sections within the source and target volumes (see Section 2.3 for further details and formulations). Separate sGAN models were trained for each individual rectilinear orientation, yielding 3 distinct models: sGAN-A for the axial orientation, sGAN-C for the coronal orientation, and sGAN-S for the sagittal orientation. The details about the network architectures of the proposed and competing methods are provided below in detail.
 
ProvoGAN: The specific implementation of the generator and discriminator submodules within the proposed ProvoGAN model were inspired by a conditional GAN model recently demonstrated for multi-contrast MRI synthesis [Dar2019]

. Each GAN contained generators and discriminators with convolution blocks (Conv) having 2D kernels of size (k), filters (f), stride (s) and activation function (act) in sequence. Generator: Conv (k=7, f=24, s=1, act=ReLU), Conv (k=3, f=48, s=2, act=ReLU), Conv (k=3, f=96, s=2, act=ReLU), 9x ResNet Conv (k=3, f=96, s=1, act=ReLU), fractionally strided Conv (k=3, f=48, s=2, act=ReLU), fractionally strided Conv (k=3, f=24, s=2, act=ReLU), Conv (k=7, f=1, s=1, act=Tanh). Discriminator: Conv (k=4, f=24, s=2, act=leakyReLU), Conv (k=4, f=48, s=2, act=leakyReLU), Conv (k=4, f=96, s=2, act=leakyReLU), Conv (k=4, f=192, s=1, act=leakyReLU), Conv (k=4, f=2, s=1, act=none). The generator in the first progression received as input the cross-sections of the source volume, whereas the generators in the second and third progressions received as input the cross-sections of both source volume and previously recovered target volume. Meanwhile, the discriminators were conditional, therefore they received as input either of the recovered or reference target cross-sections that are concatenated with inputs of the respective generators.


 
vGAN: In vGAN, the generator and discriminator used in ProvoGAN were modified by replacing 2D convolutional kernels with 3D ones. The generator received as input the source volume, whereas the conditional discriminator received as input either reference or recovered target volume concatenated with the source volume.
 
sGAN: The precise architectures of the generator and discriminator submodules within each possible sGAN model (i.e., sGAN, sGAN, and sGAN) are also adopted from the same state-of-the-art conditional GAN model [Dar2019] that was demonstrated for multi-contrast MRI synthesis, and therefore are identical with those used in the first progression of ProvoGAN.

2.6 Training Procedure

ProvoGAN:

The training procedure of the proposed method comprised three progressive phases, and within each phase the respective pair of generator and discriminator architectures were trained to learn cross-sectional recovery in the given orientation. Hyperparameter selection for all phases was adopted from

[Dar2019]

. The only exception was the total number of epochs, which was optimized via PSNR measurements in the validation set. The generator and discriminator were trained with the ADAM optimizer

[adam] (, ) for epochs. The learning rate was set to in the first epochs, and was linearly decayed to in the last epochs. All training samples were used in every training epoch with a batch size of , and instance normalization was performed. The optimal relative weighing of pixel-wise loss to adversarial loss was set to .
 
vGAN: Hyperparameters used in the vGAN model were optimized using PSNR measurements in the validation set. The generator and discriminator were trained with the ADAM optimizer (, ) for epochs. The learning rate optimized was for synthesis tasks and for reconstruction tasks in the first epochs, and was linearly decayed to in the last epochs. All training samples were used in every training epoch with a batch size of . The optimal relative weighing of pixel-wise loss to adversarial loss was for synthesis tasks and for reconstruction tasks.
 
sGAN: The training procedure of each possible sGAN model (sGAN-A, sGAN-C, and sGAN-S) was identical to the 2D models in ProvoGAN. Therefore, the hyperparameter selection optimized for the first progression of ProvoGAN was also utilized for the sGAN model.
 
In single-coil reconstruction, the models were trained to recover a magnitude image given real and imaginary parts of the undersampled image. In multi-coil reconstruction, the networks were first trained to recover a coil-combined magnitude image given real and imaginary parts of coil-combined Fourier reconstructions of undersampled acquisitions. A complex image was then formed by mapping the phase of the coil-combined undersampled image onto the predicted magnitude image. Coil combination was performed using sensitivity maps estimated via ESPIRiT [espirit]. A multi-coil complex image was obtained by projecting the coil-combined network prediction onto individual coils with the estimated sensitivity maps. Data-consistency was enforced in Fourier domain using the multi-coil complex images. In synthesis, networks were trained to recover the magnitude image of the target contrast given magnitude images of the source contrasts.
 

All methods were implemented in Python 2.7 via the PyTorch 0.4.0. The implementations were run on NVIDIA GeForce GTX 1080 Ti and GeForce RTX 2080 Ti GPUs. Code and data to replicate the ProvoGAN and competing models will be publicly available on

http://github.com/icon-lab/mrirecon.

2.7 Datasets

We demonstrated the proposed ProvoGAN approach on two public datasets, and an acquired in-vivo dataset. The first public dataset, IXI (https://brain-development.org/ixi-dataset/), consisted of coil-combined magnitude multi-contrast brain MR images of healthy subjects. The second public dataset, knee dataset [kneedataset], consisted of multi-coil complex knee MR images of healthy subjects. The acquired in-vivo dataset contained multi-contrast brain MR images of both healthy subjects and glioma patients. Further details about each dataset are provided below:
 
The IXI Dataset: T1-, T2-, and PD-weighted brain MR images of subjects were used, where subjects were reserved for training, for validation, and for testing. T1- weighted images were acquired sagitally with repetition time (TR) ms, echo time (TE) ms, flip angle , spatial resolution mm mm mm, and matrix size . T2-weighted images were acquired axially with repetition time (TR) ms, echo time (TE) ms, flip angle , spatial resolution mm mm mm, and matrix size . PD-weighted images were acquired axially with repetition time (TR) ms, echo time (TE) ms, flip angle , spatial resolution mm mm mm, and matrix size . Since the images of separate contrasts were spatially unregistered in this dataset, T2- and PD-weighted images were registered onto T1-weighted images using FSL [fsl_1, fsl_2] via an affine transformation. Registration was performed based on mutual information loss.
 
Knee Dataset: PD-weighted knee MR images of subject were used, where subjects were reserved for training, for validation, for testing. Images were sagittally acquired with repetition time (TR) ms, echo time (TE) ms, spatial resolution mm mm mm, and matrix size .
 
In-vivo Dataset: T1-weighted, contrast enhanced T1-weighted, T2-weighted, and FLAIR images of healthy subjects, glioma patients with homogenous tumor, and glioma patients with heterogenous tumor were used. subjects were reserved for training, for validation, and for testing. To prevent class imbalance, a data augmentation process was performed. MRI exams were performed in the Department of Radiology at Hacettepe University, Ankara, Turkey on Siemens and Philips scanners under a diverse set of protocols with varying spatial resolution and matrix size. All images were registered onto the MNI template of T1-weighted images with an isotropic resolution of . Registration was performed via FSL [fsl_1, fsl_2] using affine transformation based on mutual information loss. Imaging protocols were approved by the local ethics committee at Hacettepe University. All participants provided written informed consent.
 
For MRI reconstruction, volumes in the IXI and knee datasets were were retrospectively undersampled with variable-density sampling patterns for acceleration factors (

). A sampling density function across k-space was taken a bi-variate normal distribution with mean at the center of k-space. The variance of the distribution was adjusted to achieve the expected sampling rate given

. The in-plane orientation was as axial. For MRI synthesis, all images in the IXI and in-vivo datasets were further skull stripped using FSL with functional intensity threshold of , and vertical gradient intensity threshold of .

2.8 Experiments

Task-Specific Optimization of Progression Order in ProvoGAN: Experiments were performed on ProvoGAN to optimize its progression order across the rectilinear orientations for specific tasks. To do this, multiple independent ProvoGAN models were trained while varying the progression order: 1) A  C  S, 2) A  S  C, 3) C  A  S, 4) C  S  A, 5) S  A  C, 6) S  C  A, where A denotes the axial orientation, C denotes the coronal orientation, and S denotes the sagittal orientation. Performance of these models were evaluated on the validation set with PSNR measurements. The experiments were performed separately for all synthesis and reconstruction tasks, and the progression orders optimized for specific tasks were used in all evaluations thereafter.
 
MRI Reconstruction: Reconstruction experiments were performed on the IXI and knee datasets. In the IXI dataset, the proposed and competing methods were demonstrated separately for single-coil reconstruction of T1- and T2-weighted images with four distinct acceleration factors (). Meanwhile, in the knee dataset, the proposed and competing methods were demonstrated for multi-coil reconstruction of PD-weighted images again with (). Note that a single sGAN model is trained in the axial orientation (sGAN-A) given the axial readout direction.
 
MRI Synthesis: Synthesis experiments were performed on the IXI and in-vivo datasets. In the IXI dataset, the proposed and competing methods were demonstrated for one-to-one and many-to-one synthesis tasks. For one-to-one synthesis, distinct tasks were considered: 1) T1  T2, 2) T1  PD, 3) T2  T1, 4) T2  PD, 5) PD  T1, 6) PD  T2. For many-to-one synthesis, distinct tasks were considered: 1) T2, PD  T1, 2) T1, PD  T2, 3) T1, T2  PD. Meanwhile, in the in-vivo dataset, the proposed and competing methods were demonstrated for many-to-one synthesis with distinct tasks: 1) T2, FLAIR, T1c  T1, 2) T1, FLAIR, T1c  T2, 3) T1, T2, T1c  FLAIR, 4) T1, T2, FLAIR  T1c. Note that three independent sGAN models were implemented, each trained to recover target cross-sections in a separate orientation: sGAN-A for the axial orientation, sGAN-C for the coronal orientation, sGAN-S for the sagittal orientation.
 
The same set of subjects were used in all experiments for training and testing the proposed and competing methods. The recovered target volumes were compared with the ground-truth target volumes via PSNR and SSIM.

ProvoGAN sGAN vGAN
PSNR SSIM PSNR SSIM PSNR SSIM
R=4 T1 35.64 1.65 0.97 0.005 33.44 1.27 0.93 0.008 29.86 1.27 0.86 0.010
T2 37.22 1.94 0.97 0.005 35.21 1.54 0.94 0.007 31.57 1.28 0.90 0.016
R=8 T1 32.45 1.27 0.96 0.007 30.44 1.37 0.91 0.011 29.64 1.03 0.88 0.015
T2 33.23 2.64 0.96 0.010 32.34 1.90 0.91 0.009 30.68 1.02 0.91 0.012
R=12 T1 30.44 1.01 0.93 0.011 27.80 1.05 0.87 0.014 26.60 0.92 0.82 0.014
T2 31.18 1.42 0.94 0.010 30.27 1.38 0.89 0.012 28.68 0.75 0.89 0.010
R=16 T1 29.23 1.45 0.92 0.012 26.90 1.40 0.86 0.017 27.47 1.45 0.83 0.019
T2 30.92 1.56 0.94 0.011 29.56 1.40 0.89 0.014 27.98 0.67 0.86 0.013

Volumetric PSNR and SSIM measurements between the reconstructed and ground-truth images in the IXI dataset are given as mean std for the test set. The measurements are reported for the proposed ProvoGAN and competing sGAN and vGAN methods for four distinct acceleration factors (). Boldface indicates the best performing method.

Table 1: Quality of Reconstruction in the IXI Dataset

3 Results

3.1 Optimizing task-specific progression order can enhance ProvoGAN’s performance

In image recovery tasks, tissue distributions and spatial correlations of the given source and target volumes may uniquely vary across the orientations for an aimed mapping. We therefore predicted that the progression sequence in ProvoGAN can significantly affect task-specific recovery performance. To test this prediction, we performed reconstruction and synthesis experiments separately on the IXI, in-vivo, and knee datasets (see Section 2.8 for further details). In the experiments, we comparatively evaluated performance of multiple independent ProvoGAN models while varying the progression sequence: 1) A  C  S, 2) A  S  C, 3) C  A  S, 4) C  S  A, 5) S  A  C, 6) S  C  A, where A denotes the axial orientation, C denotes the coronal orientation, and S denotes the sagittal orientation. Here, we considered volumetric PSNR measurements between the recovered and reference target volumes within the validation set. In reconstruction, the highest and lowest performing ProvoGAN models yielded an average PSNR difference of dB for single-coil reconstruction tasks in IXI, and dB for multi-coil reconstruction tasks in the knee dataset (see Supp. Table 1 and Supp. Table 2 for details). Meanwhile, in synthesis, the average PSNR difference between the highest and lowest performing ProvoGAN models is determined to be dB for one-to-one synthesis tasks, dB for many-to-one synthesis tasks in IXI, and dB for many-to-one synthesis tasks in the in-vivo dataset (see Supp. Table 3, Supp. Table 4 and Supp. Table 5 for details). These results strongly indicate that a proper optimization of the progression order across the orientations significantly enhances recovery performance. Therefore, the progression orders optimized for specific recovery tasks were utilized in all evaluations thereafter.

Figure 2: The proposed method is demonstrated on the multi-coil knee dataset for reconstruction task with acceleration ratio . Representative results are displayed for all competing methods together with the undersampled source images (first column) and the ground truth target images (second column). The first two rows display results for the axial orientation, the second two rows for the coronal orientation, and the third two rows for the sagittal orientation. Overall, the proposed method offers delineation of tissuess with higher spatial resolution compared to vGAN, and alleviates discontinuity artifacts by improving synthesis performance in all orientations compared to sGAN.
ProvoGAN sGAN vGAN
PSNR SSIM PSNR SSIM PSNR SSIM
R=4 40.75 1.35 0.96 0.009 40.34 1.43 0.96 0.009 36.80 1.69 0.93 0.014
R=8 39.45 2.15 0.95 0.011 38.73 1.01 0.94 0.010 30.83 1.44 0.88 0.028
R=12 36.99 1.29 0.94 0.010 36.76 1.18 0.92 0.010 29.03 1.49 0.88 0.018
R=16 37.86 0.50 0.92 0.015 35.11 1.06 0.89 0.016 28.99 2.21 0.86 0.023

Volumetric PSNR and SSIM measurements between the reconstructed and ground-truth images in the knee dataset are given as mean std for the test set. The measurements are reported for the proposed ProvoGAN and competing sGAN and vGAN methods for four distinct acceleration factors (). Boldface indicates the best performing method.

Table 2: Quality of Reconstruction in the Knee Dataset

3.2 ProvoGAN improves reconstruction of accelerated MRI acquisitions

While reconstructing fully-sampled images from undersampled acquisitions, volumetric models can enhance capture of global spatial correlations in volumetric images by accumulating contextual priors across separate cross-sections. Yet, volumetric models intrinsically prove difficult to be trained due to increased number of network parameters they involve. As a result, they may not effectively recover fine-structural details. Cross-sectional models, on the other hand, utilize contextual priors localized to individual cross-sections in a specific orientation. Therefore, cross-sectional models can learn a more focused, detailed mapping manifesting locally improved recovery quality. However, this can also lead to inconsistency across separate cross-sections and poor recovery of fine-structural details in the remaining orientations. Considering the ability of ProvoGAN to progressively leverage contextual priors without significantly expanding model complexity, we anticipated that it can successfully alleviate the limitations of the volumetric and cross-sectional models to enhance reconstruction quality.
 
Accordingly, we performed experiments to comparatively demonstrate ProvoGAN for MRI reconstruction against state-of-the-art cross-sectional sGAN and volumetric vGAN models. We considered single-coil reconstruction tasks on IXI, and multi-coil reconstruction tasks on the knee dataset for a broad range of acceleration factors (). We evaluated the performance of the methods under comparison based on volumetric PSNR and SSIM measurements between the reconstructed and reference fully-sampled images within the test set. In IXI, ProvoGAN achieves dB higher PSNR and higher SSIM compared to the second-best performing method of each single-coil reconstruction task, on average (see Table 1 for details). Meanwhile, in the knee dataset, ProvoGAN again yields enhanced recovery performance with dB higher PSNR and higher SSIM compared to the second-best performing method of each multi-coil reconstruction task, on average (see Table 2 for details). Superior reconstruction performance of ProvoGAN is also clearly visible in representative results displayed in Supp. Fig. 1 for IXI, and in Fig. 2 for the knee-dataset. Overall, sGAN suffers from discontinuity artifacts across individually recovered cross-sections and retrograded capture of fine-structural details, whereas vGAN suffers from loss of spatial resolution within the reconstructed volumes due to noticeable over-smoothing. Meanwhile, ProvoGAN reconstructs the target volumes with elevated consistency across the cross-sections in all orientations, and offers sharper and more accurate delineations for brain and knee tissues. These results confidently indicate that ProvoGAN can mitigate limitations of the previous volumetric and cross-sectional models, yielding an enhanced performance for accelerated MRI reconstruction.

Figure 3: The proposed method is demonstrated on the IXI dataset for T1-weighted image synhtesis from T2-weighted images. Representative results are displayed for all competing methods together with the ground truth target images (first column). The first row displays results for the axial orientation, the second row for the coronal orientation, and the third row for the sagittal orientation. Overall, the proposed method offers delineation of tissues with higher spatial resolution compared to vGAN, and alleviates discontinuity artifacts by improving synthesis performance in all orientations compared to sGAN.
ProvoGAN sGAN-A sGAN-C sGAN-S vGAN
T2  T1 PSNR 24.53 2.49 22.38 2.28 22.99 2.76 23.66 2.25 22.71 3.23
SSIM 0.90 0.04 0.85 0.04 0.86 0.04 0.87 0.04 0.84 0.04
PD  T1 PSNR 23.69 3.04 22.26 2.24 22.60 2.54 23.05 2.84 22.65 2.49
SSIM 0.89 0.04 0.86 0.04 0.87 0.04 0.87 0.04 0.83 0.06
T1  T2 PSNR 24.26 2.59 23.46 1.99 23.62 2.16 23.97 2.54 22.95 1.74
SSIM 0.86 0.06 0.84 0.05 0.85 0.05 0.85 0.06 0.83 0.04
PD  T2 PSNR 28.19 2.66 27.75 2.43 27.57 2.62 27.58 2.21 24.90 1.48
SSIM 0.93 0.04 0.93 0.04 0.92 0.04 0.93 0.04 0.80 0.11
T1  PD PSNR 27.06 2.05 25.28 1.30 25.70 1.36 25.75 1.48 24.24 1.16
SSIM 0.91 0.04 0.88 0.03 0.89 0.04 0.89 0.04 0.83 0.07
T2  PD PSNR 29.57 2.74 28.75 2.45 28.56 2.33 28.61 2.05 26.47 1.76
SSIM 0.95 0.03 0.94 0.03 0.93 0.03 0.94 0.02 0.90 0.03

Volumetric PSNR and SSIM measurements between the synthesized and ground-truth images in the test set of the IXI dataset are given as mean std. The measurements are reported for the proposed ProvoGAN and competing sGAN and vGAN methods for all one-to-one synthesis tasks: 1) T1  T2, 2) T1  PD, 3) T2  T1, 4) T2  PD, 5) PD  T1, 6) PD  T2. sGAN-A denotes the sGAN model trained in the axial orientation, sGAN-C in the coronal orientation, and sGAN-S in the sagittal orientation. Boldface indicates the highest performing method.

Table 3: Quality of Synthesis for One-to-one Mappings in the IXI Dataset

3.3 ProvoGAN improves synthesis of multi-contrast MR images

In multi-contrast MRI synthesis, volumetric models enable improved capture of comprehensive spatial correlations across the source- and target-contrast volumes by leveraging global contextual information. However, training the volumetric models may inherently prove difficult due to increased model complexity. In turn, images synthesized with the volumetric models may suffer from loss of fine-structural details and deteriorated spatial resolution. On the other hand, cross-sectional models learn a localized mapping between cross-sections of the source and target contrasts in a specific rectilinear orientation. Cross-sectional models can offer locally improved recovery of fine-structural details in the transverse dimensions. However, they inevitably neglect global contextual priors, which limit overall synthesis quality and introduce discontinuity artifacts across separately recovered cross-sections. We predicted that ProvoGAN can effectively alleviate the limitations of the previous cross-sectional and volumetric models. We investigated this claim by comparatively demonstrating ProvoGAN against state-of-the-art cross-sectional sGAN and volumetric vGAN models.
 
We separately performed experiments on the IXI and in-vivo datasets for one-to-one and many-to-one synthesis tasks (see Section 2.8 for details). To evaluate the performance of the proposed and competing methods, we considered volumetric PSNR and SSIM measurements between the synthesized and reference target images. On average, ProvoGAN achieves dB higher PSNR and higher SSIM for one-to-one, and dB higher PSNR and higher SSIM for many-to-one synthesis tasks in the IXI dataset compared to the second-best performing method (see Table 3 and Table 4 for details). Similarly in the in-vivo dataset, ProvoGAN yields dB higher PSNR and higher SSIM for many-to-one synthesis tasks compared to the second-best performing method, on average (see Table 5 for details). The elevated synthesis quality offered by ProvoGAN is also clearly visible in representative results displayed in Fig. 3 and Supp. Fig. 2 for the synthesis tasks in the IXI dataset, and Supp. Fig. 3 for the synthesis tasks in the in-vivo dataset. These results indicate that sGAN suffers from suboptimal recovery in the longitudinal dimension due to separate synthesis of cross-sections, whereas vGAN suffers from poor recovery of fine-structural details and loss of spatial resolution in the target images. Meanwhile, ProvoGAN alleviates the discontinuity artifacts by pooling global contextual information via progressive implementation of the cross-sectional tasks, and offers sharper depiction of tissues and tumor regions by reducing the model complexity. These results, thus, evidently suggest that ProvoGAN can achieve enhanced synthesis quality for multi-contrast MRI.

ProvoGAN sGAN-A sGAN-C sGAN-S vGAN
T2, PD  T1 PSNR 24.15 2.80 23.20 2.08 22.58 2.11 23.65 1.98 23.35 2.89
SSIM 0.90 0.05 0.86 0.04 0.87 0.04 0.88 0.04 0.85 0.04
T1, PD  T2 PSNR 28.97 2.91 27.64 2.59 27.74 2.67 27.93 2.19 25.97 1.81
SSIM 0.94 0.04 0.92 0.04 0.93 0.04 0.93 0.03 0.91 0.04
T1, T2  PD PSNR 29.81 2.96 27.69 2.20 29.00 2.41 27.12 1.61 26.17 1.41
SSIM 0.95 0.03 0.94 0.03 0.94 0.03 0.93 0.03 0.91 0.02

Volumetric PSNR and SSIM measurements between the synthesized and ground truth images in the test set of the IXI dataset are given as mean std. The measurements are provided for proposed and competing methods for all many-to-one synthesis tasks: 1) T2, PD  T1, 2) T1, PD  T2, 3) T1, T2  PD. sGAN-A denotes the sGAN model trained in the axial orientation, sGAN-C in the coronal orientation, and sGAN-S in the sagittal orientation. Boldface indicates the highest performing method.

Table 4: Quality of Synthesis for Many-to-one Mappings in the IXI Dataset

4 Discussion

In this study, a progressively volumetrized generative model (ProvoGAN) was introduced for MR image recovery that decomposes complex volumetric image recovery into a series of simpler cross-sectional tasks across rectilinear orientations. This decomposition enables ProvoGAN to efficiently learn and leverage global contextual prior while enhancing recovery of fine-structural details across each orientation. Comprehensive evaluations on three distinct MRI datasets illustrated superior performance of the proposed method against state-of-the-art volumetric and cross-sectional models. While ProvoGAN was mainly demonstrated for MRI synthesis and reconstruction tasks here, the same framework can be adopted for other modalities and tasks where volumetric processing is vital.

ProvoGAN sGAN-A sGAN-C sGAN-S vGAN
T2, FLAIR, T1c  T1 PSNR 26.92 4.55 24.17 3.83 25.31 4.12 26.22 3.09 22.73 3.69
SSIM 0.94 0.03 0.88 0.05 0.91 0.04 0.91 0.03 0.88 0.04
T1, FLAIR, T1c  T2 PSNR 26.87 2.40 25.67 1.75 25.98 2.18 26.85 2.38 25.48 1.82
SSIM 0.93 0.04 0.90 0.03 0.91 0.04 0.92 0.04 0.90 0.04
T1, T2, T1c  FLAIR PSNR 26.35 2.54 24.50 1.84 24.95 2.03 24.81 2.21 22.94 1.61
SSIM 0.92 0.03 0.87 0.03 0.88 0.03 0.88 0.03 0.86 0.03
T1, T2, FLAIR  T1c PSNR 29.67 2.23 28.53 1.98 28.48 2.11 28.75 2.23 27.21 1.46
SSIM 0.94 0.02 0.92 0.02 0.90 0.03 0.92 0.02 0.89 0.02

Volumetric PSNR and SSIM measurements between the synthesized and ground-truth images in the test set of the in-vivo dataset are given as mean std. The measurements are provided for proposed and competing methods for all many-to-one synthesis tasks: 1) T2, FLAIR, T1c  T1, 2) T1, FLAIR, T1c  T2, 3) T1, T2, T1c  FLAIR, 4) T1, T2, FLAIR  T1c. sGAN-A denotes the sGAN model trained in the axial orientation, sGAN-C in the coronal orientation, and sGAN-S in the sagittal orientation. Boldface indicates the highest performing method.

Table 5: Quality of Synthesis for Many-to-one Mappings in the in-vivo Dataset

Few recent studies proposed spatially focused 3D patch-based learning models for volumetric MR image recovery [patch_based_one_to_one_2, patch_based_one_to_one_3, patch_based_one_to_one_4, unsup_cross_model_synth, dictionary_one_to_one_1, dict_learning_im_synth, modality_prop, Jog2017b, example_based, les_seg, ex_mod_prop]

, which reduces the overall computational complexity and dependency on large training sets. However, the patch-based approaches suffer from loss of global contextual information and inconsistencies across separately recovered 3D patches. Other studies trained parallel 2D networks across each orientation followed by fusion for super-resolution

[provolike_1] and segmentation [provolike_2]. In [provolike_1], initial training across each low-resolution orientation is performed separately via a shared network followed by a separate training via a fusion network. In [provolike_2]

, features extracted from each rectilinear orientation surrounding a target voxel are concatenated at the last layer prior to classification. While both approaches mitigate the limitations associated with traditional cross-sectional and volumetric models, initial parallel training of streams without any information sharing among different orientations

[provolike_1, provolike_2] might render the performance suboptimal. ProvoGAN, on the other hand, performs image recovery in an empirically tuned progression order, where network dedicated to each orientation uses information recovered from the previous progression. Moreover, unlike [provolike_1, provolike_2], ProvoGAN is based on GANs which have been shown to produce realistic images of remarkable visual quality [pix2pix].
 
Several avenues of improvement can be adopted to further improve performance and reliability of ProvoGAN. Here, ProvoGAN was trained using a fully-supervised framework, which assumes the availability of datasets containing fully-sampled reference images. However, collecting such datasets might be impractical due to constraints inherent to MR imaging such as patient discomfort and rapid signal decay. In such cases, ProvoGAN can be trained in a self-supervised [selfsupervised] setting to reduce or remove the dependency on fully-sampled training datasets. Another avenue of development concerns generalization of ProvoGAN to work on nonrectilinear orientations. While ProvoGAN was mainly demonstrated for rectilinear acquisitions in this work, similar decompositions can be viable for nonrectilinear sampling schemes in MRI such as radial and spiral acquisitions. Implementation of recovery models in non-Cartesian domains can be achieved via gridding algorithms for data resampling [gridding]. ProvoGAN can also be modified to allow learning from unpaired datasets by employing models based on cycle-consistency loss [cycleGAN]. In addition, instead of performing a separate sequential training of each progression, an end-to-end training of whole network can be performed for improved performance by leveraging advanced model parallelism techniques [parallelism_1].

5 Acknowledgments

This study was supported in part by a TUBITAK 1001 Research Grant (118E256), an EMBO Installation Grant (3028), a TUBA GEBIP 2015 fellowship, a BAGEP 2017 fellowship, and by NVIDIA under GPU grant.

References

6 Supplementary Materials

 C  S  S  C  A  C  C  A  S  A  A  S
R=4 T1 36.15 1.10 36.41 1.27 36.40 1.31 35.97 1.60 35.92 1.23 36.45 1.84
T2 36.24 2.10 37.04 2.13 38.07 0.88 36.77 1.33 38.09 1.69 36.76 3.32
R=8 T1 33.74 0.56 33.48 0.73 33.30 1.14 33.49 0.69 33.01 1.19 33.25 0.59
T2 34.89 0.80 33.68 2.72 34.84 1.72 34.83 0.67 34.32 2.83 33.03 2.49
R=12 T1 30.28 1.10 30.04 0.98 30.42 1.12 30.72 0.82 30.54 0.84 30.72 0.92
T2 31.02 0.95 31.45 1.83 32.05 2.42 32.03 2.42 32.00 2.51 32.39 0.73
R=16 T1 30.51 1.20 29.48 1.12 30.45 0.55 30.39 0.68 29.45 0.79 29.35 1.37
T2 31.53 0.62 30.84 2.05 31.12 2.01 31.61 2.12 31.32 1.99 31.24 1.36

Volumetric PSNR measurements between the reconstructed and ground truth images in the validation set are given as mean std. The measurements are provided for all possible progression orders: 1) A  C  S, 2) A  S  C, 3) S  A  C, 4) S  C  A, 5) C  S  A, 6) C  A  S and acceleration factors. Boldface indicates the highest performing progression sequence.

Table 1: Quality of Reconstruction in the IXI Dataset
 C  S  S  C  A  C  C  A  S  A  A  S
R=4 37.56 1.09 37.91 0.49 39.72 1.63 38.73 1.60 38.18 1.99 38.27 1.52
R=8 38.80 1.39 38.25 1.13 36.38 1.09 39.95 0.64 37.44 1.18 35.59 1.19
R=12 35.20 0.04 38.36 0.42 37.22 1.29 39.14 0.44 37.49 0.81 37.61 1.07
R=16 37.26 1.19 35.05 3.30 37.38 1.10 38.01 0.27 38.28 0.25 37.49 0.94

Volumetric PSNR measurements between the reconstructed and ground truth images in the validation set are given as mean std. The measurements are provided for all possible progression orders: 1) A  C  S, 2) A  S  C, 3) S  A  C, 4) S  C  A, 5) C  S  A, 6) C  A  S and acceleration factors. Boldface indicates the highest performing progression sequence.

Table 2: Quality of Reconstruction in the Multi-Coil Knee Dataset
 C  S  S  C  A  C  C  A  S  A  A  S
T2  T1 24.42 2.19 24.98 2.48 25.00 2.06 24.97 2.07 24.74 1.82 24.64 1.80
PD  T1 24.74 2.25 24.77 2.30 25.63 2.47 25.56 2.45 25.11 1.82 25.66 2.11
T1  T2 26.18 0.39 26.17 0.31 26.24 0.81 26.31 0.77 26.31 0.89 26.39 0.96
PD  T2 29.48 1.31 29.73 1.31 28.89 1.58 28.91 1.28 29.42 1.27 29.73 1.30
T1  PD 27.17 0.75 28.10 0.90 27.37 0.61 27.31 0.57 27.53 0.58 27.69 0.66
T2  PD 30.32 1.15 30.37 1.13 30.10 0.95 30.01 0.92 30.19 1.28 30.29 1.41

Volumetric PSNR measurements between the synthesized and ground truth images in the validation set are given as mean std. The measurements are provided for all possible progression orders: 1) A  C  S, 2) A  S  C, 3) S  A  C, 4) S  C  A, 5) C  S  A, 6) C  A  S and all one-to-one synthesis tasks: 1) T1  T2, 2) T1  PD, 3) T2  T1, 4) T2  PD, 5) PD  T1, 6) PD  T2. Boldface indicates the highest performing progression sequence.

Table 3: Quality of Synthesis in the IXI Dataset for One-to-one Mappings
 C  S  S  C  A  C  C  A  S  A  A  S
T2, PD  T1 23.38 2.38 23.75 2.59 24.24 3.81 24.01 3.65 25.35 2.23 24.90 2.06
T1, PD  T2 28.35 1.65 28.34 1.67 28.50 1.16 28.37 1.04 29.51 1.41 28.62 1.08
T1, T2  PD 30.42 1.29 31.50 1.61 30.46 1.08 30.50 1.14 31.44 1.87 30.25 1.51

Volumetric PSNR measurements between the synthesized and ground truth images in the validation set are given as mean std. The measurements are provided for all possible progression orders: 1) A  C  S, 2) A  S  C, 3) S  A  C, 4) S  C  A, 5) C  S  A, 6) C  A  S and all many-to-one synthesis tasks: 1) T2, PD  T1, 2) T1, PD  T2, 3) T1, T2  PD. Boldface indicates the highest performing progression sequence.

Table 4: Quality of Synthesis in the IXI Dataset for Many-to-one Mappings
 C  S  S  C  A  C  C  A  S  A  A  S
T2, FLAIR, T1c  T1 24.48 2.91 24.49 2.85 24.48 3.32 24.45 3.27 25.63 3.53 24.81 3.14
T1, FLAIR, T1c  T2 27.37 2.96 27.14 2.86 27.66 2.91 27.19 2.76 26.96 2.81 27.68 2.99
T1, T2, T1c  FLAIR 24.93 2.40 25.58 2.47 24.90 3.36 25.35 3.45 25.51 2.93 25.58 3.09
T1, T2, FLAIR  T1c 28.83 2.08 29.88 2.44 28.75 2.54 28.83 2.34 28.94 2.26 28.43 2.14

Volumetric PSNR measurements between the synthesized and ground truth images in the validation set are given as mean std. The measurements are provided for all possible progression orders: 1) A  C  S, 2) A  S  C, 3) S  A  C, 4) S  C  A, 5) C  S  A, 6) C  A  S and all many-to-one synthesis tasks: 1) T2, FLAIR, T1c  T1, 2) T1, FLAIR, T1c  T2, 3) T1, T2, T1c  FLAIR, 4) T1, T2, FLAIR  T1c. Boldface indicates the highest performing progression sequence.

Table 5: Quality of Synthesis in the in-vivo Dataset for Many-to-one Mappings
Figure 1: The proposed method is demonstrated on the IXI dataset for reconstruction task with acceleration ratio R=8. Representative results are displayed for all competing methods together with the undersampled source images(first column) and the ground truth target images (second column). The first row displays results for the axial orientation, the second row for the coronal orientation, and the third row for the sagittal orientation. Overall, the proposed method offers delineation of tissuess with higher spatial resolution compared to vGAN, and alleviates discontinuity artifacts by improving synthesis performance in all orientations compared to sGAN.
Figure 2: The proposed method is demonstrated on the IXI dataset for T1-weighted image synhtesis from T2- and PD-weighted images. Representative results are displayed for all competing methods together with the ground truth target images (first column). The first row displays results for the axial orientation, the second row for the coronal orientation, and the third row for the sagittal orientation. Overall, the proposed method offers delineation of tissuess with higher spatial resolution compared to vGAN, and alleviates discontinuity artifacts by improving synthesis performance in all orientations compared to sGAN.
Figure 3: The proposed method is demonstrated on the in-vivo dataset for T1-weighted image synhtesis from T2-, T1c-weighted and FLAIR images. Representative results are displayed for all competing methods together with the ground truth target images (first column). The first row displays results for the axial orientation, the second row for the coronal orientation, and the third row for the sagittal orientation. Overall, the proposed method offers delineation of tissues with higher spatial resolution compared to vGAN, and alleviates discontinuity artifacts by improving synthesis performance in all orientations compared to sGAN. Meanwhile, the proposed method achieves more accurate depiction for tumor regions, which are suboptimally recovered by the competing methods.