Markov-Chain Monte Carlo Approximation of the Ideal Observer using Generative Adversarial Networks

01/26/2020 ∙ by Weimin Zhou, et al. ∙ University of Illinois at Urbana-Champaign 0

The Ideal Observer (IO) performance has been advocated when optimizing medical imaging systems for signal detection tasks. However, analytical computation of the IO test statistic is generally intractable. To approximate the IO test statistic, sampling-based methods that employ Markov-Chain Monte Carlo (MCMC) techniques have been developed. However, current applications of MCMC techniques have been limited to several object models such as a lumpy object model and a binary texture model, and it remains unclear how MCMC methods can be implemented with other more sophisticated object models. Deep learning methods that employ generative adversarial networks (GANs) hold great promise to learn stochastic object models (SOMs) from image data. In this study, we described a method to approximate the IO by applying MCMC techniques to SOMs learned by use of GANs. The proposed method can be employed with arbitrary object models that can be learned by use of GANs, thereby the domain of applicability of MCMC techniques for approximating the IO performance is extended. In this study, both signal-known-exactly (SKE) and signal-known-statistically (SKS) binary signal detection tasks are considered. The IO performance computed by the proposed method is compared to that computed by the conventional MCMC method. The advantages of the proposed method are discussed.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

It has been widely accepted that task-based measures of image quality (IQ) should be employed for assessing and optimizing medical imaging systems[barrett2013foundations]. Task-based measures of IQ quantify the ability of an observer to perform specific tasks[kupinski2001ideal, barrett2013foundations, zhou2018learning, zhou2019learning, zhou2019learningHO, zhou2019approximating]. When optimizing imaging systems for maximizing the amount of task-specific information in the measured images, the performance of the Ideal Observer (IO) can be employed as a figure-of-merit (FOM)[barrett2013foundations, kupinski2001ideal, zhou2018learning, zhou2019learning, zhou2019approximating]. However, for binary signal detection tasks, the IO test statistic is computed by calculating the likelihood ratio that is generally analytically intractable. To approximate the IO test statistic, sampling-based methods using Markov-Chain Monte Carlo (MCMC) techniques [kupinski2003ideal, park2003ideal] have been developed. However, current applications of MCMC techniques are limited to relatively simple stochastic object models (SOMs) such as the lumpy background model[kupinski2001ideal] and binary texture model[abbey2008ideal], and it remains unclear how MCMC methods can be implemented with other more sophisticated object models.

Deep-learning methods employing generative adversarial networks (GANs) [goodfellow2014generative, zhou2019learningSOM] hold great promise to learn SOMs that describe the variability in the class of objects to-be-imaged. GANs comprise a generator and a discriminator. By playing a two-player minimax game between the generator and the discriminator, the distribution learned by the generator can approximate the distribution corresponding to the training data [goodfellow2014generative]

. One subsequently can generate new images by sampling latent vectors that constitute the input to the generator. A latent vector is typically a low-dimensional random vector that follows a simple distribution such as the normal distribution or uniform distribution.

In this study, inspired by the MCMC algorithm developed by Kupinski et al.[kupinski2003ideal], we propose a novel methodology for approximating the IO by applying MCMC techniques to SOMs learned by use of GANs. Specifically, a GAN is trained on a set of object images to establish a SOM, and MCMC techniques are subsequently applied to the GAN-represented SOM to compute the likelihood ratio. As a proof-of-concept, a lumpy background model and an idealized parallel-hole collimator system were considered. Both signal-known-exactly (SKE) and signal-known-statistically (SKS) binary signal detection tasks were considered. Receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC) values corresponding to the proposed MCMC-GAN algorithm are compared to those corresponding to the conventional MCMC method. The potential advantages of the proposed method are discussed.

2 Background

Consider a binary signal detection task that requires an observer to classify an image

as satisfying a signal-absent hypothesis () or a signal-present hypothesis (). The imaging processes can be represented as:

(1a)
(1b)

where denotes an image of background, denotes the signal to be detected, and denotes the random measurement noise.

2.1 Ideal Observer and Markov-Chain Monte Carlo techniques

The Ideal Observer (IO) sets an upper performance limit among all observers. The IO test statistic can be computed as any monotonic transformation of the likelihood ratio:

(2)

However, computation of generally is intractable analytically.

Kupinski et al. proposed a method to numerically approximate the IO test statistic by employing MCMC techniques [kupinski2003ideal]. For a signal-known-exactly (SKE) binary signal detection task, the likelihood ratio can be written as[kupinski2003ideal]:

(3)

where and . The BKE likelihood ratio sometimes has an analytical form that is dependent on the type of measurement noise[kupinski2001ideal]. In cases where the background can be described by a stochastic object model (SOM) with a set of stochastic parameters , i.e., , the likelihood ratio described in Eq. 3 can be written as[kupinski2003ideal]: Subsequently, the likelihood ratio can be approximated as:

(4)

Here, each is sampled from the posterior distribution . To sample from the distribution , a Markov chain with the stationary density can be generated. To do this, an initial vector is chosen and a proposal density function is specified. Given , the candidate vector is sampled from the proposal density

and it is accepted with probability

[kupinski2003ideal]:

(5)

The vector if the candidate is accepted; otherwise . If the proposal density is designed to be symmetric, i.e., , the sampling strategy described above becomes a Metropolis-Hastings approach and the factors corresponding to the proposal density are cancelled.

Park et al. extended the MCMC approach to signal-known-statistically (SKS) signal detection tasks [park2003ideal] where the signal is random. If the signal can be described by a set of stochastic parameters , i.e., , the likelihood ratio can be written as[park2003ideal] :

(6)

where . The likelihood ratio can be subsequently approximated as:

(7)

Here, are sampled from the distribution . The Markov chain can be constructed with acceptance probability:

(8)

Again, if the proposal densities are designed to be symmetric, the factors corresponding to the proposal density in Eq. 8 are canceled.

However, implementation of these proposed MCMC methods can be difficult because it is required to address practical issues such as the design of proposal density for the considered object model. In addition, it remains unclear how to apply these methods under situations where the background cannot be described by the current well-established SOMs.

2.2 Generative Adversarial Networks

Generative Adversarial Networks (GANs) hold great promise to establish SOMs from training data[goodfellow2014generative]. A GAN trains a generator by competing against a discriminator through an adversarial process[goodfellow2014generative]. When a GAN is trained on an ensemble of background images, the generator maps a random latent vector to a synthetic background image . Here,

is a mapping function represented by a deep neural network with a weight vector

, and the latent vector is sampled from a simple known distribution such as normal distribution or uniform distribution. The discriminator is represented by another deep neural network with a weight vector and mapping function that maps an image to a real-valued score to be used to distinguish between real and synthetic images. A GAN is trained by playing a two-player minimax game between the generator and the discriminator:

(9)

where is an objective function, which is dependent on specific training strategies. When and possess sufficient capacity, when the global optimum of the minimax game is achieved[goodfellow2014generative]. Here, denotes the distribution of the real background images , and denotes the distribution of the synthetic background images . The generator can subsequently represent a SOM that describes the variability within the ensemble of background objects.

3 Markov-Chain Monte Carlo using Generative Adversarial Networks

After a GAN has been trained on a set of background images, the distribution that describes the actual background images can be approximated by the distribution of GAN-produced background images : . The IO test statistic for SKE signal detection tasks can subsequently be approximated as:

(10)

where and . Because , where is a Dirac delta function and , the likelihood ratio can be rewritten as:

(11)

where is evaluated on the synthetic background image generated by the GAN. The likelihood ratio subsequently can be approximated as:

(12)

where is sampled from the posterior distribution . To produce , a Markov chain with an initial latent vector and a proposal density can be constructed. Given , a candidate latent vector is drawn from the proposal density and is accepted to the Markov chain with the acceptance probability:

(13)

Here, the probability density function

has a simple analytical form because the latent vector is sampled from simple distributions such as the normal distribution or uniform distribution. When a random walk Metropolis-Hastings (RWMH) algorithm[pereyra2015survey] is employed, the proposal density can simply be chosen as a Gaussian density: . Additionally, because the gradient of the function represented by the generator with respect to the latent vector can be readily computed, more advanced MH algorithms including Metropolis adjusted Langevin algorithms (MALA)[pereyra2015survey] and Hamiltonian Monte Carlo (HMC)[pereyra2015survey] that employ gradient information can be employed.

Similarly, the likelihood ratio for SKS signal detection tasks can be approximated as:

(14)

where are sampled from the distribution . A Markov chain for producing (, ) can be constructed in a similar way as described above.

4 Numerical studies

The imaging system considered in this preliminary study was a parallel-hole collimator system whose point response function was described by a Gaussian kernel: . Here, the width and the amplitude with the height . This imaging system recorded measured images. A lumpy object model [kupinski2001ideal]

was considered with the mean number of lumps equal to 5. Each lump was represented by a Gaussian function with the width of 7 and amplitude of 1. Gaussian noise was considered with the standard deviation of 20 for the SKE detection task and 10 for the SKS detection task. A deterministic signal described by a Gaussian function with the amplitude of 0.2 and the width of 3 was considered for the SKE signal detection task. For the SKS signal detection task, a uniformly distributed elliptical Gaussian signal with a random location and random shape was considered. We employed the method of Progressive Growing of GANs (ProGAN)

[karras2017progressive]

and trained a ProGAN on 100,000 lumpy object images. The ProGAN was trained in Tensorflow

[abadi2016tensorflow] by use of the Adam optimizer [kingma2014adam], which is a stochastic gradient algorithm.

5 Results

Samples of real background images and synthetic background images produced by the GAN are shown in Fig. 1.

Figure 1: Top row: Samples of real background images; Bottom row: Samples of synthetic background images produced by a GAN.

The IO performance was estimated by use of 200 pairs of signal-absent and signal-present images. For each image, to generate a Markov chain, 100,000 iterations were run on the GAN-represented SOM by use of a RWMH algorithm with a simple Gaussian proposal density and the first 1000 burn-in iterations were ignored. The IO performance estimated by use of the conventional MCMC algorithm that was applied to the original lumpy model was provided to validate the proposed method. The ROC curves and AUC values corresponding to the MCMC-GAN method and the conventional MCMC method for the SKE detection task are shown in Fig.

2 (a), and those for the SKS detection task are shown in Fig. 2 (b). The ROC curves were fit by use of the Metz-ROC software[metz1998rockit] and the “proper” binormal model[pesce2007reliable].

0pt

(a)

0pt

(b)
Figure 2: (a) ROC curves and AUC values for the SKE detection task (b) ROC curves and AUC values for the SKS detection task.

In this proof-of-concept study where a simple lumpy object model was considered, the IO performance computed by our proposed MCMC-GAN method closely approximated that computed by the conventional MCMC method. Because implementation of GANs is general and not limited to specific object models, our proposed method can be implemented with more sophisticated object models that can be learned by use of GANs where the conventional MCMC methods may not be available to use.

6 Conclusion

This work provides a novel methodology to approximate the IO performance by applying MCMC techniques with SOMs learned by use of GANs, thereby extending the domain of applicability of MCMC methods. In this preliminary study, a lumpy background model was considered and the IO performance computed by the proposed MCMC-GAN method closely approximated that computed by the conventional MCMC method for both the considered SKE and SKS signal detection tasks. The proposed MCMC-GAN method can be potentially applied to more sophisticated object models learned by use of GANs. This will enable computation of IO performance for optimizing imaging systems.

Acknowledgment

This research was supported in part by NIH awards EB020604, EB023045, NS102213, EB028652, and NSF award DMS1614305.

References