1 Introduction
Many key challenges in the intersection of natural sciences and the life sciences are related to solving inverse problems. Here, it is assumed that a forward process maps the (hidden) parameters of interest to observations that can be measured. In the context of computer assisted interventions (CAI), for example, may refer to important physiological tissue parameters, such as the tissue oxygenation (cf. Figure 1), while may represent multispectral measurements of tissue. The problem is usually solved by regression, which gives a point estimate for the tissue parameter(s) of interest based on the camera measurements [3, 30, 29]. However, in most inverse problems the mapping between and is not injective, and two substantially different can result in the same . To recover a unique inverse, a regularizer can be added to the objective, but this approach, although commonly used, neglects the inherent ambiguity of the solution. For our application, an explicit analysis of the ambiguity is crucial to identify the most suitable camera in terms of number and characteristics of camera bands. To our knowledge, none of the existing parameter estimation methods has incorporated a sufficiently powerful uncertainty quantification to do so.
and variance
. (b) Example of two posterior distributions as provided by our INN. The posterior of the 3band camera (green) is multimodal, and the MAP estimation of tissue oxygenation is associated with the wrong mode leading to a poor estimation. The posterior of the 8band camera (orange) is unimodal with small width of the mode and better MAP estimation.Current approaches to uncertainty quantification in the field of deep learning, such as dropout sampling (cf. e.g.
[7, 16, 17, 24]), probabilistic inference (cf. e.g. [6, 14, 31]) or ensembles of estimators (cf. e.g. [15, 23]), typically augment traditional point estimates with confidence intervals, but do not recover unrestricted full posteriors
. Consequently, these methods do not account for the possibility that the same observation corresponds to fundamentally different . In other words these methods would always assume that follows the blue (unimodal) distribution depicted in Figure 1(a) even if followed the orange (multimodal) distribution. The following two cases illustrate that this is a serious shortcoming when we wish to recover a physiological parameter from observations (Figure 1(b)):
The solution is unique but suffers from high uncertainty. This may be represented by a unimodal posterior
whose single mode has large standard deviation.

The problem is illposed in the sense that two substantially different yield the same . This must be represented by a multimodal posterior whose individual modes may have low uncertainty.
Forcing a unimodal representation onto the second case cannot work: It would either focus on one of the modes and miss the other, or cover both solutions under a single wide mode (similar to case 1) whose maximum is located at the average of and – a highly implausible value for the given .
We therefore argue that an ideal method for comparative camera assessment should be able to deal with all possible types of uncertainty. We propose to move beyond point estimates by mapping an observation to a full distribution rather than a single . To this end, we solve the resulting inverse problem using the recently proposed concept of invertible neural networks (INNs) [2]. Performance measures for a hardware setup can then be computed from the number and widths of the modes of the posteriors, as illustrated in Figure 1(b).
In the following sections, we describe our approach in detail and apply it to the comparative assessment of four different camera designs given the specific use case of physiological parameter estimation from multispectral imaging data.
2 Methods
In this section, we formalize the proposed approach to performance assessment in a generic manner and apply it to the specific use case of camera selection for multispectral image analysis.
Generally speaking, we assume that the method to be assessed involves a hardware setup (e. g. a multispectral camera) that is used to solve an inverse problem with a wellknown forward process , such as the mapping of tissue oxygenation to the pixelwise measurement of a multispectral camera. We further assume that we have access to a data set composed of tuples , with . Typically can be generated by means of Monte Carlo simulation, as in [13, 29, 30] assuming the (virtual) hardware setup . Finally, we represent the regressor as an invertible neural network, as detailed in Section 2.2.
Our approach to performance assessment involves the following steps: (1) Training the regressor on using
for hyperparameter tuning. (2) Applying
to , to get a target distribution for each in the test data set. (3) Extracting the modes for each. (4) Computing descriptive statistics over the number and widths of the modes to quantify the uncertainty of the regressor. Different hardware setups can then be compared using metrics that consider not only the accuracy but also the uncertainty characteristics of the regressor. The following paragraphs instantiate this approach in the specific context of camera selection for intraoperative physiological parameter estimation.
2.1 Data generation for performance assessment
We apply Monte Carlo methods to generate tuples of physiological parameters and corresponding pixelwise measurements . The method is based on previous work [30] and briefly revisited here.
Tissue is assumed to be composed of three infinitely wide layers. Each layer is defined by the following tissue parameters: blood volume fraction , reduced scattering coefficient at 500nm , scattering power , anisotropy , refractive index and layer thickness . Based on values for hemoglobin extinction coefficients and from literature [11], absorption and scattering coefficients and have been determined for use in the MC simulation framework. A Graphics Processing Unit (GPU) accelerated version [1] of the Monte Carlo MultiLayered (MCML) simulation framework [26] was chosen to generate spectral reflectances. The spectral reflectances as determined by the MC simulation can be transformed to the reflectance measurement at band of a given camera by:
(1) 
Here, the camera is characterized by , the th filter response, , the relative irradiance of the light source and , which represents other parameters of the optical system like camera quantum efficiency or transmission of the optical elements.
2.2 Invertible Neural Networks (INNs) for physiological parameter estimation
Basic principle
INNs have been proposed recently as a new method to recover a posterior distribution from an observation [2]. The network takes the form of a deterministic function :
where denotes the trainable parameters and are the latent variables carrying the uncertainty of the reconstruction of given
. Sampling the latent variables according to the normal distribution
yields an approximation of .Application to physiological parameter estimation
In the context of physiological parameter estimation, we hypothesize that observing a spectrum is generally not sufficient to recover the underlying tissue parameter(s) . Intuitively speaking, the purpose of the latent variables is to capture the information necessary to recover that is not already captured by . To recover a physiological parameter from a previously unseen spectrum , we repeatedly draw samples from the latent space to obtain samples () that we pass through the network. The corresponding set of physiological parameters yields the posterior . Due to the invertible architecture, the network simultaneously learns (1) the forward model  i. e., how to convert tissue parameters to spectral reflectances as measured by a camera  and (2) how to recover a posterior distribution of tissue parameters corresponding to an observation .
Network architecture
The network architecture applied in this work has been adapted from [2] and can be found in Figure 3. It relies on four invertible affine coupling blocks [5], each of which is followed by a permutation layer, leading to an eight layer network in total. The purpose of the permutation layer is to improve the mixing of the different input and output channels. It adds no additional weights to the network. At initialization, a randomly chosen permutation between the input and output channels is fixed permanently. We assume that each physiological parameter has its own uncertainty associated. Hence, we choose in this study.
Loss functions
Four loss terms are used to train the network (cf. Figure 3):
 forward loss ():

In the forward direction, we use an loss on the predicted reflectances and the true reflectances to enforce good estimations for the forward process.
 MMD forward loss ():

We apply a Maximum Mean Discrepancy (MMD) loss on the latent space estimation . MMD losses distinguish between distributions [8]. Here, we compare the distribution of the predicted latent variables to latent variables sampled from the desired distribution .
 backward loss ():

We use the estimations and from the forward pass and perturb both quantities with additive Gaussian noise. The resulting output
with zero padding
is compared to with zero padding via the loss . This serves as a form of regularization, smoothing the latent space and ensuring that no critical information is hidden in lowamplitude structures in the outputs.  MMD backward loss ():

We compute a reverse pass through the net with reflectances from the training set and latent variables from . The output is then passed to an MMD loss , which compares it to the distribution given by the training samples . As previous work [2] indicates that , and are the only tissue parameters that can potentially be recovered from multispectral measurements, we decided to only feed these slices of into instead of the whole prediction.
Hyperparameter optimization
We use the training data set to perform the parameter optimization of the network and the validation data set to prevent overfitting and for hyperparameter tuning. Particularly, we use the validation data to calibrate the width of the posterior distributions. As suggested in [20], the purpose of the calibration is that for every sample, the confidence interval () of the posterior contains the ground truth value in of the cases. In other words, for each value of exactly a fraction of the ground truth values shall be inliers of the corresponding confidence interval. We optimize the parameters using the validation set to enforce this behavior as best as possible.
2.3 Performance assessment
We quantify the uncertainty of an inference based on two key parameters: The presence of multiple modes and the width of the posterior.
Given samples following the posterior
our approach to automatic mode detection relies on computing a kernel density estimation for
which has the advantage of being easily sampled. This then allows us to compute the corresponding relative maxima of. A posterior is classified as
multimodal, if its standard deviation is less than half of the prior’s standard deviation, our algorithm finds more than one relative maximum, and these maxima are further than a certain threshold apart (: , : ). Furthermore, maxima whose intensity is less than 80% of the main (i. e. highest) maximum are ignored. All remaining posteriors are classified as unimodal.To assess the performance of a camera, the INN is applied to , and the automatic mode detection (Section 2.3) is run on each posterior . Next, the following metrics are computed:

Percentage of multiple modes (MM): The percentage of multimodal posterior distributions. We do this as a means to judge how wellposed the inversion is for the different cameras.

Rootmeansquare error (RMSE): We utilize maximum a posteriori probability (MAP) estimate as a predictor for the physiological parameters and give the rootmeansquare error of these estimations against the ground truth.

68% confidence interval width (W): We report the median interval width of the 68% confidence interval as a measure of the width of the posterior distributions.
3 Experiments and Results
The purpose of the experiments was to confirm the realism of our simulation pipeline (Section 3.1) and to apply our setup to the task of comparative camera assessment (Section 3.2).
3.1 Realism of Simulation Pipeline
The simulation pipeline applied for comparative camera assessment features two potential sources of error: (1) Errors in the conversion of the simulated high resolution spectrum to multispectral measurements (sec. 3.1.1) and (2) wrong model assumptions in the generic tissue model and hence errors in the simulated spectra (sec. 3.1.2). We will address both issues in this order in the following paragraphs.
3.1.1 Virtual Camera
The realism of the simulated data relies crucially on the validity of our virtual cameras. In order to explore this, we measured color tiles (XRite ColorChecker®classic, Grand Rapids, MI, USA) which have a well defined spectrum using a HR2000+ spectrometer (Ocean Optics, Largo, FL, USA) and a Pixelteq SpectroCam™, which is a 8 band multispectral camera. Using the filter response functions of the SpectroCam, we transformed the high resolution spectrum to a virtual SpectroCam spectrum. To perform this experiment we used three color tiles (blue, green and red) and averaged five SpectroCam measurements. The measured intensities were normalized. As shown in Figure 4, the simulated data is in very close agreement with the real measurements.
3.1.2 Tissue Model
Due to the lack of a reliable gold standard method for measuring optical tissue properties in vivo, validation of the tissue model is not straightforward. Previous work has addressed this issue by comparing real measurements of tissue with simulated spectra [30]. If the accuracy of the virtual camera used can be assumed to be acceptable, deviations between real and simulated data can primarily be attributed to differences in the tissue composition. The tissue model applied in this study has been validated in a previous publication [30] using multispectral data from several different porcine abdominal organs. To confirm these findings we additionally acquired measurements from a porcine brain and a human kidney. The brain was measured using the same SpectroCam as in Section 3.1.1
. For the human kidney, we used a 16band camera. We performed a principal component analysis (PCA) on the data generated by our tissue model (adapted to the appropriate camera) and a kernel density estimation (KDE) on the first two principal components. Afterwards, we projected the measured data on those same components. The result can be found in Figure
5. Clearly, all the organ data points lie within the distribution of the simulated data of our tissue model.3.2 Comparative Camera Assessment
The main purpose of our experiments was to evaluate our assessment framework in the specific context of multispectral camera selection for physiological parameters estimation.
3.2.1 Experimental Data
We applied the Monte Carlobased method described in Section 2.1 to generate 20,000 data points representing spectral reflectances using the tissue parameter distributions summarized in Table 1. If we give a range for a parameter this parameter is sampled uniformly from this range. If we give a value with standard deviation this parameter is sampled according to a normal distribution with this expectation value and standard deviation. For each camera setup investigated here, these reflectances were converted into (simulated) camera measurements considering the optical properties of the setup. For each setup, we reserved 70% of the data for training (), 5% for hyperparameter tuning () and 25% for performance assessment (). To test our assessment framework, we assessed three camera designs that have been applied in previous work [12, 19, 29] in a comparative manner. To obtain a lower bound on the achievable uncertainty, we complemented these realistic cameras by a virtual camera with nearly optimal design. The cameras are characterized by the following filter responses (cf. Figure 6):
layer  
1  010  0100  1.286  0.80.95  1.33  0.060.1  
2  010  0100  1.286  0.80.95  1.36  0.060.085  
3  010  0100  1.286  0.80.95  1.38  0.040.06  
framework: MCML[1], photons per simulation  
wavelength range : 450 (stepsize= 
 3med:

3band camera optimized for medical imaging use, as described in [12].
 3nRGB:

3band camera whose bands’ centers coincide with the standard RGB bands, as described in [19].
 8med:
 27equi:

As a close to optimal camera, we used a camera with a filter response featuring a (unrealistically) narrow band every . As our experimental data is based on presimulated data in the range of to , this leads to a ‘27 band camera’.
The remaining parameters and were set to 1 for all cameras.
3.2.2 Results
Figure 7 provides representative examples for the posteriors generated by our INN. The calibration errors for the four different cameras are presented in Figure 8 for the physiological parameters tissue oxygenation () and blood volume fraction (). We see that the calibration curves closely follow the identity. In the case of , the 3med camera is ‘underconfident’ for larger values which would make estimations based on the confidence intervals in this range less reliable.
Camera  

MM [%]  RMSE [pp]  W [pp]  MM [%]  RMSE [pp]  W [pp]  
27equi  0.3  2.3  2.3 (0%)  0.1  1.6  3.4 (67%) 
8med  0.5  2.9  4.0 (0%)  0.0  1.7  3.5 (62%) 
3nRGB  9.3  4.8  5.8 (0%)  0.0  1.7  3.1 (64%) 
3med  3.6  5.7  8.6 (0%)  0.0  2.4  5.3 (99%) 
Table 2 shows the performance of the four different cameras using the metrics presented in Section 2.3. All computations were performed on the test set . As expected, the scores generally improve with an increasing number of spectral bands. An interesting observation is that the 3band camera designed for medical use (3med) has a higher RMSE compared to the camera whose design was inspired by standard RGB cameras (3nRGB), yet, it features a substantially reduced number of multiple mode posteriors (3.6% vs 9.3%).
For all cameras except the 3nRGB, there are only few multiple mode posteriors for reconstruction. Figure 8(a) shows the estimations of the 27equi and the 3med
camera which show generally good reconstructive performance and the possibility of outlier detection via the width of the posteriors.
In contrast, our results suggest that cannot be recovered from any of the cameras with high certainty. In fact, the percentage of samples with , (where represents the standard deviation of the corresponding the prior) is greater than 50% for all four cameras. The poor performance is illustrated in Figure 8(b). We see that the 27equi camera still performs better than the 3med camera, but none of them show good performance for high values. This general trend is also true for the other two cameras. Note that since most posteriors were even wider than the priors, they did not qualify as a candidate for the multiple mode mode detection algorithm (cf. Section 2.3) explaining the low MM.
Furthermore, although W seems reasonable in absolute terms, comparing it to twice the standard deviation of the prior distribution reveals that the median width of the posteriors goes as high as 93% for the 3med camera, indicating that is effectively unrecoverable. For the values range from 4% in the 27equi case to 15% in the 4med case.
4 Discussion
Meaningful performance assessment and benchmarking are crucial for advancing research and practice. Several publications, however (cf. e. g. [18]), suggest that the metrics chosen are not always wellsuited for a specific assessment goal. In the context of multispectral intraoperative imaging, for example, camera assessment has typically been restricted to determining descriptive statistics on error metrics that quantify the difference between the estimations of an algorithm and reference (gold standard) results [25, 29, 30]. An advantage of this approach is that the error metrics are straightforward to compute and interpret. On the other hand, such performance measures suffer from the fact that they do not reveal important insights with respect to why methods perform poorly. In particular, they do not account for the different types of uncertainties that may occur when recovering tissue parameters from camera measurements. An interesting practical example is the 3band camera designed for medical use [12] and investigated here. While it features a higher RMSE compared to a 3band camera based on the standard RGB design, recovery of tissue parameters is substantially less ambiguous, as indicated by the reduced number of multiple modes.
To address the issues related to commonly applied approaches to camera design, selection and performance assessment, we present a novel approach to camera assessment which provides the following key advantages compared to previously proposed methods:

Extended scope: The topic of camera design is closely linked to that of band selection [28]. To our knowledge, however, none of the approaches proposed in this field addresses the potential inherent ambiguity associated with the recovery of physiological parameters. To overcome this bottleneck, we propose moving beyond point estimates and mapping measurements to a full posterior probability distribution. Analysis of the posteriors not only provides us with a means for quantifying the uncertainty related to a specific measurement but also allows for a fundamental theoretical analysis about which tissue properties can in principle be recovered with the present camera.

No need for acquisition of real data: Many approaches to band/camera selection rely on acquisition of real data [9, 10, 21, 22, 27]. Yet, acquisition of real data for a given application is often impractical due to budget constraints (no money to purchase a whole range of cameras) or ethical issues. We address this issue by performing the comparative assessment in silico. Experiments with a whole range of porcine and human organs confirm the realism of our simulation framework.
The above computations show that these networks can compute the same error metrics as before (e. g. RMSE) while having the potential for finer differentiation through additional metrics (e. g. number of modes or width of posterior).
While it is straightforward to compute the widths of the posteriors, fullyautomatic multiple mode detection is not trivial due to the many parameters involved. For example, the posteriors are only implicitly given by a number of samples generated according to a latent space sample. This fact alone introduces statistical fluctuations into the estimated posterior. A kernel density estimation can smoothen out these effects, but at the cost of introducing a bandwidth parameter with a high impact on the number of resolved maxima. In addition, outliers must be handled in order to avoid faulty signals at the boundary of the posterior.
The calibration of our models suggest that while the confidence of our posteriors is already good, there is still room for improvement. In particular, the calibration of for the 3med camera is off for larger confidences. In future studies which aim at finer differentiation between the observed cameras this would have to be remedied. We are confident that this can be achieved keeping in mind the convincing results for the other three cameras.
Another obstacle which learned methods have to sidestep are the so called out of distribution samples. The performance of our algorithm can only be guaranteed on data that is similar to the training data. In general, this problem is difficult to tackle. In our case, the PCA projections of the organ measurements show exemplary that the spectra of many interesting objects, like internal organs, are in fact in our training distribution.
The color tile experiments together with the measured organ spectra suggest the validity of our simulation framework. A natural next step would be to test the performance of our method on real data. To achieve this, there remains a key challenge: real data will always be subject to noise which needs to be handled adequately by our algorithm. One approach would be to average the spectra either by using a higher integration time or by averaging multiple measurements. However, if there are time constraints, for example induced by organ movement, there are limits to the amount of averaging possible. Another approach would be to incorporate a realistic noise model in the simulation framework to account for it during training. This would circumvent the time constraint as the evaluation time of the network trained with this new data set would not change.
Another interesting direction for future work is to apply the framework to additional cameras that are widely used in a clinical context (e. g. RGB or narrow band cameras). We expect some obstacles with regard to the extension to 2band cameras as there is just very little information left for a multi parameter reconstruction. Additionally, these cameras would need a larger range of simulated wavelengths compared to the data set that we based our work on [29]. However, extending the framework to these ranges should be straight forward.
Additionally, while this study focused on intraoperative optical imaging, the concept of performance assessment using INNs could easily be transferred to other fields of research. Clearly, any imaging modality with pixelwise spectral information is a prime candidate. For larger image context, the INNs are still in active development. Because of their peculiar structure, the hidden layer size is the same as the input and output dimension leading to very large networks when images are to be processed as a whole. One example of an imaging modality where it might be fruitful to apply our INN method is the field of quantitative photoacoustic imaging (qPAI). It has been shown before that qPAI is an illposed inverse problem in theory [4]. However, to the best of our knowledge an in silico or even in vivo analysis of the practical implications of this nonuniqueness has not been conducted. The ability to detect ambiguous reconstructions of physiological parameters seems like a promising candidate to close this gap.
In conclusion, we have presented a novel method for performance assessment of optical cameras bearing the potential to measure the wellposedness of the inverse problem. Future work should focus on the evaluation steps necessary to fully harness the power of the computed posterior distributions. In particular, robust mode detection algorithms seem like a fruitful area for further investigation in order to quantify the uniqueness of the reconstruction.
References
 [1] Alerstam, E., Lo, W.C.Y., Han, T.D., Rose, J., AnderssonEngels, S., Lilge, L.: Nextgeneration acceleration and code optimization for light transport in turbid media using gpus. Biomedical optics express 1(2), 658–675 (2010)
 [2] Ardizzone, L., Kruse, J., Rother, C., Köthe, U.: Analyzing inverse problems with invertible neural networks. In: International Conference on Learning Representations (2019). URL https://openreview.net/forum?id=rJed6j0cKX
 [3] Clancy, N.T., Arya, S., Stoyanov, D., Singh, M., Hanna, G.B., Elson, D.S.: Intraoperative measurement of bowel oxygen saturation using a multispectral imaging laparoscope. Biomedical optics express 6(10), 4179–4190 (2015)
 [4] Cox, B., Laufer, J., Beard, P.: The challenges for quantitative photoacoustic imaging. In: Photons Plus Ultrasound: Imaging and Sensing 2009, vol. 7177, p. 717713. International Society for Optics and Photonics (2009)
 [5] Dinh, L., SohlDickstein, J., Bengio, S.: Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016)
 [6] Feindt, M.: A Neural Bayesian Estimator for Conditional Probability Densities. arXiv:physics/0402093 (2004)
 [7] Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: international conference on machine learning, pp. 1050–1059 (2016)
 [8] Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel twosample test. Journal of Machine Learning Research 13(Mar), 723–773 (2012)
 [9] Gu, X., Han, Z., Yao, L., Zhong, Y., Shi, Q., Fu, Y., Liu, C., Wang, X., Xie, T.: Image enhancement based on in vivo hyperspectral gastroscopic images: a case study. Journal of Biomedical Optics 21(10), 101412 (2016). DOI 10.1117/1.JBO.21.10.101412. URL http://biomedicaloptics.spiedigitallibrary.org/article.aspx?doi=10.1117/1.JBO.21.10.101412
 [10] Han, Z., Zhang, A., Wang, X., Sun, Z., Wang, M.D., Xie, T.: In vivo use of hyperspectral imaging to develop a noncontact endoscopic diagnosis support system for malignant colorectal tumors. Journal of Biomedical Optics 21(1), 016001–016001 (2016). URL http://biomedicaloptics.spiedigitallibrary.org/article.aspx?articleid=2481122
 [11] Jacques, S.L.: Optical properties of biological tissues: a review. Physics in medicine and biology 58(11), R37 (2013)
 [12] Kaneko, K., Yamaguchi, H., Saito, T., Yano, T., Oono, Y., Ikematsu, H., Nomura, S., Sato, A., Kojima, M., Esumi, H., Ochiai, A.: Hypoxia imaging endoscopy equipped with laser light source from preclinical live animal study to firstinhuman subject research. PloS one 9(6), e99055 (2014)
 [13] Kirchner, T., Gröhl, J., MaierHein, L.: Context encoding enables machine learningbased quantitative photoacoustics. Journal of Biomedical Optics 23(5), 056008 (2018). DOI 10.1117/1.JBO.23.5.056008
 [14] Kohl, S.A., RomeraParedes, B., Meyer, C., De Fauw, J., Ledsam, J.R., MaierHein, K.H., Eslami, S., Rezende, D.J., Ronneberger, O.: A probabilistic unet for segmentation of ambiguous images. arXiv preprint arXiv:1806.05034 (2018)
 [15] Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. In: I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (eds.) Advances in Neural Information Processing Systems 30, pp. 6402–6413. Curran Associates, Inc. (2017)
 [16] Leibig, C., Allken, V., Ayhan, M.S., Berens, P., Wahl, S.: Leveraging uncertainty information from deep neural networks for disease detection. Scientific Reports 7(1), 17816 (2017). DOI 10.1038/s4159801717876z
 [17] Li, Y., Gal, Y.: Dropout Inference in Bayesian Neural Networks with Alphadivergences. arXiv:1703.02914 [cs, stat] (2017)
 [18] MaierHein*, L., Eisenmann*, M., Reinke, A., Onogur, S., Stankovic, M., Scholz, P., Arbel, T., Bogunovic, H., Bradley, A.P., Carass, A., Feldmann, C., Frangi, A.F., Full, P.M., van Ginneken, B., Hanbury, A., Honauer, K., Kozubek, M., Landman, B.A., März, K., Maier, O., MaierHein, K., Menze, B.H., Müller, H., Neher, P.F., Niessen, W., Rajpoot, N., Sharp, G.C., Sirinukunwattana, K., Speidel, S., Stock, C., Stoyanov, D., Taha, A.A., van der Sommen, F., Wang, C.W., Weber, M.A., Zheng, G., Jannin*, P., KoppSchneider*, A.: Is the winner really the best? a critical analysis of common research practice in biomedical image analysis competitions (2018)

[19]
Moccia, S., Wirkert, S.J., Kenngott, H., Vemuri, A.S., Apitz, M., Mayer, B., De Momi, E., Mattos, L.S., MaierHein, L.: Uncertaintyaware organ classification for surgical data science applications in laparoscopy.
IEEE Transactions on Biomedical Engineering (2018) 
[20]
NiculescuMizil, A., Caruana, R.: Predicting good probabilities with supervised learning.
In: Proceedings of the 22nd international conference on Machine learning, pp. 625–632. ACM (2005)  [21] Nouri, D., Lucas, Y., Treuillet, S.: Efficient tissue discrimination during surgical interventions using hyperspectral imaging. In: International Conference on Information Processing in ComputerAssisted Interventions, pp. 266–275. Springer (2014). URL http://link.springer.com/chapter/10.1007/9783319075211_28
 [22] Nouri, D., Lucas, Y., Treuillet, S.: Hyperspectral interventional imaging for enhanced tissue visualization and discrimination combining band selection methods. International Journal of Computer Assisted Radiology and Surgery 11(12), 2185–2197 (2016). DOI 10.1007/s1154801614495. URL http://link.springer.com/10.1007/s1154801614495
 [23] Smith, L., Gal, Y.: Understanding Measures of Uncertainty for Adversarial Example Detection. arXiv:1803.08533 [cs, stat] (2018). ArXiv: 1803.08533
 [24] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15(1), 1929–1958 (2014)
 [25] Waibel, D., Gröhl, J., Isensee, F., Kirchner, T., MaierHein, K., MaierHein, L.: Reconstruction of initial pressure from limited view photoacoustic images using deep learning. In: Photons Plus Ultrasound: Imaging and Sensing 2018, vol. 10494, p. 104942S. International Society for Optics and Photonics (2018)
 [26] Wang, L., Jacques, S.L., Zheng, L.: Mcml—monte carlo modeling of light transport in multilayered tissues. Computer methods and programs in biomedicine 47(2), 131–146 (1995)
 [27] Wirkert, S.J., Clancy, N.T., Stoyanov, D., Arya, S., Hanna, G.B., Schlemmer, H.P., Sauer, P., Elson, D.S., MaierHein, L.: Endoscopic Sheffield Index for Unsupervised In Vivo Spectral Band Selection. In: X. Luo, T. Reichl, D. Mirota, T. Soper (eds.) ComputerAssisted and Robotic Endoscopy, vol. 8899, pp. 110–120. Springer International Publishing, Cham (2014). URL http://www.springerprofessional.de/011—endoscopicsheffieldindexforunsupervisedinvivospectralbandselection/5457688.html
 [28] Wirkert, S.J., Isensee, F., Vemuri, A.S., MaierHein, K., Fei, B., MaierHein, L.: Domain and task specific multispectral band selection (conference presentation). In: Design and Quality for Biomedical Technologies XI, p. nil (2018). DOI 10.1117/12.2287824. URL https://doi.org/10.1117/12.2287824

[29]
Wirkert, S.J., Kenngott, H., Mayer, B., Mietkowski, P., Wagner, M., Sauer, P., Clancy, N.T., Elson, D.S., MaierHein, L.: Robust near realtime estimation of physiological parameters from megapixel multispectral images with inverse Monte Carlo and random forest regression.
International journal of computer assisted radiology and surgery 11(6), 909–917 (2016)  [30] Wirkert, S.J., Vemuri, A.S., Kenngott, H.G., Moccia, S., Götz, M., Mayer, B.F., MaierHein, K.H., Elson, D.S., MaierHein, L.: Physiological parameter estimation from multispectral images unleashed. In: International Conference on Medical Image Computing and ComputerAssisted Intervention, pp. 134–141. Springer (2017)
 [31] Zhu, Y., Zabaras, N.: Bayesian deep convolutional encoderdecoder networks for surrogate modeling and uncertainty quantification. Journal of Computational Physics 366, 415–447 (2018). DOI 10.1016/j.jcp.2018.04.018
Comments
There are no comments yet.