Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra

Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated well-calibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including out-of-distribution detection and sensitivity analysis.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/11/2021

Optional Pólya trees: posterior rates and uncertainty quantification

We consider statistical inference in the density estimation model using ...
11/24/2020

Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs

Uncertainty quantification is crucial for building reliable and trustabl...
01/15/2018

Uncertainty Quantification For A Permanent Magnet Synchronous Machine With Dynamic Rotor Eccentricity

The influence of dynamic eccentricity on the harmonic spectrum of the to...
09/04/2020

High-Dimensional Uncertainty Quantification via Active and Rank-Adaptive Tensor Regression

Uncertainty quantification based on stochastic spectral methods suffers ...
08/31/2020

Uncertainty quantification for Markov Random Fields

We present an information-based uncertainty quantification method for ge...
10/26/2021

Learning to Pre-process Laser Induced Breakdown Spectroscopy Signals Without Clean Data

This work tests whether deep neural networks can clean laser induced bre...
12/03/2020

Deep Spectral CNN for Laser Induced Breakdown Spectroscopy

This work proposes a spectral convolutional neural network (CNN) operati...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The ChemCam instrument on the Mars rover Curiosity uses laser-induced breakdown spectroscopy (LIBS), a type of an atomic emission spectroscopy, to remotely analyze Martian rocks (Wiens et al., 2012)

. Spectral information is used to extract the qualitative and quantitative chemical content (QQCC) of a material sample, where the QQCC can be seen as an unobserved state vector representing the sample. Both linear and nonlinear supervised learning techniques have been applied for mapping LIBS spectra to QQCC with good accuracy

(Forni et al., 2013; Boucher et al., 2015; Castorena et al., 2021)

. In this work, we build upon these efforts by proposing a framework for constructing the probability density function (PDF) of LIBS spectra. In addition, unlike previous methods which produce point estimates of QQCC, we propose and evaluate a method for uncertainty quantification (UQ) on point predictions of QQCC.

Many real-world spectra, including LIBS spectra, are characterized by a large number of features, complicating the construction of the data PDF due to the curse of dimensionality. We develop a novel framework for constructing low-dimensional PDFs suitable for downstream inference (e.g., sampling, density estimation, outlier detection, or unsupervised representation learning) using state-of-the-art neural density estimators with normalizing flows (NF) on spectral latent spaces. This framework allows us to generate realistic spectral samples on a reduced space and using an inverse-transformation project them back to the physically interpretable space. Furthermore, we propose a bootstrapping approach for quantifying uncertainty in predictions of unobserved state vectors corresponding to each spectrum. We demonstrate the capabilities of the proposed approach to construct the PDF of a LIBS spectral data set, and to learn a mapping to the known QQCC with uncertainty. The validated framework can be then employed for QQCC prediction and direct UQ of novel samples, such as artificial samples generated by the NF model as well as spectra collected directly on Mars.

To the best of our knowledge, this work is the first time normalizing flows are constructed on spectral latent spaces and can be readily employed for any kind of spectroscopy data. We show that the proposed framework provides a straightforward way to perform downstream inference tasks and direct UQ for high-dimensional spectral data.

2 Methods

2.1 Problem statement

Let’s assume is an dimensional random vector with non-negative elements, and a true data distribution , which represents the spectral signals. Our goal is to learn an invertible, stable mapping between the approximate data distribution and a latent distribution (e.g., Gaussian) that will allow for fast evaluation of various inference tasks. However, estimating the full-joint density of very high-dimensional spectra is a challenging and often intractable task. Therefore, we introduce a second mapping, that transforms to where to discover the spectral latent representation of the signals. Next, we learn an invertible mapping between (spectral latent variable) and (latent variable). This framework allows us to generate novel samples on the reduced spectral latent space and use the inverse transformation to map back to the original space and therefore approximate the true data distribution.

We also want to estimate an unobserved state vector where is the vector dimensionality. For the ChemCam application we consider, this represents the QQCC, an 8-dimensional vector with the relative weight percentages of 8 major oxides commonly found on Mars. Given a training dataset of LIBS spectra and associated compositions (samples generated on Earth), we are interested in constructing a surrogate of the mapping , that will allow us to make predictions of the chemical concentration of novel samples. To calculate uncertainties related to these predictions, we propose an approach based on bootstrapping that allows us to quantify both model and data uncertainties and thus assign measures of accuracy to sample estimates. This approach can be then employed for UQ of data generated by the normalizing flow model.

2.2 Spectral NMF latent space

Consider observations of the random vector and let the data matrix be . We use non-negative matrix factorization (NMF) to decompose into a product of a non-negative basis matrix and a non-negative coefficient matrix , such that (or equivalently ) (Paatero and Tapper, 1994; Wang and Zhang, 2012). NMF decomposes each data point into the linear combination of the basis vectors. The NMF optimization problem consists of minimizing the Frobenius norm between and .

NMF is appropriate for non-negative data and a powerful method for feature selection; and thus allows us to ignore the non-informative LIBS dimensions and enable interpretability of results

(Rammelkamp et al., 2020).

2.3 Inference via a spectral normalizing flow

We propose the construction of a normalizing flow model on the latent space of the LIBS spectra, obtained by the NMF decomposition, to learn the underlying probability distribution of the spectral latent variable

. The inverse NMF mapping introduced in the previous section can be used to project generated samples back to the physically interpretable space (i.e., ).

Normalizing flows are a powerful class of likelihood generative models which transform a base density into a target density by a series of deterministic and invertible transformations (Kobyzev et al., 2020). Consider the base density (spectral latent variable), the more complex density (latent variable) and an invertible mapping . Under the change of variables formula we can compute the log-likelihood of as

(1)

where represents the trainable parameters of the flow. To train the NF model the negative log-likelihood (NLL) of Eq.eq:nf is minimized.

Here, we parameterize the normalizing flow with a sequence of real-valued non-volume preserving (RealNVP) transformations (Dinh et al., 2016). The RealNVP model, composes two types of invertible transformations: additive coupling layers and rescaling. The model uses the so-called affine coupling layers for the coupling flows, which are simple and computationally efficient. The transformation can be written as

(2)

where is the Hadamard product or element-wise product and the is applied to each element of . The above transformation performs a ‘1-1’ mapping to the first elements and scales and shifts the remaining . By incorporating coupling layers into the flow, the elements are permuted across layers so that a different set of elements is copied each time. Here we model ,

as neural networks. Once the PDF is learned, downstream inference tasks can be performed straightforwardly. In the next Section, we are interested in predicting the elemental composition of novel samples, with their associated uncertainty.

2.4 Uncertainty quantification via bootsrapping

We now aim to construct a mapping between the LIBS signal signatures and QQCC. We train shallow neural networks, one for each oxide element. The models are formed as

(3)

where is the total number of oxides to be determined, denote the trainable weights and bias and

is the activation function. We learn the parameters of the models with a training set

, where is the total number of samples. The main advantage of such models is that they are both fast to train and result in very good accuracy scores (see Results).

R0.4 [width=0.4]figures/random_NF_samples.png Random sample generated by the normalizing flow model and transformed to the original space.

To quantify uncertainties related to predictions of elemental compositions we use bootstrapping (Kumar and Srivastava, 2012), a statistics resampling method which allow us to assign measures of accuracy to a sample estimate. In general, bootstrapping performs as well as parametric prediction intervals and its implementation is straightforward. For our application, it does not result in high-computational cost given the choice of simplistic bootstrapped shallow neural networks. In case of more complex models, methods that leverage the last-layer of the network can be employed as they have shown good performance (Brosse et al., 2020). Given a new observation , we can write

(4)

where represents the model estimate at the -th bootstrap iteration (model uncertainty) and the predicted residual between true and predicted values which can been modeled for a training dataset with a regression model (data uncertainty). To measure the quality of prediction intervals, we compute the coverage of validation samples (the rate at which the actual values fall within the range of the prediction interval).

3 Results

Consider the LIBS spectra matrix with and . We perform NMF using (selected by 5-fold cross-validation) with transformed data matrix

. Next we construct a NF model based on the RealNVP architecture with 5 coupling layers and a Gaussian distribution as a base density. We should highlight here the computational advantages of this approach. Constructing a NF model on the 15-dimensional latent space is extremely fast as the training process required less than 1 minute of CPU time. In Figure

2.4, we show a novel random sample generated by the NF model which is transformed back to the original space with inverse NMF.

[width=0.9]figures/regression-results.png

Figure 1: Regression results of the composition of each of the 8 major oxides for a holdout set of 140 samples. Accuracy is measured with the coefficient of determination ( score) and x,y axes represent the true and predicted oxide wt. respectively.
oxide
coverage () 84.89 98.56 86.33 86.33 86.33 96.40 93.53 89.93
Table 1: Coverage results for 95confidence intervals and 140 validation samples.

We consider 8 oxides and therefore we construct

, single hidden layer neural network models, with ReLU activation function, trained with stochastic gradient descent (SGD). To measure the accuracy of results we compute the coefficient of determination, calculated as

, where represents the sum of squares of residuals and the total sum of squares. We show the accuracy of models in Figure 1 for a holdout set of 140 samples. The results reported show that point estimates are close to the optimal regression (1:1 line) and overall the response is comparable to state-of-the-art deep CNN approaches (Castorena et al., 2021). For and elements, estimates show larger deviation, which are not considered significant due to the small oxide wt. values at these regions.

R0.5 [width=0.5]figures/bootstrapping_validation_2.png Bootstrap results for a random sample.

Finally, we perform bootstrapping for a dataset of LIBS spectra samples collected on Earth (with associated ground truth) and we compute the prediction intervals for the same holdout dataset of 140 samples. In Figure 3

, we plot the distribution of bootstrap predictions with the ground truth for a random Earth sample where box plots represent the first quartile to the third quartile and green stars are the ground truth. Table

1 shows the coverage calculated for 95 prediction intervals for all holdout samples, and we see that the intervals appear to nearly achieve the nominal coverage. The validated approach can be therefore used for making predictions with uncertainty for novel LIBS spectra, generated either by the normalizing flow model or for Martian samples directly collected from ChemCam.

4 Conclusions

We showed that the proposed framework provides a straightforward way to perform downstream inference tasks for high-dimensional spectral data by identifying a spectral latent space, estimated as a parsimonious representation of the data, and constructing a spectral normalizing flow model on the reduced space. The proposed approach is ideal for modeling high-dimensional data and enables the learning of complex distributions in a fast and efficient way. Beyond general high-dimensional inference, the proposed UQ approach allows for predictions of state vectors associated with novel out-of-distribution data or data generated by the trained normalizing flow.

5 Broader impact

This work provides a robust approach to construct low-dimensional probability densities on spectral latent spaces and quantify uncertainties for predictions related to the elemental compositions of spectral samples generated directly from neural density estimators. Our approach has immediate application for inference and direct UQ of spectral data in different fields such as astronomy, geology, audio signal processing, bioinformatics and more. We believe that this work does not have future societal or ethical consequences.

This project was supported by the Laboratory Directed Research and Development program of Los Alamos National Laboratory under project number LDRD-20210043DR. Research was performed while K.K. was an Applied Machine Learning Summer Research Fellow at LANL.

References

  • T. F. Boucher, M. V. Ozanne, M. L. Carmosino, M. D. Dyar, S. Mahadevan, E. A. Breves, K. H. Lepore, and S. M. Clegg (2015) A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy. Spectrochimica Acta Part B: Atomic Spectroscopy 107, pp. 1–10. Cited by: §1.
  • N. Brosse, C. Riquelme, A. Martin, S. Gelly, and É. Moulines (2020) On last-layer algorithms for classification: decoupling representation from uncertainty estimation. arXiv preprint arXiv:2001.08049. Cited by: §2.4.
  • J. Castorena, D. Oyen, A. Ollila, C. Legget, and N. Lanza (2021) Deep spectral CNN for laser induced breakdown spectroscopy. Spectrochimica Acta Part B: Atomic Spectroscopy 178, pp. 106125. Cited by: §1, §3.
  • L. Dinh, J. Sohl-Dickstein, and S. Bengio (2016) Density estimation using real NVP. arXiv preprint arXiv:1605.08803. Cited by: §2.3.
  • O. Forni, S. Maurice, O. Gasnault, R. C. Wiens, A. Cousin, S. M. Clegg, J. Sirven, and J. Lasue (2013) Independent component analysis classification of laser induced breakdown spectroscopy spectra. Spectrochimica Acta Part B: Atomic Spectroscopy 86, pp. 31–41. Cited by: §1.
  • I. Kobyzev, S. Prince, and M. Brubaker (2020) Normalizing flows: an introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence. Cited by: §2.3.
  • S. Kumar and A. Srivastava (2012)

    Bootstrap prediction intervals in non-parametric regression with applications to anomaly detection

    .
    In Proc. 18th ACM SIGKDD Conf. Knowl. Discovery Data Mining, Cited by: §2.4.
  • P. Paatero and U. Tapper (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5 (2), pp. 111–126. Cited by: §2.2.
  • K. Rammelkamp, O. Gasnault, O. Forni, J. Lasue, and S. Maurice (2020) Optimization of clustering analyses for classification of ChemCam data from Gale Crater, Mars. In European Planetary Science Congress, pp. EPSC2020–867. Cited by: §2.2.
  • Y. Wang and Y. Zhang (2012) Nonnegative matrix factorization: a comprehensive review. IEEE Transactions on knowledge and data engineering 25 (6), pp. 1336–1353. Cited by: §2.2.
  • R. C. Wiens, S. Maurice, B. Barraclough, M. Saccoccio, W. C. Barkley, J. F. Bell, S. Bender, J. Bernardin, D. Blaney, J. Blank, et al. (2012) The ChemCam instrument suite on the Mars Science Laboratory (MSL) rover: body unit and combined system tests. Space science reviews 170 (1), pp. 167–227. Cited by: §1.