Log In Sign Up

Exploring Sensitivity of ICF Outputs to Design Parameters in Experiments Using Machine Learning

Building a sustainable burn platform in inertial confinement fusion (ICF) requires an understanding of the complex coupling of physical processes and the effects that key experimental design changes have on implosion performance. While simulation codes are used to model ICF implosions, incomplete physics and the need for approximations deteriorate their predictive capability. Identification of relationships between controllable design inputs and measurable outcomes can help guide the future design of experiments and development of simulation codes, which can potentially improve the accuracy of the computational models used to simulate ICF implosions. In this paper, we leverage developments in machine learning (ML) and methods for ML feature importance/sensitivity analysis to identify complex relationships in ways that are difficult to process using expert judgment alone. We present work using random forest (RF) regression for prediction of yield, velocity, and other experimental outcomes given a suite of design parameters, along with an assessment of important relationships and uncertainties in the prediction model. We show that RF models are capable of learning and predicting on ICF experimental data with high accuracy, and we extract feature importance metrics that provide insight into the physical significance of different controllable design inputs for various ICF design configurations. These results can be used to augment expert intuition and simulation results for optimal design of future ICF experiments.


page 1

page 3

page 4

page 5

page 6

page 8


MUC-driven Feature Importance Measurement and Adversarial Analysis for Random Forest

The broad adoption of Machine Learning (ML) in security-critical fields ...

Mulberry Leaf Yield Prediction Using Machine Learning Techniques

Soil nutrients are essential for the growth of healthy crops. India prod...

Cognitive simulation models for inertial confinement fusion: Combining simulation and experimental data

The design space for inertial confinement fusion (ICF) experiments is va...

Behavior of Hyper-Parameters for Selected Machine Learning Algorithms: An Empirical Investigation

Hyper-parameters (HPs) are an important part of machine learning (ML) mo...

Enhancing Human-Machine Teaming for Medical Prognosis Through Neural Ordinary Differential Equations (NODEs)

Machine Learning (ML) has recently been demonstrated to rival expert-lev...

Data-Driven Prediction Model of Components Shift during Reflow Process in Surface Mount Technology

In surface mount technology (SMT), mounted components on soldered pads a...

I Introduction

Inertial confinement fusion (ICF), a technique for generating nuclear fusion reactions by heating and compressing a deuterium-tritium (DT) filled capsule, has been a focus of nuclear fusion research for decades [1]. Many modern ICF experiments are designed using computer simulations that approximate the real physical processes that occur during capsule implosion. However, when attempting to model applications where the underlying physics is not well understood—as is the case in ICF, where extreme temperatures and pressures exceed K and Mbar, respectively—simulations often perform poorly, and are not always validated by experimental data [2]. Moreover, ICF experiments are expensive to run, meaning that generating large sets of experimental data to validate simulation results is not always feasible.

Machine learning (ML) offers a novel framework for analyzing data from ICF experiments and simulations. Although the use of ML algorithms in the realm of ICF is relatively new, it has demonstrated some early successes. Using supervised ML techniques trained on a multi-petabyte dataset of ICF simulations, Peterson et al. [3] identify a new class of ovoid-shaped implosions that consistently achieve high yield in simulations despite the presence of hydrodynamic instabilities. Humbird et al. [2]

train a deep neural network (DNN) surrogate model for low-fidelity ICF simulations and apply transfer learning, a technique in which models already trained on one dataset are partially re-trained to solve different but related tasks, to obtain a surrogate model for high fidelity models and experiments. Hsu et al. 

[4] apply ML regression methods to experimental ICF data (the same dataset analyzed in Section III) to analyze relationships between experimental outputs of interest.

In this work, we utilize a random forest (RF) predictor [5] to identify relationships between controllable design inputs and experimental outputs from ICF experiments performed at the National Ignition Facility (NIF). The prediction model is then used to assess the sensitivity of predicted outputs to the design inputs in order to identify the design features most strongly related to changes in output. This importance analysis can be used to augment the understanding of expert designers and provide insight to improve future designs. Section II and III introduce the ML methods and data, respectively, used in this work. Section IV looks at prediction on the full set of outputs and assess the importance of design features to prediction across output metrics. Section V presents individual analyses of low- and high-density hohlraum gas fill shots. We finish by summarizing and discussing future work in Sections VI and VII.

Ii Machine Learning Background

RF regression is an ensemble ML method that employs multiple decision trees to produce highly accurate predictions on medium-to-large data sets. Decision trees are popular due to their efficiency and adaptability, but perform poorly on unseen data sets 

[6] [7]. RFs reduce over-fitting by averaging over multiple decision trees: each tree is fit to a random sample of the full training data, and for each split of the tree, a random subset of the full features is considered [5]. RFs exhibit low generalization error on large data sets, and perform better than individual decision trees on both seen and unseen data [5, 8]. We use RFs instead of DNNs for this work because RFs typically outperform DNNs on small datasets such as ours, and because they are computationally cheaper to train than other models such as DNNs and Gaussian Processes (GPs).

To analyze the input-output relationships encoded in the RF model, we use Accumulated Local Effects (ALE) [9]. ALE is an ML metric for interpretability, which refers to the ease with which a human can understand why an ML model makes the decisions that it does, and, consequently, the extent to which humans can predict the model’s results [10, 11]. Interpretability is essential to the safety of many systems (such as driverless cars) and, for scientific applications, necessary in order to extract meaningful scientific knowledge from the model’s behavior [12]. For ICF analysis, feature importance rankings are a crucial component of model interpretability because they reflect input-output relationships that can augment subject matter expert understanding of the physical processes.

ALE is a model-agnostic measure that describes the extent to which each feature influences the model’s predictions. The variance of this function averaged over the other features – the main effect of a feature – can be used to compare the relative importance of the features. This parallels variance-based sensitivity analysis like Sobol indices 


, but ALE estimates feature importance by analyzing how much the model’s predictions change over a small range of each feature, then averaging and accumulating these differences over the prediction space. ALE provides consistent estimates of the main effect of the features even when features are correlated 

[9], as they often are in the case of ICF data.

Iii Dataset

We train our regression model on data from 140 experiments conducted at the NIF beginning in 2011. Our work utilizes 21 design parameters simultaneously in order to predict each of four experimental outputs: total yield, velocity, from (we refer to this parameter as simply ), and gated X-ray bang time (referred to here as BT). Total yield represents the actual measured yield for fusion neutrons, corrected for the small portion of neutrons that lose energy due to scattering as they pass through the ice layer. Velocity (in /ns or km/s) refers to the implosion velocity. (in g/) is a measure of fuel thickness, calculated as the product of average fuel mass density and fuel radius (assuming a spherical shape for the fuel within the capsule). BT (in ns) is the time at which the fusion neutrons were produced in the experiment, measured according to the time at which X-rays come out of the capsule.

The recorded experiments were performed with a variety of ignition capsule designs and ablator materials. Over this time period, experimental design systematically evolved resulting in improved performance (see Fig. 1). In particular, hohlraum design was improved by switching from high density gas fills (group I) to low density gas fills (group II). We define high gas fill density as any value greater than 0.6 mg/. Expert opinion and previous work [4] indicate that, due to the significant physical differences between group I and II shots, separate analysis of each group may improve model prediction and provide insight as to the effects of the switch from high to low gas fills. In Section IV, we analyze RF performance on both groups together, while Section V includes a separate analysis of model prediction quality and feature importance rankings for each group.

As with any statistical analysis, our results are only as good as the available data. Uncharacterized inputs, such as surface roughness, will not be considered by the ML algorithm. Because the data contains missing values for some of the recorded experimental variables, we pre-process the dataset using iterative imputation from the scikit-learn package in Python 


, which employs Bayesian ridge regression to estimate (


impute) missing values using the remaining, observed features. The choice of imputation method is important, as missing data that is replaced with arbitrary values can falsely skew model results. Iterative imputation can provide a more informed estimate of missing feature values than methods such as zero-imputation or mean-imputation.

The data includes experimental uncertainties for three of the four output quantities studied in this work: total yield,

, and BT. The physical origin of the reported uncertainties is not noted in the data; however, we treat each uncertainty measurement as one standard deviation (

) to be conservative. The data contains no reported errors for velocity because implosion velocity is not measured directly, but rather inferred via surrogate experiments. Following expert recommendation, we use /ns as the uncertainty for all reported values of velocity based on analysis of method for inferring implosion velocity 111Velocity error is derived from error in the original surrogate convergent ablator experiments and from error in the gated X-ray bang time measurements. The velocity of a DT layered implosion is inferred via a surrogate convergent ablator that uses X-ray radiography to observe capsule radius as a function of time. Using this information along with measured values of X-ray bang time, the velocity of the DT layered implosion can be estimated using a combination of simulations and physics arguments. Using this method, errors in the velocity measurements are principally composed of error in convergent ablator measurement and X-ray bang time measurements, and are typically between 10-15 /ns. Improved quantification of measurement uncertainty would improve the assessment of quality of fit. For all four outputs, we incorporate these uncertainty values into our analysis to provide an indication of whether our model’s predictions fall within experimental uncertainty bounds.

Finally, the data convolves sensitivities to physics mechanisms, such as laser-plasma instabilities and asymmetry and hydrodynamic instability growth with high impact systematic design changes, such as hohlraum design, capsule fill, and laser wavelength tuning. As will be discussed later, ML methods and data analysis methods will identify the most dominant or important features, whether physics-driven or design-driven.

Fig. 1: Fusion yield and key design changes for each of the 141 shots carried out at the NIF beginning in 2011. Courtesy of Sean Finnegan and Los Alamos National Laboratory.

Iii-a Correlated Variables

The matrix in Fig. 2 quantifies correlation between different design parameters, confirming that there are several highly-correlated input variables. Chief among these are the three time-based parameters (start final rise, start peak power, and end pulse) as well as parameters describing hohlraum dimensions (Dante 1 diameter, hohlraum length, and hohlraum diameter). Experts informed us that the time-based parameters are likely to be correlated with ablator thickness as a design choice, since thicker ablators generally require a longer push at peak power; as a result, experimenters typically select later times for start final rise and end pulse for shots with thicker ablators. Furthermore, hohlraum dimension parameters are highly mutually correlated with one another due to the limited number of hohlraum designs used at NIF over the time period in which the experiments were performed.

Fig. 2: Matrix representing correlations between input variables. Red represents absolute positive correlation, and blue represents absolute negative correlation. There is a strong correlation between time-based variables (start final rise, start peak power, end pulse) and ablator thickness as well as between hohlraum dimensions (hohlraum length, hohlraum diameter, and Dante 1 diameter). Best viewed in color.

Correlated parameters can “share” importance in a way that falsely skews importance rankings. For example, if variables and both have a strong effect on the model’s decisions but are highly correlated, any importance metric (including ALE) will divide the importance among both variables, resulting in relatively low importance rankings for both and even though the true experimental impact of both parameters may be much higher. Following expert recommendation to account for such correlations, the following five input variables were removed from our dataset: start final rise, start peak power, end pulse, Dante 1 diameter, and hohlraum diameter. Ablator thickness and hohlraum length were maintained. A more rigorous assessment of correlated parameters and ML analysis to inform physical relationships is reserved for future work (see Section VI).

Iv Results

Iv-a Prediction Quality

Like Hsu et al. [4], we use Mean Absolute Error (MAE), , and Explained Variance () to evaluate model performance. We substitute Root Mean Square Error (RMSE) for Mean Square Error (MSE) since RMSE has the same units as MAE. Model performance results on all four output variables are summarized in Table I. Fig. 3 shows aggregated train-test results for total yield, velocity, , and BT. (Note that for total yield, the model was trained and predicted on a log scale222This is done because, unlike other outputs, yield varies across multiple orders of magnitude. Since ALE is a variance-based estimate of feature importance, it overestimates the importance of features that distinguish between orders of magnitude in output. Running the RF with yield on a log scale mitigates this effect., but points are plotted here at their original scale.) Prediction quality is high across the board, achieving values close to 1 on training data and in the 0.7-0.9 range on test data. Interestingly, the model’s predictive quality is particularly high when predicting BT. As an output, BT closely reflects a series of key design changes at the NIF (see Fig. 1). The original low-foot designs had bang times in the range of  20 ns, while the newer high foot and high-density carbon (HDC) designs have bang times of approximately 12-14 ns and 8 ns, respectively. For each key design change, yield and implosion velocity have increased while BT has decreased. However, this correlation does not fully explain why the model is able to make such accurate predictions on BT in particular.

The model systematically under-predicts for high experimental values and over-predicts for low ones. This effect may be due to a relative lack of these low and high points in the dataset, as RFs are poor at extrapolating trends for data that they haven’t seen during training. However, the bias is visible in the training data as well as the test data, suggesting that the given feature space may lack key design features (such as capsule surface quality or mixing between the pusher and the hot and cold fuels) needed to distinguish medium values of yield, velocity, etc. from very high or low ones.

The ratio of model error to experimental uncertainty—calculated as where are model predictions, are observed experimental values and are reported experimental uncertainties for —are shown in Fig. 4. The low percentage of points with for total yield and BT, despite high predictive performance on these values (BT in particular), suggests that the experimental errors reported in the data for total yield and for BT may be overly conservative. For and velocity, the number of points that fall below the line is very high because the reported uncertainties for velocity and are larger than those reported for total yield. (Velocity and have average reported experimental percent errors of and , respectively, while the average reported experimental percent errors for total yield and BT are and .)

Fig. 3: Prediction quality for four experimental outputs: total yield, velocity, , and BT. Each figure displays aggregated results from ten different RF models and random 80-20 train-test splits. Perfect predictions lie along the black line. Horizontal error bars represent , where each is the experimental uncertainty reported for that value. Best viewed in color.
Fig. 4: Ratio of model error to experimental uncertainty for four experimental outputs: total yield, velocity, , and BT. Model predictions that fall within experimental uncertainty bounds fall along or below the black line. Best viewed in color.
Output MAE RMSE ExVar
Yield Train
Velocity Train
BT Train
TABLE I: Model results for four output variables.

Iv-B Importance Results

Fig. 5: ALE importances for total yield, velocity, , and BT. The graph bars show importances aggregated by input variable. Best viewed in color.
Fig. 6: ALE importances for total yield, velocity, , and BT. The graph bars show importances aggregated by output variable. Best viewed in color.

We aggregate ALE importance rankings for all outputs by input variable (Fig. 5) and by output variable (Fig. 6). Fig. 6 shows that importance rankings between total yield and velocity are highly correlated. High velocity is typically the product of greater kinetic energy in the implosion piston. As the capsule implodes, this energy is deposited into the fuel, creating higher fuel temperature and greater overall total yield (where yield scales as ). In Fig. 6, we see this correlation in the importance rankings for total yield and velocity, both of which show significant effects from and LEH laser energy, and hohlraum length, among other variables.

Likewise, and BT show correlated importance rankings: both are strongly influenced by cryo layer thickness, trough power, and trough cone fraction. The correlation between and BT is likely due to the fact that the original low-foot ICF designs, which used high gas fill hohlraums and long laser pulses (see Fig. 1), had the largest values of both and BT. As ICF design shifted toward shorter laser pulse shapes, values of both and BT decreased. We hypothesize that the high importance of trough cone fraction in predicting both and BT is due to the fact that trough cone fraction is highly correlated with hohlraum gas fill density (high gas fill shots generally have a longer trough), and gas fill density is an important predictor of implosion performance.

As shown in Fig. 5, the input variables with the greatest total importance across all outputs are trough cone fraction, picket power, trough power, , LEH laser energy, and cryo layer thickness. The high importance of picket power reflects the fact that picket power is crucial for controlling capsule stability during high-speed implosions. Increasing the velocity of implosions allows for greater energy concentration in the hot spot, thus improving performance and yield [15]. However, high-speed implosions typically experience greater instabilities at the ablator surface. Such instabilities, when large enough, reach the hot spot and interfere with neutron reactivity. Increasing the picket power helps reduce such instabilities and prevent them from reaching the hot spot, allowing implosions to be driven stably at higher velocities and thus increasing yield. It is therefore unsurprising that picket power has such high overall importance, particularly in predicting total yield.

The model does not assign high importance to picket power when predicting on velocity. This finding is consistent with the fact that implosion velocity does not directly depend on picket power. Implosion velocity is primarily determined from the rocket equation , where is implosion velocity, is the ablation pressure, is the initial mass of the ablator, and is the final mass at peak velocity. However, although velocity does not depend directly on picket power, it is indirectly correlated with picket power because of the picket’s role in controlling implosion stability. Picket power is used to set fuel adiabat by sending small shocks into the ablator, which makes the fuel less compressible but also reduces the effects of hydrodynamic instabilities that can interfere with implosion performance. The reduced fuel compressibility in high picket power/high adiabat implosions may have some impact on implosion velocity (capsules with less compressible fuel will generally implode more slowly); however, this effect can, in principle, be mitigated by increasing the laser power. Conversely, implosions that have lower values of picket power are more likely to be unstable, which can also result in higher values of and smaller values of . As such, picket power does have some physical effect on implosion velocity; however, velocity depends principally on other factors such as ablator mass.

The high overall importance of is likely due to the implosion shape of the first 70 shots (group I). The high density hohlraum gas fill present in these shots causes laser plasma instabilities that make the implosion shape hard to control, causing some of the laser light to scatter back out of the hohlraum and thus reducing implosion yield. The wavelength difference is used to control the symmetry of high gas fill implosions by controlling the transfer of energy from the outer cone beams to the inner cone beams; however, it can also drive greater backscatter from the inner beams, leading to less overall coupled energy to the target and worse implosion performance overall. The variation in implosion stability, symmetry, and laser backscatter from shot to shot may therefore increase the importance of when predicting on high gas fill shots, a phenomenon that is not present for the low gas fill shots. Similarly, we hypothesize that the high overall importance of trough cone fraction may be due to the correlation between trough cone fraction and hohlraum gas fill, as high gas fill shots generally have a longer trough. We further analyze the discrepancy in feature importance results between high and low gas fill shots in Section V-B.

V Individual Analysis of Group I and II Shots

V-a Prediction Quality

Fig. 7 displays aggregated train-test results for high (group I) and low (group II) density shots across all four output variables. Model performance results on all four output variables are summarized in Table II. Again, model performance is generally high, with values close to 1 on training data and in the range of 0.7 to 0.9 on test data. Predictions on total yield and are slightly higher for the low gas fill shots, while predictions on velocity are slightly higher for the high gas fill shots; however, these differences are extremely slight, and model performance overall is near-equal on both groups. For BT, model performance is worse when predicting on groups I and II individually than when predicting on the dataset as a whole, although prediction quality is still high across the board. As when both groups are analyzed together, the model tends to over-predict low values and under-predict high values for all four outputs studied. This pattern is present across training and test data for both low and high density shot groups.

Fig. 7: RF prediction quality for high (group I) and low (group II) density shots. Each figure displays aggregated results from ten random 80-20 train-test splits. Perfect predictions lie along the black line. Horizontal error bars represent , where each is the experimental uncertainty reported for that value. Best viewed in color.

Fig. 8 displays model error-uncertainty ratio for total yield, , and BT on training and test data for low and high density shots. Again, the majority of data points for total yield have and the majority of data points for have . For BT, training data points are split relatively evenly above and below the line, while a majority of test data points have . For total yield and , the low gas fill data tends to have slightly more points with than does the high gas fill data, while the opposite is true of velocity. This is consistent with the fact that the model is better on low gas fill data for total yield and and better on high gas fill data for velocity, although the difference in performance between the two groups is very small.

Fig. 8: Ratio of model error to experimental uncertainty for high and low gas fill density shots. Model predictions that fall within experimental uncertainty bounds fall along or below the black line. Best viewed in color.
Output MAE RMSE ExVar
Yield Train (low)
Test (low)
Train (high)
Test (high)
Velocity Train (low)
Test (low)
Train (high)
Test (high)
Train (low)
Test (low)
Train (high)
Test (high)
BT Train (low)
Test (low)
Train (high)
Test (high)
TABLE II: Model results for high and low gas fill shots.

V-B Importance Results

Fig. 9 shows importance results for high and low density shots, aggregated by input variable, while Fig. 10 shows the same results aggregated by output variable. Both figures show significant differences in variable importance rankings between the two shot groups. From Fig. 9, we see that LEH laser energy and trough power is of high importance to both groups, but that each group is otherwise principally affected by a very different set of inputs.

Fig. 9: ALE importances for group I (top) and group II (bottom). The graph bars show importances aggregated by input variable. Best viewed in color.

Apart from LEH laser energy and trough power, the most important inputs overall for the high gas fill shots are , picket cone fraction, number of pulse steps, picket power, and toe length. In contrast, the low gas fill shots are principally affected by LEH peak power, hohlraum length, trough cone fraction, ablator thickness, and hohlraum gas fill. Notably, the parameter drop from being the second-most important predictor of high gas fill shots to zero importance for the low gas fill shots. This result is consistent with the fact that for high gas fill shots, the wavelength difference varies greatly between shots due to laser plasma instabilities caused by the gas fill. When low density hohlraum gas fill is used, these instabilities are reduced, making more consistent between shots and thus reducing its predictive importance.

Trough cone fraction, the most important variable when predicting on the dataset as a whole, drops in importance for group I, but remains a significant predictor of group II shot performance. This is likely because trough cone fraction has a much stronger effect on pulse shape for low gas fill shots, as the trough cone fraction determines how much power can pass by the waist of the capsule before the hohlraum expands to block the lasers from reaching that region (the principal function of the gas fill in early hohlraum designs was to prevent the hohlraum from expanding and blocking the lasers in this manner).

Picket power, the second-most important variable when making predictions on the dataset as a whole, also drops in importance for both groups individually, particularly for group II. This may be due to the fact that the shift toward low gas fill hohlraums was accompanied by a shift toward more stable big-foot implosions, potentially reducing the importance of picket power in setting the fuel adiabat. The largest change in picket power occurred between the low-foot and high-foot campaigns, which both used high gas fills; for this reason, the importance of picket power is higher for group I than for group II. Although there was also a shift in picket power between the HDC and big-foot campaigns (both of which used low gas fills), it was not as significant, resulting in a lower importance ranking for picket power among the low gas fill shots.

We note in Figure 9 that the LEH peak power is ranked as notably more important for the low gas fill experiments as compared to the high gas fill. Potentially related, the LEH laser energy is of notably higher importance in high than low. One possible explanation for this is that the LEH peak power and LEH laser energy are strongly correlated experimentally. As such, the RF predictor can utilize them similarly to capture the same relationship with the yield. Extending the approach in this paper to account for these design-related correlations will be investigated in future work.

From Fig. 10, we see that for both high and low density shots, the importance rankings for total yield velocity are still correlated to some extent, particularly for the low density shots. For the high gas fill shots, importance rankings between the yield outputs and velocity are similar except for the fact that the yield outputs are affected by a greater number of inputs, while velocity is dominated by LEH laser energy and . The strong effect of on yield outputs and velocity disappears for the low gas fill shots. For low gas fill shots, yield and velocity are mainly affected by LEH peak power, LEH laser energy, hohlraum length, and trough cone fraction. Total yield also shows a lesser, but still significant, effect from picket power, while velocity is strongly affected by ablator thickness and capsule or cryo layer thickness.

Fig. 10: ALE importances for group I (left) and group II (right). The graph bars show importances aggregated by output variable. Best viewed in color.

Although the importance rankings for and BT are highly correlated when high and low density shots are analyzed together, this correlation disappears when the two shot groups are analyzed separately. For the low density shots in particular, the importance rankings for are more heavily correlated with total yield and velocity than they are with BT. As discussed in Section IV-B, the correlation of and BT when the dataset is analyzed as a whole is likely due to the correlation of both variables with hohlraum gas fill density. When the data is pre-split by gas fill density, the model may no longer be able to detect this relationship in the data.

Vi Discussion and Future Work

At the outset, we intended to identify the physics mechanisms driving ICF implosion dynamics using RFs. However, the sensitivity of the data to physics mechanisms is strongly confounded with the impact of design changes. Due to the nature of experimental design at the NIF, design evolution followed design changes with the most significant impact. As such, data outputs followed the same design change evolution. We determined that purely data-driven assessment of the importance of design quantities does not allow discrimination between important physical mechanisms confounded in design changes. As shown in the paper, the importance of design changes are still beneficial. However, if it is desired to better understand the dominant physics mechanisms in an experiment, the confounding due to the impact of design changes must be considered. This can be done through analysis of the statistical design of experiments and incorporating physical knowledge in causal inference. Other directions for future work include using ML to perform a deeper analysis of relationships between correlated design inputs, to analyze discrepancy between simulations and experiments, and to predict optimal design configurations for ICF simulations.

Vii Conclusion

Although its use in the field is relatively new, ML provides a promising method of ICF analysis. In this work, we show that RFs are able to learn and predict on data from ICF experiments with high accuracy, achieving scores of 0.9+ on training data and 0.7+ on unseen test data. The model’s performance on data from high and low gas fill density shots does not differ significantly from its performance on the dataset analyzed as a whole. The model’s predictions show some bias toward the mean across all outputs and shot groups studied, suggesting that there may be factors missing from the input feature space that affect experimental results.

Many of the feature importance results detected by our model are consistent with known physics and are reflective of key design changes that took place between experiments. The model’s ability to detect shifts in hohlraum gas fill density, capsule design, and other significant changes that took place between the high-foot, low-foot, HDC, and big-foot campaigns indicates that it is able to accurately identify which design inputs exert the greatest influence on experimental outputs, providing importance results that are consistent with the physics of ICF implosions. Such ML-based importance results may provide greater insight into input-output relationships as well as the effects of key ICF experimental design changes on outputs of interests, potentially informing the design of future ICF experiments and simulations.

Viii Acknowledgements

This work was supported by the U.S. Department of Energy through the Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001). We would like to thank Dr. Otto Landen for the use of his NIF experimental database as the source for this work. Approved for public release under LA-UR-20-27991.