Create a surrogate neural network for regression of nuclear reactor power distribution
There is renewed interest in developing small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive. For operation, an area of interest is the development of fully autonomous reactor control. Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while adhering to established safety criteria. Our group has proposed and received support for demonstration of an autonomous framework on a subcritical system: the MIT Graphite Exponential Pile. In order to have a fast response (on the order of miliseconds), we must extract specific capabilities of general-purpose system codes to a surrogate model. Thus, we have adopted current state-of-the-art neural network libraries to build surrogate models. This work focuses on establishing the capability of neural networks to provide an accurate and precise multi-dimensional regression of a nuclear reactor's power distribution. We assess using a neural network surrogate against a previously validated model: an MCNP5 model of the MIT reactor. The results indicate that neural networks are an appropriate choice for surrogate models to implement in an autonomous reactor control framework. The MAPE across all test datasets was < 1.16 0.77 from 7 kW to 30 kW across the core.READ FULL TEXT VIEW PDF
The data assimilation procedures used in many operational numerical weat...
Improving predictive understanding of Earth system variability and chang...
The availability of fast to evaluate and reliable predictive models is h...
Invertible neural networks are a recent technique in machine learning
This paper studies the compression of partial differential operators usi...
The tritium breeding ratio (TBR) is an essential quantity for the design...
Several applications such as nuclear forensics, nuclear fuel cycle
Create a surrogate neural network for regression of nuclear reactor power distribution
There is renewed interest in developing small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive . For construction, methods such as additive manufacturing  is under active development. For operation, an area of interest is the development of fully autonomous reactor control . Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while adhering to established safety criteria. For critical reactors of interest, the latter precludes implementing such a framework. Therefore, our group has proposed and received support for demonstration of an autonomous framework on a subcritical system: the MIT Graphite Exponential Pile (MGEP)111Details on the MGEP available at ..
The autonomous system under development aims to incorporate a surrogate model. Why is a surrogate model necessary? The preliminary system layout is presented in Fig. 1. There are two areas where a model is necessary: determining the extent of the system perturbation, and determining the appropriate response given a particular objective (e.g., symmetric flux distribution). In order to have a fast response (on the order of miliseconds), we must extract specific capabilities of general-purpose system codes to a surrogate model. Thus, we have adopted current state-of-the-art neural network libraries to build surrogate models.
Previous work in the applications of Neural Networks (NNs) to Nuclear Engineering problems is summarized. Several authors have focused on identification of transients [5, 6, 7] such as: LOCA, CR ejection, total loss of off-site power, etc. Determining the optimal fuel loading pattern, with an objective to flattening flux  or achieving a particular burnup . A majority of work was focused on providing a point parameter regression: to determine the thermal power ; to predict DNBR using NNs [11, 12] and hybrid techniques ; to predict the and maximum power . Only a single study was found that considered a multi-dimensional regression problem. The work used a NN to predict the transient 3-D power distribution of a theoretical homogeneous cubic reactor .
The literature review indicates that there has been no work in providing a multidimensional regression of a realistic nuclear facility. Towards achieving our goal of demonstrating autonomous control in the MGEP, we first assess using a neural network surrogate against a well established model. Thus, this study focused on providing a NN surrogate of the MIT reactor (MITR). The MCNP5 MITR model used in this study has been thoroughly validated . The cross-sectional geometry of the MITR is presented in Fig. 6
. There are 27 total positions which are filled with fuel elements, aluminum dummies, or experiments. In this study, a 22-element core is modeled, with other positions occupied by dummies. The result will be a surrogate model that will accept a control rod (shim blade) position vector, and provide a full-core power distribution.
Our work uses a neural network as a surrogate model. Neural networks provide several advantages over traditional machine learning algorithms (SVM, Random Forest, etc.). Neural networks are under active development and the underlying algorithms are continuously optimized for deployment on various computing architectures. There are powerful open-source libraries that abstract the development process and allow rapid deployment. Additionally, the architectures of neural networks can be modified to address varying problems (regression of power distribution, or, inversely regression of control rod position).
NPSN is the package developed to support this work and is available online at github.com/a-jd/npsn. The major components of NPSN is summarized in Fig. 2. The preparation of datasets involved preparing multiple permutations of the shim blade heights. Latin hypercube sampling of 6 heights was used to generate 151 permutations. For each permutation, an input deck for MCNP5 v1.60 is generated and executed. The output from MCNP5 is post-processed to generate power distributions for all 22 elements, with each element further discretized into 16 axial nodes. Therefore, the input of the NN will be a vector of size , and the output will be a matrix of size .
The NN architecture is dependent on the type of problem and data structure – there is no precise prescription for the structure. The general structure implemented is presented in Fig. 2. The Dense222Details on layer functions available at https://keras.io . layer is used for all intermediate connections between the input and the output. There is an Intermediate Dense Layer (IDL), which consists of a variable quantity of Dense
layers and a variable shape. To arrive at an optimal configuration systematically, we have implemented a meta-learning procedure. The structure and hyperparameters of the neural network are optimized based on a Tree of Parzen Estimator
Post-processing is an important step in determining the viability of the model. This is achieved by evaluating the error in providing a regression of “unseen” input datasets (known as test data). If the error of the test datasets matches the error in evaluating training data, the model has good generalization and can be expected to perform accurately for novel inputs. The error is defined as the mean absolute percentage error (MAPE),
where is the predicted power and is the MCNP5 power. The subscript represents the element node, the axial node, a particular permutation of the shim blade heights, and is the total number of permutations. The total number of summations , depends on the averaging mode. If we seek element-wise error, core-wise error, , . Additional summations can lead to, e.g., total error ,
. In addition to accuracy, we also test for precision of the model. The precision is an important consideration as large variances in model outcome could lead to an unstable system. The precision is quantified by the standard deviation of the error amongst the test datasets,
where is the error before averaging over .
The optimization process provides a useful guideline for selection of hyperparameters. In this work, the optimization process involved 500 iterations, presented in Fig. 3. There are some clear patterns that lead to a more successful model. A quantity and shape of IDL does not translate to a better model. Thus, an excessively large model is detrimental to performance. There is a benefit in using the logcosh loss function when evaluating the neural network, and using the ReLU IDL activation function. There is a significant benefit in using the adam optimizer during training. On the opposite end of the spectrum, apart from the network loss function, there are no patterns noted that guarantee a poor model. Therefore, a systematic optimization process is vital in building a neural network.
This section will detail performance metrics of the optimized model. We are interested in arriving at a model that generalizes well, while minimizing error and noise. The model had a less than 0.1 % difference in test vs. training , presented in Fig. 4. Therefore, we can expect the model to provide a good regression of unseen input data. However, this assumption is invalid if the input range exceeds that of the training dataset (i.e., if a shim blade is in an anomalous position).
The spatial distribution of error and its standard deviation is discussed next. The values of and are presented in Fig. 5. There is a coherent feature in the spatial distribution of both parameters: the geometric area towards the center of the core and C-ring elements have larger values of and . The C-ring is the outermost ring of the MITR core. In fact, the shim blades are inserted towards the outer region of the core. Furthermore, the tip of the blades lie towards the center of the core nominally (i.e., centroid of sampled heights). Therefore, the larger values of and correspond to spatial locations which experience the greatest perturbations from shim blade movement. This outcome is reasonable as we would expect spatial regions where the power distribution is relatively static, with respect to shim blade movement, to be predicted with far greater accuracy. The spatial dependence of the error is demonstrated in Fig. 6.
The magnitude of error and its standard deviation is discussed next. The error ranges from 0.10-1.16 %, over 31 test datasets. The corresponding standard deviation ranges from 0.06-0.77 %. Since this is a first-of-its-kind study, there is no available literature to contrast to. However, the maximum error plus maximum standard deviation (1.93 %) falls below the experimental uncertainty of the neutron detectors we will use ( %). Therefore, our study shows that a NN is a viable surrogate to use in conjunction with experimental data.
As the training data (generated by MCNP5) is limited by computational resources, it is interesting to determine if sufficient data has been generated. The number of training sets vs. and is presented in Fig. 7. It is apparent that, initially, the error and standard deviation is decreasing with respect to training dataset size. After the training dataset size is greater than 80, there improvement saturates. Thus, our work shows that approximately 100 datasets are necessary to appropriately train and test a surrogate neural network.
Lastly, the computational runtime to provide a regression is highlighted. During the MCNP5 data generation process, 151 datasets were generated. Each dataset took on a 32-core processor to achieve satisfactory statistics. In contrast, the time required to provide a regression using the NN surrogate is (using a single NVIDIA TITAN RTX GPU). The runtime can be reduced further if we optimize the NN compilation using TensorRT. Therefore, the surrogate model will not be the limiting component in the overall system response.
This work focuses on establishing the capability of neural networks to provide an accurate and precise multi-dimensional regression of a nuclear reactor’s power distribution. The results indicate that neural networks are an appropriate choice for surrogate models to implement in an autonomous reactor control framework. The MAPE across all test datasets was 1.16 % with a corresponding standard deviation of 0.77 %. The error is low, considering that the node-wise fission power can vary from to across the core. This work also provides guidance for best practices in network architecture, hyperparameter selection and dataset size.
The code used in this work is available online as an open-source python package, NPSN. The package is written to abstract the process of importing and pre-conditioning data, optimizing the neural network architecture, and post-processing. An example of the code syntax for the end user:
This work is supported by DOE NEUP Award Number: DE-NE0008872.
G. C. LEE and S. H. CHANG, “Radial basis function networks applied to DNBR calculation in digital core protection systems,”Annals of Nuclear Energy, 30, 15, 1561–1572 (oct 2003).
F. CHOLLET ET AL., “Keras,”https://keras.io/ (2015).