Deep Surrogate Models for Multi-dimensional Regression of Reactor Power

07/10/2020 ∙ by Akshay J. Dave, et al. ∙ MIT 0

There is renewed interest in developing small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive. For operation, an area of interest is the development of fully autonomous reactor control. Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while adhering to established safety criteria. Our group has proposed and received support for demonstration of an autonomous framework on a subcritical system: the MIT Graphite Exponential Pile. In order to have a fast response (on the order of miliseconds), we must extract specific capabilities of general-purpose system codes to a surrogate model. Thus, we have adopted current state-of-the-art neural network libraries to build surrogate models. This work focuses on establishing the capability of neural networks to provide an accurate and precise multi-dimensional regression of a nuclear reactor's power distribution. We assess using a neural network surrogate against a previously validated model: an MCNP5 model of the MIT reactor. The results indicate that neural networks are an appropriate choice for surrogate models to implement in an autonomous reactor control framework. The MAPE across all test datasets was < 1.16 0.77 from 7 kW to 30 kW across the core.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 3

Code Repositories

npsn

Create a surrogate neural network for regression of nuclear reactor power distribution


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

There is renewed interest in developing small modular reactors and micro-reactors. Innovation is necessary in both construction and operation methods of these reactors to be financially attractive [1]. For construction, methods such as additive manufacturing [2] is under active development. For operation, an area of interest is the development of fully autonomous reactor control [3]. Significant efforts are necessary to demonstrate an autonomous control framework for a nuclear system, while adhering to established safety criteria. For critical reactors of interest, the latter precludes implementing such a framework. Therefore, our group has proposed and received support for demonstration of an autonomous framework on a subcritical system: the MIT Graphite Exponential Pile (MGEP)111Details on the MGEP available at [4]..

The autonomous system under development aims to incorporate a surrogate model. Why is a surrogate model necessary? The preliminary system layout is presented in Fig. 1. There are two areas where a model is necessary: determining the extent of the system perturbation, and determining the appropriate response given a particular objective (e.g., symmetric flux distribution). In order to have a fast response (on the order of miliseconds), we must extract specific capabilities of general-purpose system codes to a surrogate model. Thus, we have adopted current state-of-the-art neural network libraries to build surrogate models.

Figure 1: Control system layout. Blue boxes indicate processes where a neural network surrogate can be deployed. RCR: Reacting control rod (controlled); ICR: Initiating control rod (unknown pertubation).

1.1 Previous applications of Neural Networks

Previous work in the applications of Neural Networks (NNs) to Nuclear Engineering problems is summarized. Several authors have focused on identification of transients [5, 6, 7] such as: LOCA, CR ejection, total loss of off-site power, etc. Determining the optimal fuel loading pattern, with an objective to flattening flux [8] or achieving a particular burnup [9]. A majority of work was focused on providing a point parameter regression: to determine the thermal power [10]; to predict DNBR using NNs [11, 12] and hybrid techniques [13]; to predict the and maximum power [14]. Only a single study was found that considered a multi-dimensional regression problem. The work used a NN to predict the transient 3-D power distribution of a theoretical homogeneous cubic reactor [15].

1.2 Objectives

The literature review indicates that there has been no work in providing a multidimensional regression of a realistic nuclear facility. Towards achieving our goal of demonstrating autonomous control in the MGEP, we first assess using a neural network surrogate against a well established model. Thus, this study focused on providing a NN surrogate of the MIT reactor (MITR). The MCNP5 MITR model used in this study has been thoroughly validated [16]. The cross-sectional geometry of the MITR is presented in Fig. 6

. There are 27 total positions which are filled with fuel elements, aluminum dummies, or experiments. In this study, a 22-element core is modeled, with other positions occupied by dummies. The result will be a surrogate model that will accept a control rod (shim blade) position vector, and provide a full-core power distribution.

2 Surrogate Model

Our work uses a neural network as a surrogate model. Neural networks provide several advantages over traditional machine learning algorithms (SVM, Random Forest, etc.). Neural networks are under active development and the underlying algorithms are continuously optimized for deployment on various computing architectures. There are powerful open-source libraries that abstract the development process and allow rapid deployment. Additionally, the architectures of neural networks can be modified to address varying problems (regression of power distribution, or, inversely regression of control rod position).

NPSN is the package developed to support this work and is available online at github.com/a-jd/npsn. The major components of NPSN is summarized in Fig. 2. The preparation of datasets involved preparing multiple permutations of the shim blade heights. Latin hypercube sampling of 6 heights was used to generate 151 permutations. For each permutation, an input deck for MCNP5 v1.60 is generated and executed. The output from MCNP5 is post-processed to generate power distributions for all 22 elements, with each element further discretized into 16 axial nodes. Therefore, the input of the NN will be a vector of size , and the output will be a matrix of size .

Figure 2: NPSN

package layout. The left-hand graph shows the three major components. The right-hand graph shows the neural network architecture and examples of hyperparameters that are modified during the optimization process (

X: IDL layer shape, Y: number of layers)

The NN architecture is dependent on the type of problem and data structure – there is no precise prescription for the structure. The general structure implemented is presented in Fig. 2. The Dense222Details on layer functions available at https://keras.io [17]. layer is used for all intermediate connections between the input and the output. There is an Intermediate Dense Layer (IDL), which consists of a variable quantity of Dense

layers and a variable shape. To arrive at an optimal configuration systematically, we have implemented a meta-learning procedure. The structure and hyperparameters of the neural network are optimized based on a Tree of Parzen Estimator

[18]

. Several hyperparameters such as the batch size, IDL number of layers, IDL shape of layers, IDL activation function, network loss function, etc., were probed.

Post-processing is an important step in determining the viability of the model. This is achieved by evaluating the error in providing a regression of “unseen” input datasets (known as test data). If the error of the test datasets matches the error in evaluating training data, the model has good generalization and can be expected to perform accurately for novel inputs. The error is defined as the mean absolute percentage error (MAPE),

(1)

where is the predicted power and is the MCNP5 power. The subscript represents the element node, the axial node, a particular permutation of the shim blade heights, and is the total number of permutations. The total number of summations , depends on the averaging mode. If we seek element-wise error, core-wise error, , . Additional summations can lead to, e.g., total error ,

. In addition to accuracy, we also test for precision of the model. The precision is an important consideration as large variances in model outcome could lead to an unstable system. The precision is quantified by the standard deviation of the error amongst the test datasets,

(2)

where is the error before averaging over .

2.1 Optimization

The optimization process provides a useful guideline for selection of hyperparameters. In this work, the optimization process involved 500 iterations, presented in Fig. 3. There are some clear patterns that lead to a more successful model. A quantity and shape of IDL does not translate to a better model. Thus, an excessively large model is detrimental to performance. There is a benefit in using the logcosh loss function when evaluating the neural network, and using the ReLU IDL activation function. There is a significant benefit in using the adam optimizer during training. On the opposite end of the spectrum, apart from the network loss function, there are no patterns noted that guarantee a poor model. Therefore, a systematic optimization process is vital in building a neural network.

Figure 3: Outcome of 500 iterations during optimization. Each plot represents a particular hyperparameter that was modified. The x-axis represents permutations sorted in ascending loss. The top 50 model configurations are highlighted in red.

3 Results

This section will detail performance metrics of the optimized model. We are interested in arriving at a model that generalizes well, while minimizing error and noise. The model had a less than 0.1 % difference in test vs. training , presented in Fig. 4. Therefore, we can expect the model to provide a good regression of unseen input data. However, this assumption is invalid if the input range exceeds that of the training dataset (i.e., if a shim blade is in an anomalous position).

Figure 4: Difference between test & training set element-wise error. The training set contains data that the surrogate model has been exposed to and used for tuning neural network parameters. The test set contains data that the model has not been exposed to. A large discrepancy would indicate that the model is unable to provide a regression for unseen data.

The spatial distribution of error and its standard deviation is discussed next. The values of and are presented in Fig. 5. There is a coherent feature in the spatial distribution of both parameters: the geometric area towards the center of the core and C-ring elements have larger values of and . The C-ring is the outermost ring of the MITR core. In fact, the shim blades are inserted towards the outer region of the core. Furthermore, the tip of the blades lie towards the center of the core nominally (i.e., centroid of sampled heights). Therefore, the larger values of and correspond to spatial locations which experience the greatest perturbations from shim blade movement. This outcome is reasonable as we would expect spatial regions where the power distribution is relatively static, with respect to shim blade movement, to be predicted with far greater accuracy. The spatial dependence of the error is demonstrated in Fig. 6.

Figure 5: Left: Evaluation of the error function across entire test dataset. Right: Evaluation of the error function standard deviation across the entire test dataset.
Figure 6: Element-wise error () across all test sets. The red bars indicate the approximate shim blade (control rod) locations in the MITR. Gray elements with suffix (D) indicate empty fuel element positions. Gray elements with suffix (E) indicate empty in-core experiment positions. Empty positions are modeled as dummy aluminum elements.

The magnitude of error and its standard deviation is discussed next. The error ranges from 0.10-1.16 %, over 31 test datasets. The corresponding standard deviation ranges from 0.06-0.77 %. Since this is a first-of-its-kind study, there is no available literature to contrast to. However, the maximum error plus maximum standard deviation (1.93 %) falls below the experimental uncertainty of the neutron detectors we will use ( %). Therefore, our study shows that a NN is a viable surrogate to use in conjunction with experimental data.

As the training data (generated by MCNP5) is limited by computational resources, it is interesting to determine if sufficient data has been generated. The number of training sets vs. and is presented in Fig. 7. It is apparent that, initially, the error and standard deviation is decreasing with respect to training dataset size. After the training dataset size is greater than 80, there improvement saturates. Thus, our work shows that approximately 100 datasets are necessary to appropriately train and test a surrogate neural network.

Figure 7: Variation in model performance as a function of total dataset size. The composition of test sets is kept constant, while the training sets vary.

Lastly, the computational runtime to provide a regression is highlighted. During the MCNP5 data generation process, 151 datasets were generated. Each dataset took on a 32-core processor to achieve satisfactory statistics. In contrast, the time required to provide a regression using the NN surrogate is (using a single NVIDIA TITAN RTX GPU). The runtime can be reduced further if we optimize the NN compilation using TensorRT. Therefore, the surrogate model will not be the limiting component in the overall system response.

4 Concluding Remarks

This work focuses on establishing the capability of neural networks to provide an accurate and precise multi-dimensional regression of a nuclear reactor’s power distribution. The results indicate that neural networks are an appropriate choice for surrogate models to implement in an autonomous reactor control framework. The MAPE across all test datasets was 1.16 % with a corresponding standard deviation of 0.77 %. The error is low, considering that the node-wise fission power can vary from to across the core. This work also provides guidance for best practices in network architecture, hyperparameter selection and dataset size.

The code used in this work is available online as an open-source python package, NPSN. The package is written to abstract the process of importing and pre-conditioning data, optimizing the neural network architecture, and post-processing. An example of the code syntax for the end user:

import npsn
# Define dataset directory
data_dir = ’~/some/data_location’
# Define model name (for output file label)
proj_nm = ’npsn_surrogate’
# Define number of control blades
n_x = 6
# Define nodalization of power distribution
n_y = (16, 22)  #(axial_nodes, fuel_locations)
# Train neural network without optimization
npsn.train(proj_nm, data_dir, n_x, n_y)
# Or with optimization
npsn.train(proj_nm, data_dir, n_x, n_y, max_evals=100)
# Post-process to quantify error
npsn.post(proj_nm)

5 Acknowledgments

This work is supported by DOE NEUP Award Number: DE-NE0008872.

References

  • [1] D. PETTI, P. J. BUONGIORNO, M. CORRADINI, and J. PARSONS, “The future of nuclear energy in a carbon-constrained world,” Massachusetts Institute of Technology Energy Initiative (MITEI) (2018).
  • [2] “3D-printed nuclear reactor promises faster, more economical path to nuclear energy – ORNL,” https://www.ornl.gov/news/3d-printed-nuclear-reactor-promises-faster-more-economical-path-nuclear-energy.
  • [3] R. T. WOOD, B. R. UPADHYAYA, and D. C. FLOYD, “An autonomous control framework for advanced reactors,” Nuclear Engineering and Technology, 49, 5, 896–904 (aug 2017).
  • [4] M. D. GALE, Developing Modern Graphite Exponential Pile Experiments to Augment Reactor physics Education, Ph.D. thesis, Massacusetts Institute of Technology (2018).
  • [5] A. BASU and E. B. BARTLETT, “Detecting faults in a nuclear power plant by using dynamic node architecture artificial neural networks,” Nuclear Science and Engineering, 116, 4, 313–325 (1994).
  • [6] T. V. SANTOSH, G. VINOD, R. K. SARAF, A. K. GHOSH, and H. S. KUSHWAHA, “Application of artificial neural networks to nuclear power plant transient diagnosis,” Reliability Engineering and System Safety, 92, 10, 1468–1472 (oct 2007).
  • [7] E. B. BARTLETT and R. E. UHRIG, “Nuclear power plant status diagnostics using an artificial neural network,” Nuclear Technology, 97, 3, 272–281 (1992).
  • [8] M. SADIGHI, S. SETAYESHI, and A. A. SALEHI, “PWR fuel management optimization using neural networks,” Annals of Nuclear Energy, 29, 1, 41–51 (jan 2002).
  • [9] B. LENIAU, B. MOUGINOT, N. THIOLLIERE, X. DOLIGEZ, A. BIDAUD, F. COURTIN, M. ERNOULT, and S. DAVID, “A neural network approach for burn-up calculation and its application to the dynamic fuel cycle code CLASS,” Annals of Nuclear Energy, 81, 125–133 (2015).
  • [10] M. S. ROH, S. W. CHEON, and S. H. CHANG, “Thermal Power Prediction of Nuclear Power Plant Using Neural Network and Parity Space Model,” IEEE Transactions on Nuclear Science, 38, 2, 866–872 (1991).
  • [11] H. C. KIM and S. H. CHANG, “Development of a back propagation network for one-step transient DNBR calculations,” Annals of Nuclear Energy, 24, 17, 1437–1446 (1997).
  • [12]

    G. C. LEE and S. H. CHANG, “Radial basis function networks applied to DNBR calculation in digital core protection systems,”

    Annals of Nuclear Energy, 30, 15, 1561–1572 (oct 2003).
  • [13] X. ZHAO, K. SHIRVAN, R. K. SALKO, and F. GUO, “On the prediction of critical heat flux using a physics-informed machine learning-aided framework,” Applied Thermal Engineering, 164, 114540 (jan 2020).
  • [14] H. MAZROU, “Performance improvement of artificial neural networks designed for safety key parameters prediction in nuclear research reactors,” Nuclear Engineering and Design, 239, 10, 1901–1910 (2009).
  • [15] M. BOROUSHAKI, M. B. GHOFRANI, and C. LUCAS, “Simulation of nuclear reactor core kinetics using multilayer 3-D cellular neural networks,” in “IEEE Transactions on Nuclear Science,” (jun 2005), vol. 52, pp. 719–728.
  • [16] K. SUN, M. AMES, T. NEWTON JR, and L.-W. HU, “Validation of a fuel management code MCODE-FM against fission product poisoning and flux wire measurements of the MIT reactor,” Progress in Nuclear Energy, 75, 42–48 (2014).
  • [17]

    F. CHOLLET ET AL., “Keras,”

    https://keras.io/ (2015).
  • [18] J. BERGSTRA, D. YAMINS, and D. D. COX, “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures,” in “30th International Conference on Machine Learning, ICML 2013,” (2013), vol. 28, pp. 115–123.