To express diverse natural phenomena such as sound, heat, electrostatics, elasticity, thermodynamics, fluid dynamics, and quantum mechanics mathematically, various partial differential equations (PDEs) have been derived and numerical methods can be applied to solve these PDEs. Representative numerical methods for solving PDEs are the finite difference method, finite element method, finite volume method, spectral method, etc. We focus on the finite difference method (FDM) which is to divide a given domain into finite grids and find an approximate solution using derivatives with finite differencesPZ2012
. This method uses each and its neighbor points to predict the corresponding point at the next time step. Likewise, in convolutional neural networks (CNNs)cnn , convolution operators extract each pixel of an output by using the corresponding pixel and its neighbor pixels of an input. Also, the convolution operator is basically immutable. Hence, well-structured convolutional neural networks have a potential to solve partial differential equations numerically. Therefore, we propose Five-point stencil CNN (FCNN) containing a five-point stencil kernel and a trainable approximation function to obtain numerical solutions of the PDEs. Among the various PDEs representing natural phenomena, we deal with reaction-diffusion type equations. The reaction-diffusion model has been applied and used in various fields such as biology NB1986 ; PBH2016 ; DYYMDJJJ2017 , chemistry BAG2009 ; ISBBDL2012 ; GHRR2013 , image segmentation HBB1995 ; SYHR2006 ; ZYMQS2020 MBet2000 ; YDJSJ2015 ; JJS2016 , medical EQ2015 ; HYJ2015 ; MCYSA2021 , and so on. In this paper, we use second order reaction-diffusion type equations: heat, Fisher’s, Allen–Cahn (AC) equation, and reaction-diffusion equations with trigonometric functions terms.
In recent years, neural networks have been widely applied to solve PDEs. Physics-informed neural networks (PINNs) pinns1
based on multi-layer perceptron (MLP) models approximate solutions by the optimization of a loss function consisting of given physics laws. The biggest benefit of PINNs is that solutions can be inferred without any iterative process such as a recurrence equation with respect to time. Furthermore, it is used for diverse applications such as Hidden Fluid Mechanicshfm that extracts hidden variables of a given equation using a PINN and observations. However, it is hard to optimize model parameters when we deal with complicated PDEs and their coefficients. In order to improve the training ability, combinations of PINNs and numerical methods have been developed, or other neural networks such as CNNs are selected. M. Raissi et al. pinns1 added Runge-Kutta methods to a PINN model for solving AC equation. Aditi et al. pinns2
proposed transfer learning and curriculum regularization which start training PINNs on a specific safe domain and then transfer to a target domain. Hao Ma et al.unetpde proposed a U-shape CNN so-called U-net unet and they showed that the usage of target data in a loss function significantly improves optimization. Elie Bretin et al. meancur used convolutional neural networks derived from a semi-implicit approach to learn phase field mean curvature flows of the AC equation.
We propose data-driven models that approximate the solution of explicit finite difference scheme to solve second order reaction-diffusion type equations numerically. Our contributions are as follows:
We propose a five-point stencil convolution operator to solve reaction-diffusion type equations.
Our proposed model is trained using two consecutive snapshots to solve a given equation.
We demonstrate the robustness of our method using five reaction-diffusion type equations and noisy data.
The remainder of this paper is organized as follows. In Section 2, we present how to create training data using explicit FDM, explain the FCNN concept, training process, and numerical solutions. In Section 3, we compare the prediction results using our proposed FCNN and the evolution of PDEs results using the FDM method and show the robustness of our FCNN. Finally conclusions are drawn in Section 4.
2 Methods and numerical solutions
FDM is to divide a given domain into finite grids and find an approximate solution using derivatives with finite differences PZ2012 . We use explicit FDM to create training data with random initial conditions. We use only two consecutive FDM results, the initial and next time step results, as training data for each equation. To create training data, a computational domain is defined using a uniform grid of size and for and . Here, and are mesh sizes on the computational domain . Let be approximations of and is temporal step size. The boundary condition is zero Neumann boundary condition. Laplacian of a function is calculated using a five-point stencil method, the Laplacian can be approximated as follows:
In this way, the first and second derivatives of at each point (e.g., , , and ) can be approximated within the 3 3 local area centered . This concept can be equivalent to 3 3 convolution kernels. The 3 3 kernels following properties:
1. for any (element-wise summation)
2. for any (element-wise multiplication)
3. for any (element-wise division)
4. for any and any real numbers
The second-order PDEs can be solved numerically using combinations of 3 3 kernels. Therefore, if we build a proper CNN as the form of recurrence Eq. (4), we can solve a PDE (2) numerically. A previous study about AC equation cnnallencahn shows that FDM can be expressed as CNN.
To solve second order reaction-diffusion type equations
where is a diffusion coefficient, is a reaction coefficient, and is a smooth function to present reaction effect, we propose FCNN as a recurrence relation:
As a CNN,
containing a 5-point stencil filter and a pad satisfying given boundary conditions solves. In order to approximate terms, we define a trainable polynomial function as follows:
with model parameters for any and a real value . Let be a FCNN. Then, the inference is as follows:
where is a set of model parameters. Figure 1 shows the computational graph of our explicit model FCNN containing model parameters for any in a filter. Furthermore, represents the diffusion term on the uniform grid of and axises, so we set up and to cut down on training time. When the five-point stencil filter is used and is a -th order polynomial function, the number of model parameters is only . Thus, the set-up enables to learn physical patterns from few data. In Algorithm 1, an initial image and the prediction at the next time are used with train data and to train a model . The objective function is the mean square error function as follows:
where , and are the number of pixels in an output image, a prediction and its target respectively.
2.1 Reaction-diffusion type equations
To demonstrate the robustness of FCNN, we consider reaction-diffusion type equations: heat, Fisher’s, AC equation, reaction-diffusion equations with trigonometric functions. The reaction and diffusion coefficients used in each formula are arranged in Table 1.
For the AC equation, where is the thickness of the transition layer and which value is cnnallencahn . For the other equations, we select arbitrary coefficients. For all the following equations, the continuous equations and the discretized equations are described in turn, and the zero Neumann boundary condition is used.
2.1.1 Heat equation
2.1.2 Fisher’s equation
2.1.3 AC equation
2.1.4 Reaction-diffusion equation with trigonometric function:
2.1.5 Reaction-diffusion equation with trigonometric function:
where and each discretized equation ((11), (13), (15), (17), (19)) is implemented based on the model structure proposed in cnnallencahn . When and , all the equations show the almost similary evolution so we use different reaction coefficient much bigger than diffusion coefficient to check diverse evolutions as shown in Table 1.
3 Simulation results
Assume that we observe a reaction-diffusion pattern and investigate the pattern rule under the constraint meaning that the observations and predictions follow the same PDE. Our proposed FCNN is trained using only two consecutive data which are the initial and next time step results for each equation. Then, we evaluate the model using diverse unseen initial values.
In the simulations, we use random initial value data with mesh so that the size of the input data is containing a pad as a boundary condition. Also, (Heat, Fisher’s, AC) or (Sine, Tanh) for is fixed depending on given equations and a
convolutional filter is used with the stride ofin Eq. (7). Hence, the filter has 10,000 () chances to learn the evolution of results images, so training a model using only two consecutive images are enough to optimize nine or thirteen model parameters (). As an optimizer, ADAM adam is used with a learning rate of 0.01 and without any regularization. Instead, we apply early stopping earlystopping based on a validation data to avoid overfitting. To demonstrate the approximation for non-polynomial functions , we additionally consider sine and tanh functions besides heat, Fisher’s, and AC equations.
For the evaluation, we implement FCNN and FDM respectively and then measure the averaged relative
error with 95% confidence interval over 100 novel random initial values as shown in Table2.
Furthermore, we validate the errors using different types of initial values for each equation as shown in Table 3. The initial conditions are described in the Appendix Section.
Figures 2-6 show the time evolution results when unseen initial shapes (circle, star, three circles, torus, and maze) are given after learning with two training data (random initial condition and next time step result with FDM). We compare the predicted results from pretrained models to the FDM results.
Data-driven models are sensitive to data noise. To investigate the noise effect, we inject Gaussian random noise to and then the model is trained using and for the AC equation. Table 4 shows that the model can be trained under the noise condition. Figure 7 displays the results of the inference using contaminated models.
In this paper, we proposed Five-point stencil CNN (FCNN) containing a five-point stencil kernel and a trainable approximation function. We considered reaction-diffusion type equations including heat, Fisher’s, Allen–Cahn equations, and reaction-diffusion equations with trigonometric functions. We showed that our proposed FCNN can be trained well using few data (used only two consecutive data) and then can predict reaction-diffusion evolution with unseen diverse initial conditions. Also, we demonstrated the robustness of our FCNN under the noise condition.
The corresponding author Y. Choi was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (NRF-2020R1C1C1A0101153712).
In this appendix session, we describe the initial conditions used in the simulation results session 3. A detailed description of these initial conditions can be found in our previous research paper cnnallencahn .
(1) The initial condition of a circle shape
where is the initial radius of a circle.
(2) The initial condition of a star shape
(3) The initial condition of a torus shape
where and are the radius of major (outside) and minor (inside) circles, respectively. And, for simplicity of expression, .
(4) The initial condition of a maze shape
The initial condition of a maze shape is complicated to describe its equation, so refer to the codes which are available from the first author’s GitHub web page (https://github.com/kimy-de) and the corresponding author’s web page (https://sites.google.com/view/yh-choi/code).
(5) The initial condition of a random shape
here the function rand has a random value between and .
- (1) P. Zhou, Numerical analysis of electromagnetic fields. Springer Science & Business Media, 2012.
- (2) S Kondo, T Miura, ”Reaction-diffusion model as a framework for understanding biological pattern formation.” science 329.5999 (2010): 1616-1620.
- (3) NF Britton, Reaction-diffusion equations and their applications to biology. Academic Press, 1986.
- (4) P Broadbridge, BH Bradshaw-Hajek, ”Exact solutions for logistic reaction–diffusion equations in biology.” Zeitschrift für angewandte Mathematik und Physik 67.4 (2016): 1-13.
- (5) D Jeong, Y Li, Y Choi, M Yoo, D Kang, J Park, J Choi, J Kim ”Numerical simulation of the zebra pattern formation on a three-dimensional model.” Physica A: Statistical Mechanics and its Applications 475 (2017): 106-116.
- (6) BA Grzybowski, Chemistry in motion: reaction-diffusion systems for micro-and nanotechnology. John Wiley & Sons, 2009.
- (7) I Sgura, B Bozzini, D Lacitignola, ”Numerical approximation of oscillating Turing patterns in a reaction-diffusion model for electrochemical material growth.” AIP Conference Proceedings. Vol. 1493. No. 1. American Institute of Physics, 2012.
- (8) G Hariharan, R Rajaraman, ”A new coupled wavelet-based method applied to the nonlinear reaction–diffusion equation arising in mathematical chemistry.” Journal of Mathematical Chemistry 51.9 (2013): 2386-2400.
H Tek, BB Kimia, ”Image segmentation by reaction-diffusion bubbles.” Proceedings of IEEE International Conference on Computer Vision. IEEE, 1995.
- (10) S Esedog, YHR Tsai, ”Threshold dynamics for the piecewise constant Mumford–Shah functional.” Journal of Computational Physics 211.1 (2006): 367-384.
Z Zhang, YM Xie, Q Li, S Zhou, ”A reaction–diffusion based level set method for image segmentation in three dimensions.” Engineering Applications of Artificial Intelligence 96 (2020): 103998.
- (12) M Bertalmio et al. ”Image inpainting.” Proceedings of the 27th annual conference on Computer graphics and interactive techniques. 2000.
- (13) Y Li, D Jeong, J Choi, S Lee, J Kim, ”Fast local image inpainting based on the Allen–Cahn model.” Digital Signal Processing 37 (2015): 65-74.
- (14) J Yu, J Ye, S Zhou, ”Reaction-diffusion system with additional source term applied to image restoration.” International Journal of Computer Applications 975 (2016): 8887.
- (15) E Özuğurlu, ”A note on the numerical approach for the reaction–diffusion problem to model the density of the tumor growth dynamics.” Computers & Mathematics with Applications 69.12 (2015): 1504-1517.
- (16) HG Lee, Y Kim, J Kim, ”Mathematical model and its fast numerical method for the tumor growth.” Mathematical Biosciences & Engineering 12.6 (2015): 1173.
- (17) M Yousefnezhad, CY Kao, SA Mohammadi, ”Optimal Chemotherapy for Brain Tumor Growth in a Reaction-Diffusion Model.” SIAM Journal on Applied Mathematics 81.3 (2021): 1077-1097.
- (18) S.M. Allen, J.W. Cahn, “A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening.” Acta metallurgica 27.6 (1979): 1085–1095.
- (19) Yongho Kim, Gilnam Ryu, and Yongho Choi (2021) Fast and Accurate Numerical Solution of Allen-Cahn Equation, Mathematical Problems in Engineering, vol. 2021, Article ID 5263989, 12 pages, 2021. https://doi.org/10.1155/2021/5263989
- (20) LeCun, Yann and Boser, Bernhard and Denker, John and Henderson, Donnie and Howard, R. and Hubbard, Wayne and Jackel, Lawrence, (1990), Handwritten Digit Recognition with a Back-Propagation Network, Advances in Neural Information Processing Systems, Vol 2.
- (21) Diederik P. Kingma and Jimmy Ba, (2017). Adam: A Method for Stochastic Optimization, arXiv:1412.6980.
Maziar Raissi, Paris Perdikaris, and George Em Karniadakis, (2018). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019): 686-707.
- (23) Aditi S. Krishnapriyan, Amir Gholami, Shandian Zhe, Robert M. Kirby, and Michael W. Mahoney, (2021). Characterizing Possible Failure Modes in Physics-Informed Neural Networks, Neural Information Processing Systems (NeurIPS) 2021, arXiv:2109.01050.
- (24) Hao Ma, Yuxuan Zhang, Nils Thuerey, Xiangyu Hu, Oskar J. Haidn, (2021). Physics-driven Learning of the Steady Navier-Stokes Equations using Deep Convolutional Neural Networks, arXiv:2106.09301.
- (25) Maziar Raissi, Alireza Yazdani, and George Em Karniadakis. (2020) Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations, Science 367.6481 (2020): 1026-1030.
- (26) Elie Bretin, Roland Denis, Simon Masnou, and Garry Terii, (2021). Learning Phase Field Mean Curvature Flows With Neural Networks, arXiv:2112.07343.
- (27) Olaf Ronneberger, Philipp Fischer and Thomas Brox, (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation, Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2015, Lecture Notes in Computer Science, vol 9351. Springer, Cham.
- (28) Lutz Prechelt, (1998). Early Stopping - but when?, In: Orr G.B., Müller KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 1524.