Data-driven discovery of Bäcklund transforms and soliton evolution equations via deep neural network learning schemes

11/18/2021
by   Zijian Zhou, et al.
11

We introduce a deep neural network learning scheme to learn the Bäcklund transforms (BTs) of soliton evolution equations and an enhanced deep learning scheme for data-driven soliton equation discovery based on the known BTs, respectively. The first scheme takes advantage of some solution (or soliton equation) information to study the data-driven BT of sine-Gordon equation, and complex and real Miura transforms between the defocusing (focusing) mKdV equation and KdV equation, as well as the data-driven mKdV equation discovery via the Miura transforms. The second deep learning scheme uses the explicit/implicit BTs generating the higher-order solitons to train the data-driven discovery of mKdV and sine-Gordon equations, in which the high-order solution informations are more powerful for the enhanced leaning soliton equations with higher accurates.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 9

page 11

page 13

page 14

page 18

page 22

04/30/2021

Deep learning neural networks for the third-order nonlinear Schrodinger equation: Solitons, breathers, and rogue waves

The third-order nonlinear Schrodinger equation (alias the Hirota equatio...
09/24/2020

Discovery of Governing Equations with Recursive Deep Neural Networks

Model discovery based on existing data has been one of the major focuses...
11/06/2019

Diagnostics for Eddy Viscosity Models of Turbulence Including Data-Driven/Neural Network Based Parameterizations

Classical eddy viscosity models add a viscosity term with turbulent visc...
02/28/2022

Deep learning enhanced Rydberg multifrequency microwave recognition

Recognition of multifrequency microwave (MW) electric fields is challeng...
03/06/2021

Enhanced fifth order WENO Shock-Capturing Schemes with Deep Learning

In this paper we enhance the well-known fifth order WENO shock-capturing...
02/18/2021

Deep learning as closure for irreversible processes: A data-driven generalized Langevin equation

The ultimate goal of physics is finding a unique equation capable of des...
07/07/2020

A deep learning based nonlinear upscaling method for transport equations

We will develop a nonlinear upscaling method for nonlinear transport equ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In the fields of applied mathematics and nonlinear mathematical physics, there are many types of physically interesting nonlinear evolution partial differential equations (PDEs). Particularly, since the well-known Korteweg-de Vries (KdV) equation with solitons (or solitary waves) 

[2, 3] was presented by Korteweg and de Vries [4], various of soliton evolution equations [5, 6] (e.g., Boussinesq equation, mKdV equation, KP equation, sine-Gordon equation, nonlinear Schrödinger equation, Gross-Pitaevskii equation) play an important role in the fields of nonlinear science such as fluid mechanics, nonlinear optics, quantum optics, Bose-Einstein condensates, plasmas physics, ocean, atmosphere, biology, and even finance [7, 8, 9, 10, 11, 12]. Many types of analytical, numerical and experimental approaches have been used to deeply explore the wave structures and properties of these soliton equations (see, e.g., Refs. [5, 6, 7, 8, 9, 10, 11, 12, 13] and references therein).

Since Bäcklund [14] first found a transform (alias the auto-Bäcklund transform (aBT)) of the sine-Gordon equation in 1875, and Darboux [15] found a transform (alias Darboux transform) of the Strum-Liouville equation (alias the linear Schrödinger equation) in 1882, many types of well-known analytical transforms were found between the same equation or different equations [16, 18, 17, 20, 19, 21, 22, 23, 24]. For example, Hopf [25] and Cole [26] independently established a BT between the nonlinear Burgers equation and linear heat (or diffusion) equation in 1950-1951. In 1967, Gardner, Greene, Kruskal, and Miura (GGKM) [27] solved the initial value problem of the KdV equation starting from its coupled linear PDEs, which are just its Lax pair [6]. In 1968, strongly motivated by the GGKM’s idea, Lax [28]

presented a general formal BT (alias Lax pair) implying that the eigenvalues of the linear operator

are integrals of the nonlinear equation . In the same year, Miura [29] found a new BT (alias Miura transform) between the KdV equation and focusing (or defocusing) mKdV equation .

With the quick development of cloud computing resources and mass data, deep learning [30, 31] has been used in many fields containing cognitive science [32], image recognition [33], genomics [34], industrial areas [35, 36], and etc. In particular, in the past of decades, some deep neural network learning methods [37, 38, 39, 40, 41, 43, 42, 44, 45] have been developed to study the partial differential equations (PDEs), which play an important role in the various of scientific fields. The powerful physics-informed neural network (PINN) method [44, 45] was used to investigate the PDEs [46, 47, 48, 49, 50, 51, 52], the fractional PDEs [53], and stochastic PDEs [54].

In this paper, we would like to develop two kinds of deep neural network learning methods to study the discovery of BTs and soliton equations via the general system

(1a)
(1b)
(1c)

where , the considered spario-temporal region is , Eq. (1c) is called the BT between Eqs. (1a) and (1b). In particular, if Eq. (1b) is equivalent to Eq. (1a), then the transform (1c) is called the aBT. It is obvious to see that the information of Eq. (1a) can be shifted to Eq. (1b) with the aid of BT (1c). Sometimes, the structures of the transformed Eq. (1b) may become more simpler. Therefore, BTs can be used to discover the new information of Eqs. (1a) and (1b).

The rest of the paper is organized as follows. In Sec. 2, we will introduce a deep neural network learning scheme to study aBTs and BTs, e.g., the aBT of the sine-Gordon equation, and Miura transform between the focusing/defocusing mKdV equation and KdV equation. Moreover, the deep learning scheme can also be used to discover soliton equations with the aid of BTs. In Sec. 3, a new deep learning method for soliton equation discovery is displayed based on BTs. We use the implicit and explicit BTs to exhibit the data-driven discovery of sine-Gordon equation and mKdV equation with the aid of aBT of the sine-Gordon equation, and Darboux transform of the focusing mKdV equation, respectively. Finally, some conclusions and discussions are presented in Sec. 4.

2 Data-driven discovery of BTs and soliton equations

2.1 Deep learning scheme for the BT discovery

We here would like to introduce the deep learning scheme for the discovery of BTs and soliton equations by examining system (1). The main idea of this scheme is to use some constraints on and to find the approximate transform between and , where and

are represented by one or two deep neural networks. There are many kinds of neural networks including the fully-connected neural network, convolution neural network, and recurrent neural network. The constraints of the functions include the real solution data-set and the corresponding equations. The aim of the scheme is to make the neural network solution approach to the real data better and match some physical laws efficiently.

Figure 1 displays the deep learning scheme of the BT discovery, where and are represented by a deep neural network, which is, in general, chosen as a fully-connected neural network. It will share the network parameters, weights, and biases. For some conditions, they can be represented by two different networks to eliminate mutual influence. In this diagram, we assume that and are real-valued functions. If the solutions of Eq. (1a

) are complex, the number of output neurons would be double. The number of input neurons equals to the independent variables of

and .

represents the activation function, and is chosen as

in this scheme, whose aim is to add the nonlinear action to the deep neural network. Notice that one can also choose other types of activation functions such as the sigmoid (logistic) function, threshold function, piecewise linear function, ReLU function, ELU function, swish function, and softmax function.

Figure 1: The deep learning scheme for the data-driven discovery of BTs and soliton equations.

The loss function during the training process consists of two different parts. It can be simply written as:

(2)

with

(3)

where denote the set of sampling points in some spatio-temporal region , stands for the number of sampling points, which are generated by using the Latin Hypercube Sampling strategy [55], and and represent the sampling solution data of Eqs. (1a) and (1b), respectively.

We would like to use the above-mentioned framework to study two tasks. In the first task, we aim to learn the unknown BTs. ’s in the training loss contain all the possible terms under the principle of homogeneity. And the constraints arising from the loss functions (or ) and (or ) will make ’s approach to the exact BT. Moreover the correct parameters will be found. For the second task, we assume that the in Eq. (1b) is unknown. If can be explicitly represented by , then the data-set of can be converted to the data-set of . Thus we assume that is implicit about . We assume that has some possible terms under the principle of homogeneity. The equation can be discovered when the constraints produced by , and . We can learn only using the data-set of .

In the scheme, firstly, for a set of larger sampling points, we would like to use an efficient mini-batch optimization algorithm Adam [56]. Secondly, the model will be trained by a full-batch optimization algorithm L-BFGS [57] until the difference of the loss function is less than the Machine Epsilon. In what follows, we will use some examples to verify the validity of our deep learning scheme.

2.1.1 Data-driven BT discovery of the sine-Gordon equation

The well-known sine-Gordon (s-G) equation[58]

(4)

is a physical interesting model, and can be used to describe the theory of crystal dislocations, Bloch-wall motion, splay waves in lipid membranes, magnetic flux on a Josephson line, and elementary particles [59, 60, 61]. The s-G equation (4) admits the auto-Bäcklund transform (aBT) [14]

(5)

where is an arbitrary real-valued constant, that is, if is a solution of the s-G equation (4), then so is given by Eq. (5). In what follows, we would like to use the above-mentioned deep learning method to discover the parameters of the aBT (5). For convenience, we consider the generalized aBT

(6)

where the parameters , and are real-valued parameters to be determined later, the two new quadratic nonlinear terms are introduced in the unknown aBT (6). In particular, as , the unknown aBT (6) reduces to the known exact aBT (5).

We use the above-mentioned deep leaning scheme to discover the aBT of the s-G equation in two cases by considering the system

(7)

in the spario-temporal region , where , and are parameters to be determined. The training data are generated by following the breather solution of the s-G equation (4[60]

(8)

where .

Here, the hidden neural network in Python can be defined as

def u(x, t):
    U = neural_net(tf.concat([x,t],1), weights, biases)
    u = U[:, 0:1]
    v = U[:, 1:2]
    return u, v

such that the residual neural network and in Python can be obtained as

 def f_sG(x, t):
     u, v = u(x, t)
     v_t = tf.gradients(v, t)[0]
     v_x = tf.gradients(v, x)[0]
     v_xt = tf.gradients(v_x, t)[0]
     u_x = tf.gradients(u, x)[0]
     u_t = tf.gradients(u, t)[0]
     f_sG = v_xt - sin(v)
     f_BT1 = v_x - a * u_x + b * tf.sin((u + v) / 2) - h * u * u_x
     f_BT2 = v_t + c * u_t - d * tf.sin((u - v) / 2) + f * u * u_t
     return f_sG, f_BT1, f_BT2

Case A.—In this case, we suppose that , and are unknown parameters, and . It should be pointed out that the two parameters and are not fixed. We know that in the given exact aBT (5) such that we only consider the product value of and in the deep learning. We use a 6-layer neural network with 5 hidden layers and 40 neurons per layer to learn system (7). Without loss of generality, we take the initial value of all free parameters as , i.e., . We choose as the training region, from which 10,000 sample points are taken by Latin Hypercube Sampling strategy [55]. Moreover, the 20,000 steps Adam and 50,000 steps L-BFGS optimizations are used in the deep learning. Fig. 2(a) displays the trained breather solution by using the deep neural network. Case A in Table 1 exhibits the learning parameters about and , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. Moreover, the errors are exhibited in Figs. 2(b1, b2) for the cases without a noise and with a noise, respectively. The training times are 619.92s and 637.08s, respectively.

Case B.—In this case, we suppose that , and are all unknown parameters. We used the same deep neural network method as Case A to study this case. Case B in Table 1 displays the learning parameters about , and , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. Moreover, the errors are exhibited in Figs. 2(b3, b4) for the cases without a noise and with a noise, respectively. The training times are 659.96s and 697.76s, respectively.

Case
Exact 1 4 1 0 0
A (no noise) 1.00007 4.00006 1.00001 0 0
A ( noise) 0.99995 4.00057 1.00003 0 0
B (no noise) 1.00004 4.00021 1.00001 -4.24 -2.51
B (2 noise) 0.99988 3.99959 0.99985 -3.90 -3.60
Case error of error of error of error of error of time
A (no noise) 7.44 6.45 1.31 0 0 619.92s
A ( noise) 5.29 5.74 3.24 0 0 637.08s
B (no noise) 4.17 2.12 1.37 4.24 2.51 659.96s
B (2 noise) 1.20 4.05 1.47 3.94 3.57 697.76s
Table 1: Data-driven discovery of parameters , , in aBT (6) and errors, as well as training times.
Figure 2: Data-driven aBT discovery of the sine-Gordon equation (4): (a) breather solution (8) generating the training data; (b1-b4) Relative error between exact solution and neural network solution: (b1, b3) training data without a noise; (b2, b4) training data with a noise. The relative norm errors of , respectively, are (b1) , (b2) , (b3) , and (b4) . Training times in (b1-b4) are 619.92s, 659.96s, 637.08s, and 697.76s, respectively.

2.1.2 Data-driven discovery of Miura transforms

In 1968, Miura [29] presented the well-known complex Miura transform (a special BT)

(9)

and real Miura transform

(10)

to transform, respectively, the focusing mKdV equation [29, 6]

(11)

and the defocusing mKdV equation [29, 6]

(12)

into the same KdV equation [4]

(13)

which can describe the shallow water wave, pressure waves, acoustic waves, magneto-sonic waves, electron plasma waves, and ion acoustic waves [62, 63].

In this subsection, the data-driven deep learning method will be used in two different cases. In the first part, the parameters of Miura transform will be discovered including with and without disturbance terms. The data-set is generated by an exact soliton solution of mKdV and an corresponding exact solution of KdV equation, which is generated by following soliton solution through Miura transform. In the second part, an equation will be discovered through the solution of another equation. The data-set is generated by the solution of another equation. The real Miura transform will be used in the loss function to find the unknown equation.

Case
Exact 1 1 0 0
A (no noise) 0.99931 0.99958 0 0
A ( noise) 1.00026 0.99981 0 0
B (no noise) 1.00006 0.99990 -4.1 -1.0
B ( noise) 1.00017 0.99994 -1.3 2.0
Case error of error of error of error of time
A (no noise) 6.94 4.25 0 0 609.30s
A ( noise) 2.58 1.89 0 0 645.68s
B (no noise) 6.00 1.03 4.05 5.00 635.56s
B ( noise) 1.68 5.90 1.32 1.99 637.18s
Table 2: Data-driven discovery of the complex Miura transform (14) via system (15): , , and their errors, as well as the training times.

Case 1.  Data-driven discovery of complex Miura transform

In the following, we would like to study the data-driven parameter discovery of the complex Miura transform. We consider the generalized Miura transform

(14)

where are four parameters to be determined. If , then the transform (14) reduces to the known Miura transform.

In the following, we would like to use the above-mentioned deep leaning scheme to discovery these parameters of the complex Miura transform (14) between the focusing mKdV equation and KdV equation in two cases by considering the system

(15)

The training data-set is generated by the bright soliton of the focusing mKdV equation (11):

(16)

where is a free parameter, and the corresponding complex bright soliton of the KdV equation

(17)

by the complex Miura transform (9).

Case A.—We fix , and learn the two parameters . The data-set is sampled in the spatio-temporal region . Moreover, 10,000 sampling points will be used in the training process, and in this example. A 6-layer neural network with 40 neurons per layer is used to learn system (15) to fit the exact solutions of two equations. For convenience, the initial value of all free parameters are set as 1. We choose 10,000 steps Adam and 20,000 steps L-BFGS optimizations to train the considered deep learning model. Case A in Table 2 exhibits the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 609.30s and 645.68s,, respectively. Moreover, the errors are exhibited in Figs. 3(b1-b4) for the cases without a noise and with a noise, respectively.

Case B.—We learn all four parameters . We used the same deep neural network method as Case A to study this case. Case B in Table 2 displays the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 635.56s and 637.18s, respectively. Moreover, the errors are exhibited in Figs. 3(c1-c4) for the cases without a noise and with a noise, respectively.

Figure 3: Data-driven complex Miura transform discovery. (a1) soliton (16), (a2, a3) real and imaginary parts of soliton (17). (b1-e3) Relative errors between exact solutions and neural network solutions: (b1-b3) Case A without a noise, (c1-c3) Case A with a 2 noise, (d1-d3) Case B without a noise, (e1-e3) Case B with a 2 noise. The relative norm errors of , and are (b1) 2.11e-3, (b2) 2.71e-3, (b3) 2.68e-3, (c1) 1.69e-3, (c2) 2.17e-3, (c3) 2.16e-3, (d1) 1.58e-3, (d2) 2.00e-3, (d3) 2.17e-3, (e1) 9.43e-3, (e2) 1.53e-3, and (e3) 1.43e-3, respectively.
Case
Exact 1 -1 0 0
Case A (no noise) 0.99999 -1.00000 0 0
Case A ( noise) 1.00026 -1.00012 0 0
Case B (no noise) 1.00000 -0.99999 0.00011 -0.00013
Case B ( noise) 1.00019 -1.00023 0.00271 -0.00303
Case error of error of error of error of time
Case A (no noise) 1.36 1.35 0 0 374.40s
Case A ( noise) 2.55 1.22 0 0 407.44s
Case B (no noise) 7.39 5.66 1.14 1.25 405.32s
Case B ( noise) 1.85 2.31 2.71 3.03 417.00s
Table 3: Data-driven discovery of real Miura transform (18) via system (19): , , and their errors, as well as the training times.

Case 2.  Data-driven discovery of real Miura transform

We consider the generalized form of the real Miura transform (10) as

(18)

where are four real parameters to be determined. If , then the transform (18) reduces to the known real Miura transform (10).

In what follows, we would like to use the above-mentioned deep leaning scheme to discover these parameters of the Miura transform (18) between the mKdV equation and KdV equation in two cases by considering the system

(19)

The training data-set is generated by a shock wave solution of the defocusing mKdV equation (12)

(20)

with a free real parameter , and the soliton solution of the KdV equation (13)

(21)

via the real Miura transform (10).

Case A.—We fix , and learn the two parameters . The data-set is sampled in the spatio-temporal region . Moreover, 10,000 sampling points will be used in the training process, and in this example. A 7-layer neural network with 20 neurons per layer is used to learn system (19) to fit the exact solutions of two equations. For convenience, the initial value of all free parameters are set as 1. We choose 5,000 steps Adam and 5,000 steps L-BFGS optimizations to train the considered deep learning model. Case A in Table 3 exhibits the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 374.40s and 407.44s, respectively. Moreover, the errors are exhibited in Figs. 4(b1-c3) for the cases without a noise and with a noise, respectively.

Case B.—We learn all four parameters . We used the same deep neural network method as Case A to study this case. Case B in Table 3 displays the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 405.32s and 417.00s, respectively. Moreover, the errors are exhibited in Figs. 4(d1-e3) for the cases without a noise and with a noise, respectively.

Figure 4: Data-driven discovery of real Miura transform. (a1) solution(20),(a2) soliton solution(21). (b1-e4) Relative errors between exact solutions and neural network solutions: (b1-b2) Case A without a noise, (b3-b4) Case A with a 2 noise,(c1-c2) Case B without a noise, (c3-c4) Case B with a 2 noise. The relative norm errors of and are (b1) 5.74e-5, (b2) 1.44e-4, (b3) 2.69e-4, (b4) 4.42e-4, (c1) 5.37e-5, (c2) 9.37e-5, (c3) 4.17e-4, and (c4) 3.91e-4, respectively.

2.2 Data-driven discovery of mKdV equation via Miura transforms

2.2.1 Data-driven discovery of mKdV equation via the complex Miura transform

In this subsection, we would like to use the above-mentioned deep learning scheme to learn the mKdV equation through the complex Miura transform (9). The solution of the mKdV equation (11) can not be explicitly expressed by the solution of the KdV equation (13). The training data of can not direct calculate by the data-set of . So, we will use a neural network approximate directly. Training loss brings close to real solution.

We would like to consider the generalized form of the original mKdV equation (11)

(22)

to test the robustness of our scheme, where , , , are real-valued parameters to be determined.

We use the above-mentioned deep leaning scheme to discover the parameters of the mKdV equation (22) via the complex Miura transform (9) and KdV equation (13) in two cases by considering the system

(23)

Case A.—We fix , and learn the two parameters . The training data-set is generated by the soliton (17) of the KdV equation. The data-set is sampled in the spatio-temporal region . Moreover, 10,000 sampling points will be used in the training process, and in this example. A 6-layers neural network with 40 neurons per layer is used to learn system (23) to fit the exact solutions of two equations. For convenience, the initial value of all free parameters are set as 1. We choose 10,000 steps Adam and 20,000 steps L-BFGS optimizations to train the considered deep learning model. Case A in Table 4 exhibits the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 768.54s and 764.55s, respectively. Moreover, the errors are exhibited in Figs. 5(b1-b4) for the cases without a noise and with a noise, respectively.

Case B.—We learn all four parameters . We used the same deep neural network method as Case A to study this case. Case B in Table 4 displays the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 742.11s and 750.01s, respectively. Moreover, the errors are exhibited in Figs. 5(c1-c4) for the cases without a noise and with a noise, respectively.

Case
Exact 6 1 0 0
A (no noise) 5.94175 0.98650 0 0
A ( noise) 5.90499 0.97883 0 0
B (no noise) 5.96718 0.99260 1.88 1.37
B ( noise) 5.95149 0.98922 3.3 -3.8
Case error of error of error of error of time
A (no noise) 5.83 1.35 0 0 768.54s
A ( noise) 9.50 2.12 0 0 764.55s
B (no noise) 3.28 7.40 1.88 1.37 742.11s
B ( noise) 4.85 1.08 3.27 3.80 750.01s
Table 4: Data-driven discovery of the mKdV equation (22) via system (23) with the complex Miura transform (9): , and their errors, as well as the training times.
Figure 5: Data-driven discovery of the focusing mKdV equation via the complex Miura transform. (a1) trained soliton of mKdV equation, (a2-a3) real and imaginary parts of trained soliton of the KdV equation. (b1, b2) Case A without a nose, (b3, b4) Case A with a 2 noise, (c1, c2) Case B without a noise, (c3, c4) Case B with a 2 noise. The relative norm errors of and are (b1) , (b2) , (b3), (b4), (c1), (c2), (c3) , and (c4), respectively.

2.2.2 Data-driven discovery of mKdV equation via the real Miura transform

In this subsection, we would like to learn the mKdV equation through the real Miura transform (10). In general, the solution of the mKdV equation (11) is very difficultly expressed by the solution of the KdV equation (13). The training data of can not be directly generated by the data-set of . So, we will use a neural network to approximate directly. The smaller training loss can make the trained close to real solution .

We would like to consider the generalized form of the original mKdV equation given by Eq. (22) to test the robustness of our scheme. We use the above-mentioned deep leaning scheme to discover the parameters of the mKdV equation (22) via the real Miura transform (10) and KdV equation (13) in two cases by considering the system

(24)

Case A.—We fix , and learn the two parameters . The training data-set is generated by the soliton (21) of the KdV equation. The data-set is sampled in the spatio-temporal region . Moreover, 10,000 sampling points will be used in the training process, and in this example. A 7-layer neural network with 20 neurons per layer is used to learn system (24) to fit the exact solutions of two equations. For convenience, the initial value of all free parameters are set as 1. We choose 5,000 steps Adam and 5,000 steps L-BFGS optimizations to train the considered deep learning model. Case A in Table 5 exhibits the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 203.41s and 178.44s, respectively. Moreover, the errors are exhibited in Figs. 6(b1, b2) for the cases without a noise and with a noise, respectively.

Case B.—We learn all four parameters . We used the same deep neural network method as Case A to study this case. Case B in Table 5 displays the learning parameters about , and their errors under two senses of the training data without a noise and with a noise, which imply that the used deep learning method is effective. The training times are 178.08s and 163.98s, respectively. Moreover, the errors are exhibited in Figs. 6(b3, b4) for the cases without a noise and with a noise, respectively.

Case
Exact -6 1 0 0
Case A (no noise) -5.98709 0.99742 0 0
Case A ( noise) -5.97933 0.99527 0 0
Case B (no noise) -5.98739 0.99681 -0.00330 -0.00122
Case B ( noise) -5.97631 0.99433 -0.00204 -0.00149
Case error of error of error of error of time
Case A (no noise) 1.29 2.58 0 0 203.41s
Case A ( noise) 2.07 4.73 0 0 178.44s
Case B (no noise) 1.26 3.19 3.30 1.22 178.08s
Case B ( noise) 2.37 5.67 2.04 1.50 163.98s
Table 5: Data-driven mKdV equation (22) discovery via system (24) with the real Miura transform (10): , and their errors, as well as the training times.
Figure 6: Data-driven discovery of the defocusing mKdV equation via the real Miura transform. (a1) trained soliton of mKdV equation, (a2) trained soliton of the KdV equation. (b1) Case A without a nose, (b2) Case A with a 2 noise, (b3) Case B without a noise, (b4) Case B with a 2 noise. The relative norm errors of are (b1), (b2), (b3), and (b4), respectively.

3 The BT-enhanced scheme for data-driven PDE discovery

Some deep learning schemes about the PDE discovery usually only used the basic physical data (e.g., PDE solution data) and the possible forms of the presupposed PDEs (see, e.g., Ref. [45]). Sometime they are not effective. In this section, we will propose an enhanced scheme of PDE discovery based on the BTs. We can consider the BTs into two forms, including implicit and explicit forms to learn the soliton equations.

3.1 The BT-enhanced deep learning scheme for PDE discovery

In this section, we will provide the framework of our scheme. The main idea about the PDE discovery in the previous works is to use the solution information, which may be generated by the experiment data-set or given data-set. But if we know the BT of the original equation, then we can find the higher accuracy of parameters in the original equation. The basic idea of this method is to add the high-order constraints to the training process, witch are generated by the BTs.

We will discuss our scheme in two different cases. In the first case, the BT can be written as an explicit form. In the second case, the BT can be written only in an implicit form.

Figure 7: The BT-enhanced PDE discovery scheme for the explicit BTs.

Case 1.—Fig 7 exhibits the case that the BT can be written as an explicit form. and are represented by some neural networks, such as the fully-connected neural networks or other kinds of neural networks. The data-set arises from the experimental data or solution data. And the first-order data-set is only calculated from the above data-set by some BT. The unknown equations contain some parameters to be learned. The train loss contains the three parts: , and , where and make the neural network model fit the data-set, and the TL-F makes the unknown equation approach to the exact form, which composed by the above neural network solution. If necessary, the high-order solutions can generate the stronger constraints. It is worth noting that the data-set should be prepared before training.

The train loss contains the three parts: , and , that is, where

(25)
(26)
(27)

in which and make the neural network model fit the data-set, makes the unknown equation approach to the exact form, which composed by above neural network solutions and . If necessary, high-order solution can be used to obtain the stronger constraints. For instance, if we use , and 2-order data-set to train our model to obtain the neural network solutions . The losses and in the TL are replaced by

(28)

and

(29)
Figure 8: The BT-enhanced PDE discovery scheme for the implicit BTs.

Case 2.—Fig 8 displays the case that the BT can be written as an implicit form. Since the BT is an implicit form, thus we can not get the data-set of . We use the training loss to obtain the approximated neural network solution . And the approximated neural network solution can still be trained by the data-set through . The unknown equation will be learned by . The correct equation will be found more precise by the above two schemes. These models are trained by the efficient Adam optimization algorithm and L-BFGS optimization algorithm.

The loss function of this scheme is

(30)

where

(31)
(32)
(33)

The exact equation will be found more precise by the above scheme. If one wants to obtain the higher accuracy of unknown equations, an additional neural network solution should be added. The losses and in the TL are replaced by

(34)

and

(35)

In the two above schemes, if we only use and to train the neural network, then we call it 1-fold BT-enhanced (BTE) scheme. And if we use , and in the training process, we call it a 2-fold BTE scheme. All schemes are trained by the efficient Adam and L-BFGS optimization algorithms.

In what follows, we will display two examples to show the validity of our schemes. The first example is to learn the mKdV equation based on the explicit BT (alias Darboux transform (DT)) of the mKdV equation (36). The second example is to study the s-G equation via the implicit aBT (5).

Case error of error of error of error of
Exact 6 0 1 0 0 0 0 0
A (PINNs) 5.63392 0.366 0.90826 0.0917 0 0 0 0
A (PINNs ) 5.60595 0.394 0.90041 0.0996 0 0 0 0
A (PINNs ) 5.56262 0.437 0.89004 0.110 0 0 0 0
A (PINNs ) 5.58002 0.420 0.89078 0.109 0 0 0 0
A (PINNs ) 3.64644 2.35356 0.45905 0.541 0 0 0 0
B (PINNs) 5.58765 0.412 0.90205 0.0979 -0.02825 0.0282 -0.01493 0.0149
B (PINNs ) 5.80756 0.192 0.95514 0.0449 -0.02467 0.0247 -0.02322 0.0232
B (PINNs ) 5.74998 0.250 0.94189 0.0581 -0.01166 0.0117 -0.01847 0.0185
B (PINNs ) 5.63770 0.362 0.91263 0.0874 -0.00343 0.00343 -0.01174 0.0117
B (PINNs ) 4.86428 1.14 0.73158 0.268 -0.02057 0.0206 -0.00608 0.00608
A (BTE) 6.00973 9.73 1.00123 1.23 0 0 0 0
A (BTE ) 5.97970 2.03 0.99380 6.20 0 0 0 0
A (BTE ) 5.98296 1.70 0.99231 7.69 0 0 0 0
A (BTE ) 5.99462 5.38 0.99104 8.96 0 0 0 0
A (BTE ) 5.97702 2.30 0.97901 2.10 0 0 0 0
B (BTE) 5.99109 8.91 0.99804 1.96 0.00086 8.57 0.00110 1.10
B (BTE ) 5.99943 5.72 0.99818 1.82 0.00013 1.35 0.00028 2.80
B (BTE ) 5.98173 1.83 0.99214 7.86 0.00024 2.38 0.00016 1.62
B (BTE ) 5.98347 1.65 0.98876 1.12 -0.00063 6.26 -0.00142 1.42
B (BTE ) 5.98058 1.94 0.97879 2.12 -0.00178 1.78 -0.00305 3.05
Table 6: Comparisons of PINNs and BT-enhanced scheme for the data-driven mKdV equation discovery.
Figure 9: Focusing mKdV equation. (a1) trained one-soliton solution via PINNs and (a2, a3) trained one- and two-soliton solutions via the BT-enhanced PDE scheme. (b1-c3) absolute errors: (b1) PINNs learning Case A with a 5 noise, (b2-b3) BT-enhanced PDE scheme learning Case A with a 5 noise, (c1) PINNs learning Case B with a 5 noise, (c2-c3) BT-enhanced PDE scheme learning Case B with a 5 noise. The relative norm errors of and are: (b1) , (b2) , (b3) , (c1) , (c2) , and (c3) , respectively. The training times are (b1) 132.14s, (b2-b3) 302.67s, (c1) 135.95s, and (c2-c3) 285.60s, respectively.

3.2 Data-driven discovery of mKdV equation via the explicit BT/DT

The focusing mKdV equation (11) possesses the BT/DT [64]

(36)

with a free parameter , where

is the basic solution of the Lax pair of the mKdV equation (11)

(37)

with being a spectral parameter, and an initial solution of the mKdV equation (11).

By using the above DT (36) with and , one can obtain the one-soliton solution of the mKdV equation (11)

(38)

Further, one can use the DT (36) with given by Eq. (38), and , to find the 2-soliton solution of the mKdV equation (11)