 # Machine learning based non-Newtonian fluid model with molecular fidelity

We introduce a machine-learning-based framework for constructing continuum non-Newtonian fluid dynamics model directly from a micro-scale description. Polymer solution is used as an example to demonstrate the essential ideas. To faithfully retain molecular fidelity, we establish a micro-macro correspondence via a set of encoders for the micro-scale polymer configurations and their macro-scale counterparts, a set of nonlinear conformation tensors. The dynamics of these conformation tensors can be derived from the micro-scale model and the relevant terms can be parametrized using machine learning. The final model, named the deep non-Newtonian model (DeePN^2), takes the form of conventional non-Newtonian fluid dynamics models, with a new form of the objective tensor derivative. Numerical results demonstrate the accuracy of DeePN^2.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## Appendix A Rotational symmetry of the model ansatz and the DNN representation

In this section, we show that both the modeling ansatz and the DNN representation of the DeePN model satisfy the rotational invariance condition.

### a.1 Rotational invariance from the continuum and microscopic perspective

Let us consider a symmetric tensor in two different coordinate frames. Frame is a static inertial frame. We let , , , the position, velocity and in frame . Framework is a rotated frame which is related to frame by a unitary matrix . We denote , , the position, velocity and in frame . Accordingly, , and follows the transformation rule

 ~x=Qx~v=Qv(x,t)+˙QQT~x~c=Qc(x,t)QT (16)

To construct the dynamics of , we need to choose an objective derivative which retains proper rotational symmetry, i.e.,

 (17)

For example, if we choose as the material derivative , Eq. (17) cannot be satisfied, since

 (18)

where the second identity follows from

 dcdt∣∣frame 1=∂c(QT~x,t)∂t+~v⋅∇~xc(QT~x,t)=∂c(x,t)∂t+(˙QT~x⋅∇x)c+(Qv(x,t)+˙Qx)⋅∇~xc=∂c(x,t)∂t+v(x,t)⋅∇xc=dcdt∣∣frame 2 (19)

Alternatively, if we choose to be the objective tensor derivatives coupled with , e.g., the upper-convected , covariant derivative , the Jaumann derivative , Eq. (17) is satisfied. For example,

 (20)

On the other hand, this analysis does not provide us concrete guidance to constructing , since multiple choices such as , and all satisfy Eq. (17). To address this issue, we look for a micro-scale perspective based on the Fokker-Planck equation to understand the rotational invariance and construct .

Let us consider the Fokker-Planck equation of a dumb-bell polymer with end-end vector coupled with flow field . By ignoring the external field, the evolution of the density is governed by

 ∂ρ(r,t)∂t=−∇⋅[(κ⋅r)ρ−2kBTγ∇ρ−2γ∇Vb(r)ρ], (21)

where is the friction coefficient of the solvent, is the intra-molecule potential energy.

###### Proposition A.1.

Eq. (21) retains rotational invariance under the transformation by Eq. (16), i.e.

 ~ρ:=ρ(~r,t)∣∣frame 1≡ρ(r,t)∣∣frame 2 (22)
###### Proof.
 (23)

where we have used the fact that is anti-symmetric. In addition, it is straightforward to show that the terms and are invariant. Therefore we have (22). ∎

Accordingly, if we define to be the mean value of a second-order tensor , the dynamics follows

 (24)
###### Proposition A.2.

If obeys rotational symmetry , then so does (24).

###### Proof.

Using Eq. (18), the individual terms in frame follow

 ddt⟨~B⟩∣∣frame 1=˙Q⟨B⟩QT+Q⟨B⟩˙QT+Qddt⟨B⟩∣∣frame 2˙QT. (25)

Note that

 (26)

where we have used the relation

 (Ar)⋅∇B=AB+BAT (27)

if is a rotational symmetric tensor and is an anti-symmetric tensor.

By Eq. (25) and (26), we see that

 (28)

The rotational symmetry of the other terms follows similarly. ∎

The above analysis shows that, from the perspective of the Fokker-Planck equation, the evolution dynamics retains the rotational symmetry. In particular, the term provides a microscopic perspective for understanding the objective tensor derivative , which we use to construct the DNN representation of the constitutive models.

### a.2 DNN representation

Next we establish a micro-macro correspondence via a set of encoder (see Proposition A.4 for details) and, accordingly, a set of micro-scale tensor and , i.e.,

 Bi(r)=fi(r)fi(r)T=(gi(r)r)(gi(r)r)T,ci=⟨Bi⟩. (29)

We will use to construct the evolution dynamics (24) via some proper DNN structure which retains the rotational invariance. In particular, we consider the fourth-order tensor and show that the following DNN representation (see also Eq. (11) in main text) ensures the rotational symmetry of . For simplicity, the subscript is ignored and we use to denote the set of conformation tensor .

###### Proposition A.3.

The following ansatz of ensures that the dynamic of evolution of retains rotational invariance.

 ⟨r∇r⊗B⟩=⟨g(r)2r∇r⊗rrT⟩+6∑i=1G(i)1(c)⊗G(i)2(c) (30)

where and satisfy

 ~G1:=G1(~c)=QG1QT~G2:=G2(~c)=QG2QT (31)
###### Proof.

Without loss of generality, we represent the fourth order tensor by the following two basis

 F1(c)⊗F2(c)⊗F3(c)+F1(c)⊗(F2(c)⊗F3(c))T{2,3}F1,F3∈R3,F2∈R3×3G1(c)⊗G2(c)G1,G2∈R3×3, (32)

where the super-script represent the transpose between the 2nd and 3rd indices; also , , , and satisfy the symmetry conditions

 (33)

For the term , we have

 κ:G1(c)⊗G2(c)∣∣frame2=Tr(κG1)G2 (34)

and

 (35)

where we have used .

For the term , we have

 κ:F1(c)⊗F2(c)⊗F3(c)∣∣frame2=FT2κF1FT3 (36)

and

 ~κ:~F1(c)⊗~F2(c)⊗~F3(c)∣∣frame1=QFT2κF1FT3QT+QFT2QT˙QF1FT3QT. (37)

On the other hand, note that

 d~Bdt∣∣frame1=˙QBQT+QB˙QT+QdBdt∣∣frame 2QT. (38)

To ensure the rotational symmetry of , we have

 F2≡I,K1∑i=1F(i)1⊗I⊗F(i)3=⟨g(r)r⊗I⊗g(r)r⟩. (39)

Hence, we have

 ddt~c−~κ:(K1∑i=1~F(i)1⊗~F(i)2⊗~F(i)3+~F(i)1⊗(~F(i)2⊗~F(i)3)T{2,3})∣∣frame1≡Q[ddtc−κ:(K1∑i=1F(i)1⊗F(i)2⊗F(i)3+F(i)1⊗(F(i)2⊗F(i)3)T{2,3})∣∣frame2]QT (40)

Furthermore, using Eq. (39), we obtain

 (41)

Accordingly, the remaining part of is expanded by

 ⟨r∇rg(r)2⊗rrT⟩=K2∑i=1G(i)1(c)⊗G(i)2(c) (42)

where due to the tensor index symmetry of and , as well as and .

Combining Eq. (40), (41) and (42), we conclude that the decomposition

 ⟨r∇r⊗B⟩=⟨g(r)2r∇r⊗rrT⟩+6∑i=1G(i)1(c)⊗G(i)2(c) (43)

ensures the rotational invariance in the dynamic equation of . ∎

Finally, we show that the encoder takes the form (see also Eq. (9) in the main text).

###### Proposition A.4.

If satisfies

 f(Qr)=Qf(r) (44)

for an arbitrary unitary matrix , then must take the form , where is a scalar function and .

###### Proof.

Let , and the basis vectors of the cartesian coordinate space. In particular, we consider and denote by . By choosing to be of the form

 Q=⎛⎜⎝0cosθsinθ1000−sinθcosθ⎞⎟⎠, (45)

we have

 f(Qr)=⎛⎜⎝f1(re2)f2(re2)f3(re2)⎞⎟⎠=⎛⎜⎝f2(re1)cosθ+f3(re1)sinθf1(re1)−f2(re1)sinθ+f3(re1)cosθ⎞⎟⎠ (46)

In particular, by choosing and , respectively, we get , i.e.,

## Appendix B The micro-scale model

The polymer solution is modeled by suspensions of dumbbell polymer molecules in explicit solvent. The bond interaction is modeled by the FENE potential, i.e.,

 Vb(r)=−ks2r20log[1−r2r20], (47)

where is the spring constant and and is the end-end vector between the two beads of a polymer molecule. In addition, pairwise interactions are imposed between all particles (except the intramolecular pairs bonded by ) under dissipative particle dynamics Hoogerbrugge and Koelman (1992); Groot and Warren (1997), i.e.,

 Fij=FCij+FDij+FDijFCij={a(1.0−rij/rc)eij,rijrcFDij={−γwD(rij)(vij⋅eij)eij,rijrcFRij={σwR(rij)ξijeij,rijrc (48)

where , , , and ,

are independent identically distributed (i.i.d.) Gaussian random variables with zero mean and unit variance.

, , are the total conservative, dissipative and random forces between particles and , respectively. is the cut-off radius beyond which all interactions vanish. The coefficients , and represent the strength of the conservative, dissipative and random force, respectively. The last two coefficients are coupled with the temperature of the system by the fluctuation-dissipation theorem Espanol and Warren (1995) as . Similar to Ref. Lei et al. (2010), the weight functions and are defined by

 wD(rij)=[wR(rij)]2wR(rij)=(1.0−rij/rc)k, (49)

We refer to Ref. Lei et al. (2017) for the details of the reverse Poiseuille flow simulation and the calculation of the shear rate dependent viscosity. In all the numerical experiments, the number density of the solvent particle is set to be and the number density of the polymer molecule is set to be . Other model parameters are given in Tab. LABEL:tab:polymer_model_parameter.

The training dataset is collected from micro-scale shear flow simulations of the polymer solution in a domain , with periodic boundary condition imposed in each direction. The Lees-Edwards boundary condition Lees and Edwards (1972) is used to impose the shear flow rates . The simulation is run for a production period of with time step . samples of the polymer configurations are collected with uniformly selected between . samples are used for training and the remaining ones are used for testing.

## Appendix C Training procedure

The constructed DeeP model is represented by various DNNs for the encoders , stress model , evolution dynamics , and the th order tensors of the objective tensor derivatives. In particular, by choosing , so we do not need to train

separately. The loss function is defined by

 L=λH1LH1+λH2LH2+λELE, (50)

where , and

are hyperparameters specified later. For each training batch of

training samples, , , are given by

 (51)

where denotes the total sum of squares of the entries in the tensor. is the matrix composed of the eigenvectors of of the th sample.

Furthermore, we note that , , , and are all symmetric. Accordingly, the DNN inputs are composed of the upper-triangular parts of the and the outputs are the upper-triangular parts of the representation tensors. Specifically, , , , are represented by the

layer fully-connected DNNs. The number of neurons in the hidden layers are set to be

, , ,

, respectively. The activation function is taken to be the hyperbolic tangent.

The DNNs are trained by the Adam stochastic gradient descent method

Kingma and Ba (2015) for epochs, using samples per batch size. The initial learning rate is and decay rate is per steps. The hyper-parameters , and are chosen in the following two ways. In the first setup, we set them to be constant throughout the training process, e.g., . In the second setup, the hyper-parameters are updated every epochs by

 λH1=˜LH1˜LH1+˜LH2+˜LE,λH2=˜LH2˜LH1+˜LH2+˜LE,λE=˜LE˜LH1+˜LH2+˜LE, (52)

where denotes the mean of the loss during the past epochs. For the present study, both approaches achieve a loss smaller than and the root of relative loss less than . More sophisticated choices of , and as well as other formulation of will be investigated in future work.

## Appendix D Computational cost

We consider two dynamic processes: relaxation to quasi-equilibrium and the development of the reverse Poiseuille flow. For relaxation to quasi-equilibrium, the micro-scale simulation is conducted in a domain (in reduced unit), which is mapped into a volume unit in the continuum DeePN, Hookean and FENE-P models. All simulations are run for a production period of (in reduced unit). For the case of the reverse Poiseuille flow, the microscale simulation is conducted in a domain . The simulations of the continuum DeePN, Hookean, and FENE-P models are conducted by mapping the domain into volume units along y direction. All simulations are run for a production period of . The computational cost for both systems is reported in Tab. LABEL:tab:computational_cost_all_models. All simulations are performed on Michigan State University HPCC supercomputer with Intel(R) Xeon(R) CPU E5-2670 v2.

We note that the size of the volume unit is chosen empirically in the continuum models of the flow systems considered in the present work. Our sensitivity studies show that the numerical results of the DeePN model agree well with the full MD when the average number of polymer within a unit volume is greater than . For all the cases, the computational cost of the DeePN model is less than of the computational cost of the full MD simulations and less than 10 times the cost of empirical continuum models.

## References

• (1)
• Hoogerbrugge and Koelman (1992) P. J. Hoogerbrugge and J. M. V. A. Koelman, Europhys. Lett. 19, 155 (1992).
• Groot and Warren (1997) R. D. Groot and P. B. Warren, Journal of Chemical Physics 107, 4423 (1997).
• Espanol and Warren (1995) P. Espanol and P. Warren, Europhysics Letters 30, 191 (1995).
• Lei et al. (2010) H. Lei, B. Caswell,  and G. E. Karniadakis, Phys. Rev. E 81, 026704 (2010).
• Lei et al. (2017) H. Lei, X. Yang, Z. Li,  and G. E. Karniadakis, J. Comput. Phys. 330, 571 (2017).
• Lees and Edwards (1972) A. W. Lees and S. F. Edwards, Journal of Physics C 5, 1921 (1972).
• Kingma and Ba (2015) D. Kingma and J. Ba, International Conference on Learning Representations (ICLR)  (2015).