Geometric algebra generation of molecular surfaces

12/25/2021
by   Azzam Alfarraj, et al.
Michigan State University
0

Geometric algebra is a powerful framework that unifies mathematics and physics. Since its revival in the middle of the 1960s by David Hestenes, it attracts great attention and has been exploited in many fields such as physics, computer science, and engineering. This work introduces a geometric algebra method for the molecular surface generation that utilizes the Clifford-Fourier transform which is a generalization of the classical Fourier transform. Notably, the classical Fourier transform and Clifford-Fourier transform differ in the derivative property in R_k for k even. This distinction is due to the noncommutativity of geometric product of pseudoscalars with multivectors and has significant consequences in applications. We use the Clifford-Fourier transform in R_3 to benefit from the derivative property in solving partial differential equations (PDEs). The Clifford-Fourier transform is used to solve the mode decomposition process in PDE transform. Two different initial cases are proposed to make the initial shapes used in the present method. The proposed method is applied first to small molecules and proteins. To validate the method, the molecular surfaces generated are compared to surfaces of other definitions. Applications are considered to protein electrostatic analysis. This work opens the door for further applications of geometric algebra and Clifford-Fourier transform in biological sciences.

READ FULL TEXT VIEW PDF

Authors

page 11

page 12

page 14

page 15

page 16

page 17

page 18

page 20

08/17/2019

Discrete and Fast Fourier Transform Made Clear

Fast Fourier transform was included in the Top 10 Algorithms of 20th Cen...
06/05/2013

Quaternion Fourier Transform on Quaternion Fields and Generalizations

We treat the quaternionic Fourier transform (QFT) applied to quaternion ...
06/07/2013

OPS-QFTs: A new type of quaternion Fourier transforms based on the orthogonal planes split with one or two general pure quaternions

We explain the orthogonal planes split (OPS) of quaternions based on the...
03/02/2022

Image-based material analysis of ancient historical documents

Researchers continually perform corroborative tests to classify ancient ...
03/31/2019

Fourier Transform Approach to Machine Learning

We propose a supervised learning algorithm for machine learning applicat...
08/24/2009

Geometric Analysis of the Conformal Camera for Intermediate-Level Vision and Perisaccadic Perception

A binocular system developed by the author in terms of projective Fourie...
03/05/2019

Efficient representation and manipulation of quadratic surfaces using Geometric Algebras

Quadratic surfaces gain more and more attention among the Geometric Alge...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The structures of biomolecules, such as those of proteins, DNAs, molecular motor, subcellular organelles, and viruses are directly related to their interactions and functions [26, 60, 2]. Therefore, studying biomolecular structures is a major topic in molecular biology and is essential for understanding biological processes. For example, the visualization of biomolecular surfaces and their electrostatic potentials is vital in the analysis of biomolecular interactions like protein-nucleic acid and protein-protein interactions, ligand-receptor binding, macromolecular assembly, enzymatic mechanism, and drug discovery [60]. Also, geometric complementarity, which is essential in molecular docking can be predicted via molecular surface shapes [26]. Hence, many theoretical methods and computational algorithms were proposed to characterize molecular shapes. In 1953, Corey and Pauling presented molecular models of atom and bond that are still pivotal in molecular sciences [16]. The wide applications of surfaces raise the need for fast and reliable surface generation algorithms. This is due to the fact that in molecular simulations molecular surfaces are rendered millions of times repeatedly [60]. For the case of large macromolecules that require excessive memory, a divide-and-conquer method was proposed to improve the efficiency of surface generation [58].

It is worth noting that when we talk about molecular surfaces we may not mean real physical surfaces. It is, indeed, a representation of molecular shapes and there are many representations available in the literature. The most popular representations are the van der Waals surface (VdWS), the solvent excluded surface (SES) and the solvent accessible surface (SAS) [43]. The VdWS is defined as a surface composed of overlapping rigid spheres with each sphere having a radius corresponding to the van der Waals radius of the corresponding atom. The SES is defined as the surface of the volume generated by moving a sphere representing a solvent molecule around the molecule and depicting the positions of the exterior surface of the sphere. Likewise, the SAS is depicting the positions of the center of the sphere [26]. The importance of such representations is that they are often used as molecule-solvent interfaces. These interface models are crucial in illustrating how surfaces interact with surrounding molecules such as ions, counterions, and solvents [60]. These interactions may determine the stability and solubility of macromolecules in an aqueous environment. The essence of such investigations comes from the fact that the human cell mass has a great percentage of water in the range of and most biological processes occur in that aqueous part of the cell. With this being said, the above surface models have been exploited in studies of protein folding [45], protein-protein interactions [18], drug classification [4], solvation energies [42], macromolecular docking [22], ion channel transport [59], protein pocket detection [57] and DNA binding [22].

The aforementioned surfaces, i.e., VdWS, SES and SAS, admit geometric singularities which result in computational difficulties [15, 24, 27, 44, 55]. To overcome this issue, the energy minimization principle has been adopted for biomolecular surface construction. Partial different equation (PDE)-based biomolecular surfaces were proposed in 2005 [51]. Inspired by geometric flows, the minimal molecular surface (MMS) was presented in 2006. In general, minimal surfaces are widely seen in nature due to the energy minimization principle [2]. These methods, unlike other popular methods, start with atomic coordinates and radii rather than some given surfaces. In addition, Gaussian surfaces [47, 11, 19], flexibility-rigidity index surfaces [38, 39, 53], level-set surfaces [13], and skinning surfaces [12] were also proposed to avoid singularity issues.

In the past few decades, geometric flow algorithms were exploited in image analysis and surface processing. Witkin, in 1983, proposed an image denoising algorithm using diffusion equations that were presented to be formally equivalent to Gaussian low-pass filters [52]. Perona and Malik presented an anisotropic diffusion equation for image denoise without edges being smeared [40]. Generalized Perona-Malik equation with arbitrarily high order nonlinear PDEs was proposed for edge-preserving noisy image restoration [49]. Mode decomposition evolution equations were proposed to generalize nonlinear PDE-based high-pass filters. These equations perform a PDE transform, which splits the data, signals, and images into functional modes such as trend, edge, texture, noise, and so on, depending on frequencies [48]. PDE transform was used to generate biomolecular surfaces [60]. The fast Fourier transform was incorporated in the PDE transform to avoid the stability constraints of solving high-order PDEs [60].

The Fourier transform is widely applied in science and engineering. Due to its great importance and impact on experiments and computational work, there have been many versions proposed to generalize or improve Fourier transform. In the field of geometric algebra, Clifford-Fourier transform was presented among other proposed transforms such as quaternion-Fourier transform [8, 25, 3, 32]. The Clifford-Fourier transform exploits two main notions in geometric algebra: geometric product and multivector. It is noteworthy that the notion of multivectors is built on the geometric product that was proposed by Clifford in 1876 [29] to unify the work of Grassmann in the outer product, also known as wedge product, and the work of Hamilton in quaternions [29]. However, Clifford’s work did not get much attention until the 1960s. In 1966, Davis Hestenes ignited the revival of geometric algebra and geometric calculus in his book Space-Time Algebra [3]. In the beginning, the emphasis of Hestenes was mainly on physics before geometric algebra applications gotten wide recognition in other fields such as computer science [21] and image processing [46, 1]. Hestenes suggested geometric algebra to be the unifying language of mathematics and physics [30]. Geometric algebra has also been applied to protein structure analysis [41, 31, 14, 36, 7]. Nowadays, one can say that geometric algebra offers a unified framework for diverse applications in mathematics, physics, computer science, engineering, and biology [17]. Notably, other fields of mathematics that have a strong relationship with geometric algebra also have great potential in biophysical applications. Specifically, the evolutionary de Rham-Hodge method was proposed for molecular data representation and analysis [10]. The wedge product and exterior calculus used in de Rham-Hodge theory are related to geometric calculus. Also, the -forms of de Rham-Hodge theory play a very similar role as the

-vectors of geometric algebra. The evolutionary de Rham-Hodge method showed success in predicting the protein B-factors of some challenging cases and outperformed the present methods in protein flexibility analysis

[10].

The goal of this work is to develop a geometric algebra-based biomolecular surface generation algorithm. The Clifford-Fourier transform is used along with the PDE transform to define a new molecular representation. This work opens a new direction in geometric algebra-based biomolecular modeling and analysis. It may stimulate future applications of geometric algebra in biological sciences. This paper is organized into four sections. Section 1 is dedicated to a brief literature review on molecular surface generation methods, PDE transform, and geometric algebra, and calculus. Then, Section 2 presents the theoretical background of our method. It starts by giving a thorough introduction to geometric algebra stating the definitions and main properties of the outer product, geometric product, -vectors, multivectors, and Clifford algebras. After that, Clifford-Fourier transform is presented where necessary definitions of multivector functions, derivatives, and integration are stated in the geometric calculus context. Then, specific cases of Clifford-Fourier transform in two-dimensional (2D) and 3D settings are discussed along with showing similarities and differences with the original Fourier transform. Next, we discuss the PDE transform. Afterward, Section 3 is devoted to our biomolecular surface generation method. Two equations used in the construction of initial surfaces are given. Then, test cases are provided and investigated to explore the effects of changing the parameters, i.e., propagation time and isovalues. After investigating the parameters, surfaces of real proteins are generated and compared to those from well-known methods. Finally, Section 4 demonstrates some applications on the generation of the surface for the purposes of validation. First, the electrostatic surface potentials are calculated and mapped to surfaces generated using our Clifford-Fourier transform method and then to surfaces generated using the MSMS method [44]. The calculations are conducted using the APBS package in VMD [35]. Second, the electrostatic solvation free energies of 21 proteins are calculated and compared to three other molecular surface generation methods presented in the literature. The energy calculations are carried out using MIBPB [9].

2 Theories and methods

The Fourier transform has been used extensively in mathematics, science, and engineering. Many versions of the Fourier transform have been proposed in different fields of mathematics. In the field of geometric algebra, Clifford-Fourier transform and quaternion Fourier transform were presented. Since this work utilizes the Clifford-Fourier transform, a basic introduction to geometric algebra is given to define notions and establish notations. Then, a definition of Clifford-Fourier transform follows.

2.1 Geometric algebra

Geometric algebra presents a framework where operations are given as scalars, vectors, and multivectors irrespective of the grade of the vectors. This unification and generalization is due to two main concepts in geometric algebra: geometric product and multivector [23, 6]. To define the geometric product, we need first to introduce the outer product, also called the wedge product, operator of geometric algebra.

2.1.1 Outer product

Given and in , their outer product is represented by . For any three vectors and in and a scalar in , the outer product has the following properties:

(1)
(2)
(3)
(4)
(5)
(6)

It is worth noting that the outer product is anticommutative as given in the property (1). The result of the outer product of two vectors and is called a bivector and can be visualized as an oriented parallelogram with and as shown in Figure 1.

Figure 1: Visualization of compared to . is an oriented parallelogram while is a vector perpendicular to both and .

Furthermore, the outer product of three vectors is called a trivector and can be visualized as an oriented parallelepiped, which has six oriented parallelogram as its faces. In general, the outer product of vectors is called a -vector and obviously, it cannot be visualized in a 3D setting. Keep in mind that -vectors are feasible if the vectors are in a Euclidean space of dimension where . Otherwise, the outer product is zero if . A -vector is said to have a grade .

2.1.2 Geometric product

After this introduction of the outer product, the geometric product of and is represented as and defined as:

(7)

where is the inner product.

Note that the product is a combination of a scalar and a bivector . This note leads us to the concept of multivectors. In an Euclidean space of dimension , a multivector is a linear combination of different-grade -vectors with being the highest grade, i.e. it is a linear combination of a scalar, vectors, bivectors, trivectors, … , -vectors and an -vector. Let us deduce some properties of the geometric product. The geometric product of a vector with itself is the magnitude of squared as shown below:

which leads to the fact that for any nonzero vector , the vector is its inverse. Also, the inner product and the outer product of any two vectors and can be expressed in terms of their geometric products only as shown below:
Since

then

which implied

2.1.3 Clifford algebras

First, we generalize the concept of basis to the geometric algebra setting. If is an orthonormal basis of , then

is a basis for bivectors and

is a basis for trivectors, and so on for the rest of -vectors. This means that any -vector can be written as a linear sum of the basis presented. This is useful when it comes to the summation of -vectors and helps in finding out the resulting -vector.

Notably, the geometric product of orthonormal basis vectors has some special properties that simplify the calculations of geometric products of multivectors. These properties are:

(8)
(9)

since

For the Euclidean space , we get a Clifford algebra that has the dimension . The basis of Clifford algebra consists of the scalar and the basis of and all different geometric products of the basis vectors. The Clifford algebra contains all multivectors of grade or less where the multivector grade is the highest grade among its constituent -vectors. For the sake of simplicity, we limit the discussion to and in the rest of this section. Therefore, the basis for is

and the basis for is

and to simplify the notations, and are denoted as and from now on. The following are examples of multivectors in and respectively:

It is noteworthy that the basis of the Clifford algebra has only one element of grade 2 which is and therefore it is called pseudoscalar and denoted as . In , similarly is called pseudoscalar and denoted as . The pseudoscalars have two important features. First, the square to -1 as follows

(10)
(11)

Second, any multivector in can be written as

where and , and any multivector in can be written as

where and .

Moreover, is isomorphic to since . So, for any scalar

Likewise for we have

Note that is not commutative with multivectors in which makes not commutative as well. Indeed, commutes with scalars and bivectors, as shown in Equation 4 and Equation 10, and anticommutes with vectors as follows

The same can be said about .

On the other hand, is commutative with any multivector in which means is commutative with any multivector in , and this can be proved in the way followed with . This note makes a noticeable impact when discussing the properties of the Clifford-Fourier transform.

2.2 Clifford-Fourier transform

A multivector function is a function whose range is a set of multivectors [28]. Now, let be a multivector function that is defined on

then its directional derivative in direction is defined as [23]

, where . Also, its Riemannian integral is defined as [23]

.

The Clifford Fourier transform is presented for a 2D setting and then for a 3D setting.

2.2.1 Clifford-Fourier transform in 2D

The Clifford-Fourier transform of a multivector function is defined as

and the inverse Clifford-Fourier transform is defined as

, where , provided the integrals exist. A multivector function defined as

can be written as

which can be seen as two complex signals and interpreted as an element of . The linearity of Clifford-Fourier transform would result in the following

which means that the Clifford-Fourier transform in 2D can be dealt with as a linear combination of two classical Fourier transforms. One of the most powerful properties of the Fourier transform is the derivative property. Fortunately, the Clifford-Fourier transform has derivative properties that might agree or disagree with the ones of the classical Fourier transform. To present the derivative property for a multivector function , one needs to decompose it into

where f is commutative with and in anticommutative. With this being said, Ebling and Scheuermann [23] showed the following

while,

From the above, one can see that

while no rule can be written for .

2.2.2 Clifford-Fourier transform in 3D

The Clifford-Fourier transform of a multivector function is defined as

and the inverse Clifford-Fourier transform is defined as

where , provided the integrals exist. A multivector function defined as

can be written as

which can be seen as four complex signals and interpreted as an element of . The linearity of the Clifford-Fourier transform would result in the following

and this makes it plausible to deal with Clifford-Fourier transform in 3D as a linear combination of four classical Fourier transform. The derivative property in 3D is similar to the one in the classical Fourier transform because is commutative with any multivector in . Therefore [23],

In our applications, we use the Clifford-Fourier transform in 3D, denoted as CFT3, since it acts like the classical Fourier transform in terms of derivative property.

2.3 PDE transform

In this section, we present a brief review of the partial differential equation transform that was proposed in our earlier work [48]. This transform is used to generate the biomolecular surfaces by applying it to a specific initial data-driven by the coordinates of the atoms in the molecule and their van der Waals radii. The next section offers a detailed explanation of the methods used in getting the initial data as well as the surface construction procedure.

Motivated by many physical phenomena in biological systems and pattern formation in nature, a family of high order PDEs for image processing was introduced in 1999

(12)

where is the image function, , is the edge sensitive diffusion coefficient and is the enhancement operator. Equation 12 is a generalization of the Perona-Malik equation [40] that can be recovered if the enhancement operator is set to zero and . The diffusion coefficients were defined as

(13)

where the values of depend on the noise level, and for

were defined in terms of local statistical variance of

and as

(14)

The notation represents the local average of centered at . The importance of the statistical measure based on the local statistical variance comes from its role in discriminating image features from noise. This advantage gives the ability to bypass the preprocessing done to noisy images where they get convolved with a test function or smooth mask [48].

(a)
(b)
(c)
(d)
(e)
(f)
Figure 2: The isosurfaces of the three-atom-molecule generated with the Clifford-Fourier transform method using the piecewise initial data defined in Equation 23 and extracted at isovalue and different propagation times.(a) The initial surface. (b) The isosurface with propagation time . (c) . (d) . (e) . (f) .

The well-posedness of the generalized Perona-Malik equation proposed was analyzed in terms of the existence and uniqueness of the solution [5, 34, 54]. The properties of Equation 12 were shown to be different from the properties of other high order PDEs because it is not derived from a variational formulation [34]. The stability of Equation 12 comes from appropriate choice of the coefficients [48].

As noted in our earlier work [48], the PDE transform can extract mode functions from some data given, say , which is a very important property of the PDE transform. The solution of Equation 12 can be found by the following equation

(15)

where is a low-pass PDE tansform that satisfies

(16)

with being an artificial time involved in , is the th mode function and is the th residue function that is defined by

(17)
(18)

The original data can be reconstructed perfectly as [60]

(19)

Note that recursive applications of the PDE transform can generate the mode functions based on the input data, where the first mode is the trend of the data and the first residue is a general edge function. In contrast, high-pass PDE transform, proposed in our earlier work, were constructed in a way that the first mode is the edge type of information and the trend is the final residue [48].

For the practical applications in this work, we assume the following linearized form

(20)

where is the th residue of the data, and . This linearized equation is subject to the initial data . Solving this arbitrarily high-order PDE transform is computationally expensive. We use the fast Clifford-Fourier transform (FCFT) to make the computations more efficient computationally. The Clifford-Fourier transform is applied to both sides of Equation 15 as follow

(21)

where is a frequency response function expressed as

(22)

with , and and are the Cilfford-Fourier transform of and .

In the present work, periodic boundary condition is used whenever needed.

3 Biomolecular surface generation

(a)
(b)
(c)
(d)
(e)
(f)
Figure 3: The isosurfaces of the imaginary three-atom-molecule generated with the Clifford-Fourier transform method using the Gaussian initial data defined in Equation 24 and extracted at isovalue and different propagation times.(a) The initial surface. (b) The isosurface with propagation time . (c) . (d) . (e) . (f) .

In this part, we propose a Clifford-Fourier transform-based surface generation method. First, we give a brief explanation of the method. Then, we show examples of surfaces generated using this method. In the next section, surface electrostatic potentials of some proteins are shown. The surfaces generated are compared with MSMS surfaces in terms of geometric singularities and electrostatic solvation free energy.

The first step in a surface generation is to make an initial shape driven by the coordinates and atomic radii of the atoms in the protein of interest. Then, we apply a PDE transform with specific parameters of time and order. After that, the Clifford-Fourier transform is applied to attain the final shape. From the final shape, we generate the molecular surface by extracting a specific isosurface. For the initial data, we have two cases for the initial data used in this method: piece-wise initial data and Gaussian initial data.

3.1 Initial data

The first one of the two initial data cases is to use the piecewise initial data that we used in our earlier work [60, 51]. The piecewise initial value is defined as

(23)

where is the sphere centered at and has a radius , i.e. with and being the coordinates of a specific atom in the molecule and its atomic radius respectively and is the total number of atoms in that molecule. In our present work, we take the van der Waals radius to be the atomic radius. So, this equation means that if is in the sphere of any atom in the molecule, then , and if it is outside of any sphere, then . As noted, this initial shape represents the van der Waals surface which is non-smooth. Another non-smooth definition of a piecewise case can be achieved by switching the region with value 0 to 1 and vice versa. This latter case was used in our earlier work extensively [2, 60].

(a)
(b)
(c)
(d)
(e)
(f)
Figure 4: The isosurfaces of the imaginary three-atom-molecule generated with the Clifford-Fourier transform method using the piecewise initial data defined in Equation 23 and extracted at propagation time and different isovalues.(a) The isosurface extracted at isovalue . (b) isovalue . (c) isovalue . (d) isovalue . (e) isovalue . (f) isovalue .

The second case of initial data is achieved using Gaussian functions. Gaussian functions have been used to generate molecular surfaces in the literature [55, 26, 56] and we exploited them in our earlier work of surface generation using PDE transform [60]. In this work, we adopt the Gaussian function proposed in our earlier work [60] which is a modified version of the one Giard and Macq [26] defined as

(24)

where is the threshold parameter and is set to Å. In this case of the initial value, the surface is not represented directly by the Gaussian function but the surface is indeed embedded within the Gaussian function. As noted, this case represents a smooth function but this does not give it any superiority over the non-smooth case in surface generation as discussed later.

(a)
(b)
(c)
(d)
(e)
(f)
Figure 5: The isosurfaces of the imaginary three-atom-molecule generated with the Clifford-Fourier transform method using the piecewise initial data defined in Equation 23 and extracted at propagation time and different isovalues.(a) The isosurface extracted at isovalue . (b) isovalue . (c) isovalue . (d) isovalue . (e) isovalue . (f) isovalue .

3.2 Test cases

Now, we conduct different experiments on a test case of an imaginary molecule that is composed of three atoms. The coordinates of the centers of atoms are and each of them has an atomic radius of Å. First, we explore the effect of the propagation time on the extracted isosurfaces. The experiments are done with the following propagation times and to both initial values, piecewise and Gaussian. After that, the effect of changing the isovalue on the extracted isosurfaces is investigated. We carry out this first to the piecewise case with the following isovalues , and . This experiment is done twice to see the effect in two different times and . Then, this investigation is carried out on the Gaussian initial shape with the following isovalues , and and it is done twice to see the effect in two different times and .

(a)
(b)
(c)
(d)
(e)
(f)
Figure 6: The isosurfaces of the imaginary three-atom-molecule generated with the Clifford-Fourier transform method using the Gaussian initial data defined in Equation 24 and extracted at propagation time and different isovalues.(a) The isosurface extracted at isovalue . (b) isovalue . (c) isovalue . (d) isovalue . (e) isovalue . (f) isovalue .
(a)
(b)
(c)
(d)
(e)
(f)
Figure 7: The isosurfaces of the imaginary three-atom-molecule generated with the Clifford-Fourier transform method using the Gaussian initial data defined in Equation 24 and extracted at propagation time and different isovalues.(a) The isosurface extracted at isovalue . (b) isovalue . (c) isovalue . (d) isovalue . (e) isovalue . (f) isovalue .

3.2.1 The effect of propagation time

We start our experiments by investigating the effect of propagation time on the extracted isosurfaces. We apply our algorithm on a piecewise initial shape of the imaginary three-atom-molecule. All extracted isosurfaces has isovalue while time propagates in powers of as . As seen in Figure 2, as time propagates the isosurfaces become more smooth, and geometric singularities disappear. With this being said, it is not necessarily true that the higher the propagation time the better the surface is. It is clear that in Figure 2(f) the isosurface is very smooth and not very helpful.

Then, the same experiment is done to the Gaussian initial shape with propagation times and and all the extracted isosurfaces has isovalue . Figure 3 shows that as time propagates the isosurfaces get more smooth and the geometric singularities disappear. However, this does not mean increasing propagation time is always better. As you can see that Figure 3(f) shows a surface that is over smoothed and hence not very useful.

3.2.2 The effect of the isovalue

After seeing the effect of the propagation time, we now show the effect of changing the isovalue. As seen in Figure 4, we have six different isosurfaces extracted at the following isovalues and respectively at propagation time using the piecewise initial data defined in Equation 23. Due to the definition of the initial data, the surfaces get inflated as the isovalue increases. In terms of geometric singularities, there is no significant changes as the isovalue changes. Likewise, Figure 5 shows isosurfaces extracted at propagation time and the same isovalues and using the piecewise initial data. The same observations can be said about this figure as well, where isosurfaces get larger as isovalue increases and no noticeable changes happen to geometric characteristics. In both figures, we do not have geometric singularities.

(a)
(b)
(c)
(d)
(e)
(f)
Figure 8: The molecular surfaces of protein 1ajj generated by Clifford-Fourier transform method using piecewise initial data. (a) isovalue , , (b) isovalue , , (c) isovalue , , (d) isovalue , , (e) isovalue , , (f) isovalue , .

Now, we conduct the same experiment on the Gaussian initial shape. Figure 6 shows six isosurfces extracted at the following isovlaues and respectively at propagation time using the Gaussian initial data defined in Equation 24. In contrary to the piecewise case, the isosurfaces of the Gaussian shape get deflated as the isovalues get larger. That is also due to the definition of Gaussian initial equation which is monotonically decreasing. Regarding the geometric characteristics, it is clear that the isosurfaces tend to be more meaningful as the isovalues get larger. However, very large isovalues may cause some geometric singularities. Likewise, the same experiment has been done for the same settings but with propagation time and is demonstrated in Figure 7. Clearly, the same conclusions can be made about the isosurfacs in terms of surface size as well as geometric characteristics and singularities.

3.3 Biomolecular surfaces

In this part, we show the biomolecular surfaces of real proteins generated using our algorithm exploiting the Clifford-Fourier transform to validate our proposed method. To gain acceptance within the molecular visualization community, our method is compared to the well-established method SES using MSMS package [44] available in the software visual molecular dynamics (VMD) [33]. All protein structures and atomic coordinates used in our computations were obtained from the Protein Data Bank (PDB) website (https://www.rcsb.org). We then used the package PDB2PQR[20] to add the missing hydrogen atoms and assign point charges at atomic centers based on the CHARMM force field [37].

(a)
(b)
(c)
(d)
(e)
(f)
Figure 9: The molecular surfaces of protein 1ajj generated by Clifford-Fourier transform method using Gaussian initial data. (a) isovalue , , (b) isovalue , , (c) isovalue , , (d) isovalue , , (e) isovalue , , (f) isovalue , .

Now, we start with the protein 1ajj by applying our method with both cases of initial shapes: piecewise and Gaussian. Figure 8 shows the piecewise initial shape where the first row has propagation time and the second row has propagation time . In each row, we have the following isovalues and respectively. As expected from the above test cases, the isosurfaces get inflated slightly as the isovalues increase. Also as expected, the geometric singularities do not disappear as we change the isovalues. The second row in Figure 8, which is for , goes in agreement with our predictions in the above test cases experiments as well. The isosurfaces for propagation time are very smooth and hence not preferred for biomolecular surfaces.

Now, we carry out the same experiments on protein 1ajj with Gaussian initial data. This is to investigate the combinations of propagation times and isovalues. Likewise, Figure 9 shows isosurfaces extracted at isovalues respectively where the first row show the results at propagation time and the second row show the results at . Isosurfaces demonstrated affirm our predictions made earlier where the geometric singularities tend to appear as the isovalues increase. Moreover, the surfaces get deflated as the isovalues increase due to the definition of the Gaussian function. Picking the appropriate isovalue has two competing factors which are the surface volume and the presence of geometric singularities.

(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
Figure 10: Comparison of molecular surfaces generated by Clifford-Fourier transform (CFT) using piecewise initial data (first column), Clifford-Fourier transform using Gaussian initial data (second column) and MSMS (third column). The first row is for protein 1ajj. The second row is for protein 1bor. The third row is for protein 1mbg. The fourth row is for protein 1sh1. All Clifford-Fourier transform surfaces generated .

After applying our method to the protein 1ajj and experimenting with different combinations of propagation times, isovalues and initial values, we demonstrate more biomolecular surfaces. We show the biomolecular surfaces of the proteins: 1ajj, 1bor, 1mbg, and 1sh1 using our method with two different sets of parameters and initial values. For the piecewise initial shape, we choose isovalue , the propagation time and order . For the Gaussian initial shape, we choose isovalue , the propagation time and order . These two surfaces of each protein are compared with SES surface generated using MSMS package in VMD with probe radius set to and density set to . Figure 10 illustrates the biomolecular surfaces of the proteins mentioned above where each row corresponds to a specific protein. The first row corresponds to 1ajj, the second to 1bor, the third to 1mbg, and the fourth to 1sh1. In each row, the first surface is corresponding to the piecewise initial shape, the second surface is corresponding to the Gaussian initial shape and the third is corresponding to the MSMS surface.

(a)
(b)
Figure 11: The electrostatic surface potentials of protein 1ajj mapped on two surfaces. (a) Surface generated by Clifford-Fourier transform method. (b) The MSMS surface.

4 Applications

The surfaces generated for a specific biomolecule represents its boundary region, which is regarded as the interface between the biomolecule region and the solvent region. The solvent-solute interface is essential in many models and applications such as electrostatic calculations [35], diffusion analysis [59], and differential geometry-based solvation models [50].

In this section, we show some applications on the biomolecular surfaces generated with our Clifford-Fourier transform method. We use the Poisson-Boltzmann model for electrostatic calculations. First, electrostatic surface potentials of some proteins are mapped to the surfaces generated with our Clifford-Fourier transform method and then they are compared with the mapping on surfaces generated by MSMS package. The electrostatic surface potentials are calculated using APBS method [35] available in VMD. Then, the electrostatic solvation free energy is calculated and compared with three other methods of biomolecular surface generation. We calculate the energies using the match interface and boundary (MIB) method [9] and compared our results with MSMS, two different methods that are developed using flexibility and rigidity index (FRI) [38].

(a)
(b)
Figure 12: The electrostatic surface potentials of protein 1bor mapped on two surfaces. (a) Surface generated by Clifford-Fourier transform method. (b) The MSMS surface.

4.1 Electrostatic surface potentials

The electrostatic surface potential is an important property for many applications in biology and biophysics [60]. It is essential in studies of drug design, protein-protein interactions, and other applications. The electrostatic potentials are calculated by solving the Poisson-Boltzmann equations. The calculation is done using a package available in VMD. After that, the potentials are mapped to the surface generated using our Clifford-Fourier transform method and to the SES surface generated by MSMS package in VMD. We validate our method by demonstrating the potentials on two proteins: 1ajj and 1bor. First, Figure 11 shows the Clifford-Fourier transform surface on the left and the MSMS surface to the right where both figures show a very good match in surface potentials. Then, Figure 12 shows another example of surface potentials mapped to the Clifford-Fourier transform surface on the left and mapped to MSMS surface on the right. This latter figure illustrates also that the two surfaces matched very well in their surface potentials. On top of that, the Clifford-Fourier transform surfaces shown below do not have geometric singularities as can be verified from the figures. Moreover, Clifford-Fourier transform surfaces are more smooth in both proteins.

Electrostatic solvation free energies (kcal/mol)
Protein ID MSMS surface FRI surface 1 FRI surface 2 CFT surface
1ajj -1100.754 -1155.158 -1258.784 -1154.039
1vii -862.865 -761.195 -846.414 -793.326
1bor -927.310 -1021.579 -1140.463 -833.647
451c -1003.17 -971.629 -1152.786 -946.969
1svr -1582.131 -1530.159 -1700.400 -1518.964
1uxc -1097.189 -975.583 -1075.070 -982.813
1mbg -1340.086 -1329.440 -1412.048 -1226.894
1ptq -800.130 -765.680 -866.256 -726.299
1sh1 -729.626 -648.315 -785.820 -724.799
2pde -1234.229 -863.702 -990.056 -1192.055
1hpt -788.626 -768.424 -873.410 -685.036
1a7m -2173.814 -2159.535 -2492.651 -2196.988
1neq -1683.679 -1661.077 -1811.667 -1552.394
1r69 -1115.733 -983.129 -1087.688 -983.534
1a2s -1868.827 -2216.464 -2356.554 -1819.385
2erl -894.960 -1127.617 -1189.821 -856.024
1bbl -970.053 -993.386 -1052.617 -876.956
1fca -1148.672 -1427.109 -1534.746 -1166.321
1frd -2691.339 -2935.189 -3106.112 -2438.468
1bpi -1267.063 -1170.959 -1269.508 -1128.051
1a63 -2291.449 -2233.139 -2495.664 -2213.839
Table 1: Comparison of electrostatic solvation free energies for surfaces generated by MSMS, exponential FRI method with Å, , and , Lorentz FRI method with Å, , and , and Clifford-Fourier transform surface with grid size =, isovalue= , propagation time=.

4.2 Electrostatic solvation free energy

Now, we calculate the electrostatic solvation free energies of 21 proteins to validate our method of surface generation. These calculation are performed using the match interface and boundary method. To show the validity, the results are compared with three other methods discussed in the work of Mu et al[38]. Table 1 shows the electrostatic solvation free energies of molecular surfaces generated by MSMS package, exponential kernel based rigidity (FRI surface 1) with Å, , and , Lorentz kernel based rigidity (FRI surface 2) with Å, , and and Clifford-Fourier transform with grid step = , isovalue = and propagation time = . The results of the three methods mentioned above were

5 Concluding remarks

Molecular surface generation is an important topic in computational biophysics and is crucial to the understanding of biological processes. A variety of computational methods have been developed for molecular surface generation, including those based on geometry, differential geometry, partial differential equation (PDE) transform, level sets, etc. Therefore, molecular surface generation has been a research topic where biology meets physics, mathematics, and computer science.

Geometric algebra has been widely applied to physics, computer vision, image analysis, and molecular biophysics, etc. However, it has not been used for molecular surface generation. This work introduces geometric algebra for molecular surface generation. More specifically, we utilize Clifford-Fourier transform (CFT), an important technique in geometric algebra, to define biomolecular surfaces.

We presented geometric algebra definitions of main calculus concepts such as integration and derivative. We also discussed in detail the -dimensional CFT and -dimensional CFT. We pointed to the fact that pseudoscalars in are not commutative with multivectors in contrast to which maintains the commutativity. This impacts the derivative property leading the -dimensional CFT to not have a general rule. On the other hand, the -dimensional CFT goes in parallel to the classical Fourier transform in this regard. Hence, we preferred the -dimensional CFT. This choice is due to the importance of the derivative property in solving PDEs using CFT. The PDE transform would then be implemented using CFT to generate molecular surfaces.

After introducing the Clifford-Fourier transform and PDE transform, we proposed our geometric algebra method of surface generation. We started by providing two cases of initial data: piecewise and Gaussian. Then, we conducted many experiments on an imaginary three-atom-molecule the real protein 1ajj to see the effect of changing the propagation times and the effect of changing the isovalues. These experiments showed that both initial cases were valid and able to generate good molecular surfaces. After that, molecular surfaces of real proteins were generated using both initial cases and compared to MSMS surfaces. Our surfaces showed superiority over MSMS surfaces due to the free of geometric singularities. To validate our method, we calculated electrostatic surface potentials and mapped them to our surfaces and to MSMS surfaces to visualize the electrostatic consistency and singularity free. Furthermore, we computed the electrostatic solvation free energies of 21 proteins on our surfaces and compared them to those from MSMS and FRI surfaces. Our surfaces showed very good results in terms of energies as well. One more feature of our method is that it gets the terminal state in a single-step approach which makes it time-efficient.

Finally, the proposed geometric algebra surface generation method opens the door for new multiscale methods for surface generation where different propagation times might be applied at once, and then the resulting surfaces can be combined to get the final surface. Also, the presented Clifford-Fourier transform has great potential to be exploited in biophysical problems like protein-protein docking and protein-ligand binding.

Acknowledgment

This work was supported in part by NIH grant GM126189, NSF grants DMS-2052983, DMS-1761320, and IIS-1900473, NASA grant 80NSSC21M0023, Michigan Economic Development Corporation, MSU Foundation, Bristol-Myers Squibb 65109, and Pfizer. AA thanks Dr. Jiahui Chen for technical assistance.

References

  • [1] T. Batard, M. Berthier, and C. Saint-Jean. Clifford–Fourier transform for color image processing. In Geometric algebra computing, pages 135–162. Springer, 2010.
  • [2] P. W. Bates, G.-W. Wei, and S. Zhao. Minimal molecular surfaces and their applications. Journal of Computational Chemistry, 29(3):380–391, 2008.
  • [3] E. Bayro-Corrochano and G. Scheuermann. Geometric algebra computing: in engineering and computer science. Springer Science & Business Media, 2010.
  • [4] C. A. Bergström, M. Strafford, L. Lazorova, A. Avdeef, K. Luthman, and P. Artursson. Absorption classification of oral drugs based on molecular surface properties. Journal of medicinal chemistry, 46(4):558–570, 2003.
  • [5] A. L. Bertozzi and J. B. Greer. Low-curvature image simplifiers: Global regularity of smooth solutions and laplacian limiting schemes. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 57(6):764–790, 2004.
  • [6] U. A. Bhatti, Z. Yu, L. Yuan, Z. Zeeshan, S. A. Nawaz, M. Bhatti, A. Mehmood, Q. U. Ain, and L. Wen.

    Geometric algebra applications in geospatial artificial intelligence and remote sensing image processing.

    IEEE Access, 8:155783–155796, 2020.
  • [7] S. J. Billinge, P. M. Duxbury, D. S. Gonçalves, C. Lavor, and A. Mucherino. Assigned and unassigned distance geometry: applications to biological molecules and nanostructures. 4OR, 14(4):337–376, 2016.
  • [8] F. Brackx, N. De Schepper, and F. Sommen. The Clifford-Fourier transform. Journal of Fourier Analysis and Applications, 11(6):669–681, 2005.
  • [9] D. Chen, Z. Chen, C. Chen, W. Geng, and G.-W. Wei. Mibpb: a software package for electrostatic analysis. Journal of computational chemistry, 32(4):756–770, 2011.
  • [10] J. Chen, R. Zhao, Y. Tong, and G.-W. Wei. Evolutionary de Rham-Hodge method. Discrete and continuous dynamical systems. Series B, 26(7):3785, 2021.
  • [11] M. Chen and B. Lu. Tmsmesh: A robust method for molecular surface mesh generation using a trace technique. Journal of Chemical Theory and Computation, 7(1):203–212, 2011.
  • [12] H.-L. Cheng and X. Shi. Quality mesh generation for molecular skin surfaces using restricted union of balls. Computational Geometry, 42(3):196–206, 2009.
  • [13] L.-T. Cheng, Y. Xie, J. Dzubiella, J. A. McCammon, J. Che, and B. Li. Coupling the level-set method with molecular mechanics for variational implicit solvation of nonpolar molecules. Journal of chemical theory and computation, 5(2):257–266, 2009.
  • [14] P. Chys. Application of geometric algebra for the description of polymer conformations. The Journal of chemical physics, 128(10):104107, 2008.
  • [15] M. L. Connolly. Depth-buffer algorithms for molecular modelling. Journal of Molecular Graphics, 3(1):19–24, 1985.
  • [16] R. B. Corey and L. Pauling. Molecular models of amino acids, peptides, and proteins. Review of Scientific Instruments, 24(8):621–627, 1953.
  • [17] E. B. Corrochano and G. Sobczyk. Geometric algebra with applications in science and engineering. Springer Science & Business Media, 2001.
  • [18] P. B. Crowley and A. Golovin. Cation– interactions in protein–protein interfaces. Proteins: Structure, Function, and Bioinformatics, 59(2):231–239, 2005.
  • [19] S. Decherchi and W. Rocchia. A general and robust ray-casting-based algorithm for triangulating surfaces at the nanoscale. PloS one, 8(4):e59744, 2013.
  • [20] T. J. Dolinsky, J. E. Nielsen, J. A. McCammon, and N. A. Baker. Pdb2pqr: an automated pipeline for the setup of Poisson–Boltzmann electrostatics calculations. Nucleic acids research, 32(suppl_2):W665–W667, 2004.
  • [21] L. Dorst, D. Fontijne, and S. Mann. Geometric algebra for computer science: an object-oriented approach to geometry. Elsevier, 2010.
  • [22] A. I. Dragan, C. M. Read, E. N. Makeyeva, E. I. Milgotina, M. E. Churchill, C. Crane-Robinson, and P. L. Privalov. Dna binding and bending by hmg boxes: energetic determinants of specificity. Journal of molecular biology, 343(2):371–393, 2004.
  • [23] J. Ebling and G. Scheuermann. Clifford fourier transform on vector fields. IEEE Transactions on Visualization and Computer Graphics, 11(4):469–479, 2005.
  • [24] F. Eisenhaber and P. Argos. Improved strategy in analytic surface calculation for molecular systems: Handling of singularities and computational efficiency. Journal of Computational Chemistry, 14(11):1272–1280, 1993.
  • [25] M. Felsberg, T. Bülow, and G. Sommer. Commutative hypercomplex fourier transforms of multidimensional signals. In Geometric Computing with Clifford Algebras, pages 209–229. Springer, 2001.
  • [26] J. Giard and B. Macq. Molecular surface mesh generation by filtering electron density map. International Journal of Biomedical Imaging, 2010, 2010.
  • [27] V. Gogonea and E. Ōsawa. Implementation of solvent effect in molecular mechanics part 3. the first-and second-order analytical derivatives of excluded volume. Journal of Molecular Structure: THEOCHEM, 311:305–324, 1994.
  • [28] D. Hestenes. Multivector calculus. Journal of Mathematical Analysis and Applications, 24(2):313–325, 1968.
  • [29] D. Hestenes. New foundations for classical mechanics, volume 15. Springer Science & Business Media, 2012.
  • [30] D. Hestenes and G. Sobczyk. Clifford algebra to geometric calculus: a unified language for mathematics and physics, volume 5. Springer Science & Business Media, 2012.
  • [31] E. Hitzer and C. Perwass. Interactive 3d space group visualization with clucalc and the clifford geometric algebra description of space groups. Advances in applied Clifford algebras, 20(3):631–658, 2010.
  • [32] E. Hitzer and S. J. Sangwine. Quaternion and Clifford Fourier transforms and wavelets. Springer, 2013.
  • [33] W. Humphrey, A. Dalke, and K. Schulten. VMD – Visual Molecular Dynamics. Journal of Molecular Graphics, 14:33–38, 1996.
  • [34] Z. Jin and X. Yang. Strong solutions for the generalized perona–malik equation for image restoration. Nonlinear Analysis: Theory, Methods & Applications, 73(4):1077–1084, 2010.
  • [35] E. Jurrus, D. Engel, K. Star, K. Monson, J. Brandi, L. E. Felberg, D. H. Brookes, L. Wilson, J. Chen, K. Liles, et al. Improvements to the apbs biomolecular solvation software suite. Protein Science, 27(1):112–128, 2018.
  • [36] C. Lavor and R. Alves. Oriented conformal geometric algebra and the molecular distance geometry problem. Advances in Applied Clifford Algebras, 29(1):1–15, 2019.
  • [37] A. D. MacKerell Jr, D. Bashford, M. Bellott, R. L. Dunbrack Jr, J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. The journal of physical chemistry B, 102(18):3586–3616, 1998.
  • [38] L. Mu, K. Xia, and G. Wei. Geometric and electrostatic modeling using molecular rigidity functions. Journal of Computational and Applied Mathematics, 313:18–37, 2017.
  • [39] K. Opron, K. Xia, and G.-W. Wei. Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis. The Journal of chemical physics, 140(23):06B617_1, 2014.
  • [40] P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on pattern analysis and machine intelligence, 12(7):629–639, 1990.
  • [41] J. Quine. Helix parameters and protein structure using quaternions. Journal of Molecular Structure: THEOCHEM, 460(1-3):53–66, 1999.
  • [42] T. M. Raschke, J. Tsai, and M. Levitt. Quantification of the hydrophobic interaction by simulations of the aggregation of small hydrophobic solutes in water. Proceedings of the National Academy of Sciences, 98(11):5965–5969, 2001.
  • [43] F. M. Richards. Areas, volumes, packing, and protein structure. Annual review of biophysics and bioengineering, 6(1):151–176, 1977.
  • [44] M. F. Sanner, A. J. Olson, and J.-C. Spehner. Reduced surface: an efficient way to compute molecular surfaces. Biopolymers, 38(3):305–320, 1996.
  • [45] R. S. Spolar and M. T. Record. Coupling of local folding to site-specific binding of proteins to dna. Science, 263(5148):777–784, 1994.
  • [46] R. Wang, K. Wang, W. Cao, and X. Wang. Geometric algebra in signal and image processing: A survey. IEEE Access, 7:156315–156325, 2019.
  • [47] S. Wang, E. Alexov, and S. Zhao. On regularization of charge singularities in solving the Poisson-Boltzmann equation with a smooth solute-solvent boundary. Mathematical Biosciences and Engineering, 18(2):1370–1405, 2021.
  • [48] Y. Wang, G.-W. Wei, and S. Yang. Partial differential equation transform—variational formulation and fourier analysis. International journal for numerical methods in biomedical engineering, 27(12):1996–2020, 2011.
  • [49] G. W. Wei. Generalized Perona-Malik equation for image restoration. IEEE Signal processing letters, 6(7):165–167, 1999.
  • [50] G.-W. Wei. Differential geometry based multiscale models. Bulletin of mathematical biology, 72(6):1562–1622, 2010.
  • [51] G.-W. Wei, Y. Sun, Y. Zhou, and M. Feig. Molecular multiresolution surfaces. arXiv preprint math-ph/0511001, 2005.
  • [52] A. Witkin. Scale-space filtering: A new approach to multi-scale description. In ICASSP’84. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 9, pages 150–153. IEEE, 1984.
  • [53] K. Xia, K. Opron, and G.-W. Wei. Multiscale multiphysics and multidomain models—flexibility and rigidity. The Journal of chemical physics, 139(19):11B614_1, 2013.
  • [54] M. Xu and S. Zhou. Existence and uniqueness of weak solutions for a fourth-order nonlinear parabolic equation. Journal of Mathematical Analysis and Applications, 325(1):636–654, 2007.
  • [55] Z. Yu, M. J. Holst, Y. Cheng, and J. A. McCammon. Feature-preserving adaptive mesh generation for molecular shape modeling and simulation. Journal of Molecular Graphics and Modelling, 26(8):1370–1380, 2008.
  • [56] Y. Zhang, I. Hubner, A. Arakaki, E. Shakhnovich, and J. Skolnick. On the origin and completeness of highly likely single domain protein structures. Proc Natl Acad Sci USA, 103:2605–2610, 2006.
  • [57] R. Zhao, Z. Cang, Y. Tong, and G.-W. Wei. Protein pocket detection via convex hull surface evolution and associated reeb graph. Bioinformatics, 34(17):i830–i837, 2018.
  • [58] R. Zhao, M. Wang, Y. Tong, and G.-W. Wei. Divide-and-conquer strategy for large-scale eulerian solvent excluded surface. Communications in information and systems, 18(4):299, 2018.
  • [59] Q. Zheng and G.-W. Wei. Poisson–Boltzmann–Nernst–Planck model. The Journal of chemical physics, 134(19):194101, 2011.
  • [60] Q. Zheng, S. Yang, and G.-W. Wei. Biomolecular surface construction by pde transform. International journal for numerical methods in biomedical engineering, 28(3):291–316, 2012.