Geometric Wavelet Scattering Networks on Compact Riemannian Manifolds

05/24/2019 ∙ by Michael Perlmutter, et al. ∙ Université de Montréal Michigan State University 0

The Euclidean scattering transform was introduced nearly a decade ago to improve the mathematical understanding of convolutional neural networks. Inspired by recent interest in geometric deep learning, which aims to generalize convolutional neural networks to manifold and graph-structured domains, we define a geometric scattering transform on manifolds. Similar to the Euclidean scattering transform, the geometric scattering transform is based on a cascade of wavelet filters and pointwise nonlinearities. It is invariant to local isometries and stable to certain types of diffeomorphisms. Empirical results demonstrate its utility on several geometric learning tasks. Our results generalize the deformation stability and local translation invariance of Euclidean scattering, and demonstrate the importance of linking the used filter structures to the underlying geometry of the data.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In an effort to improve our mathematical understanding of deep convolutional networks and their learned features, S. Mallat introduced the scattering transform for signals on mallat:firstScat2010 ; mallat:scattering2012 . This transform has an architecture similar to convolutional neural networks (ConvNets), based on a cascade of convolutional filters and simple pointwise nonlinearities. However, unlike other deep learning methods, this transform uses the complex modulus as its nonlinearity and does not learn its filters from data, but instead uses designed filters. As shown in mallat:scattering2012 , with properly chosen wavelet filters, the scattering transform is provably invariant to the actions of certain Lie groups, such as the translation group, and is also provably Lipschitz stable to small diffeomorphisms, where the size of a diffeomorphism is quantified by its deviation from a translation. These notions were applied in bruna:scatClass2011 ; bruna:invariantScatConvNet2013 ; sifre:rotoScatTexture2012 ; mallat:rotoScat2013 ; mallat:rigidMotionScat2014 ; oyallon:scatObjectClass2014 using groups of translations, rotations, and scaling operations, with applications in image and texture classification. Additionally, the scattering transform and its deep filter bank approach have also proven to be effective in several other fields, such as audio processing anden:scatAudioClass2011 ; anden:deepScatSpectrum2014 ; wolf:BSS-mlsp ; wolf:BSS ; arXiv:1807.08869 , medical signal processing talmon:scatManifoldHeart2014 , and quantum chemistry hirn:waveletScatQuantum2016 ; eickenberg:3DSolidHarmonicScat2017 ; eickenberg:scatMoleculesJCP2018 ; brumwell:steerableScatLiSi2018 .

However, many data sets of interest have an intrinsically non-Euclidean structure and are better modeled by graphs or manifolds. Indeed, manifold learning models (e.g., tenenbaum:isomap2000, ; coifman:diffusionMaps2006, ; maaten:tSNE2008, )

are commonly used for representing high-dimensional data in which unsupervised algorithms infer data-driven geometries to capture intrinsic structure in data. Furthermore, signals supported on manifolds are becoming increasingly prevalent, for example, in shape matching and computer graphics. As such, a large body of work has emerged to explore the generalization of spectral and signal processing notions to manifolds

coifman:geometricHarmonics2006 and graphs (shuman:emerging2013, , and references therein)

. In these settings, functions are supported on the manifold or the vertices of the graph, and the eigenfunctions of the Laplace-Beltrami operator or the eigenvectors of the graph Laplacian serve as the Fourier harmonics. This increasing interest in non-Euclidean data geometries has led to a new research direction known as

geometric deep learning, which aims to generalize convolutional networks to graph and manifold structured data (Bronstein:geoDeepLearn2017, , and references therein). Inspired by geometric deep learning, recent works have also proposed an extension of the scattering transform to graph domains. These mostly focused on finding features that represent a graph structure (given a fixed set of signals on it) while being stable to graph perturbations. In gama:diffScatGraphs2018 , a cascade of diffusion wavelets from coifman:diffWavelets2006 was proposed, and its Lipschitz stability was shown with respect to a global diffusion-inspired distance between graphs. A similar construction discussed in zou:graphCNNScat2018 was shown to be stable to permutations of vertex indices, and to small perturbations of edge weights. Finally, gao:graphScat2018 established the viability of scattering coefficients as universal graph features for data analysis tasks (e.g., in social networks and biochemistry data).

In this paper we consider the manifold aspect of geometric deep learning. There are two basic tasks in this setting: (1) classification of multiple signals over a single, fixed manifold; and (2) classification of multiple manifolds. Beyond these two tasks, there are additional problems of interest such as manifold alignment, partial manifold reconstruction, and generative models. Fundamentally for all of these tasks, both in the approach described here and in other papers, one needs to process signals over a manifold. Indeed, even in manifold classification tasks and related problems such as manifold alignment, one often begins with a set of universal features that can be defined on any manifold, and which are processed in such a way that allows for comparison of two or more manifolds. In order to carry out these tasks, a representation of manifold supported signals needs to be stable to orientations, noise, and deformations over the manifold geometry. Working towards these goals, we define a scattering transform on compact smooth Riemannian manifolds without boundary, which we call geometric scattering. Our construction is based on convolutional filters defined spectrally via the eigendecomposition of the Laplace-Beltrami operator over the manifold, as discussed in Section 2. We show that these convolutional operators can be used to construct a wavelet frame similar to the diffusion wavelets constructed in coifman:diffWavelets2006 . Then, in Section 3, we construct a cascade of these generalized convolutions and pointwise absolute value operations that is used to map signals on the manifold to scattering coefficients that encode approximate local invariance to isometries, which correspond to translations, rotations, and reflections in Euclidean space. We then show that our scattering coefficients are also stable to the action of diffeomorphisms with a notion of stability analogous to the Lipschitz stability considered in mallat:scattering2012 on Euclidean space. Our results provide a path forward for utilizing the scattering mathematical framework to analyze and understand geometric deep learning, while also shedding light on the challenges involved in such generalization to non-Euclidean domains. Numerical results in Section 4 show that geometric scattering coefficients achieve impressive results on signal classification on a single manifold, and classification of different manifolds. We demonstrate the geometric scattering method can capture the both local and global features to generate useful latent representations for various downstream tasks. Proofs of all theoretical results are provided in the appendices.

1.1 Notation

Let denote a compact, smooth, connected -dimensional Riemannian manifold without boundary contained in , and let denote the set of functions that are square integrable with respect to the Riemannian volume Let denote the geodesic distance between two points, and let denote the Laplace-Beltrami operator on . We let be the group of all diffeomorphisms , and likewise let denote the group of all isometries on For , we let denote its maximum displacement.

2 Geometric wavelet transforms on manifolds

The Euclidean scattering transform is constructed using wavelet and low-pass filters defined on . In Section 2.1, we extend the notion of convolution against a filter (wavelet, low-pass, or otherwise), to manifolds using notions from spectral geometry. Many of the notions described in this section are geometric analogues of similar constructions used in graph signal processing shuman:graphSigProc2013 . Section 2.2 utilizes these constructions to define Littlewood-Paley frames for , and Section 2.3 describes a specific class of Littlewood-Paley frames which we call geometric wavelets.

2.1 Convolution on manifolds

On the convolution of a signal with a filter is defined by translating against ; however, translations are not well-defined on generic manifolds. Nevertheless, convolution can also be characterized using the Fourier convolution theorem, i.e., . Fourier analysis can be defined on using the spectral decomposition of . Since is compact and connected,

has countably many eigenvalues which we enumerate as

(repeating those with multiplicity greater than one), and there exists a sequence of eigenfunctions such that is an orthonormal basis for and One can show that is constant, which implies, by orthogonality, that has mean zero for We consider the eigenfunctions as the Fourier modes of the manifold , and define the Fourier series of as

The following result, which is the analogue of the Fourier inversion theorem for , will be a useful way to represent signals supported on :


For we define the convolution over between and as


The last formulation, integration against the kernel will be used when we implement these operators numerically in Section 4.

It is well known that convolution on commutes with translations. This equivariance property is fundamental to Euclidean ConvNets, and has spurred the development of equivariant neural networks on other spaces pmlr-v48-cohenc16 ; kondor:equivarianceNNGroups2018 ; thomas:tensorFieldNetworks2018 ; kondor:clebsch-gordanNets2018 ; cohen:sphericalCNNs2018 ; kondor:covariantCompNets2018 ; NIPS2018_8239 . Since translations are not well-defined on we instead seek to construct a family of operators which commute with isometries. To this end, we say a filter is a spectral filter if implies , i.e. if can be written as a function of For a diffeomorphism we define the operator as

The following theorem shows that and commute if is an isometry and is a spectral filter. We note the assumption that is a spectral filter is critical and in general does not commute with isometries if is not a spectral filter. We will give a proof in Appendix A.

Theorem 1.

isoequi For every spectral filter and for every ,

2.2 Littlewood-Paley frames over manifolds

A family of spectral filters (with countable), is called a Littlewood-Paley frame if it satisfies the following condition which implies that the cover the frequencies of evenly:


We define the corresponding frame analysis operator, , by

The following proposition shows that if (3) holds, then preserves the energy of . For a proof, please see Appendix B.

Proposition 1.

lpframe If satisfies (3), then , is an isometry, i.e.,

Since the operator is linear, Proposition 1 also shows the operator is non-expansive, i.e., . This property is directly related to the stability of a ConvNet of the form . Indeed, if all the frame analysis operators and all the nonlinear operators are non-expansive, then the entire network is non-expansive as well.

2.3 Geometric wavelet transforms on manifolds

The geometric wavelet transform is a special type of Littlewood-Paley frame analysis operator in which the filters group the frequencies of into dyadic packets. A spectral filter is said to be a low-pass filter if and is non-increasing with respect to . Typically, decays rapidly as grows large. Thus, a low-pass filtering, , retains the low frequencies of while suppressing the high frequencies. A wavelet, is a spectral filter such that and . Unlike low-pass filters, wavelets have no frequency response at but are generally well localized in the frequency domain away from

We shall define a family of low-pass and a wavelet filters, using the difference between low-pass filters at consecutive dyadic scales, in a manner which mimics standard wavelet constructions (see, e.g., meyer:waveletsOperators1993 ). Let be a non-negative, non-increasing function with . Define a low-pass spectral filter by , and define its dilation at scale for , by . Given the dilated low pass filters, we defined our wavelet filters by


Letting and , we define the geometric wavelet transform as

The geometric wavelet transform extracts the low frequency, slow transitions of over through , and groups the high frequency, sharp transitions of over into different dyadic frequency bands via the collection . The following proposition can be proved by observing that forms a Littlewood-Paley frame and applying Proposition 1. We provide a proof in Appendix C.

Proposition 2.

waveisom For any , is an isometry, i.e.,

An important example is . In this case the low-pass kernel is the heat kernel on at time , and the wavelet operators are similar to the diffusion wavelets introduced in coifman:diffWavelets2006 . Figure 1 depicts these wavelets over manifolds from the FAUST Bogo:CVPR:2014 data set.

Figure 1: Geometric wavelets on the FAUST mesh with . From left to right . Positive values are colored red, while negative values are dark blue.

3 The geometric wavelet scattering transform

The geometric wavelet scattering transform is a type of geometric ConvNet, constructed in a manner analogous to the Euclidean scattering transform mallat:scattering2012 as an alternating cascade of geometric wavelet transforms (defined in Section 2.3) and nonlinearities. As we shall show in Sections 3.3 and 3.4, this transformation enjoys several desirable properties for processing data consisting of signals defined on a fixed manifold

, in addition to tasks in which each data point is a different manifold and one is required to compare and classify manifolds. Tasks of the latter form are approachable due to the use of geometric wavelets that are derived from a universal frequency function

that is defined independent of . Motivation for these invariance and stability properties is given in Section 3.1, and the geometric wavelet scattering transform is defined in Section 3.2.

3.1 The role of invariance and stability

Invariance and stability play a fundamental role in many machine learning tasks, particularly in computer vision. For classification and regression, one often wants to consider two signals

, or two manifolds and , to be equivalent if they differ by the action of a global isometry. Similarly, it is desirable that the action of small diffeomorphisms on , or on the underlying manifold , should not have a large impact on the representation of the inputted signal.

Thus, we seek to construct a family of representations, , which are invariant to isometric transformations up to the scale . Such a representation should satisfy a condition of the form:


where measures the size of the isometry with , and decreases to zero as the scale grows to infinity. For diffeomorphisms, invariance is too strong of a property. Instead, we want a family of representations that is stable to diffeomorphism actions, but not invariant. Combining this requirement with the isometry invariance condition (5) leads us to seek a condition of the form:


where measures how much differs from being an isometry, with if and if . At the same time, the representations should not be trivial. Different classes or types of signals are often distinguished by their high frequency content, i.e., for large Our problem is thus to find a family of representations for data defined on a manifold that is stable to diffeomorphisms, allows one to control the scale of isometric invariance, and discriminates between different types of signals, in both high and low frequencies. The wavelet scattering transform of mallat:scattering2012 achieves goals analogous to the ones presented here, but for Euclidean supported signals. We seek to construct a geometric version of the scattering transform, using filters corresponding to the spectral geometry of and to show it has similar properties.

3.2 Defining the geometric wavelet scattering transform

The geometric scattering transform is a nonlinear operator constructed through an alternating cascade of at most geometric wavelet transforms and nonlinearities. Its construction is motivated by the desire to obtain localized isometry invariance and stability to diffeomorphisms, as formulated in Section 3.1.

A simple way to obtain a locally isometry invariant representation of a signal is to apply the low-pass averaging operator If , then one can use Theorem 1 to show that


In other words, the difference between and for a unit energy signal (i.e., ), is no more than the size of the isometry depressed by a factor of , up to some universal constant that depends only on . Thus, the parameter controls the degree of invariance.

However, by definition , and so if we see the high frequency content of is lost in the representation . The high frequencies of are recovered with the wavelet coefficients , which are guaranteed to capture the remaining frequency content of . However, the wavelet coefficients are not isometry invariant and thus do not satisfy any bound analogous to (7). If we apply the averaging operator in addition to the wavelet coefficient operator, we obtain:

but by design the sequences and have small overlapping support, particularly in their largest responses, and thus . In order to obtain a non-trivial invariant that also retains some of the high frequency information in the signal , we apply a nonlinear operator. We choose the absolute value function because it is non-expansive and commutes with isometries. This leads to the following locally invariant descriptions of which we refer to as the first-order scattering coefficients:


The collection of all such coefficients is written as , where . These coefficients also satisfy a local invariance bound similar to (7), but encode multiscale characteristics of over the manifold geometry, which are not contained in . Nevertheless, the geometric scattering representation still loses information contained in the signal . Indeed, even with the absolute value, the functions have frequency information not captured by the low-pass . Iterating the geometric wavelet transform recovers this information by computing , which contains the first order invariants (8) but also retains the high frequencies of . We then obtain second-order geometric wavelet scattering coefficients given by

the collection of which can be written as . The corresponding geometric scattering transform up to order computes , which can be thought of as a three layer geometric ConvNet that extracts invariant representations of the inputted signal at each layer. Second order coefficients, in particular, decompose the interference patterns in into dyadic frequency bands via a second wavelet transform. This second order transform has the effect of coupling two scales and over the geometry of the manifold .

The general geometric scattering transform iterates the wavelet transform and absolute value operators up to an arbitrary depth. It is defined as


where is the scale of its invariance and is the depth of the network; Figure 2 gives a diagrammatic representation of . The invariance and diffeomorphism stability properties of are described in Sections 3.3 and 3.4, respectively. The following proposition shows that is non-expansive. The proof is nearly identical to (mallat:scattering2012, , Proposition 2.5), and is thus omitted.

Proposition 3.

The geometric wavelet scattering transform is nonexpansive, i.e.,


Figure 2: The geometric wavelet scattering transform , illustrated for .

3.3 Isometric invariance

The geometric wavelet scattering transform is invariant to the action of the isometry group on the inputted signal up to a factor that depends upon the frequency decay of the low-pass spectral filter . If , then the following theorem establishes isometric invariance up to the scale . We will give a proof in Appendix D.

Theorem 2.

isoinv Let and . Then there is a constant such that

For manifold classification (or any task requiring rigid invariance), we take . This limit is equivalent to replacing the the low-pass operator with an integration over , since for any ,


3.4 Stability to diffeomorphisms

Analogously to the Lipschitz diffeomorphism stability in (mallat:scattering2012, , Section 2.5), we wish to show the geometric scattering coefficients are stable to diffeomorphisms that are close to being an isometry. Similarly to wiatowski:frameScat2015 ; czaja:timeFreqScat2017 , we will assume the inputted signal is - bandlimited for some That is, whenever For the proof, please see Appendix E.

Theorem 3.

smudgestabilityBL Let and let . Then there is a constant such that if for some isometry and diffeomorphism


for all functions such that whenever

Theorem 3 achieves the goal set forth by (6), with the exception that we restrict to bandlimited functions. When is an isometry, it reduces to Theorem 2, since in this case we may choose , and note that . For a general diffeomorphim, taking the infimum of over all factorizations leads to a bound where the first term depends on the scale of the isometric invariance and the second term depends on the distance from to the isometry group in the uniform norm.

3.5 Isometric invariance between different manifolds

In shape matching and many other tasks, it is desirable to relax the assumption that is a diffeomorphism from to itself and instead assume that is a diffeomorphism from to another manifold The result below is an extension of Theorem 2 to this setting.

If is an isometry from to then the operator maps into

We wish to estimate how much

differs from where denotes the geometric wavelet scattering transform on However, the difference is not well-defined since is a countable collection of functions defined on and is a collection of functions defined on Therefore, we let be a second isometry from to and estimate the quantity We will give a proof in Appendix F.

Theorem 4.

isoinvariancediff Let be isometries and assume the low-pass filters and satisfy and . Then there is a constant such that

For shape matching tasks in which two isometric manifolds and should be identified as the same shape, we let and use (10) to carry out the computation.

4 Numerical results

In this section, we describe two numerical experiments to illustrate the utility of the geometric wavelet scattering transform. We consider both traditional geometric learning tasks, in which we compare to other geometric deep learning methods, as well as limited training tasks in which the unsupervised nature of the transform is particularly useful. In the former set of tasks, empirical results are not state-of-the-art, but they show the geometric scattering model is a good mathematical model for geometric deep learning. Specifically, in Section 4.1 we classify signals, corresponding to digits, on a fixed manifold, the two-dimensional sphere. Then, in Section 4.2 we classify different manifolds which correspond to ten different people whose bodies are positioned in ten different ways. The back-end classifier for all experiments is an RBF kernel SVM.

4.1 Spherical MNIST

In the first experiment, we project the MNIST dataset from Euclidean space onto a two dimensional sphere using a triangle mesh with 642 vertices. During the projection, we generate two datasets consisting of not rotated (NR) and randomly rotated (R) digits. Using the NR spherical MNIST database, we first investigate in Figure

(a)a the power of the globally invariant wavelet scattering coefficients for different networks depths with . We observe increasing accuracy but with diminishing returns across the range . Then on both the NR and R spherical MNIST datasets, we calculate the geometric scattering coefficients for and . Other values of are also reported in Appendix G. From Theorem 3, we know the scattering transform is stable to randomly generated rotations and Table (b)b shows the scattering coefficients capture enough rotational information to correctly classify the digits.

(a) Non-rotated spherical MNIST classfication using for different network depths . Depth obtains % classification accuracy.
Figure 5: Spherical MNIST classificaion results.

width=0.5 Model NR R S2CNN cohen:sphericalCNNs2018 FFS2CNN kondor:clebsch-gordanNets2018 Method from DBLP:s2cnn_ungrid N/A Harr wavelet scattering chen:scatHaar2014 N/A Geometric scattering

(b) Spherical MNIST classification with not rotated (NR) and rotated (R) datasets. Note that cohen:sphericalCNNs2018 ; kondor:clebsch-gordanNets2018 ; DBLP:s2cnn_ungrid utilize fully learned filters specifically designed for the sphere.

4.2 Faust

The FAUST dataset Bogo:CVPR:2014 contains ten poses from ten people resulting in a total of 100 manifolds represented by triangle meshes. We first consider the problem of classifying poses. This task requires globally invariant features, and thus we compute the geometric wavelet scattering transform with . Following the common practice of other geometric deep learning methods (see e.g. Litany2017 ; Lim2018 ), we use 352 SHOT features tombari2010unique ; bshot_iros2015 as initial node features . We used 5-fold cross validation for the classification tests with nested cross validation to tune hyper-parameters, including the network depth . As indicated in Table 3, we achieve 95% overall accuracy using the geometric scattering features, compared to 92% accuracy achieved using only the integrals of SHOT features (i.e., restricting to ). We note that DBLP:MasciBBV15 also considered pose classification, but the authors used a different training/test split (50% for training and 50% for test in a leave-one-out fashion), so their result is not directly comparable to ours.

As a second task, we attempt to classify the people. This task is even more challenging than classifying the poses since some of the people are very similar to each other. We again performed 5-fold cross-validation, with each fold containing 2 poses from each person to ensure the folds are evenly distributed. As shown in 1, we achieved 81% accuracy on this task compared to the 61% accuracy using only integrals of SHOT features.

width=0.65 Task/Model SHOT only Geometric scattering Pose classification Person classification

Table 1: Manifold classification on FAUST dataset with two tasks

5 Conclusion

We have constructed a geometric version of the scattering transform on a large class of Riemannian manifolds and shown this transform is non-expansive, invariant to isometries, and stable to diffeomorphisms. Our construction uses the spectral decomposition of the Laplace Beltrami operator to construct a class of spectral filtering operators that generalize convolution on Euclidean space. While our numerical examples demonstrate geometric scattering on two (or three) dimensional manifolds, our theory remains valid for manifolds of any dimension and therefore can be naturally extended and applied to higher-dimensional manifolds in future work. Finally, our construction provides a mathematical framework that enables future analysis and understanding of geometric deep learning.


Appendix A Proof of Theorem 1

We will prove a result that generalizes Theorem 1 to isometries between different manifolds. This more general result will be needed in order to prove Theorem 4.

Before stating our more general result, we introduce some notation. Let and be smooth compact connected Riemannian manifolds without boundary, and let be an isometry. Since and are and isometric, their Laplace Beltrami operators and have the same eigenvalues, and we enumerate the eigenvalues of (and also of ) in increasing order (repeating those with multiplicity greater than one) as If is a spectral filter, then by definition, whenever Therefore, there exists a well-defined function (also denoted by in a slight abuse of notation) defined on the set of distinct eigenvalues of given by

Therefore, we see that we can write the kernel defined in (2), as

and we define an operator on which we consider the analogue of as integration against the kernel

where is an orthonormal basis of eigenfunction on with With this notation, we may now state a generalized version of Theorem 1. Theorem 1 can be recovered by setting

Theorem 5.

Let be an isometry. Then for every spectral filter and every


For let be the operator which projects a function

onto the corresponding eigenspace

and let be the analogous operator defined on Since forms an orthonormal basis for , we may write write as integration against a kernel:



As noted in the beginning of this section, since is a spectral filter there is a well-defined function (also denoted by ) defined on by whenever Therefore, recalling the definition of from (2), we have that

From this it follows that

Likewise, by the same argument, we see that

Therefore, by the linearity of it suffices to show that

for all and all Let and write

where Since is an isometry, we have and Therefore,

as desired. ∎

Appendix B Proof of Proposition 1

Proposition 1.

If satisfies (3), then , is an isometry, i.e.,



Analogously to Parseval’s theorem, it follows from the Fourier inversion formula (1) and the fact that is an orthonormal basis, that

Similarly, it follows from (2) that

Therefore, using the Littlewood Paley condition (3), we see

Appendix C Proof of Proposition 2

Proposition 2.

For any , is an isometry, i.e.,



We will show that the frame satisfies the Littlewood Paley condition (3), i.e. that

The result will then follow from Proposition 1. Recall that is defined by for some non-negative, non-increasing function such that . Therefore, from (4), we see that that

and so,

Appendix D Proof of Theorem 2

Theorem 2.

Let and . Then there is a constant such that


In order to prove Theorem 2, we will need the to introduce the the -step scattering propagator, which analogously to (9) is defined by,

Note that by definition