1 Introduction
The motivation behind the organization of the ICLR 2021 Computational Geometry and Topology challenge was twofolds: first, to push forward the fields of computational differential geometry and topology; and second, to foster reproducible research in mathematics by encouraging the use, development and maintenance of opensource repositories.
Reproducible Research
The reproducibility of a (computational) experiment is widely regarded as a requirement to establish a scientific claim or to demonstrate the applicability of a technology. Different authors have suggested different levels of reproducibility (Dalle, 2012; Stodden et al., 2013; Schnell, 2018). These levels range from repeatability – the ability of the same team to repeat the same experiment with the same methodological set up – to the stronger notion of reproducibility – the ability of a different team to reproduce the results with a different methodological set up. In computational and mathematical sciences, such “methodological setup” refers to the software environment, the data, and raw code.
Despite many valuable initiatives to improve reproducibility in science, minimum standards are rarely met – even in the mathematical sciences (David Redish et al., 2018). This socalled reproducibility crisis has led researchers, funding agencies, politicians, and the wider audience to question the reliability of the scientific enterprise. Among other attempts to address the crisis, the workshop “ICERM Workshop on Reproducibility in Computational and Experimental Mathematics (2012)” laid out several recommendations to the research community. In particular, the workshop committee claimed that “appropriate tools” should be taught as standard operating procedure in relation to computational aspects of research (Stodden et al., 2013).
Computational notebooks
What are today’s “appropriate tools” in modern computational and mathematical sciences? The most commonly used tools may not be the most appropriate anymore. The publication process has barely evolved since the 17th century, as the research paper (or its electronic form) still represents the principal mean of diffusion for mathematical ideas. As James Sommers writes: “Scientific results today are as often as not found with the help of computers. […] And yet by far the most popular tool we have for communicating these results is the PDF  literally a simulation of a piece of paper. Maybe we can do better.”
Many new “appropriate tools” are available to mathematical sciences to go beyond the sole diffusion format of the research paper. Among them, the computational notebook is an excellent candidate that can complement the traditional PDF paper (Oakes et al., 2019). Even a very theoretical mathematical paper could be completed by a computational notebook, that would: (i) use symbolic computation software to automatically check equations, (ii) leverage packages to check the veracity of a theorem on specific examples, (iii) provide interactive visualizations of special cases for the theoretical concepts exposed. This computational notebook could also be run automatically by the editor board of the corresponding journal, hence relieving some aspects of the review process and directly fostering reproducibility.
Opensource packages
Computational notebooks would ideally heavily leverage a shared implementation of the core concepts of a given field of mathematics. This implementation would be materialized as a piece of opensource software, whose computations would be constantly checked by its community. As such, wellmaintained opensource software and computational notebooks represent the foundational “appropriate tools” that can foster reproducibility in mathematical research and with it, improve the efficiency and reliability of the research enterprise. Many opensource packages have made code available to foster reproducible research in their respective fields of mathematics.
In the field of differential geometry, we find Pygeometry (Censi, 2012), Manopt (Boumal et al., 2013), Pyquaternion (Wynn, 2014), Pyriemann (Barachant, 2015), PyManopt (Townsend et al., 2016), TheanoGeometry (Kühnel and Sommer, 2017), Geoopt (Kochurov et al., 2019), Geomstats (Miolane et al., 2020), the SageManifold project within the package SageMath (Developers et al., 2020), Tensorflow Manopt (Smirnov, 2021), and JuliaManifolds (Axen et al., 2021), among others. In the field of topology, we find Perseus (Nanda, 2012), Dipha (Kerber et al., 2014), Javaplex (Tausz et al., 2014), TDA (Fasy et al., 2015), Dionysus (Morozov, 2015), Eireen (Henselman and Ghrist, 2016), PHAT (Bauer et al., 2017), the Topology ToolKit TTK (Tierny et al., 2018), RedHom (Brendel et al., 2019), Scikittda (Saul and Tralie, 2019), GiottoTDA (Tauzin et al., 2020), HomCloud (Obayashi et al., 2020), Diamorse (Robins and DelgadoFriedrichs, 2020), Gudhi (The GUDHI Project, 2021), GDApublic (GeomData, 2021), and Ripser (Bauer, 2021), to cite a few.
Despite the existence of these packages, computational notebooks do not always accompany the submission of a mathematical research paper. Furthermore, recruiting maintainers to ensure the validity of the code on these platforms is often difficult. Both issues can be explained by a lack of incentives in the associated scientific communities.
Incentives
Computational notebooks and opensource software – such as the ones referenced in the previous paragraph – are gaining popularity in several fields of research, for example within machine learning communities. However, they might be still underused in the mathematical sciences for three main reasons. First, traditional mathematical training rarely introduce notebooks and software engineering as part of the curriculum. In differential geometry for example, textbooks may lack coding exercises or an associated interactive library. As a result, mathematicians do not necessarily master the tools available to use or write code associated with their findings. Second, many fields of mathematics lack a reference platform, such as a designated software, where researchers can share their computations and together contribute to their field. Third, there are only few incentives that motivate junior researchers in the mathematical sciences to learn good practices. The “publishorperish” pressure can make it difficult for junior researchers to consider taking additional time to (learn to) implement and share their results. As a consequence, and specifically in differential geometry and topology, it can be challenging to reproduce results, even if they were produced by the same team.
Computational Geometry and Topology Challenge
The ICLR 2021 Computational Geometry and Topology Challenge aimed to address these issues by encouraging researchers to delve into opensource implementations of differential geometry and topology. The participants were asked to create computational notebooks using the opensource software Geomstats (Miolane et al., 2020) and GiottoTDA (Tauzin et al., 2020). The goal was to showcase some of the aforementioned “appropriate tools” for modern research in the mathematical sciences. The participants of the challenge were rewarded by the publication of the present paper and with prizes for the three winning teams.
Outline and contributions
The remainder of this paper is organized as follows. Section 2 introduces the setup and guidelines of the challenge. Section 3 summarizes the submissions to the challenge. Section 4 presents the main features used by the participants within the packages Geomstats and GiottoTDA. Section 5 presents the limitations of the packages as reported by the participants. Section 6 provides a list of new features proposed by the participants that aim to enhance current implementations of computational geometry and topology. Section 7 describes the challenge’s evaluation process and gives the final ranking of the submissions to the challenge.
2 Setup of the challenge
The challenge was held in conjunction with the workshop “Geometric and Topological Representation Learning” of the International Conference on Learning Representations (ICLR) 2021.
Guidelines
The participants were asked to submit a Jupyter Notebook to provide “the best data analysis, computational method, or numerical experiment relying on stateoftheart geometric and topological Python packages”: Geomstats and GiottoTDA. The participants submitted their Jupyter Notebooks via Pull Requests (PR) to the GitHub repository of the challenge ^{1}^{1}1https://github.com/geomstats/challengeiclr2021. Teams were accepted and there was no restriction on the number of team members. The current principal developers of Geomstats and GiottoTDA, i.e. the coauthors of the published papers (Miolane et al., 2020; Tauzin et al., 2020), were not allowed to participate.
Each submission was requested to respect the following structure: (i) Introduction and motivation, (ii) Analysis/Experiment, (iii) Benchmark, (iv) Limitations and perspectives. Guidelines were also giving examples of possible submissions:

Data analysis with geometric and topological methods,

Implementation of a research paper with Geomstats/GiottoTDA

Implementation of a feature to merge into Geomstats/GiottoTDA codebases,

Implementation of visualization tools for Geomstats/GiottoTDA,

Benchmarking/profiling on geometric and topological methods against other methods for a public dataset.
This list was completed by the submissionexample* folders on the GitHub repository, to help participants understand the packages and design their submission.
Evaluation criterion: fostering creativity
The evaluation criterion was: “how does the submission help push forward the fields of computational geometry and topology?”. The submissions were ranked according to this evaluation criterion, through a voting procedure relying on the Condorcet method, see Section 7.
The choice of this evaluation criterion was motivated by several reasons. First, the criterion did not require participants to submit novel research. The main focus was the implementation, which could for instance be the reproduction of published research. Such a criterion can allow the participants to focus on producing clean code and to provide a handson explanation of the mathematical concepts at hand.
This criterion also did not bias participants towards showcasing “positive results” such as a new method beating the state of the art. “Negative results” were considered just as valuable as positive results. In particular, submissions criticizing the available packages, or showing examples where geometric and topological representations did not help the analysis were also significantly rewarded.
Lastly, such criterion encouraged participants to be generally creative. Most machine learning challenges are conducted by ranking the participants according to a quantitative metric on a test dataset. This can induce biases in the contributions of the participants, since methods that do not perform on that specific metric are not rewarded. While they have many other advantages, such criteria may hide interesting research. In contrast, our evaluation criterion, relying on a voting system through the Condorcet method, was meant to also reward creative submissions.
Software engineering practices
The participants were also encouraged to use software engineering best practices. Their code should be compatible with Python 3.8 and make an effort to respect the Python style guide PEP8. The Jupyter notebooks were automatically tested when a Pull Request was submitted and the tests were required to pass. If a dataset was used, the dataset had to be public and referenced. Participants could raise GitHub issues and/or request help or guidance at any time through Geomstats slack and GiottoTDA slack. The help/guidance was be provided modulo availability of the maintainers.
3 Submissions to the Challenge
Sixteen teams submitted code before the deadline and participated in the challenge. This section provides a summary of their submissions.
Noise Invariant Topological Features
This submission analyzes data topological structure whilst being robust to various data corruptions. Examples of perturbed data are noisy point clouds, photos taken from different views, or dynamic modeling. This submission showcases the pipeline for extracting Perturbed Topological Signatures (PTS) by using Geomstats and GiottoTDA (Som et al., 2018). The topological properties are studied by using distance metrics and kernels defined on the Stiefel and Grassmann manifolds (Edelman et al., 1999; Hamm and Lee, 2009; Som et al., 2018). Experiments are performed on three datasets: SHREC 2010 (Veltkamp et al., 2010), Princeton COS429 (COS429, 2009)
, and MNIST
(LeCun et al., 2010).Estimators of Means of Symmetric Positive Matrices
This submission investigates estimators of means of Symmetric Positive Definite (SPD) matrices. In the first notebook, the efficiency of the recursive estimation of the Karcher mean
(Ho et al., 2013), and of the Kmeans algorithm relying on it, are benchmarked and improved. In the second notebook, the Shrinkage Estimator is implemented
(Yang et al., 2020), and the notebook shows how it improves on the maximum likelihood estimator. Experiments rely on synthetic datasets on the manifold of SPD matrices using sampling methods on manifolds (Schwartzman, 2016).Visualization of Kendall Shape Spaces for Triangles
This submission introduces visualization methods for Kendall shape spaces of triangles. An object’s shape can be described by locating relevant points on it, called landmarks (Dryden and Mardia, 2016). The Kendall shape space of triangles in dimension is the space of triangles quotiented by the group of rotations, translations and dilatations of (Kendall, 1984; Le and Kendall, 1993). This submission presents two new visualization methods of these Kendall shape spaces, demonstrates their use and compares them with an alternative visualization method for this dataset: the nonexact visualization of multidimensional scaling (MDS). The experiments are performed on synthetic data of triangles in for , and on the dataset of optic nerve heads shapes from Geomstats’ datasets module.
Map your Topology to Different Geometries
This submission implements a method to map a set of points from one geometry of choice onto another while preserving the topology. In the context of this notebook, a “geometry” refers to a Riemannian manifold such as Euclidean space, Hyperbolic space, Hypersphere, manifold of Symmetric Positive Definite (SPD) matrices, among others. The method uses gradient descent on Riemannian manifolds with a loss function introduced in
(Moor et al., 2020)that has been used in Deep Learning
(Moor et al., 2020; Gabrielsson et al., 2020). Experiments are run on synthetic data generated on the Euclidean plane, the sphere and the Poincare ball.Naive Image Anomaly Detection on Fashion MNIST
This submission evaluates the possibility to achieve anomaly detection (AD) in image databases with naive distances to centroids and norms using Euclidean and Riemannian representations. The notebook considers simple AD setups where the objective is to discriminate between two classes of the Fashion MNIST dataset
(Xiao et al., 2017). A general approach to embed images into the space of covariance matrices is introduced based on (Calvo and Oller, 1990). The best performances are achieved by the method relying on the norm of the negated geodesic principal component analysis (PCA) with the Fréchet mean as PCA base point
(Rippel et al., 2020), using the LogEuclidean Riemannian metric.Shape Analysis of Bone Cancer Cells
This submission studies osteosarcoma (bone cancer) cells and the impact of drug treatment on their morphological shapes. The analysis uses cell images obtained from fluorescence microscopy. The corresponding dataset has been added into the Geomstats’ module datasets by the participants. Cell shapes are modelled as discrete (open) curves. The submission uses the Riemannian elastic metric on discrete curves to compare cell shapes (Jermyn et al., 2011). The biological assumption is that such measures of irregularity and spreading of cells allow accurate classification and discrimination between cancer cell lines treated with different drugs (Elaheh et al., 2019). The submission studies to which extent this Riemannian metric can detect how the cell shape is associated with the response to treatment.
Repurposing Peptide Inhibitors for SARSCov2 Spike Protein
This submission develops an approach combining physicochemical parameter analysis and topological featurization to train robust oneclass classifiers to predict proteinprotein interactions (PPIs). PPIs form the molecular basis of processes that equally sustain life and drive development of disease, such as SARSCov2. Peptides have garnered therapeutic interest due to their potential to disrupt clinicallyrelevant PPIs, apart from synthetic accessibility and better targeting modalities
(Tsomaia, 2015; Mohapatra et al., 2020; Schissel et al., 2020). The submission uses the topperforming model to screen the peptides in the current dataset against SARSCov2 receptor binding domain protein. The Peptide Binding DataBase (PepBDB) is used for model training (Wen et al., 2018).Shape analysis with skeletal models and Principal Nested Spheres
This submission considers anatomical shape analysis with skeletal representations (sreps) (Liu et al., 2021) and Principal Nested Spheres (PNS) (Jung et al., 2012; Kim et al., 2020)
. The srep of a given shape consists of the shape’s skeleton and two functions defined on the skeleton: a radial vector field and a radius function. PNS is a manifold learning method that addresses the nonEuclidean properties of shape data. PNS fits a
hierarchy of submanifolds – subspheres – to some input data. The notebook applies this method to sreps of toy data and to the classification problem of the hand skeleton shape dataset available in Geomstats’ datasets module, comparing sreps and PNS to Euclidean and Riemannian alternatives from the literature. The best classification performance is obtained by using the Kendall Riemmanian metric (Le and Kendall, 1993) on the hand skeleton shapes.Riemannian meanshift algorithm
This submission implements a Riemannian version of the meanshift algorithm (Subbarao and Meer, 2009; Caseiro et al., 2012). Classic (Euclidean) mean shift works by sliding a window (a ball whose radius is called “bandwidth”) over the dataset, iteratively adjusting the center of the window until convergence to the estimated mode of the data. Mean shift is used for clustering, with several advantages over Kmeans. This notebook implements the method and shows its applicability on toy datasets on the sphere and hyperbolic plane.
Intrinsic Disease Maps Using Persistent Cohomology
This submission uses persistent cohomology to investigate and visualize two infectious disease progression datasets: physiological data on Malaria in mice (Cumnock et al., 2018) and humans (Torres et al., 2016), and data on Hepatitis C in humans (Rosenberg et al., 2018). The submission reiterates the work of (Daniel Amin, 2021) and computes circular coordinates using the methodology introduced in (de Silva and VejdemoJohansson, 2009). The generated circular coordinate function provides an intrinsic disease phase coordinate that maps out the disease progression in the full data space.
Neural Sequence Distance Embeddings
This submission presents Neural Sequence Distance Embeddings (NeuroSEED), a general framework to embed biological sequences in geometric vector spaces that reflect their evolutionary distance. The notebook illustrates the effectiveness of the hyperbolic space that captures the hierarchical structure and provides an average 38% reduction in embedding RMSE against the best competing geometry. The capacity of the framework and the significance of these improvements are then demonstrated devising supervised and unsupervised NeuroSEED approaches to multiple core tasks in bioinformatics. Benchmarked with common baselines, the proposed approaches display significant accuracy and/or runtime improvements on realworld datasets (Clemente et al., 2015; Zheng et al., 2019).
Analyzing Representative Cycles for Persistent Homology
This submission aims to simplify the use of cycles for the analysis of persistent homology. The persistence diagram is often the only representation that software packages for TDA provide to visualize persistent homology information. Visualizing where each homology class appeared in the domain space can be very challenging for a user. This submission provides a collection of functions to simplify the interactive visualization and analysis of the homology class by enriching the information contained in the persistence diagram with cycles. Cycles are computed with an external library Iuricich (2020) which uses Discrete Morse theory Robins et al. (2011); Mischaikow and Nanda (2013) to achieve scalability. The notebook demonstrates the use of cycles on a subset of the MNIST dataset (LeCun et al., 2010) and provides an overview of applications using the direct visualization of cycles for exploratory data analysis.
Investigating CNN weights with Giotto Vectorization
This submission provides a topological analysis of convolutional neural networks (CNN) weights. Transforming each layer to a new auxiliary space predicts network properties on a nontrivial supervised classification task. The notebook uses the Small CNN Zoo dataset
(Unterthiner et al., 2021), shows how to compute the persistence diagrams of the auxiliary space, vectorizes the diagrams into Silhouettes (Chazal et al., 2014), and finally runs several regression and classification experiments. The results are particularly encouraging in terms of anomaly detection.Brain Connectomes Comparison using Geodesic Distances
This submission investigates the performance of geodesic distances on manifolds to assess brain connectome similarity between pairs of twins in terms of their structural networks at different network resolutions. The notebook uses the brain structural connectomes of 412 human subjects in five different resolutions and two edge weights (Kerepesi et al., 2016). The notebook investigates the performance of geodesic distances on manifolds and compares them with Euclidean distances within a Wilcoxon rank sum nonparametric test (Cuzick, 1985).
Fuzzy cMeans Clustering for Persistence Diagrams and Riemannian Manifolds
This submission implements Fuzzy cMeans clustering for persistence diagrams and Riemannian manifolds (Davies et al., 2020). Many real world problems are fuzzy; that is, data points can have partial membership to several clusters, rather than a single “hard” labelling to only one cluster (Campello, 2007). The notebook describes the fuzzy cmeans algorithm, highlights the convergence results, and demonstrates fuzzy clustering on two simple datasets.
Reweighting Vectors for Graph Convolutional Neural Networks via Poincaré Embedding and Persistence Images
This submission demonstrates how to incorporate local graph topological properties (e.g. connected components, cycles) into persistence enhanced graph neural networks (GNN)
(Zhao et al., 2020) for graph and node classification tasks. The notebook converts unweighted graphs to weighted graphs by embedding them using the Poincaré ball model and using the resulting Riemannian distances as the weights. Then, the persistence images of resulting weighted graphs are computed and the resulting matrix is used to reweight the GNN weights for enhanced performances.4 Features used in the packages
This section presents the features used in both packages and the limitations outlined by the participants. The numbers in parentheses refer to the number of submissions in which a given feature has been used.
Features used in Geomstats
The differential geometric structures used in the submissions are the following: Hypersphere (), Kendall’s PreShapeSpace () with associated KendallShapeMetric (), the space of symmetric positive definite matrices SPDMatrices () with associated BuresWasserstein metric SPDMetricBuresWasserstein (), or alternative Riemannian metrics such as the SPDMetricAffine () and the SPDMetricLogEuclidean (), the Lie group of rotations SpecialOrthogonal (), the hyperbolic space in its PoincareBall () representation and associated PoincareBallMetric (), the space of Matrices (), the general class of RiemannianMetric (), the manifold of DiscreteCurves () with associated squareroot velocity metric SRVMetric (), the Grassmanian () with the GrassmanianCanonicalMetric (), Stiefel () with the StiefelCanonicalMetric (). The algorithms of geometric statistics that have been used are: the estimation of the FrechetMean (), the tangent principal component analysis TangentPCA (), the RiemannianKMeans and the Riemannian version of the KNearest NeighbordsClassifier. In terms of differential geometric datasets, the cell shapes dataset, hand skeleton shape dataset and the optical nerve shape dataset were used. The visualization module was also used through its Sphere (), PoincareDisk (), KendallDisk (), KendallSphere (
). The main features that have not been used are the mathematical structures and functions related to information geometry, such as the manifolds of Beta distributions, Dirichlet distributions and Normal distributions; and the more involved learning procedures such as the
expectation_maximization or the KalmanFilter on Lie groups.Features used in GiottoTDA
The topological features used in the submissions are the following. VietorisRipsPersistence () was the most used class for computing persistent homology, followed by CubicalPersistence (). In the diagrams module, the computation of distance matrices between persistence diagrams via PairwiseDistance () was used but not preferred to the vectorisation of persistence diagrams via PersistenceImage (), PersistenceEntropy (), NumberOfPoints (), Amplitude () and Silhouette (). Additionally, Scaler (), from the same module, was used. A few preprocessing utilities for images were explored, namely Binarizer () and the following classes for creating filtrations from 2D or 3D greyscale images: RadialFiltration (), DilationFiltration (), ErosionFiltration () and DensityFiltration (). The visualization module plotting was used through its functions plot_diagram () and plot_point_cloud (). Among the main modules that have not been used, we find the modules related to timeseries and curves.
5 Limitations of the packages
Participants were asked to report on the limitations of the packages. This section provides a summary of their findings.
Limitations of Geomstats
First, some participants reported bugs. For example, there was an issue to project points from the Grassmannian or the Stiefel manifold to the tangent space using default canonical metrics or the participants’ custom metrics. Both preprocessing.ToTangentSpace and Grassmannian.to_tangent failed with the same issue. A similar problem was encountered while trying to project points from the manifold of DiscreteCurves on its tangent space at the FrechetMean using the squareroot velocity metric SRVMetric. In this case, the implementation of the FrechetMean itself was failing.
Second, participants did not find some implementations in the package or struggled to understand the existing code. For example, a participant reported that the metric for the HyperbolicSpace was missing, and another tried to use the abstract class RiemannianMetric without providing a definition of innerproduct or a metric matrix. Another participant would have liked to use product manifold and product metrics but did not realize that it was implemented in the library. Other participants wanted to use several backends but could not find the way to use both in one script:
os.environ["GEOMSTATS_BACKEND"] = "pytorch"; importlib.reload(geomstats.backend)
.These issues come from a lack of completeness of the current documentation of the package, misleading error messages and possible erroneous existing documentation. Other participants did report several problems with the current documentation, which could be improved with more detailed descriptions and an index with short summaries. There are classes such as the PoincareBall that are found in tutorials, but not found in the documentation website.
Lastly, participants reported the lack of integration between the modules related to graph and hyperbolic spaces in Geomstats, and the formalism of modern libraries using graphs. In this case, a refactoring is needed to allow a better integration between geometric statistics through Geomstats and packages of geometric learning such as pytorchgeometric (Fey and Lenssen, 2019), networkx (Hagberg et al., 2008) and dgl (Wang et al., 2019).
Limitations of GiottoTDA
First, some participants reported bugs. For example, computing persistence images on persistence diagrams formed by a single persistence pair outputs an empty persistence image.
Some participants pointed out that the rigid input requirements for PairwiseDistance – in particular, the fact that all persistence diagrams must formally have the same number of homology dimensions and of birthdeath pairs in each homology dimension – can be limiting in applications. This indicates that a utility function for converting collections of persistence diagrams into a format accepted by PairwiseDistance should be added to the package. Possibly related to this, some participants surmised (incorrectly, in this case) that persistent homology transformers such as VietorisRipsPersistence cannot handle collections of point clouds of different cardinalities. This might indicate that the aforementioned tight requirements in PairwiseDistance
are at odds with the more permissive character of many other components of
Giottotda.The next reported limitation came from the architecture of the package itself. Some participants reported that the package’s highlevel API did not allow for manipulations of attributes and usage of methods that are present in the objects that it runs underneath. For example, VietorisRipsPersistence runs using Ripser but does not allow one to access the cocycles that are otherwise accessible using Ripser’s API directly.
Lastly, in terms of performance, some participants reported that the computational runtime for PersistenceImage was very irregular on their data in comparison to the performance of e.g. Silhouette.
6 Proposed features for the packages
This section lists the features, suggested by the participants, that could be implemented in packages of computational geometry and topology such as Geomstats and GiottoTDA. These are implementations that they would judge useful in order to push forwards the fields of computational geometry and topology.
Proposed features for Geomstats
As the Stiefel and the Grassmann manifolds are becoming popular in the Machine Learning and Computer Vision community, more geometric features on these manifolds (such as various metrics) could offer powerful tools for solving a wide range of learning tasks.
In the same vein, symmetric positive definite (SPD) matrices are raising more and more interest in the same communities. While Geomstats has a module that processes timeseries into SPD matrices, further SPDdedicated preprocessing to SPD matrices for various data types would be helpful. Trainable SPD representations would also be of interest. For instance, some participants specifically asked for an implementation of SPD neural networks, the socalled second order neural networks.
Then, the module on the geometry of discrete curves could be improved in different ways. Geomstats only provides spaces of open curves, a restriction which is also not necessarily clear through reading the documentation only. Adding the implementation of the space of closed curves would be interesting. Furthermore, only the elastic metric on these spaces of curves is implemented, while generalizations of this metric exist that could be interesting. We note that these last two structures have been implemented in the library since. Lastly, when looking at shapes of curves, it is interesting to quotient out not only the reparameterization of the curve (done with the elastic metric), but also the rotations and translations of these curves. This is not obvious from the current documentation, and indications of this aspect, together with a recommendation to use the module of Kendall shape space for this, could be helpful.
Lastly, the visualization module could be further improved, by allowing more interactive visualizations and adding a visualization for the space of SPD matrices that could allow to visualize the differences between the different metrics on SPD matrices (of low dimensions).
Proposed features for GiottoTDA
First, participants suggested to add an implementation that computes the pairwise distance matrix of diagrams with different number of homology groups in each dimension, without having to first perform some laborious manual padding. Some participants also offered to add tools to compute the cohomology persistence, and for example circular coordinates.
For machine learning applications, some participants suggested to use a tensor backend instead of NumPy, such as Tensorflow or PyTorch.
7 Final ranking
This section provides the final ranking of the challenge’s submissions. The Condorcet method was used to rank the submissions based on the single evaluation criterion: “how does the submission help push forward the fields of computational geometry and topology?”
Each of the 16 teams had the opportunity to vote for the 3 best submissions. Each team received only one vote, even if there were several participants in the team. In addition, 8 external reviewers, chosen among Geomstats and GiottoTDA core maintainers and all from different institutions, also voted for the 3 best submissions. The 3 preferences had to be all 3 be different: e.g. one could not select the same Jupyter Notebook for both first and second place. The submissions were anonymized, the votes remained secret, only the final ranking is published here. Ties are represented by bullet points in the ranking below.

Noise Invariant topological features


Estimators of Means of Symmetric Positive Matrices

Visualization of Kendall Shape Spaces for Triangles


Neural Sequence Distance Embeddings

Repurposing Peptide Inhibitors for SARSCov2 Spike Protein


Map your Topology to Different Geometries

Intrinsic Disease Maps Using Persistent Cohomology


Shape analysis with skeletal models and Principal Nested Spheres


Riemannian meanshift algorithm

Fuzzy cMeans Clustering for Persistence Diagrams and Riemannian Manifolds



Naive Image Anomaly Detection on Fashion MNIST

Shape Analysis of Bone Cancer Cells

Brain Connectomes Comparison using Geodesic Distances

Investigating CNN weights with Giotto Vectorization


Analyzing Representative Cycles for Persistent Homology

Reweighting Vectors for Graph Convolutional Neural Networks via Poincaré Embedding and Persistence Images
Regardless of this final ranking, we would like to stress that all the submissions were of very high quality. We warmly congratulate all the participants.
Author Contributions
Nina Miolane, Matteo Corsi, Umberto Lupo, Marius Guerard and Nicolas Guigui led the organization of the challenge. Nina Miolane and Marius Guerard were responsible of the GitHub repository. Nina Miolane, Matteo Caorsi, Umberto Lupo, Marius Guerard, Nicolas Guigui, Johan Mathe, Yann Cabanes and Wojciech Reise were the external reviewers in the evaluation process. The remaining authors of this white paper were the participants of the challenge.
Acknowledgments
The authors would like the thank the organizers of the ICLR 2021 workshop “Geometrical and Topological Representation Learning” for their valuable support in the organization of the challenge and specifically Bastian Rieck for his availability and help.
8 Conclusion
This white paper presented the motivations behind the organization of the “Computational Geometric and Topological Challenge” at the ICLR 2021 workshop “Geometric and Topological Representation Learning”, and summarized the findings from the participants’ submissions.
References
 Manifolds.jl: a library of Riemannian manifolds in Julia. External Links: Link Cited by: §1.
 PyRiemann: Python package for covariance matrices manipulation and Biosignal classification with application in Brain Computer interface. External Links: Link Cited by: §1.
 PHAT: persistent homology algorithm toolbox. External Links: Link Cited by: §1.
 Ripster: efficient computation of vietorisrips persistence barcodes. Note: Preprint External Links: 1908.02518v2 Cited by: §1.
 Manopt, a Matlab toolbox for optimization on manifolds. Journal of Machine Learning Research 15, pp. 1455–1459. External Links: Document, 1308.5200, ISSN 15337928, Link Cited by: §1.
 CAPD::redhom  simplicical and cubical homology.. External Links: Link Cited by: §1.

A distance between multivariate normal distributions based in an embedding into the siegel group.
Journal of multivariate analysis
35 (2), pp. 223–242. Cited by: §3.  A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recogn. Lett. 28 (7), pp. 833–841. External Links: ISSN 01678655, Link, Document Cited by: §3.
 Semiintrinsic mean shift on riemannian manifolds. In Computer Vision – ECCV 2012, A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, and C. Schmid (Eds.), Berlin, Heidelberg, pp. 342–355. External Links: ISBN 9783642337185 Cited by: §3.
 PyGeometry: Library for handling various differentiable manifolds.. External Links: Link Cited by: §1.
 Stochastic convergence of persistence landscapes and silhouettes. In Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG’14, New York, NY, USA, pp. 474–483. External Links: ISBN 9781450325943, Link, Document Cited by: §3.
 The microbiome of uncontacted amerindians. Science advances. Cited by: §3.
 External Links: Link Cited by: §3.
 Host energy source is important for disease tolerance to malaria. Current Biology 28 (10), pp. 1635–1642.e3. External Links: ISSN 09609822, Document, Link Cited by: §3.
 A wilcoxontype test for trend. Statistics in Medicine 4 (4), pp. 543–547. External Links: Document, Link, https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.4780040416 Cited by: §3.
 On reproducibility and traceability of simulations. In Proceedings of the 2012 Winter Simulation Conference (WSC), Vol. , pp. 1–12. External Links: Document Cited by: §1.

Intrinsic disease maps using persistent cohomology.
Foundations of Data Science
0. Cited by: §3.  Reproducibility failures are essential to scientific inquiry. Proceedings of the National Academy of Sciences of the United States of America 115 (20), pp. 5042–5046. External Links: Document, ISBN 1806370115, ISSN 10916490 Cited by: §1.
 Fuzzy cmeans clustering for persistence diagrams. CoRR abs/2006.02796. External Links: Link, 2006.02796 Cited by: §3.
 Persistent cohomology and circular coordinates. Proceedings of the 25th annual symposium on Computational geometry  SCG ’09. External Links: ISBN 9781605585017, Link, Document Cited by: §3.
 SageMath, version 9.0. External Links: Link Cited by: §1.
 Statistical shape analysis, with applications in R. second edition.. John Wiley and Sons, Chichester. Cited by: §3.
 The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20 (2), pp. 303–353. External Links: ISSN 08954798, Link, Document Cited by: §3.
 TISMorph: a tool to quantify texture, irregularity and spreading of single cells.. PLoS ONE 14 (6). External Links: Link Cited by: §3.
 Introduction to the r package tda. External Links: 1411.1830 Cited by: §1.
 Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, Cited by: §5.

A topology layer for machine learning.
In
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
, S. Chiappa and R. Calandra (Eds.), Proceedings of Machine Learning Research, Vol. 108, pp. 1553–1563. External Links: Link Cited by: §3.  GDApublic: opensource toolbox of easytouse topological data analysis tools.. External Links: Link Cited by: §1.
 Exploring network structure, dynamics, and function using networkx. In Proceedings of the 7th Python in Science Conference, G. Varoquaux, T. Vaught, and J. Millman (Eds.), Pasadena, CA USA, pp. 11 – 15. Cited by: §5.
 Extended grassmann kernels for subspacebased learning. In Advances in Neural Information Processing Systems, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.), Vol. 21, pp. . External Links: Link Cited by: §3.
 Matroid Filtrations and Computational Persistent Homology. ArXiv eprints. External Links: 1606.00199 Cited by: §1.

Recursive karcher expectation estimators and geometric law of large numbers
. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, C. M. Carvalho and P. Ravikumar (Eds.), Proceedings of Machine Learning Research, Vol. 31, Scottsdale, Arizona, USA, pp. 325–332. External Links: Link Cited by: §3.  PersistenceCycles: a C++ package for visual exploration of persistent homology. External Links: Link Cited by: §3.
 Shape analysis of elastic curves in euclidean spaces. IEEE Transactions on Pattern Analysis & Machine Intelligence 33 (07), pp. 1415–1428. External Links: ISSN 19393539, Document Cited by: §3.
 Analysis of principal nested spheres. Biometrika 99 (3), pp. 551–568. Cited by: §3.
 Shape Manifolds, Procrustean Metrics, and Complex Projective Spaces. Bulletin of the London Mathematical Society 16 (2), pp. 81–121. External Links: ISSN 00246093, Document, Link, https://academic.oup.com/blms/articlepdf/16/2/81/6692748/16281.pdf Cited by: §3.
 DiPha: a distributed persistent homology algorithm. External Links: Link Cited by: §1.
 The braingraph.org database of high resolution structural connectomes and the brain graph tools. Cognitive Neurodynamics 11, pp. . External Links: Document Cited by: §3.
 Kurtosis test of modality for rotationally symmetric distributions on hyperspheres. Journal of Multivariate Analysis 178 (C), pp. . External Links: Document, Link Cited by: §3.
 Geoopt: Riemannian Adaptive Optimization Methods with pytorch optim. External Links: 2005.02819, Link Cited by: §1.

Computational Anatomy in Theano
. In Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics, Cham, pp. 164–176. External Links: Link Cited by: §1.  The riemannian structure of euclidean shape spaces: a novel environment for statistics. The Annals of Statistics 21 (3), pp. 1225–1271. External Links: ISSN 00905364, Link Cited by: §3, §3.
 MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist 2. Cited by: §3, §3.
 Fitting unbranching skeletal structures to objects. Medical Image Anal. 70, pp. 102020. External Links: Link, Document Cited by: §3.
 Geomstats: A python package for riemannian geometry in machine learning. Journal of Machine Learning Research 21, pp. 1–9. External Links: 2004.04667, ISSN 15337928 Cited by: §1, §1, §2.
 Morse theory for filtrations and efficient computation of persistent homology. Discrete & Computational Geometry 50 (2), pp. 330–353 (en). External Links: ISSN 01795376, 14320444, Document Cited by: §3.
 Deep Learning for Prediction and Optimization of FastFlow Peptide Synthesis. ACS Central Science. Note: doi: 10.1021/acscentsci.0c00979 External Links: Document, ISSN 23747943, Link Cited by: §3.

Topological autoencoders
. In Proceedings of the 37th International Conference on Machine Learning, H. D. III and A. Singh (Eds.), Proceedings of Machine Learning Research, Vol. 119, pp. 7045–7054. External Links: Link Cited by: §3.  Dionysus, a c++ library for computing persistent homology.. External Links: Link Cited by: §1.
 Perseus, the persistent homology software. External Links: Link Cited by: §1.
 The computational notebook paradigm for multiparadigm modeling. Proceedings  2019 ACM/IEEE 22nd International Conference on Model Driven Engineering Languages and Systems Companion, MODELSC 2019, pp. 449–454. External Links: Document, ISBN 9781728151250 Cited by: §1.
 HomCloud  a data analysis software based on persistent homology.. External Links: Link Cited by: §1.

Modeling the distribution of normal data in pretrained deep features for anomaly detection
. In25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 1015, 2021
, pp. 6726–6733. External Links: Link, Document Cited by: §3.  Diamorse  digital image analysis using discrete morse theory and persistent homology.. External Links: Link Cited by: §1.
 Theory and algorithms for constructing discrete morse complexes from grayscale digital images. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (8), pp. 1646–1658. External Links: Document Cited by: §3.
 Longitudinal transcriptomic characterization of the immune response to acute hepatitis c virus infection in patients with spontaneous viral clearance. PLOS Pathogens 14 (9), pp. 1–24. External Links: Document, Link Cited by: §3.
 Scikittda: topological data analysis for python. External Links: Document, Link Cited by: §1.
 Interpretable Deep Learning for De Novo Design of CellPenetrating Abiotic Polymers. bioRxiv. External Links: Document, Link Cited by: §3.
 “Reproducible” Research in Mathematical Sciences Requires Changes in our Peer Review Culture and Modernization of our Current Publication Approach. Bulletin of Mathematical Biology 80 (12), pp. 3095–3105. External Links: Document, ISSN 15229602 Cited by: §1.
 Lognormal distributions and geometric averages of symmetric positive definite matrices. International Statistical Review 84 (3), pp. 456–86. External Links: Document Cited by: §3.
 TensorFlow manopt: a library for optimization on riemannian manifolds. External Links: 2105.13921 Cited by: §1.
 Perturbation robust representations of topological persistence diagrams. In Computer Vision – ECCV 2018  15th European Conference, 2018, Proceedings, V. Ferrari, C. Sminchisescu, M. Hebert, and Y. Weiss (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 638–659 (English (US)). Note: Funding Information: Acknowledgments. This work was supported in part by ARO grant W911NF1710293 and NSF CAREER award 1452163. Publisher Copyright: © Springer Nature Switzerland AG 2018.; 15th European Conference on Computer Vision, ECCV 2018 ; Conference date: 08092018 Through 14092018 External Links: Document, ISBN 9783030012335 Cited by: §3.
 Setting the Default to Reproducible. External Links: Link Cited by: §1, §1.
 Nonlinear mean shift over riemannian manifolds. Cited by: §3.
 JavaPlex: A research software package for persistent (co)homology. In Proceedings of ICMS 2014, H. Hong and C. Yap (Eds.), Lecture Notes in Computer Science 8592, pp. 129–136. Note: Software available at http://appliedtopology.github.io/javaplex/ Cited by: §1.
 Giottotda: a topological data analysis toolkit for machine learning and data exploration. External Links: 2004.02551 Cited by: §1, §1, §2.
 GUDHI user and reference manual. 3.4.1 edition, GUDHI Editorial Board. External Links: Link Cited by: §1.
 The Topology ToolKit. IEEE Transactions on Visualization and Computer Graphics 24 (1), pp. 832–842. External Links: ISSN 10772626, Document Cited by: §1.
 Tracking resilience to infections by mapping disease space. PLOS Biology 14 (4), pp. 1–19. External Links: Document, Link Cited by: §3.
 Pymanopt: A Python Toolbox for Manifold Optimization using Automatic Differentiation. arXiv preprint arXiv:1603.03236 17, pp. 1–4. External Links: 1603.03236, ISSN 15337928, Link Cited by: §1.
 Peptide therapeutics: targeting the undruggable space. European journal of medicinal chemistry 94, pp. 459–470. External Links: Document, ISSN 02235234, Link Cited by: §3.
 Predicting neural network accuracy from weights. External Links: 2002.11448 Cited by: §3.
 SHREC’10 Track: Large Scale Retrieval. In Eurographics Workshop on 3D Object Retrieval, M. Daoudi and T. Schreck (Eds.), External Links: ISSN 19970471, ISBN 9783905674224, Document Cited by: §3.
 Deep graph library: a graphcentric, highlyperformant package for graph neural networks. arXiv preprint arXiv:1909.01315. Cited by: §5.
 PepBDB: a comprehensive structural database of biological peptide–protein interactions. Bioinformatics 35 (1), pp. 175–177. External Links: ISSN 13674803, Document, Link, https://academic.oup.com/bioinformatics/articlepdf/35/1/175/27182855/bty579.pdf Cited by: §3.
 PyQuaternions: A fully featured, pythonic library for representing and using quaternions. External Links: Link Cited by: §1.
 Fashionmnist: a novel image dataset for benchmarking machine learning algorithms. Note: cite arxiv:1708.07747Comment: Dataset is freely available at https://github.com/zalandoresearch/fashionmnist Benchmark is available at http://fashionmnist.s3website.eucentral1.amazonaws.com/ External Links: Link Cited by: §3.
 An empirical bayes approach to shrinkage estimation on the manifold of symmetric positivedefinite matrices. External Links: 2007.02153 Cited by: §3.
 Persistence enhanced graph neural network. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, S. Chiappa and R. Calandra (Eds.), Proceedings of Machine Learning Research, Vol. 108, pp. 2896–2906. External Links: Link Cited by: §3.
 SENSE: siamese neural network for sequence embedding and alignmentfree comparison. Bioinformatics. Cited by: §3.
Comments
There are no comments yet.