Virgo: Scalable Unsupervised Classification of Cosmological Shock Waves

08/14/2022
by   Max Lamparth, et al.
19

Cosmological shock waves are essential to understanding the formation of cosmological structures. To study them, scientists run computationally expensive high-resolution 3D hydrodynamic simulations. Interpreting the simulation results is challenging because the resulting data sets are enormous, and the shock wave surfaces are hard to separate and classify due to their complex morphologies and multiple shock fronts intersecting. We introduce a novel pipeline, Virgo, combining physical motivation, scalability, and probabilistic robustness to tackle this unsolved unsupervised classification problem. To this end, we employ kernel principal component analysis with low-rank matrix approximations to denoise data sets of shocked particles and create labeled subsets. We perform supervised classification to recover full data resolution with stochastic variational deep kernel learning. We evaluate on three state-of-the-art data sets with varying complexity and achieve good results. The proposed pipeline runs automatically, has only a few hyperparameters, and performs well on all tested data sets. Our results are promising for large-scale applications, and we highlight now enabled future scientific work.

READ FULL TEXT

page 35

page 36

page 39

page 40

page 41

research
07/12/2023

Deep Unrolling for Nonconvex Robust Principal Component Analysis

We design algorithms for Robust Principal Component Analysis (RPCA) whic...
research
03/01/2016

Scalable Metric Learning via Weighted Approximate Rank Component Analysis

We are interested in the large-scale learning of Mahalanobis distances, ...
research
06/08/2019

Study of Compressed Randomized UTV Decompositions for Low-Rank Matrix Approximations in Data Science

In this work, a novel rank-revealing matrix decomposition algorithm term...
research
01/02/2017

Towards multiple kernel principal component analysis for integrative analysis of tumor samples

Personalized treatment of patients based on tissue-specific cancer subty...
research
10/30/2017

Communication-Avoiding Optimization Methods for Massive-Scale Graphical Model Structure Learning

Undirected graphical models compactly represent the structure of large, ...
research
07/13/2018

Generalized Simultaneous Component Analysis of Binary and Quantitative data

In the current era of systems biological research there is a need for th...
research
08/31/2020

Relationship-aware Multivariate Sampling Strategy for Scientific Simulation Data

With the increasing computational power of current supercomputers, the s...

Please sign up or login with your details

Forgot password? Click here to reset