HLC2: a highly efficient cross-matching framework for large astronomical catalogues on heterogeneous computing environments

01/18/2023
by   Yajie Zhang, et al.
0

Cross-matching operation, which is to find corresponding data for the same celestial object or region from multiple catalogues,is indispensable to astronomical data analysis and research. Due to the large amount of astronomical catalogues generated by the ongoing and next-generation large-scale sky surveys, the time complexity of the cross-matching is increasing dramatically. Heterogeneous computing environments provide a theoretical possibility to accelerate the cross-matching, but the performance advantages of heterogeneous computing resources have not been fully utilized. To meet the challenge of cross-matching for substantial increasing amount of astronomical observation data, this paper proposes Heterogeneous-computing-enabled Large Catalogue Cross-matcher (HLC2), a high-performance cross-matching framework based on spherical position deviation on CPU-GPU heterogeneous computing platforms. It supports scalable and flexible cross-matching and can be directly applied to the fusion of large astronomical cataloguesfrom survey missions and astronomical data centres. A performance estimation model is proposed to locate the performance bottlenecks and guide the optimizations. A two-level partitioning strategy is designed to generate an optimized data placement according to the positions of celestial objects to increase throughput. To make HLC2 a more adaptive solution, the architecture-aware task splitting, thread parallelization, and concurrent scheduling strategies are designed and integrated. Moreover, a novel quad-direction strategy is proposed for the boundary problem to effectively balance performance and completeness. We have experimentally evaluated HLC2 using public released catalogue data. Experiments demonstrate that HLC2 scales well on different sizes of catalogues and the cross-matching speed is significantly improved compared to the state-of-the-art cross-matchers.

READ FULL TEXT

page 1

page 4

research
04/13/2021

Optimal Data Placement for Data-Sharing Scientific Workflows in Heterogeneous Edge-Cloud Computing Environments

The heterogeneous edge-cloud computing paradigm can provide a more optim...
research
05/14/2022

Scientific Workflows in Heterogeneous Edge-Cloud Computing: A Data Placement Strategy Based on Reinforcement learning

The heterogeneous edge-cloud computing paradigm can provide an optimal s...
research
01/18/2019

Exploiting OpenMP & OpenACC to Accelerate a Molecular Docking Mini-App in Heterogeneous HPC Nodes

In drug discovery, molecular docking is the task in charge of estimating...
research
06/14/2021

Scalable and accurate multi-GPU based image reconstruction of large-scale ptychography data

While the advances in synchrotron light sources, together with the devel...
research
11/20/2021

HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments

Deep neural networks (DNNs) exploit many layers and a large number of pa...
research
07/11/2022

HEGrid: A High Efficient Multi-Channel Radio Astronomical Data Gridding Framework in Heterogeneous Computing Environments

The challenge to fully exploit the potential of existing and upcoming sc...
research
07/22/2022

Globally optimal and scalable N-way matching of astronomy catalogs

Building on previous Bayesian approaches, we introduce a novel formulati...

Please sign up or login with your details

Forgot password? Click here to reset