Corrfunc: Blazing fast correlation functions with AVX512F SIMD Intrinsics

11/15/2019
by   Manodeep Sinha, et al.
0

Correlation functions are widely used in extra-galactic astrophysics to extract insights into how galaxies occupy dark matter halos and in cosmology to place stringent constraints on cosmological parameters. A correlation function fundamentally requires computing pair-wise separations between two sets of points and then computing a histogram of the separations. Corrfunc is an existing open-source, high-performance software package for efficiently computing a multitude of correlation functions. In this paper, we will discuss the SIMD AVX512F kernels within Corrfunc, capable of processing 16 floats or 8 doubles at a time. The latest manually implemented Corrfunc AVX512F kernels show a speedup of up to ∼ 4× relative to compiler-generated code for double-precision calculations. The AVX512F kernels show ∼ 1.6× speedup relative to the AVX kernels and compare favorably to a theoretical maximum of 2×. In addition, by pruning pairs with too large of a minimum possible separation, we achieve a ∼ 5-10% speedup across all the SIMD kernels. Such speedups highlight the importance of programming explicitly with SIMD vector intrinsics for complex calculations that can not be efficiently vectorized by compilers. Corrfunc is publicly available at https://github.com/manodeep/Corrfunc/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2017

CGAlgebra: a Mathematica package for conformal geometric algebra

A tutorial of the Mathematica package CGAlgebra, for conformal geometric...
research
01/29/2023

Fast Correlation Function Calculator – A high-performance pair counting toolkit

Context. A novel high-performance exact pair counting toolkit called Fas...
research
07/07/2023

QIGen: Generating Efficient Kernels for Quantized Inference on Large Language Models

We present ongoing work on a new automatic code generation approach for ...
research
04/02/2022

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems

Several manufacturers have already started to commercialize near-bank Pr...
research
09/11/2017

Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions

In this report, some cosmological correlation functions are used to eval...
research
08/02/2018

corr2D - Implementation of Two-Dimensional Correlation Analysis in R

In the package corr2D two-dimensional correlation analysis is implemente...
research
09/12/2023

Comparing Llama-2 and GPT-3 LLMs for HPC kernels generation

We evaluate the use of the open-source Llama-2 model for generating well...

Please sign up or login with your details

Forgot password? Click here to reset