Parallelized Kendall's Tau Coefficient Computation via SIMD Vectorized Sorting On Many-Integrated-Core Processors

04/12/2017
by   Yongchao Liu, et al.
0

Pairwise association measure is an important operation in data analytics. Kendall's tau coefficient is one widely used correlation coefficient identifying non-linear relationships between ordinal variables. In this paper, we investigated a parallel algorithm accelerating all-pairs Kendall's tau coefficient computation via single instruction multiple data (SIMD) vectorized sorting on Intel Xeon Phis by taking advantage of many processing cores and 512-bit SIMD vector instructions. To facilitate workload balancing and overcome on-chip memory limitation, we proposed a generic framework for symmetric all-pairs computation by building provable bijective functions between job identifier and coordinate space. Performance evaluation demonstrated that our algorithm on one 5110P Phi achieves two orders-of-magnitude speedups over 16-threaded MATLAB and three orders-of-magnitude speedups over sequential R, both running on high-end CPUs. Besides, our algorithm exhibited rather good distributed computing scalability with respect to number of Phis. Source code and datasets are publicly available at http://lightpcc.sourceforge.net.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2019

Parallel Implementations for Computing the Minimum Distance of a Random Linear Code on Multicomputers

The minimum distance of a linear code is a key concept in information th...
research
11/29/2020

XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Network on RISC-V based IoT End Nodes

This work introduces lightweight extensions to the RISC-V ISA to boost t...
research
12/10/2021

FLiMS: a Fast Lightweight 2-way Merger for Sorting

In this paper, we present FLiMS, a highly-efficient and simple parallel ...
research
05/22/2023

Further Decimating the Inductive Programming Search Space with Instruction Digrams

Overlapping instruction subsets derived from human originated code have ...
research
07/27/2020

HeAT – a Distributed and GPU-accelerated Tensor Framework for Data Analytics

To cope with the rapid growth in available data, the efficiency of data ...
research
09/28/2020

Engineering In-place (Shared-memory) Sorting Algorithms

We present sorting algorithms that represent the fastest known technique...
research
11/20/2022

An Algorithm for Routing Vectors in Sequences

We propose a routing algorithm that takes a sequence of vectors and comp...

Please sign up or login with your details

Forgot password? Click here to reset