High Performance Parallel Sort for Shared and Distributed Memory MIMD

by   Thoria Alghamdi, et al.

We present four high performance hybrid sorting methods developed for various parallel platforms: shared memory multiprocessors, distributed multiprocessors, and clusters taking advantage of existence of both shared and distributed memory. Merge sort, known for its stability, is used to design several of our algorithms. We improve its parallel performance by combining it with Quicksort. We present two models designed for shared memory MIMD (OpenMP): (a) a non-recursive Merge sort and (b) a hybrid Quicksort and Merge sort. The third model presented is designed for distributed memory MIMD (MPI) using a hybrid Quicksort and Merge sort. Our fourth model is designed to take advantage of the shared memory within individual nodes of todays cluster systems, and to eliminate all internal data transfers between different nodes, Our model implements a one-step MSD-Radix to distribute data in ten packets (MPI) while parallel cores of each node use Quicksort to sort their data partitions sequentially then merge and sort them in parallel employing the OpenMp. The performances of all developed models outperform the baseline performance. Hybrid Quicksort and Merge sort outperformed Hybrid Memory Parallel Merge Sort using Hybrid MSD-Radix and Quicksort in Cluster Platforms when sorting small size data, but with larger data the speedup of Hybrid Memory Parallel Merge Sort Using Hybrid MSD-Radix and Quicksort in Cluster Platforms becomes bigger and it keeps improving. The speedup of Distributed Memory Parallel Hybrid Quicksort and Merge Sort is the best.


page 1

page 2

page 3

page 4


Experience with Distributed Memory Delaunay-based Image-to-Mesh Conversion Implementation

This paper presents some of our findings on the scalability of parallel ...

Approaches to the Parallelization of Merge Sort in Python

The theory of divide-and-conquer parallelization has been well-studied i...

Combinatorics and Geometry for the Many-ported, Distributed and Shared Memory Architecture

Manycore SoC architectures based on on-chip shared memory are preferred ...

TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-based FPGAs

The emergence of high-bandwidth memory (HBM) brings new opportunities to...

Parallel Algorithms for Successive Convolution

In this work, we consider alternative discretizations for PDEs which use...

Evaluation of a Simple, Scalable, Parallel Best-First Search Strategy

Large-scale, parallel clusters composed of commodity processors are incr...

A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake

The modern CPU's design, which is composed of hierarchical memory and SI...

Please sign up or login with your details

Forgot password? Click here to reset