From Sand to Flour: The Next Leap in Granular Computing with NanoSort

04/26/2022
by   Theo Jepsen, et al.
0

The granularity of distributed computing is limited by communication time: there is no point in farming out smaller and smaller tasks if the communication overhead dominates the decrease in processing time due to the added parallelism. In this work, we leverage the low communication latency of a new NIC/CPU hardware design, the nanoPU, to explore a new extreme of granularity in distributed computation, where a problem is partitioned into tens of thousands of nanosecond-scale tasks. To evaluate the feasibility and practicality of extremely fine-grained computing, we built NanoSort, a distributed sorting algorithm running on the nanoPU. Using cycle-accurate FireSim simulations of 65,536 nanoPU cores, we show that NanoSort can sort 1M keys in 68μs, an order of magnitude faster than MilliSort, the current state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2022

From Task-Based GPU Work Aggregation to Stellar Mergers: Turning Fine-Grained CPU Tasks into Portable GPU Kernels

Meeting both scalability and performance portability requirements is a c...
research
01/03/2023

Fast Parallel Algorithms for Enumeration of Simple, Temporal, and Hop-Constrained Cycles

Cycles are one of the fundamental subgraph patterns and being able to en...
research
01/18/2021

DFOGraph: An I/O- and Communication-Efficient System for Distributed Fully-out-of-Core Graph Processing

With the magnitude of graph-structured data continually increasing, grap...
research
02/19/2022

Scalable Fine-Grained Parallel Cycle Enumeration Algorithms

This paper investigates scalable parallelisation of state-of-the-art cyc...
research
03/16/2022

On Distributed Gravitational N-Body Simulations

The N-body problem is a classic problem involving a system of N discrete...
research
01/20/2022

The Specialized High-Performance Network on Anton 3

Molecular dynamics (MD) simulation, a computationally intensive method t...
research
07/03/2018

Rethinking Misalignment to Raise the Bar for Heap Pointer Corruption

Heap layout randomization renders a good portion of heap vulnerabilities...

Please sign up or login with your details

Forgot password? Click here to reset