Work-stealing prefix scan: Addressing load imbalance in large-scale image registration

10/23/2020
by   Marcin Copik, et al.
0

Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this paper, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix scan, we can convert this seemingly sequential problem into a parallel computation that scales to over thousand of cores. We analyze a variety of scan algorithms that behave similarly for common low-compute operators and propose a novel work-stealing procedure for a hierarchical prefix scan. Our evaluation shows that by identifying a suitable and well-optimized prefix scan algorithm, we reduce time-to-solution on a series of 4,096 images spanning ten seconds of microscopy acquisition from over 10 hours to less than 3 minutes (using 1024 Intel Haswell cores), enabling derivation of material properties at nanoscale for long microscopy image series.

READ FULL TEXT

page 3

page 6

page 9

page 11

research
12/07/2017

Parallel Prefix Algorithms for the Registration of Arbitrarily Long Electron Micrograph Series

Recent advances in the technology of transmission electron microscopy ha...
research
11/12/2014

Multi-modal Image Registration for Correlative Microscopy

Correlative microscopy is a methodology combining the functionality of l...
research
06/07/2022

High-performance computing for super-resolution microscopy on a cluster of computers

Multiple signal classification algorithm (MUSICAL) provides a super-reso...
research
09/16/2022

CLAIRE – Parallelized Diffeomorphic Image Registration for Large-Scale Biomedical Imaging Applications

We study the performance of CLAIRE – a diffeomorphic multi-node, multi-G...
research
07/23/2019

Scaling Back-propagation by Parallel Scan Algorithm

In an era when the performance of a single compute device plateaus, soft...
research
07/23/2019

BPPSA: Scaling Back-propagation by Parallel Scan Algorithm

In an era when the performance of a single compute device plateaus, soft...
research
09/05/2014

Identifying Synapses Using Deep and Wide Multiscale Recursive Networks

In this work, we propose a learning framework for identifying synapses u...

Please sign up or login with your details

Forgot password? Click here to reset