Streaming Computations with Region-Based State on SIMD Architectures

06/12/2020
by   Stephen Timcheck, et al.
0

Streaming computations on massive data sets are an attractive candidate for parallelization, particularly when they exhibit independence (and hence data parallelism) between items in the stream. However, some streaming computations are stateful, which disrupts independence and can limit parallelism. In this work, we consider how to extract data parallelism from streaming computations with a common, limited form of statefulness. The stream is assumed to be divided into variably-sized regions, and items in the same region are processed in a common context of state. In general, the computation to be performed on a stream is also irregular, with each item potentially undergoing different, data-dependent processing. This work describes mechanisms to implement such computations efficiently on a SIMD-parallel architecture such as a GPU. We first develop a low-level protocol by which a data stream can be augmented with control signals that are delivered to each stage of a computation at precise points in the stream. We then describe an abstraction, enumeration and aggregation, by which an application developer can specify the behavior of a streaming application with region-based state. Finally, we study an implementation of our ideas as part of the MERCATOR system for irregular streaming computations on GPUs, investigating how the frequency of region boundaries in a stream impacts SIMD occupancy and hence application performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2003

The Graphics Card as a Streaming Computer

Massive data sets have radically changed our understanding of how to des...
research
03/30/2018

Scaling Ordered Stream Processing on Shared-Memory Multicores

Many modern applications require real-time processing of large volumes o...
research
12/17/2022

GPU Load Balancing

Fine-grained workload and resource balancing is the key to high performa...
research
04/09/2021

Stream Processing With Dependency-Guided Synchronization

Real-time data processing applications with low latency requirements hav...
research
09/20/2023

Testing frequency distributions in a stream

We study how to verify specific frequency distributions when we observe ...
research
09/01/2023

Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion

This paper introduces Laminar, a novel serverless framework based on dis...
research
10/24/2018

On the analysis of scheduling algorithms for structured parallel computations

Algorithms for scheduling structured parallel computations have been wid...

Please sign up or login with your details

Forgot password? Click here to reset