DARM: Control-Flow Melding for SIMT Thread Divergence Reduction – Extended Version

07/12/2021
by   Charitha Saumya, et al.
0

GPGPUs use the Single-Instruction-Multiple-Thread (SIMT) execution model where a group of threads-wavefront or warp-execute instructions in lockstep. When threads in a group encounter a branching instruction, not all threads in the group take the same path, a phenomenon known as control-flow divergence. The control-flow divergence causes performance degradation because both paths of the branch must be executed one after the other. Prior research has primarily addressed this issue through architectural modifications. We observe that certain GPGPU kernels with control-flow divergence have similar control-flow structures with similar instructions on both sides of a branch. This structure can be exploited to reduce control-flow divergence by melding the two sides of the branch allowing threads to reconverge early, reducing divergence. In this work, we present DARM, a compiler analysis and transformation framework that can meld divergent control-flow structures with similar instruction sequences. We show that DARM can reduce the performance degradation from control-flow divergence.

READ FULL TEXT

page 8

page 10

page 11

research
07/14/2017

Variable Instruction Fetch Rate to Reduce Control Dependent Penalties

In order to overcome the branch execution penalties of hard-to-predict i...
research
07/06/2023

Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane

Spatial architecture is a high-performance architecture that uses contro...
research
06/04/2019

SPECCFI: Mitigating Spectre Attacks using CFI Informed Speculation

Spectre attacks and their many subsequent variants are a new vulnerabili...
research
04/30/2018

Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance

In a modern GPU architecture, all threads within a warp execute the same...
research
09/03/2013

Understanding Evolutionary Potential in Virtual CPU Instruction Set Architectures

We investigate fundamental decisions in the design of instruction set ar...
research
03/01/2018

The Effect of Instruction Padding on SFI Overhead

Software-based fault isolation (SFI) is a technique to isolate a potenti...
research
07/20/2020

Program algebra for random access machine programs

This paper presents an algebraic theory of instruction sequences with in...

Please sign up or login with your details

Forgot password? Click here to reset