Characterizing Optimizations to Memory Access Patterns using Architecture-Independent Program Features

03/12/2020
by   Aditya Chilukuri, et al.
0

High-performance computing developers are faced with the challenge of optimizing the performance of OpenCL workloads on diverse architectures. The Architecture-Independent Workload Characterization (AIWC) tool is a plugin for the Oclgrind OpenCL simulator that gathers metrics of OpenCL programs that can be used to understand and predict program performance on an arbitrary given hardware architecture. However, AIWC metrics are not always easily interpreted and do not reflect some important memory access patterns affecting efficiency across architectures. We propose a new metric of parallel spatial locality – the closeness of memory accesses simultaneously issued by OpenCL work-items (threads). We implement the parallel spatial locality metric in the AIWC framework, and analyse gathered results on matrix multiply and the Extended OpenDwarfs OpenCL benchmarks. The differences in the observed parallel spatial locality metric across implementations of matrix multiply reflect the optimizations performed. The new metric can be used to distinguish between the OpenDwarfs benchmarks based on the memory access patterns affecting their performance on various architectures. The improvements suggested to AIWC will help HPC developers better understand memory access patterns of complex codes and guide optimization of codes for arbitrary hardware targets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2018

Impact of Traditional Sparse Optimizations on a Migratory Thread Architecture

Achieving high performance for sparse applications is challenging due to...
research
05/10/2018

AIWC: OpenCL based Architecture Independent Workload Characterisation

OpenCL is an attractive programming model for high-performance computing...
research
09/28/2018

New Thread Migration Strategies for NUMA Systems

Multicore systems present on-board memory hierarchies and communication ...
research
04/18/2019

Memory and Parallelism Analysis Using a Platform-Independent Approach

Emerging computing architectures such as near-memory computing (NMC) pro...
research
06/24/2019

Platform Independent Software Analysis for Near Memory Computing

Near-memory Computing (NMC) promises improved performance for the applic...
research
05/13/2020

Semantic prefetching using forecast slices

Modern prefetchers identify memory access patterns in order to predict f...
research
10/28/2020

StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems

Spatial computing devices have been shown to significantly accelerate st...

Please sign up or login with your details

Forgot password? Click here to reset