DeepAI AI Chat
Log In Sign Up

Do Your Cores Play Nicely? A Portable Framework for Multi-core Interference Tuning and Analysis

by   Dan Iorga, et al.
Princeton University
Imperial College London

Multi-core architectures can be leveraged to allow independent processes to run in parallel. However, due to resources shared across cores, such as caches, distinct processes may interfere with one another, e.g. affecting execution time. Analysing the extent of this interference is difficult due to: (1) the diversity of modern architectures, which may contain different implementations of shared resources, and (2) the complex nature of modern processors, in which interference might arise due to subtle interactions. To address this, we propose a black-box auto-tuning approach that searches for processes that are effective at causing slowdowns for a program when executed in parallel. Such slowdowns provide lower bounds on worst-case execution time; an important metric in systems with real-time constraints. Our approach considers a set of parameterised "enemy" processes and "victim" programs, each targeting a shared resource. The autotuner searches for enemy process parameters that are effective at causing slowdowns in the victim programs. The idea is that victim programs behave as a proxy for shared resource usage of arbitrary programs. We evaluate our approach on: 5 different chips; 3 resources (cache, memory bus, and main memory); and consider several search strategies and slowdown metrics. Using enemy processes tuned per chip, we evaluate the slowdowns on the autobench and coremark benchmark suites and show that our method is able to achieve slowdowns in 98 combinations and provide similar results to manually written enemy processes.


Reuse-Aware Cache Partitioning Framework for Data-Sharing Multicore Systems

Multi-core processors improve performance, but they can create unpredict...

Cache Where you Want! Reconciling Predictability and Coherent Caching

Real-time and cyber-physical systems need to interact with and respond t...

A WCET-aware cache colouring technique for reducing interference in real-time systems

The predictability of a system is the condition to give saferbound on wo...

Compiler-Guided Throughput Scheduling for Many-core Machines

Modern ARM-based servers such as ThunderX and ThunderX2 offer a tremendo...

Towards a General Framework for Static Cost Analysis of Parallel Logic Programs

The estimation and control of resource usage is now an important challen...

Loop Tiling in Large-Scale Stencil Codes at Run-time with OPS

The key common bottleneck in most stencil codes is data movement, and pr...