Addressing Variability in Reuse Prediction for Last-Level Caches

by   Priyank Faldu, et al.

Last-Level Cache (LLC) represents the bulk of a modern CPU processor's transistor budget and is essential for application performance as LLC enables fast access to data in contrast to much slower main memory. However, applications with large working set size often exhibit streaming and/or thrashing access patterns at LLC. As a result, a large fraction of the LLC capacity is occupied by dead blocks that will not be referenced again, leading to inefficient utilization of the LLC capacity. To improve cache efficiency, the state-of-the-art cache management techniques employ prediction mechanisms that learn from the past access patterns with an aim to accurately identify as many dead blocks as possible. Once identified, dead blocks are evicted from LLC to make space for potentially high reuse cache blocks. In this thesis, we identify variability in the reuse behavior of cache blocks as the key limiting factor in maximizing cache efficiency for state-of-the-art predictive techniques. Variability in reuse prediction is inevitable due to numerous factors that are outside the control of LLC. The sources of variability include control-flow variation, speculative execution and contention from cores sharing the cache, among others. Variability in reuse prediction challenges existing techniques in reliably identifying the end of a block's useful lifetime, thus causing lower prediction accuracy, coverage, or both. To address this challenge, this thesis aims to design robust cache management mechanisms and policies for LLC in the face of variability in reuse prediction to minimize cache misses, while keeping the cost and complexity of the hardware implementation low. To that end, we propose two cache management techniques, one domain-agnostic and one domain-specialized, to improve cache efficiency by addressing variability in reuse prediction.


Domain-Specialized Cache Management for Graph Analytics

Graph analytics power a range of applications in areas as diverse as fin...

Reuse Cache for Heterogeneous CPU-GPU Systems

It is generally observed that the fraction of live lines in shared last-...

Learning Forward Reuse Distance

Caching techniques are widely used in the era of cloud computing from ap...

ETICA: Efficient Two-Level I/O Caching Architecture for Virtualized Platforms

In this paper, we propose an Efficient Two-Level I/O Caching Architectur...

Practical Data Compression for Modern Memory Hierarchies

In this thesis, we describe a new, practical approach to integrating har...

A Fast-and-Effective Early-Stage Multi-level Cache Optimization Method Based on Reuse-Distance Analysis

In this paper, we propose a practical and effective approach allowing de...

Effective Cache Apportioning for Performance Isolation Under Compiler Guidance

With a growing number of cores per socket in modern data-centers where m...

Please sign up or login with your details

Forgot password? Click here to reset