To Update or Not To Update?: Bandwidth-Efficient Intelligent Replacement Policies for DRAM Caches

07/04/2019
by   Vinson Young, et al.
0

This paper investigates intelligent replacement policies for improving the hit-rate of gigascale DRAM caches. Cache replacement policies are commonly used to improve the hit-rate of on-chip caches. The most effective replacement policies often require the cache to track per-line reuse state to inform their decision. A fundamental challenge on DRAM caches, however, is that stateful policies would require significant bandwidth to maintain per-line DRAM cache state. As such, DRAM cache replacement policies have primarily been stateless policies, such as always-install or probabilistic bypass. Unfortunately, we find that stateless policies are often too coarse-grain and become ineffective at the size and associativity of DRAM caches. Ideally, we want a replacement policy that can obtain the hit-rate benefits of stateful replacement policies, but keep the bandwidth-efficiency of stateless policies. In our study, we find that tracking per-line reuse state can enable an effective replacement policy that can mitigate common thrashing patterns seen in gigascale caches. We propose a stateful replacement/bypass policy called RRIP Age-On-Bypass (RRIP-AOB), that tracks reuse state for high-reuse lines, protects such lines by bypassing other lines, and Ages the state On cache Bypass. Unfortunately, such a stateful technique requires significant bandwidth to update state. To this end, we propose Efficient Tracking of Reuse (ETR). ETR makes state tracking efficient by accurately tracking the state of only one line from a region, and using the state of that line to guide the replacement decisions for other lines in that region. ETR reduces the bandwidth for tracking replacement state by 70 DRAM caches. Our evaluations with a 2GB DRAM cache, show that our RRIP-AOB and ETR techniques provide 18

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

page 8

page 9

page 11

research
04/10/2017

Banshee: Bandwidth-Efficient DRAM Caching Via Software/Hardware Cooperation

Putting the DRAM on the same package with a processor enables several ti...
research
07/28/2021

Reuse Cache for Heterogeneous CPU-GPU Systems

It is generally observed that the fraction of live lines in shared last-...
research
08/15/2018

Making Belady-Inspired Replacement Policies More Effective Using Expected Hit Count

Memory-intensive workloads operate on massive amounts of data that canno...
research
06/15/2020

CoT: Decentralized Elastic Caches for Cloud Environments

Distributed caches are widely deployed to serve social networks and web ...
research
07/04/2019

TicToc: Enabling Bandwidth-Efficient DRAM Caching for both Hits and Misses in Hybrid Memory Systems

This paper investigates bandwidth-efficient DRAM caching for hybrid DRAM...
research
06/03/2018

Gemini: Reducing DRAM Cache Hit Latency by Hybrid Mappings

Die-stacked DRAM caches are increasingly advocated to bridge the perform...
research
01/31/2022

The complexity gap in the static analysis of cache accesses grows if procedure calls are added

The static analysis of cache accesses consists in correctly predicting w...

Please sign up or login with your details

Forgot password? Click here to reset