Gemini: Reducing DRAM Cache Hit Latency by Hybrid Mappings

by   Ye Chi, et al.

Die-stacked DRAM caches are increasingly advocated to bridge the performance gap between on-chip Cache and main memory. It is essential to improve DRAM cache hit rate and lower cache hit latency simultaneously. Prior DRAM cache designs fall into two categories according to the data mapping polices: set-associative and direct-mapped, achieving either one. In this paper, we propose a partial direct-mapped die-stacked DRAM cache to achieve the both objectives simultaneously, called Gemini, which is motivated by the following observations: applying unified mapping policy to different blocks cannot achieve high cache hit rate and low hit latency in terms of mapping structure. Gemini cache classifies data into leading blocks and following blocks, and places them with static mapping and dynamic mapping respectively in a unified set-associative structure. Gemini also designs a replacement policy to balance the different blocks miss penalty and the recency, and provides strategies to mitigate cache thrashing due to block type transitions. Experimental results demonstrate that Gemini cache can narrow the hit latency gap with direct-mapped cache significantly, from 1.75X to 1.22X on average, and can achieve comparable hit rate with set-associative cache. Compared with the state-of-the-art baselines, i.e., enhanced Loh-Hill cache, Gemini improves the IPC by up to 20 respectively.


Banshee: Bandwidth-Efficient DRAM Caching Via Software/Hardware Cooperation

Putting the DRAM on the same package with a processor enables several ti...

A Cycle-level Unified DRAM Cache Controller Model for 3DXPoint Memory Systems in gem5

To accommodate the growing memory footprints of today's applications, CP...

Die-Stacked DRAM: Memory, Cache, or MemCache?

Die-stacked DRAM is a promising solution for satisfying the ever-increas...

FIGARO: Improving System Performance via Fine-Grained In-DRAM Data Relocation and Caching

DRAM Main memory is a performance bottleneck for many applications due t...

Taming Process Variations in CNFET for Efficient Last Level Cache Design

Carbon nanotube field-effect transistors (CNFET) emerge as a promising a...

Revisiting Comparative Performance of DNS Resolvers in the IPv6 and ECS Era

This paper revisits the issue of the performance of DNS resolution servi...

To Update or Not To Update?: Bandwidth-Efficient Intelligent Replacement Policies for DRAM Caches

This paper investigates intelligent replacement policies for improving t...

Please sign up or login with your details

Forgot password? Click here to reset