Data Cache Prefetching with Perceptron Learning

12/04/2017
by   Haoyuan Wang, et al.
0

Cache prefetcher greatly eliminates compulsory cache misses, by fetching data from slower memory to faster cache before it is actually required by processors. Sophisticated prefetchers predict next use cache line by repeating program's historical spatial and temporal memory access pattern. However, they are error prone and the mis-predictions lead to cache pollution and exert extra pressure on memory subsystem. In this paper, a novel scheme of data cache prefetching with perceptron learning is proposed. The key idea is a two-level prefetching mechanism. A primary decision is made by utilizing previous table-based prefetching mechanism, e.g. stride prefetching or Markov prefetching, and then, a neural network, perceptron is taken to detect and trace program memory access patterns, to help reject those unnecessary prefetching decisions. The perceptron can learn from both local and global history in time and space, and can be easily implemented by hardware. This mechanism boost execution performance by ideally mitigating cache pollution and eliminating redundant memory request issued by prefetcher. Detailed evaluation and analysis were conducted based on SPEC CPU 2006 benchmarks. The simulation results show that generally the proposed scheme yields a geometric mean of 60.64 instruction per cycle(IPC)(floating between -2.22 rate(floating between -1.67 may refuse useful blocks and thus cause minor raise in cache miss rate, lower memory request count can decrease average memory access latency, which compensate for the loss, and in the meantime, enhance overall performance in multi-programmed workloads.

READ FULL TEXT
research
09/01/2022

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction

Long-latency load requests continue to limit the performance of high-per...
research
09/01/2016

On-Chip Mechanisms to Reduce Effective Memory Access Latency

This dissertation develops hardware that automatically reduces the effec...
research
05/29/2022

TransforMAP: Transformer for Memory Access Prediction

Data Prefetching is a technique that can hide memory latency by fetching...
research
11/12/2017

Strongly Secure and Efficient Data Shuffle On Hardware Enclaves

Mitigating memory-access attacks on the Intel SGX architecture is an imp...
research
04/30/2018

Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance

In a modern GPU architecture, all threads within a warp execute the same...
research
11/18/2022

ACIC: Admission-Controlled Instruction Cache

The front end bottleneck in datacenter workloads has come under increase...
research
01/06/2020

A Fast Analytical Model of Fully Associative Caches

While the cost of computation is an easy to understand local property, t...

Please sign up or login with your details

Forgot password? Click here to reset