Using ECC DRAM to Adaptively Increase Memory Capacity

06/27/2017
by   Yixin Luo, et al.
0

Modern DRAM modules are often equipped with hardware error correction capabilities, especially for DRAM deployed in large-scale data centers, as process technology scaling has increased the susceptibility of these devices to errors. To provide fast error detection and correction, error-correcting codes (ECC) are placed on an additional DRAM chip in a DRAM module. This additional chip expands the raw capacity of a DRAM module by 12.5 are unable to use any of this extra capacity, as it is used exclusively to provide reliability for all data. In reality, there are a number of applications that do not need such strong reliability for all their data regions (e.g., some user batch jobs executing on a public cloud), and can instead benefit from using additional DRAM capacity to store extra data. Our goal in this work is to provide the additional capacity within an ECC DRAM module to applications when they do not need the high reliability of error correction. In this paper, we propose Capacity- and Reliability-Adaptive Memory (CREAM), a hardware mechanism that adapts error correcting DRAM modules to offer multiple levels of error protection, and provides the capacity saved from using weaker protection to applications. For regions of memory that do not require strong error correction, we either provide no ECC protection or provide error detection using multibit parity. We evaluate several layouts for arranging the data within ECC DRAM in these reduced-protection modes, taking into account the various trade-offs exposed from exploiting the extra chip. Our experiments show that the increased capacity provided by CREAM improves performance by 23.0 a memory caching workload, and by 37.3 executing production query traces. In addition, CREAM can increase bank-level parallelism within DRAM, offering further performance improvements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Bit-Exact ECC Recovery (BEER): Determining DRAM On-Die ECC Functions by Exploiting DRAM Data Retention Characteristics

Increasing single-cell DRAM error rates have pushed DRAM manufacturers t...
research
06/28/2023

Retrospective: RAIDR: Retention-Aware Intelligent DRAM Refresh

Dynamic Random Access Memory (DRAM) is the prevalent memory technology u...
research
05/04/2018

Adaptive-Latency DRAM: Reducing DRAM Latency by Exploiting Timing Margins

This paper summarizes the idea of Adaptive-Latency DRAM (AL-DRAM), which...
research
03/17/2020

Workload-Aware DRAM Error Prediction using Machine Learning

The aggressive scaling of technology may have helped to meet the growing...
research
04/21/2022

Enabling Effective Error Mitigation in Memory Chips That Use On-Die Error-Correcting Codes

Improvements in main memory storage density are primarily driven by proc...
research
04/21/2022

A Case for Transparent Reliability in DRAM Systems

Today's systems have diverse needs that are difficult to address using o...
research
11/28/2017

Errors in Flash-Memory-Based Solid-State Drives: Analysis, Mitigation, and Recovery

NAND flash memory is ubiquitous in everyday life today because its capac...

Please sign up or login with your details

Forgot password? Click here to reset