RowClone: Accelerating Data Movement and Initialization Using DRAM

05/07/2018
by   Vivek Seshadri, et al.
0

In existing systems, to perform any bulk data movement operation (copy or initialization), the data has to first be read into the on-chip processor, all the way into the L1 cache, and the result of the operation must be written back to main memory. This is despite the fact that these operations do not involve any actual computation. RowClone exploits the organization and operation of commodity DRAM to perform these operations completely inside DRAM using two mechanisms. The first mechanism, Fast Parallel Mode, copies data between two rows inside the same DRAM subarray by issuing back-to-back activate commands to the source and the destination row. The second mechanism, Pipelined Serial Mode, transfers cache lines between two banks using the shared internal bus. RowClone significantly reduces the raw latency and energy consumption of bulk data copy and initialization. This reduction directly translates to improvement in performance and energy efficiency of systems running copy or initialization-intensive workloads

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2016

The Processing Using Memory Paradigm:In-DRAM Bulk Copy, Initialization, Bitwise AND and OR

In existing systems, the off-chip memory interface allows the memory con...
research
04/21/2020

NOM: Network-On-Memory for Inter-Bank Data Transfer in Highly-Banked Memories

Data copy is a widely-used memory operation in many programs and operati...
research
05/20/2016

Simple DRAM and Virtual Memory Abstractions to Enable Highly Efficient Memory Systems

In most modern systems, the memory subsystem is managed and accessed at ...
research
05/23/2019

In-DRAM Bulk Bitwise Execution Engine

Many applications heavily use bitwise operations on large bitvectors as ...
research
10/29/2021

PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM

Processing-using-memory (PuM) techniques leverage the analog operation o...
research
05/08/2018

LISA: Increasing Internal Connectivity in DRAM for Fast Data Movement and Low Latency

This paper summarizes the idea of Low-Cost Interlinked Subarrays (LISA),...
research
06/01/2022

PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques

DRAM-based main memory is used in nearly all computing systems as a majo...

Please sign up or login with your details

Forgot password? Click here to reset