Accelerating Bulk Bit-Wise X(N)OR Operation in Processing-in-DRAM Platform
With Von-Neumann computing architectures struggling to address computationally- and memory-intensive big data analytic task today, Processing-in-Memory (PIM) platforms are gaining growing interests. In this way, processing-in-DRAM architecture has achieved remarkable success by dramatically reducing data transfer energy and latency. However, the performance of such system unavoidably diminishes when dealing with more complex applications seeking bulk bit-wise X(N)OR- or addition operations, despite utilizing maximum internal DRAM bandwidth and in-memory parallelism. In this paper, we develop DRIM platform that harnesses DRAM as computational memory and transforms it into a fundamental processing unit. DRIM uses the analog operation of DRAM sub-arrays and elevates it to implement bit-wise X(N)OR operation between operands stored in the same bit-line, based on a new dual-row activation mechanism with a modest change to peripheral circuits such sense amplifiers. The simulation results show that DRIM achieves on average 71x and 8.4x higher throughput for performing bulk bit-wise X(N)OR-based operations compared with CPU and GPU, respectively. Besides, DRIM outperforms recent processing-in-DRAM platforms with up to 3.7x better performance.
READ FULL TEXT