Bandana: Using Non-volatile Memory for Storing Deep Learning Models

11/14/2018
by   Assaf Eisenman, et al.
0

Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2019

TicToc: Enabling Bandwidth-Efficient DRAM Caching for both Hits and Misses in Hybrid Memory Systems

This paper investigates bandwidth-efficient DRAM caching for hybrid DRAM...
research
02/08/2017

Flashield: a Key-value Cache that Minimizes Writes to Flash

As its price per bit drops, SSD is increasingly becoming the default sto...
research
01/20/2018

Storage-Class Memory Hierarchies for Scale-Out Servers

With emerging storage-class memory (SCM) nearing commercialization, ther...
research
08/28/2023

Scalable and Configurable Tracking for Any Rowhammer Threshold

The Rowhammer vulnerability continues to get worse, with the Rowhammer T...
research
01/20/2018

Storage-Class Memory Hierarchies for Servers

With emerging storage-class memory (SCM) nearing commercialization, ther...
research
10/27/2019

Semi-Asymmetric Parallel Graph Algorithms for NVRAMs

Emerging non-volatile main memory (NVRAM) technologies provide novel fea...
research
12/22/2020

Reducing Solid-State Drive Read Latency by Optimizing Read-Retry (Extended Abstract)

3D NAND flash memory with advanced multi-level cell techniques provides ...

Please sign up or login with your details

Forgot password? Click here to reset