We make three observations in modern processors: (1) LLC capacity is get...
The security goals of cloud providers and users include memory
confident...
High load latency that results from deep cache hierarchies and relativel...
DL inference queries play an important role in diverse internet services...
Resistive memories have limited lifetime caused by limited write enduran...
Modern recommendation systems rely on real-valued embeddings of categori...
Modern deep learning models have high memory and computation cost. To ma...
Near-data accelerators (NDAs) that are integrated with main memory have ...
Training convolutional neural networks (CNNs) requires intense compute
t...
GPUs offer orders-of-magnitude higher memory bandwidth than traditional
...
Model pruning is a popular mechanism to make a network more efficient fo...
Training convolutional neural networks (CNNs) requires intense computati...