b'Mattan Erez'

research

∙ 01/10/2023

Harvesting L2 Caches in Server Processors

We make three observations in modern processors: (1) LLC capacity is get...

0 Majid Jalili, et al. ∙

research

∙ 09/01/2022

SecDDR: Enabling Low-Cost Secure Memories by Protecting the DDR Interface

The security goals of cloud providers and users include memory confident...

0 Ali Fakhrzadehgan, et al. ∙

research

∙ 03/27/2021

Reducing Load Latency with Cache Level Prediction

High load latency that results from deep cache hierarchies and relativel...

0 Majid Jalili, et al. ∙

research

∙ 11/30/2020

Accelerating Bandwidth-Bound Deep Learning Inference with Main-Memory Accelerators

DL inference queries play an important role in diverse internet services...

0 Benjamin Y. Cho, et al. ∙

research

∙ 10/06/2020

WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders

Resistive memories have limited lifetime caused by limited write enduran...

0 Leonid Yavits, et al. ∙

research

∙ 06/10/2020

Training with Multi-Layer Embeddings for Model Reduction

Modern recommendation systems rely on real-valued embeddings of categori...

0 Benjamin Ghaemmaghami, et al. ∙

research

∙ 04/27/2020

FlexSA: Flexible Systolic Array Architecture for Efficient Pruned DNN Model Training

Modern deep learning models have high memory and computation cost. To ma...

0 Sangkug Lym, et al. ∙

research

∙ 08/18/2019

CHoNDA: Near Data Acceleration with Concurrent Host Access

Near-data accelerators (NDAs) that are integrated with main memory have ...

0 Benjamin Y. Cho, et al. ∙

research

∙ 04/02/2019

DeLTA: GPU Performance Model for Deep Learning Applications with In-depth Memory System Traffic Analysis

Training convolutional neural networks (CNNs) requires intense compute t...

0 Sangkug Lym, et al. ∙

research

∙ 03/06/2019

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs

GPUs offer orders-of-magnitude higher memory bandwidth than traditional ...

0 Esha Choukse, et al. ∙

research

∙ 01/26/2019

PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training

Model pruning is a popular mechanism to make a network more efficient fo...

0 Sangkug Lym, et al. ∙

research

∙ 09/30/2018

Mini-batch Serialization: CNN Training with Inter-layer Data Reuse

Training convolutional neural networks (CNNs) requires intense computati...

0 Sangkug Lym, et al. ∙

Mattan Erez

Featured Co-authors

Sign in with Google

Consider DeepAI Pro