ALPINE: Analog In-Memory Acceleration with Tight Processor Integration for Deep Learning

05/20/2022
by   Joshua Klein, et al.
0

Analog in-memory computing (AIMC) cores offers significant performance and energy benefits for neural network inference with respect to digital logic (e.g., CPUs). AIMCs accelerate matrix-vector multiplications, which dominate these applications' run-time. However, AIMC-centric platforms lack the flexibility of general-purpose systems, as they often have hard-coded data flows and can only support a limited set of processing functions. With the goal of bridging this gap in flexibility, we present a novel system architecture that tightly integrates analog in-memory computing accelerators into multi-core CPUs in general-purpose systems. We developed a powerful gem5-based full system-level simulation framework into the gem5-X simulator, ALPINE, which enables an in-depth characterization of the proposed architecture. ALPINE allows the simulation of the entire computer architecture stack from major hardware components to their interactions with the Linux OS. Within ALPINE, we have defined a custom ISA extension and a software library to facilitate the deployment of inference models. We showcase and analyze a variety of mappings of different neural network types, and demonstrate up to 20.5x/20.8x performance/energy gains with respect to a SIMD-enabled ARM CPU implementation for convolutional neural networks, multi-layer perceptrons, and recurrent neural networks.

READ FULL TEXT

page 1

page 3

page 10

page 14

research
05/17/2023

AnalogNAS: A Neural Network Design Framework for Accurate Inference with Analog In-Memory Computing

The advancement of Deep Learning (DL) is driven by efficient Deep Neural...
research
01/29/2019

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference

Memristor crossbars are circuits capable of performing analog matrix-vec...
research
06/13/2022

Machine Learning Training on a Real Processing-in-Memory System

Training machine learning algorithms is a computationally intensive proc...
research
10/11/2019

Scalability of TTool's AMS extensions: a case study

Embedded cyber-physical systems (CPS) are commonly built upon heterogene...
research
11/23/2022

End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture

The demand for computation resources and energy efficiency of Convolutio...
research
10/12/2017

NeuroTrainer: An Intelligent Memory Module for Deep Learning Training

This paper presents, NeuroTrainer, an intelligent memory module with in-...
research
05/04/2018

Performance tuning for deep learning on a many-core processor (master thesis)

Convolutional neural networks (CNNs) are becoming very successful and po...

Please sign up or login with your details

Forgot password? Click here to reset