Octopus: A Heterogeneous In-network Computing Accelerator Enabling Deep Learning for network

08/22/2023
by   Dong Wen, et al.
0

Deep learning (DL) for network models have achieved excellent performance in the field and are becoming a promising component in future intelligent network system. Programmable in-network computing device has great potential to deploy DL for network models, however, existing device cannot afford to run a DL model. The main challenges of data-plane supporting DL-based network models lie in computing power, task granularity, model generality and feature extracting. To address above problems, we propose Octopus: a heterogeneous in-network computing accelerator enabling DL for network models. A feature extractor is designed for fast and efficient feature extracting. Vector accelerator and systolic array work in a heterogeneous collaborative way, offering low-latency-highthroughput general computing ability for packet-and-flow-based tasks. Octopus also contains on-chip memory fabric for storage and connecting, and Risc-V core for global controlling. The proposed Octopus accelerator design is implemented on FPGA. Functionality and performance of Octopus are validated in several use-cases, achieving performance of 31Mpkt/s feature extracting, 207ns packet-based computing latency, and 90kflow/s flow-based computing throughput.

READ FULL TEXT

page 3

page 6

page 8

page 9

research
02/27/2020

Optimizing Memory-Access Patterns for Deep Learning Accelerators

Deep learning (DL) workloads are moving towards accelerators for faster ...
research
12/14/2020

Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators on Multi-Tenant FPGAs

Field-programmable gate arrays (FPGAs) are becoming widely used accelera...
research
12/15/2021

N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores

Accelerating the neural network inference by FPGA has emerged as a popul...
research
08/01/2021

Improving the Performance of a NoC-based CNN Accelerator with Gather Support

The increasing application of deep learning technology drives the need f...
research
05/24/2023

Reconfigurable Distributed FPGA Cluster Design for Deep Learning Accelerators

We propose a distributed system based on lowpower embedded FPGAs designe...
research
08/08/2023

EPS: Distinguishable IQ Data Representation for Domain-Adaptation Learning of Device Fingerprints

Deep learning (DL)-based RF fingerprinting (RFFP) technology has emerged...
research
04/23/2023

GACER: Granularity-Aware ConcurrEncy Regulation for Multi-Tenant Deep Learning

As deep learning continues to advance and is applied to increasingly com...

Please sign up or login with your details

Forgot password? Click here to reset