DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

07/18/2020
by   Mohammadreza Baharani, et al.
0

Deep Separable Convolutional Neural Networks (DSCNNs) have become the emerging paradigm by offering modular networks with structural sparsity in order to achieve higher accuracy with relatively lower operations and parameters. However, there is a lack of customized architectures that can provide flexible solutions that fit the sparsity of the DSCNNs. This paper introduces DeepDive, which is a fully-functional, vertical co-design framework, for power-efficient implementation of DSCNNs on edge FPGAs. DeepDive's architecture supports crucial heterogeneous Compute Units (CUs) to fully support DSCNNs with various convolutional operators interconnected with structural sparsity. It offers an FPGA-aware training and online quantization combined with modular synthesizable C++ CUs, customized for DSCNNs. The execution results on Xilinx's ZCU102 FPGA board, demonstrate 47.4 and 233.3 FPS/Watt for MobileNet-V2 and a compact version of EfficientNet, respectively, as two state-of-the-art depthwise separable CNNs. These comparisons showcase how DeepDive improves FPS/Watt by 2.2× and 1.51× over Jetson Nano high and low power modes, respectively. It also enhances FPS/Watt about 2.27× and 37.25× over two other FPGA implementations. The DeepDive output for MobileNetV2 is available at https://github.com/TeCSAR-UNCC/DeepDive.

READ FULL TEXT

page 2

page 7

page 8

page 10

page 12

page 14

research
09/29/2018

NICE: Noise Injection and Clamping Estimation for Neural Network Quantization

Convolutional Neural Networks (CNN) are very popular in many fields incl...
research
06/05/2017

NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps

Convolutional neural networks (CNNs) have become the dominant neural net...
research
09/29/2016

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs

Deep learning has significantly advanced the state of the art in artific...
research
04/06/2020

CNN2Gate: Toward Designing a General Framework for Implementation of Convolutional Neural Networks on FPGA

Convolutional Neural Networks (CNNs) have a major impact on our society ...
research
12/17/2020

FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4bit-Compact Multilayer Perceptrons

With the growing demand for deploying deep learning models to the "edge"...
research
07/15/2017

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

State-of-the-art convolutional neural networks are enormously costly in ...

Please sign up or login with your details

Forgot password? Click here to reset