DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

07/18/2020

∙

Deep Separable Convolutional Neural Networks (DSCNNs) have become the emerging paradigm by offering modular networks with structural sparsity in order to achieve higher accuracy with relatively lower operations and parameters. However, there is a lack of customized architectures that can provide flexible solutions that fit the sparsity of the DSCNNs. This paper introduces DeepDive, which is a fully-functional, vertical co-design framework, for power-efficient implementation of DSCNNs on edge FPGAs. DeepDive's architecture supports crucial heterogeneous Compute Units (CUs) to fully support DSCNNs with various convolutional operators interconnected with structural sparsity. It offers an FPGA-aware training and online quantization combined with modular synthesizable C++ CUs, customized for DSCNNs. The execution results on Xilinx's ZCU102 FPGA board, demonstrate 47.4 and 233.3 FPS/Watt for MobileNet-V2 and a compact version of EfficientNet, respectively, as two state-of-the-art depthwise separable CNNs. These comparisons showcase how DeepDive improves FPS/Watt by 2.2× and 1.51× over Jetson Nano high and low power modes, respectively. It also enhances FPS/Watt about 2.27× and 37.25× over two other FPGA implementations. The DeepDive output for MobileNetV2 is available at https://github.com/TeCSAR-UNCC/DeepDive.

READ FULL TEXT

DeepDive: An Integrative Algorithm/Architecture Co-Design for Deep Separable Convolutional Neural Networks

Sign in with Google

Consider DeepAI Pro