Hardware-Efficient Deconvolution-Based GAN for Edge Computing

by   Azzam Alhussain, et al.

Generative Adversarial Networks (GAN) are cutting-edge algorithms for generating new data samples based on the learned data distribution. However, its performance comes at a significant cost in terms of computation and memory requirements. In this paper, we proposed an HW/SW co-design approach for training quantized deconvolution GAN (QDCGAN) implemented on FPGA using a scalable streaming dataflow architecture capable of achieving higher throughput versus resource utilization trade-off. The developed accelerator is based on an efficient deconvolution engine that offers high parallelism with respect to scaling factors for GAN-based edge computing. Furthermore, various precisions, datasets, and network scalability were analyzed for low-power inference on resource-constrained platforms. Lastly, an end-to-end open-source framework is provided for training, implementation, state-space exploration, and scaling the inference using Vivado high-level synthesis for Xilinx SoC-FPGAs, and a comparison testbed with Jetson Nano.



There are no comments yet.


page 4


A Competitive Edge: Can FPGAs Beat GPUs at DCNN Inference Acceleration in Resource-Limited Edge Computing Applications?

When trained as generative models, Deep Learning algorithms have shown e...

Systimator: A Design Space Exploration Methodology for Systolic Array based CNNs Acceleration on the FPGA-based Edge Nodes

The evolution of IoT based smart applications demand porting of artifici...

FINN-L: Library Extensions and Design Trade-off Analysis for Variable Precision LSTM Networks on FPGAs

It is well known that many types of artificial neural networks, includin...

An Energy-Efficient Edge Computing Paradigm for Convolution-based Image Upsampling

A novel energy-efficient edge computing paradigm is proposed for real-ti...

Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking

In edge computing scenarios, the distribution of data and collaboration ...

RED: A ReRAM-based Deconvolution Accelerator

Deconvolution has been widespread in neural networks. For example, it is...

Automated Space/Time Scaling of Streaming Task Graph

In this paper, we describe a high-level synthesis (HLS) tool that automa...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.