A Real Time Super Resolution Accelerator with Tilted Layer Fusion

05/09/2022
by   An-Jung Huang, et al.
0

Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92% and just needs 102KB on-chip memory. The design implemented with a 40nm CMOS process achieves 1920x1080@60fps throughput with 544.3K gate count when running at 600MHz; it has higher throughput and lower area cost than previous designs.

READ FULL TEXT

page 1

page 2

page 3

research
05/02/2022

BSRA: Block-based Super Resolution Accelerator with Hardware Efficient Pixel Attention

Increasingly, convolution neural network (CNN) based super resolution mo...
research
05/02/2022

A Real Time 1280x720 Object Detection Chip With 585MB/s Memory Traffic

Memory bandwidth has become the real-time bottleneck of current deep lea...
research
08/30/2023

ACNPU: A 4.75TOPS/W 1080P@30FPS Super Resolution Accelerator with Decoupled Asymmetric Convolution

Deep learning-driven superresolution (SR) outperforms traditional techni...
research
11/08/2017

Hydra: An Accelerator for Real-Time Edge-Aware Permeability Filtering in 65nm CMOS

Many modern video processing pipelines rely on edge-aware (EA) filtering...
research
01/18/2018

On-Chip CNN Accelerator for Image Super-Resolution

To implement convolutional neural networks (CNN) in hardware, the state-...
research
05/09/2022

Row-wise Accelerator for Vision Transformer

Following the success of the natural language processing, the transforme...
research
02/23/2022

Alleviating Datapath Conflicts and Design Centralization in Graph Analytics Acceleration

Previous graph analytics accelerators have achieved great improvement on...

Please sign up or login with your details

Forgot password? Click here to reset