Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations

04/21/2023
by   Yu-Hui Chen, et al.
0

The rapid development and application of foundation models have revolutionized the field of artificial intelligence. Large diffusion models have gained significant attention for their ability to generate photorealistic images and support various tasks. On-device deployment of these models provides benefits such as lower server costs, offline functionality, and improved user privacy. However, common large diffusion models have over 1 billion parameters and pose challenges due to restricted computational and memory resources on devices. We present a series of implementation optimizations for large diffusion models that achieve the fastest reported inference latency to-date (under 12 seconds for Stable Diffusion 1.4 without int8 quantization on Samsung S23 Ultra for a 512x512 image with 20 iterations) on GPU-equipped mobile devices. These enhancements broaden the applicability of generative AI and improve the overall user experience across a wide range of devices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2023

Squeezing Large-Scale Diffusion Models for Mobile

The emergence of diffusion models has greatly broadened the scope of hig...
research
06/01/2023

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Text-to-image diffusion models can create stunning images from natural l...
research
06/04/2023

Temporal Dynamic Quantization for Diffusion Models

The diffusion model has gained popularity in vision applications due to ...
research
04/27/2020

Deploying Image Deblurring across Mobile Devices: A Perspective of Quality and Latency

Recently, image enhancement and restoration have become important applic...
research
10/28/2020

INT8 Winograd Acceleration for Conv1D Equipped ASR Models Deployed on Mobile Devices

The intensive computation of Automatic Speech Recognition (ASR) models o...
research
01/16/2021

NNStreamer: Efficient and Agile Development of On-Device AI Systems

We propose NNStreamer, a software system that handles neural networks as...
research
06/26/2020

Making DensePose fast and light

DensePose estimation task is a significant step forward for enhancing us...

Please sign up or login with your details

Forgot password? Click here to reset