Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling

05/18/2023
by   Shitong Shao, et al.
0

Diffusion Probability Models (DPMs) have made impressive advancements in various machine learning domains. However, achieving high-quality synthetic samples typically involves performing a large number of sampling steps, which impedes the possibility of real-time sample synthesis. Traditional accelerated sampling algorithms via knowledge distillation rely on pre-trained model weights and discrete time step scenarios, necessitating additional training sessions to achieve their goals. To address these issues, we propose the Catch-Up Distillation (CUD), which encourages the current moment output of the velocity estimation model “catch up” with its previous moment output. Specifically, CUD adjusts the original Ordinary Differential Equation (ODE) training objective to align the current moment output with both the ground truth label and the previous moment output, utilizing Runge-Kutta-based multi-step alignment distillation for precise ODE estimation while preventing asynchronous updates. Furthermore, we investigate the design space for CUDs under continuous time-step scenarios and analyze how to determine the suitable strategies. To demonstrate CUD's effectiveness, we conduct thorough ablation and comparison experiments on CIFAR-10, MNIST, and ImageNet-64. On CIFAR-10, we obtain a FID of 2.80 by sampling in 15 steps under one-session training and the new state-of-the-art FID of 3.37 by sampling in one step with additional training. This latter result necessitated only 62w iterations with a batch size of 128, in contrast to Consistency Distillation, which demanded 210w iterations with a larger batch size of 256. Our code is released at https://anonymous.4open.science/r/Catch-Up-Distillation-E31F.

READ FULL TEXT

page 23

page 24

page 25

research
11/22/2022

Accelerating Diffusion Sampling with Classifier-based Feature Distillation

Although diffusion model has shown great potential for generating higher...
research
02/01/2022

Progressive Distillation for Fast Sampling of Diffusion Models

Diffusion models have recently shown great promise for generative modeli...
research
07/13/2022

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

Denoising diffusion probabilistic models (DDPMs) have recently achieved ...
research
06/08/2023

BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping

Diffusion models have demonstrated excellent potential for generating di...
research
03/07/2023

TRACT: Denoising Diffusion Models with Transitive Closure Time-Distillation

Denoising Diffusion models have demonstrated their proficiency for gener...
research
08/15/2023

SciRE-Solver: Accelerating Diffusion Models Sampling by Score-integrand Solver with Recursive Difference

Diffusion models (DMs) have made significant progress in the fields of i...

Please sign up or login with your details

Forgot password? Click here to reset