Vector quantization loss analysis in VQGANs: a single-GPU ablation study for image-to-image synthesis

08/09/2023
by   Luv Verma, et al.
0

This study performs an ablation analysis of Vector Quantized Generative Adversarial Networks (VQGANs), concentrating on image-to-image synthesis utilizing a single NVIDIA A100 GPU. The current work explores the nuanced effects of varying critical parameters including the number of epochs, image count, and attributes of codebook vectors and latent dimensions, specifically within the constraint of limited resources. Notably, our focus is pinpointed on the vector quantization loss, keeping other hyperparameters and loss components (GAN loss) fixed. This was done to delve into a deeper understanding of the discrete latent space, and to explore how varying its size affects the reconstruction. Though, our results do not surpass the existing benchmarks, however, our findings shed significant light on VQGAN's behaviour for a smaller dataset, particularly concerning artifacts, codebook size optimization, and comparative analysis with Principal Component Analysis (PCA). The study also uncovers the promising direction by introducing 2D positional encodings, revealing a marked reduction in artifacts and insights into balancing clarity and overfitting.

READ FULL TEXT

page 3

page 7

page 8

page 9

page 11

page 12

page 14

page 15

research
06/14/2020

PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Autoencoders and generative models produce some of the most spectacular ...
research
10/18/2022

Landmark Enforcement and Style Manipulation for Generative Morphing

Morph images threaten Facial Recognition Systems (FRS) by presenting as ...
research
04/12/2021

Diamond in the rough: Improving image realism by traversing the GAN latent space

In just a few years, the photo-realism of images synthesized by Generati...
research
04/13/2023

Intriguing properties of synthetic images: from generative adversarial networks to diffusion models

Detecting fake images is becoming a major goal of computer vision. This ...
research
01/20/2022

GAN-based Matrix Factorization for Recommender Systems

Proposed in 2014, Generative Adversarial Networks (GAN) initiated a fres...
research
08/11/2023

Optimizing transformer-based machine translation model for single GPU training: a hyperparameter ablation study

In machine translation tasks, the relationship between model complexity ...

Please sign up or login with your details

Forgot password? Click here to reset