Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder

05/04/2023
by   Xinmiao Lin, et al.
0

The popular VQ-VAE models reconstruct images through learning a discrete codebook but suffer from a significant issue in the rapid quality degradation of image reconstruction as the compression rate rises. One major reason is that a higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space. In this paper, a Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality. The FCM can be easily incorporated into the VQ-VAE structure, and we refer to the new model as Frequency Augmented VAE (FA-VAE). In addition, a Dynamic Spectrum Loss (DSL) is introduced to guide the FCMs to balance between various frequencies dynamically for optimal reconstruction. FA-VAE is further extended to the text-to-image synthesis task, and a Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts. Extensive reconstruction experiments with different compression rates are conducted on several benchmark datasets, and the results demonstrate that the proposed FA-VAE is able to restore more faithfully the details compared to SOTA methods. CAT also shows improved generation quality with better image-text semantic alignment.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

page 8

research
07/18/2021

ANFIC: Image Compression Using Augmented Normalizing Flows

This paper introduces an end-to-end learned image compression system, te...
research
08/09/2022

Disentangled Representation Learning Using (β-)VAE and GAN

Given a dataset of images containing different objects with different fe...
research
05/25/2021

Self-Organized Variational Autoencoders (Self-VAE) for Learned Image Compression

In end-to-end optimized learned image compression, it is standard practi...
research
10/25/2018

Towards improved lossy image compression: Human image reconstruction with public-domain images

Lossy image compression has been studied extensively in the context of t...
research
08/04/2023

Frequency Disentangled Features in Neural Image Compression

The design of a neural image compression network is governed by how well...
research
11/19/2020

Dual Contradistinctive Generative Autoencoder

We present a new generative autoencoder model with dual contradistinctiv...
research
04/26/2023

Multi-Modality Deep Network for Extreme Learned Image Compression

Image-based single-modality compression learning approaches have demonst...

Please sign up or login with your details

Forgot password? Click here to reset