Image Reconstruction using Enhanced Vision Transformer

07/11/2023
by   Nikhil Verma, et al.
0

Removing noise from images is a challenging and fundamental problem in the field of computer vision. Images captured by modern cameras are inevitably degraded by noise which limits the accuracy of any quantitative measurements on those images. In this project, we propose a novel image reconstruction framework which can be used for tasks such as image denoising, deblurring or inpainting. The model proposed in this project is based on Vision Transformer (ViT) that takes 2D images as input and outputs embeddings which can be used for reconstructing denoised images. We incorporate four additional optimization techniques in the framework to improve the model reconstruction capability, namely Locality Sensitive Attention (LSA), Shifted Patch Tokenization (SPT), Rotary Position Embeddings (RoPE) and adversarial loss function inspired from Generative Adversarial Networks (GANs). LSA, SPT and RoPE enable the transformer to learn from the dataset more efficiently, while the adversarial loss function enhances the resolution of the reconstructed images. Based on our experiments, the proposed architecture outperforms the benchmark U-Net model by more than 3.5% structural similarity (SSIM) for the reconstruction tasks of image denoising and inpainting. The proposed enhancements further show an improvement of ~5% SSIM over the benchmark for both tasks.

READ FULL TEXT
research
08/06/2018

X-GANs: Image Reconstruction Made Easy for Extreme Cases

Image reconstruction including image restoration and denoising is a chal...
research
12/26/2018

FPD-M-net: Fingerprint Image Denoising and Inpainting Using M-Net Based Convolutional Neural Networks

The fingerprint is a common biometric used for authentication and verifi...
research
02/04/2022

Image-to-Image MLP-mixer for Image Reconstruction

Neural networks are highly effective tools for image reconstruction prob...
research
11/30/2021

EdiBERT, a generative model for image editing

Advances in computer vision are pushing the limits of im-age manipulatio...
research
03/12/2021

A Neural Network for Semigroups

Tasks like image reconstruction in computer vision, matrix completion in...
research
03/23/2022

Adaptively Re-weighting Multi-Loss Untrained Transformer for Sparse-View Cone-Beam CT Reconstruction

Cone-Beam Computed Tomography (CBCT) has been proven useful in diagnosis...
research
08/16/2019

The Angel is in the Priors: Improving GAN based Image and Sequence Inpainting with Better Noise and Structural Priors

Contemporary deep learning based inpainting algorithms are mainly based ...

Please sign up or login with your details

Forgot password? Click here to reset