ParaColorizer: Realistic Image Colorization using Parallel Generative Networks

08/17/2022
by   Himanshu Kumar, et al.
7

Grayscale image colorization is a fascinating application of AI for information restoration. The inherently ill-posed nature of the problem makes it even more challenging since the outputs could be multi-modal. The learning-based methods currently in use produce acceptable results for straightforward cases but usually fail to restore the contextual information in the absence of clear figure-ground separation. Also, the images suffer from color bleeding and desaturated backgrounds since a single model trained on full image features is insufficient for learning the diverse data modes. To address these issues, we present a parallel GAN-based colorization framework. In our approach, each separately tailored GAN pipeline colorizes the foreground (using object-level features) or the background (using full-image features). The foreground pipeline employs a Residual-UNet with self-attention as its generator trained using the full-image features and the corresponding object-level features from the COCO dataset. The background pipeline relies on full-image features and additional training examples from the Places dataset. We design a DenseFuse-based fusion network to obtain the final colorized image by feature-based fusion of the parallelly generated outputs. We show the shortcomings of the non-perceptual evaluation metrics commonly used to assess multi-modal problems like image colorization and perform extensive performance evaluation of our framework using multiple perceptual metrics. Our approach outperforms most of the existing learning-based methods and produces results comparable to the state-of-the-art. Further, we performed a runtime analysis and obtained an average inference time of 24ms per image.

READ FULL TEXT

page 1

page 2

page 4

page 5

page 6

page 7

page 8

page 9

research
05/21/2020

Instance-aware Image Colorization

Image colorization is inherently an ill-posed problem with multi-modal u...
research
11/21/2021

MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation

Referring image segmentation is a typical multi-modal task, which aims a...
research
02/02/2023

MoE-Fusion: Instance Embedded Mixture-of-Experts for Infrared and Visible Image Fusion

Infrared and visible image fusion can compensate for the incompleteness ...
research
01/19/2021

Collaboration among Image and Object Level Features for Image Colourisation

Image colourisation is an ill-posed problem, with multiple correct solut...
research
02/20/2020

Stroke Constrained Attention Network for Online Handwritten Mathematical Expression Recognition

In this paper, we propose a novel stroke constrained attention network (...
research
05/30/2020

Blended Multi-Modal Deep ConvNet Features for Diabetic Retinopathy Severity Prediction

Diabetic Retinopathy (DR) is one of the major causes of visual impairmen...
research
12/26/2017

Multi-modal Geolocation Estimation Using Deep Neural Networks

Estimating the location where an image was taken based solely on the con...

Please sign up or login with your details

Forgot password? Click here to reset