UniColor: A Unified Framework for Multi-Modal Colorization with Transformer

09/22/2022
by   Zhitong Huang, et al.
0

We propose the first unified framework UniColor to support colorization in multiple modalities, including both unconditional and conditional ones, such as stroke, exemplar, text, and even a mix of them. Rather than learning a separate model for each type of condition, we introduce a two-stage colorization framework for incorporating various conditions into a single model. In the first stage, multi-modal conditions are converted into a common representation of hint points. Particularly, we propose a novel CLIP-based method to convert the text to hint points. In the second stage, we propose a Transformer-based network composed of Chroma-VQGAN and Hybrid-Transformer to generate diverse and high-quality colorization results conditioned on hint points. Both qualitative and quantitative comparisons demonstrate that our method outperforms state-of-the-art methods in every control modality and further enables multi-modal colorization that was not feasible before. Moreover, we design an interactive interface showing the effectiveness of our unified framework in practical usage, including automatic colorization, hybrid-control colorization, local recolorization, and iterative color editing. Our code and models are available at https://luckyhzt.github.io/unicolor.

READ FULL TEXT

page 1

page 3

page 9

page 10

page 11

page 13

page 14

page 15

research
05/25/2023

Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation

Recently, video object segmentation (VOS) referred by multi-modal signal...
research
07/07/2022

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

Multi-modal medical image completion has been extensively applied to all...
research
12/08/2022

OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models

Generalist models, which are capable of performing diverse multi-modal t...
research
03/12/2023

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

This paper proposes a unified diffusion framework (dubbed UniDiffuser) t...
research
05/31/2021

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

Inspired by biological evolution, we explain the rationality of Vision T...
research
11/21/2022

TFormer: A throughout fusion transformer for multi-modal skin lesion diagnosis

Multi-modal skin lesion diagnosis (MSLD) has achieved remarkable success...
research
07/20/2022

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

We study the challenging problem of recovering detailed motion from a si...

Please sign up or login with your details

Forgot password? Click here to reset