On Efficient Transformer and Image Pre-training for Low-level Vision

12/19/2021
by   Wenbo Li, et al.
7

Pre-training has marked numerous state of the arts in high-level computer vision, but few attempts have ever been made to investigate how pre-training acts in image processing systems. In this paper, we present an in-depth study of image pre-training. To conduct this study on solid ground with practical value in mind, we first propose a generic, cost-effective Transformer-based framework for image processing. It yields highly competitive performance across a range of low-level tasks, though under constrained parameters and computational complexity. Then, based on this framework, we design a whole set of principled evaluation tools to seriously and comprehensively diagnose image pre-training in different tasks, and uncover its effects on internal network representations. We find pre-training plays strikingly different roles in low-level tasks. For example, pre-training introduces more local information to higher layers in super-resolution (SR), yielding significant performance gains, while pre-training hardly affects internal feature representations in denoising, resulting in a little gain. Further, we explore different methods of pre-training, revealing that multi-task pre-training is more effective and data-efficient. All codes and models will be released at https://github.com/fenglinglwb/EDT.

READ FULL TEXT

page 4

page 5

page 6

page 13

page 14

page 15

page 16

page 17

research
03/30/2023

Masked Autoencoders as Image Processors

Transformers have shown significant effectiveness for various vision tas...
research
12/13/2022

FastMIM: Expediting Masked Image Modeling Pre-training for Vision

The combination of transformers and masked image modeling (MIM) pre-trai...
research
07/31/2023

Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training

Harnessing the power of pre-training on large-scale datasets like ImageN...
research
03/09/2023

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Masked Autoencoders (MAE) have been popular paradigms for large-scale vi...
research
12/01/2020

Pre-Trained Image Processing Transformer

As the computing power of modern hardware is increasing strongly, pre-tr...
research
10/24/2022

Effective Pre-Training Objectives for Transformer-based Autoencoders

In this paper, we study trade-offs between efficiency, cost and accuracy...
research
03/20/2023

Explicit Visual Prompting for Low-Level Structure Segmentations

We consider the generic problem of detecting low-level structures in ima...

Please sign up or login with your details

Forgot password? Click here to reset