Guided Patch-Grouping Wavelet Transformer with Spatial Congruence for Ultra-High Resolution Segmentation

07/03/2023
by   Deyi Ji, et al.
0

Most existing ultra-high resolution (UHR) segmentation methods always struggle in the dilemma of balancing memory cost and local characterization accuracy, which are both taken into account in our proposed Guided Patch-Grouping Wavelet Transformer (GPWFormer) that achieves impressive performances. In this work, GPWFormer is a Transformer (𝒯)-CNN (𝒞) mutual leaning framework, where 𝒯 takes the whole UHR image as input and harvests both local details and fine-grained long-range contextual dependencies, while 𝒞 takes downsampled image as input for learning the category-wise deep context. For the sake of high inference speed and low computation complexity, 𝒯 partitions the original UHR image into patches and groups them dynamically, then learns the low-level local details with the lightweight multi-head Wavelet Transformer (WFormer) network. Meanwhile, the fine-grained long-range contextual dependencies are also captured during this process, since patches that are far away in the spatial domain can also be assigned to the same group. In addition, masks produced by 𝒞 are utilized to guide the patch grouping process, providing a heuristics decision. Moreover, the congruence constraints between the two branches are also exploited to maintain the spatial consistency among the patches. Overall, we stack the multi-stage process in a pyramid way. Experiments show that GPWFormer outperforms the existing methods with significant improvements on five benchmark datasets.

READ FULL TEXT

page 3

page 4

research
03/16/2022

EDTER: Edge Detection with Transformer

Convolutional neural networks have made significant progresses in edge d...
research
05/18/2023

Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark

With the increasing interest and rapid development of methods for Ultra-...
research
08/19/2021

Generating Superpixels for High-resolution Images with Decoupled Patch Calibration

Superpixel segmentation has recently seen important progress benefiting ...
research
09/06/2021

From Contexts to Locality: Ultra-high Resolution Image Segmentation via Locality-aware Contextual Correlation

Ultra-high resolution image segmentation has raised increasing interests...
research
06/29/2021

Looking Outside the Window: Wider-Context Transformer for the Semantic Segmentation of High-Resolution Remote Sensing Images

Long-range context information is crucial for the semantic segmentation ...
research
04/30/2022

Coarse-to-Fine Video Denoising with Dual-Stage Spatial-Channel Transformer

Video denoising aims to recover high-quality frames from the noisy video...
research
07/02/2021

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

Pixel-wise regression is probably the most common problem in fine-graine...

Please sign up or login with your details

Forgot password? Click here to reset