DeepAI AI Chat
Log In Sign Up

TransMatting: Tri-token Equipped Transformer Model for Image Matting

by   Huanqia Cai, et al.

Image matting aims to predict alpha values of elaborate uncertainty areas of natural images, like hairs, smoke, and spider web. However, existing methods perform poorly when faced with highly transparent foreground objects due to the large area of uncertainty to predict and the small receptive field of convolutional networks. To address this issue, we propose a Transformer-based network (TransMatting) to model transparent objects with long-range features and collect a high-resolution matting dataset of transparent objects (Transparent-460) for performance evaluation. Specifically, to utilize semantic information in the trimap flexibly and effectively, we also redesign the trimap as three learnable tokens, named tri-token. Both Transformer and convolution matting models could benefit from our proposed tri-token design. By replacing the traditional trimap concatenation strategy with our tri-token, existing matting methods could achieve about 10 Equipped with the new tri-token design, our proposed TransMatting outperforms current state-of-the-art methods on several popular matting benchmarks and our newly collected Transparent-460.


page 4

page 6

page 9

page 10


TransMatting: Enhancing Transparent Objects Matting with Transformers

Image matting refers to predicting the alpha values of unknown foregroun...

Visual Saliency Transformer

Recently, massive saliency detection methods have achieved promising res...

Token Transformer: Can class token help window-based transformer build better long-range interactions?

Compared with the vanilla transformer, the window-based transformer offe...

Trans2Seg: Transparent Object Segmentation with Transformer

This work presents a new fine-grained transparent object segmentation da...

Improved Image Classification with Token Fusion

In this paper, we propose a method using the fusion of CNN and transform...

Relationformer: A Unified Framework for Image-to-Graph Generation

A comprehensive representation of an image requires understanding object...