Probabilistic Attention for Interactive Segmentation

06/23/2021
by   Prasad Gabbur, et al.
0

We provide a probabilistic interpretation of attention and show that the standard dot-product attention in transformers is a special case of Maximum A Posteriori (MAP) inference. The proposed approach suggests the use of Expectation Maximization algorithms for online adaptation of key and value model parameters. This approach is useful for cases in which external agents, e.g., annotators, provide inference-time information about the correct values of some tokens, e.g, the semantic category of some pixels, and we need for this new information to propagate to other tokens in a principled manner. We illustrate the approach on an interactive semantic segmentation task in which annotators and models collaborate online to improve annotation efficiency. Using standard benchmarks, we observe that key adaptation boosts model performance (∼10% mIoU) in the low feedback regime and value propagation improves model responsiveness in the high feedback regime. A PyTorch layer implementation of our probabilistic attention model will be made publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2023

Dynamic Token-Pass Transformers for Semantic Segmentation

Vision transformers (ViT) usually extract features via forwarding all th...
research
03/20/2023

Robustifying Token Attention for Vision Transformers

Despite the success of vision transformers (ViTs), they still suffer fro...
research
07/31/2019

Expectation-Maximization Attention Networks for Semantic Segmentation

Self-attention mechanism has been widely used for various tasks. It is d...
research
03/27/2023

Learning Expressive Prompting With Residuals for Vision Transformers

Prompt learning is an efficient approach to adapt transformers by insert...
research
04/10/2023

Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation

Video Panoptic Segmentation (VPS) aims to achieve comprehensive pixel-le...
research
05/23/2023

Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision Transformers

With the increasing availability of depth sensors, multimodal frameworks...
research
07/16/2020

Efficient Full Image Interactive Segmentation by Leveraging Within-image Appearance Similarity

We propose a new approach to interactive full-image semantic segmentatio...

Please sign up or login with your details

Forgot password? Click here to reset