Learning Discriminative Visual-Text Representation for Polyp Re-Identification

07/20/2023
by   Suncheng Xiang, et al.
0

Colonoscopic Polyp Re-Identification aims to match a specific polyp in a large gallery with different cameras and views, which plays a key role for the prevention and treatment of colorectal cancer in the computer-aided diagnosis. However, traditional methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which may easily leads to poor generalization capability when adapted the pretrained model into the new scenarios. To relieve this dilemma, we propose a simple but effective training method named VT-ReID, which can remarkably enrich the representation of polyp videos with the interchange of high-level semantic information. Moreover, we elaborately design a novel clustering mechanism to introduce prior knowledge from textual data, which leverages contrastive learning to promote better separation from abundant unlabeled text data. To the best of our knowledge, this is the first attempt to employ the visual-text feature with clustering mechanism for the colonoscopic polyp re-identification. Empirical results show that our method significantly outperforms current state-of-the art methods with a clear margin.

READ FULL TEXT

page 3

page 13

research
08/02/2023

Towards Discriminative Representation with Meta-learning for Colonoscopic Polyp Re-Identification

Colonoscopic Polyp Re-Identification aims to match the same polyp from a...
research
03/28/2023

Colo-SCRL: Self-Supervised Contrastive Representation Learning for Colonoscopic Video Retrieval

Colonoscopic video retrieval, which is a critical part of polyp treatmen...
research
04/19/2023

Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

Generalizable person re-identification (Re-ID) is a very hot research to...
research
11/24/2021

Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition

Semantic information has been proved effective in scene text recognition...
research
06/05/2022

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

Multi-channel video-language retrieval require models to understand info...
research
03/24/2021

Supporting Clustering with Contrastive Learning

Unsupervised clustering aims at discovering the semantic categories of d...

Please sign up or login with your details

Forgot password? Click here to reset