Increasing Textual Context Size Boosts Medical Image-Text Matching

03/23/2023
by   Idan Glassberg, et al.
0

This short technical report demonstrates a simple technique that yields state of the art results in medical image-text matching tasks. We analyze the use of OpenAI's CLIP, a general image-text matching model, and observe that CLIP's limited textual input size has negative impact on downstream performance in the medical domain where encoding longer textual contexts is often required. We thus train and release ClipMD, which is trained with a simple sliding window technique to encode textual captions. ClipMD was tested on two medical image-text datasets and compared with other image-text matching models. The results show that ClipMD outperforms other models on both datasets by a large margin. We make our code and pretrained model publicly available.

READ FULL TEXT
research
04/26/2021

Contextualized Keyword Representations for Multi-modal Retinal Image Captioning

Medical image captioning automatically generates a medical description t...
research
10/02/2020

Contrastive Learning of Medical Visual Representations from Paired Images and Text

Learning visual representations of medical images is core to medical ima...
research
06/07/2023

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

Vision-Language Pretraining (VLP) has demonstrated remarkable capabiliti...
research
07/12/2023

Unified Medical Image-Text-Label Contrastive Learning With Continuous Prompt

Contrastive language-image Pre-training (CLIP) [13] can leverage large d...
research
05/05/2022

AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching

This paper tackles the challenge of forensic medical image matching (FMI...
research
04/18/2021

Go Forth and Prosper: Language Modeling with Ancient Textual History

We introduce a technique for improving document-level language models (L...
research
03/18/2010

Sliding window approach based Text Binarisation from Complex Textual images

Text binarisation process classifies individual pixels as text or backgr...

Please sign up or login with your details

Forgot password? Click here to reset