Word to Sentence Visual Semantic Similarity for Caption Generation: Lessons Learned

09/26/2022
by   Ahmed Sabir, et al.
0

This paper focuses on enhancing the captions generated by image-caption generation systems. We propose an approach for improving caption generation systems by choosing the most closely related output to the image rather than the most likely output produced by the model. Our model revises the language generation output beam search from a visual context perspective. We employ a visual semantic measure in a word and sentence level manner to match the proper caption to the related information in the image. The proposed approach can be applied to any caption system as a post-processing based method.

READ FULL TEXT
research
08/17/2019

Leveraging sentence similarity in natural language generation: Improving beam search using range voting

We propose a novel method for generating natural language sentences from...
research
09/16/2022

Belief Revision based Caption Re-ranker with Visual Semantic Information

In this work, we focus on improving the captions generated by image-capt...
research
12/06/2022

M-VADER: A Model for Diffusion with Multimodal Context

We introduce M-VADER: a diffusion model (DM) for image generation where ...
research
09/12/2019

Speculative Beam Search for Simultaneous Translation

Beam search is universally used in full-sentence translation but its app...
research
12/06/2017

Learning Semantic Concepts and Order for Image and Sentence Matching

Image and sentence matching has made great progress recently, but it rem...
research
06/20/2015

Aligning where to see and what to tell: image caption with region-based attention and scene factorization

Recent progress on automatic generation of image captions has shown that...
research
10/05/2020

Acrostic Poem Generation

We propose a new task in the area of computational creativity: acrostic ...

Please sign up or login with your details

Forgot password? Click here to reset