New Image Captioning Encoder via Semantic Visual Feature Matching for Heavy Rain Images

05/28/2021
by   Chang-Hwan Son, et al.
0

Image captioning generates text that describes scenes from input images. It has been developed for high quality images taken in clear weather. However, in bad weather conditions, such as heavy rain, snow, and dense fog, the poor visibility owing to rain streaks, rain accumulation, and snowflakes causes a serious degradation of image quality. This hinders the extraction of useful visual features and results in deteriorated image captioning performance. To address practical issues, this study introduces a new encoder for captioning heavy rain images. The central idea is to transform output features extracted from heavy rain input images into semantic visual features associated with words and sentence context. To achieve this, a target encoder is initially trained in an encoder-decoder framework to associate visual features with semantic words. Subsequently, the objects in a heavy rain image are rendered visible by using an initial reconstruction subnetwork (IRS) based on a heavy rain model. The IRS is then combined with another semantic visual feature matching subnetwork (SVFMS) to match the output features of the IRS with the semantic visual features of the pretrained target encoder. The proposed encoder is based on the joint learning of the IRS and SVFMS. It is is trained in an end-to-end manner, and then connected to the pretrained decoder for image captioning. It is experimentally demonstrated that the proposed encoder can generate semantic visual features associated with words even from heavy rain images, thereby increasing the accuracy of the generated captions.

READ FULL TEXT

page 13

page 17

page 20

page 21

research
08/30/2019

Reflective Decoding Network for Image Captioning

State-of-the-art image captioning methods mostly focus on improving visu...
research
06/16/2022

Image Captioning based on Feature Refinement and Reflective Decoding

Automatically generating a description of an image in natural language i...
research
04/03/2018

Learning to Guide Decoding for Image Captioning

Recently, much advance has been made in image captioning, and an encoder...
research
05/20/2019

Multimodal Transformer with Multi-View Visual Representation for Image Captioning

Image captioning aims to automatically generate a natural language descr...
research
02/23/2021

Enhanced Modality Transition for Image Captioning

Image captioning model is a cross-modality knowledge discovery task, whi...
research
01/06/2023

An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)

Image captioning by the encoder-decoder framework has shown tremendous a...
research
11/17/2022

Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired

We propose a simple yet effective image captioning framework that can de...

Please sign up or login with your details

Forgot password? Click here to reset