Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment

04/28/2023
by   Lu Yu, et al.
0

Automated image captioning has the potential to be a useful tool for people with vision impairments. Images taken by this user group are often noisy, which leads to incorrect and even unsafe model predictions. In this paper, we propose a quality-agnostic framework to improve the performance and robustness of image captioning models for visually impaired people. We address this problem from three angles: data, model, and evaluation. First, we show how data augmentation techniques for generating synthetic noise can address data sparsity in this domain. Second, we enhance the robustness of the model by expanding a state-of-the-art model to a dual network architecture, using the augmented data and leveraging different consistency losses. Our results demonstrate increased performance, e.g. an absolute improvement of 2.15 on CIDEr, compared to state-of-the-art image captioning networks, as well as increased robustness to noise with up to 3 points improvement on CIDEr in more noisy settings. Finally, we evaluate the prediction reliability using confidence calibration on images with different difficulty/noise levels, showing that our models perform more reliably in safety-critical situations. The improved model is part of an assisted living application, which we develop in partnership with the Royal National Institute of Blind People.

READ FULL TEXT

page 3

page 6

page 7

research
05/17/2021

Multi-Modal Image Captioning for the Visually Impaired

One of the ways blind people understand their surroundings is by clickin...
research
05/03/2023

Multimodal Data Augmentation for Image Captioning using Diffusion Models

Image captioning, an important vision-language task, often requires a tr...
research
06/10/2021

Data augmentation to improve robustness of image captioning solutions

In this paper, we study the impact of motion blur, a common quality flaw...
research
05/26/2019

A Survey on Biomedical Image Captioning

Image captioning applied to biomedical images can assist and accelerate ...
research
11/17/2022

Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired

We propose a simple yet effective image captioning framework that can de...
research
06/14/2022

Automated Testing of Image Captioning Systems

Image captioning (IC) systems, which automatically generate a text descr...
research
03/27/2020

Assessing Image Quality Issues for Real-World Problem

We introduce a new large-scale dataset that links the assessment of imag...

Please sign up or login with your details

Forgot password? Click here to reset