Captioning Images Taken by People Who Are Blind

02/20/2020
by   Danna Gurari, et al.
0

While an important problem in the vision community is to design algorithms that can automatically caption images, few publicly-available datasets for algorithm development directly address the interests of real users. Observing that people who are blind have relied on (human-based) image captioning services to learn about images they take for nearly a decade, we introduce the first image captioning dataset to represent this real use case. This new dataset, which we call VizWiz-Captions, consists of over 39,000 images originating from people who are blind that are each paired with five captions. We analyze this dataset to (1) characterize the typical captions, (2) characterize the diversity of content found in the images, and (3) compare its content to that found in eight popular vision datasets. We also analyze modern image captioning algorithms to identify what makes this new dataset challenging for the vision community. We publicly-share the dataset with captioning challenge instructions at https://vizwiz.org

READ FULL TEXT

page 1

page 8

page 12

page 13

page 14

research
03/21/2021

#PraCegoVer: A Large Dataset for Image Captioning in Portuguese

Automatically describing images using natural sentences is an important ...
research
05/17/2021

Multi-Modal Image Captioning for the Visually Impaired

One of the ways blind people understand their surroundings is by clickin...
research
03/27/2020

Assessing Image Quality Issues for Real-World Problems

We introduce a new large-scale dataset that links the assessment of imag...
research
03/27/2020

Assessing Image Quality Issues for Real-World Problem

We introduce a new large-scale dataset that links the assessment of imag...
research
12/21/2020

Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

Image captioning has recently demonstrated impressive progress largely o...
research
01/27/2021

See-Through Captions: Real-Time Captioning on Transparent Display for Deaf and Hard-of-Hearing People

Real-time captioning is a useful technique for deaf and hard-of-hearing ...
research
08/26/2023

Towards Real Time Egocentric Segment Captioning for The Blind and Visually Impaired in RGB-D Theatre Images

In recent years, image captioning and segmentation have emerged as cruci...

Please sign up or login with your details

Forgot password? Click here to reset