Alleviating Noisy Data in Image Captioning with Cooperative Distillation

12/21/2020
by   Pierre Dognin, et al.
0

Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images. Unfortunately, scarce availability of such cleanly labeled data results in trained algorithms producing captions that can be terse and idiosyncratically specific to details in the image. We propose a new technique, cooperative distillation that combines clean curated datasets with the web-scale automatically extracted captions of the Google Conceptual Captions dataset (GCC), which can have poor descriptions of images, but is abundant in size and therefore provides a rich vocabulary resulting in more expressive captions.

READ FULL TEXT
research
05/02/2017

STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset

In recent years, automatic generation of image descriptions (captions), ...
research
10/06/2017

Contrastive Learning for Image Captioning

Image captioning, a popular topic in computer vision, has achieved subst...
research
12/21/2020

Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

Image captioning has recently demonstrated impressive progress largely o...
research
04/16/2021

Concadia: Tackling image accessibility with context

Images have become an integral part of online media. This has enhanced s...
research
07/26/2019

Cooperative image captioning

When describing images with natural language, the descriptions can be ma...
research
08/29/2019

Aesthetic Image Captioning From Weakly-Labelled Photographs

Aesthetic image captioning (AIC) refers to the multi-modal task of gener...
research
09/15/2023

PatFig: Generating Short and Long Captions for Patent Figures

This paper introduces Qatent PatFig, a novel large-scale patent figure d...

Please sign up or login with your details

Forgot password? Click here to reset