NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

03/28/2022
by   Duc Minh Vo, et al.
0

Novel object captioning aims at describing objects absent from training data, with the key ingredient being the provision of object vocabulary to the model. Although existing methods heavily rely on an object detection model, we view the detection step as vocabulary retrieval from an external knowledge in the form of embeddings for any object's definition from Wiktionary, where we use in the retrieval image region features learned from a transformers model. We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset. Furthermore, our model eliminates the requirement for model retraining by simply updating the external knowledge whenever a novel object appears. Our comprehensive experiments on held-out COCO and Nocaps datasets show that our NOC-REK is considerably effective against SOTAs.

READ FULL TEXT

page 6

page 12

page 13

page 14

page 15

page 16

page 17

research
03/25/2023

Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection

Prompt-OVD is an efficient and effective framework for open-vocabulary o...
research
04/25/2019

Pointing Novel Objects in Image Captioning

Image captioning has received significant attention with remarkable impr...
research
08/06/2019

Cascaded Revision Network for Novel Object Captioning

Image captioning, a challenging task where the machine automatically des...
research
06/24/2016

Captioning Images with Diverse Objects

Recent captioning models are limited in their ability to scale and descr...
research
11/27/2017

Query-Adaptive R-CNN for Open-Vocabulary Object Detection and Retrieval

We address the problem of open-vocabulary object retrieval and localizat...
research
10/17/2017

Describing Natural Images Containing Novel Objects with Knowledge Guided Assitance

Images in the wild encapsulate rich knowledge about varied abstract conc...
research
11/23/2018

Fast Object Class Labelling via Speech

Object class labelling is the task of annotating images with labels on t...

Please sign up or login with your details

Forgot password? Click here to reset