Teaching Machines to Describe Images via Natural Language Feedback

06/01/2017
by   Huan Ling, et al.
0

Robots will eventually be part of every household. It is thus critical to enable algorithms to learn from and be guided by non-expert users. In this paper, we bring a human in the loop, and enable a human teacher to give feedback to a learning agent in the form of natural language. We argue that a descriptive sentence can provide a much stronger learning signal than a numeric reward in that it can easily point to where the mistakes are and how to correct them. We focus on the problem of image captioning in which the quality of the output can easily be judged by non-experts. We propose a hierarchical phrase-based captioning model trained with policy gradients, and design a feedback network that provides reward to the learner by conditioning on the human-provided feedback. We show that by exploiting descriptive feedback our model learns to perform better than when given independently written human captions.

READ FULL TEXT

page 2

page 9

page 10

page 11

research
04/04/2021

Influencing Reinforcement Learning through Natural Language Guidance

Interactive reinforcement learning agents use human feedback or instruct...
research
12/01/2018

Lifelong Learning for Image Captioning by Asking Natural Language Questions

In order to bring artificial agents into our lives, we will need to go b...
research
07/08/2023

Improving Prototypical Part Networks with Reward Reweighing, Reselection, and Retraining

In recent years, work has gone into developing deep interpretable method...
research
12/19/2022

Continual Learning for Instruction Following from Realtime Feedback

We study the problem of continually training an instruction-following ag...
research
09/30/2020

Learning Rewards from Linguistic Feedback

We explore unconstrained natural language feedback as a learning signal ...
research
01/21/2017

Interactive Learning from Policy-Dependent Human Feedback

For agents and robots to become more useful, they must be able to quickl...
research
07/20/2023

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

Exploration and reward specification are fundamental and intertwined cha...

Please sign up or login with your details

Forgot password? Click here to reset