Guided Visual Attention Model Based on Interactions Between Top-down and Bottom-up Information for Robot Pose Prediction

02/21/2022
by   Hyogo Hiruma, et al.
0

Learning to control a robot commonly requires mapping between robot states and camera images, where conventional deep vision models require large training dataset. Existing visual attention models, such as Deep Spatial Autoencoders, have improved the data-efficiency by training the model to selectively extract only the task relevant image area. However, since the models are unable to select attention targets on demand, the diversity of trainable tasks are limited. This paper proposed a novel Key-Query-Value formulated visual attention model which can be guided to a certain attention target. The model creates an attention heatmap from Key and Query, and selectively extracts the attended data represented in Value. Such structure is capable of incorporating external inputs to create the Query, which will be trained to represent the target objects. The separation of Query creation improved the model's flexibility, enabling to simultaneously obtain and switch between multiple targets in a top-down manner. The proposed model is experimented on a simulator and a real-world environment, showing better performance compared to existing end-to-end robot vision models. The results of real-world experiments indicated the model's high scalability and extendiblity on robot controlling tasks.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

research
06/12/2017

Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition

We design an Enriched Deep Recurrent Visual Attention Model (EDRAM) - an...
research
07/27/2018

Attention-based Active Visual Search for Mobile Robots

We present an active visual search model for finding objects in unknown ...
research
06/29/2022

Deep Active Visual Attention for Real-time Robot Motion Generation: Emergence of Tool-body Assimilation and Adaptive Tool-use

Sufficiently perceiving the environment is a critical factor in robot mo...
research
10/08/2020

Improving Attention Mechanism with Query-Value Interaction

Attention mechanism has played critical roles in various state-of-the-ar...
research
10/21/2016

Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies

While deep learning has had significant successes in computer vision tha...
research
12/03/2022

Policy Learning for Active Target Tracking over Continuous SE(3) Trajectories

This paper proposes a novel model-based policy gradient algorithm for tr...
research
04/12/2023

Learning to search for and detect objects in foveal images using deep learning

The human visual system processes images with varied degrees of resoluti...

Please sign up or login with your details

Forgot password? Click here to reset