Spatial-Language Attention Policies for Efficient Robot Learning

04/21/2023
by   Priyam Parashar, et al.
0

We investigate how to build and train spatial representations for robot decision making with Transformers. In particular, for robots to operate in a range of environments, we must be able to quickly train or fine-tune robot sensorimotor policies that are robust to clutter, data efficient, and generalize well to different circumstances. As a solution, we propose Spatial Language Attention Policies (SLAP). SLAP uses three-dimensional tokens as the input representation to train a single multi-task, language-conditioned action prediction policy. Our method shows 80 eight tasks with a single model, and a 47.5 and unseen object configurations are introduced, even with only a handful of examples per task. This represents an improvement of 30 given unseen distractors and configurations).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset