P^3O: Transferring Visual Representations for Reinforcement Learning via Prompting

03/22/2023
by   Guoliang You, et al.
0

It is important for deep reinforcement learning (DRL) algorithms to transfer their learned policies to new environments that have different visual inputs. In this paper, we introduce Prompt based Proximal Policy Optimization (P^3O), a three-stage DRL algorithm that transfers visual representations from a target to a source environment by applying prompting. The process of P^3O consists of three stages: pre-training, prompting, and predicting. In particular, we specify a prompt-transformer for representation conversion and propose a two-step training process to train the prompt-transformer for the target environment, while the rest of the DRL pipeline remains unchanged. We implement P^3O and evaluate it on the OpenAI CarRacing video game. The experimental results show that P^3O outperforms the state-of-the-art visual transferring schemes. In particular, P^3O allows the learned policies to perform well in environments with different visual inputs, which is much more effective than retraining the policies in these environments.

READ FULL TEXT
research
06/03/2019

Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies

This paper proposes a novel scheme for the watermarking of Deep Reinforc...
research
07/15/2019

Proximal Policy Optimization with Mixed Distributed Training

Instability and slowness are two main problems in deep reinforcement lea...
research
06/04/2020

Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion

We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO...
research
09/12/2021

Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies

Researchers have demonstrated that Deep Reinforcement Learning (DRL) is ...
research
05/30/2016

Control of Memory, Active Perception, and Action in Minecraft

In this paper, we introduce a new set of reinforcement learning (RL) tas...
research
02/24/2020

How Transferable are the Representations Learned by Deep Q Agents?

In this paper, we consider the source of Deep Reinforcement Learning (DR...
research
10/07/2020

Proximal Policy Optimization with Relative Pearson Divergence

Deep reinforcement learning (DRL) is one of the promising approaches for...

Please sign up or login with your details

Forgot password? Click here to reset