Adversarial Imitation Learning via Random Search

08/21/2020
by   MyungJae Shin, et al.
0

Developing agents that can perform challenging complex tasks is the goal of reinforcement learning. The model-free reinforcement learning has been considered as a feasible solution. However, the state of the art research has been to develop increasingly complicated techniques. This increasing complexity makes the reconstruction difficult. Furthermore, the problem of reward dependency is still exists. As a result, research on imitation learning, which learns policy from a demonstration of experts, has begun to attract attention. Imitation learning directly learns policy based on data on the behavior of the experts without the explicit reward signal provided by the environment. However, imitation learning tries to optimize policies based on deep reinforcement learning such as trust region policy optimization. As a result, deep reinforcement learning based imitation learning also poses a crisis of reproducibility. The issue of complex model-free model has received considerable critical attention. A derivative-free optimization based reinforcement learning and the simplification on policies obtain competitive performance on the dynamic complex tasks. The simplified policies and derivative free methods make algorithm be simple. The reconfiguration of research demo becomes easy. In this paper, we propose an imitation learning method that takes advantage of the derivative-free optimization with simple linear policies. The proposed method performs simple random search in the parameter space of policies and shows computational efficiency. Experiments in this paper show that the proposed model, without a direct reward signal from the environment, obtains competitive performance on the MuJoCo locomotion tasks.

READ FULL TEXT
research
03/31/2020

Augmented Q Imitation Learning (AQIL)

The study of unsupervised learning can be generally divided into two cat...
research
02/26/2018

Reinforcement and Imitation Learning for Diverse Visuomotor Skills

We propose a model-free deep reinforcement learning method that leverage...
research
07/24/2019

Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing Analytic Experts

This paper tackles the problem of learning a questioner in the goal-orie...
research
06/10/2021

Differentiable Robust LQR Layers

This paper proposes a differentiable robust LQR layer for reinforcement ...
research
10/31/2022

Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning

The permutation flow shop scheduling (PFSS), aiming at finding the optim...
research
01/31/2020

Preventing Imitation Learning with Adversarial Policy Ensembles

Imitation learning can reproduce policies by observing experts, which po...
research
09/29/2021

Mitigation of Adversarial Policy Imitation via Constrained Randomization of Policy (CRoP)

Deep reinforcement learning (DRL) policies are vulnerable to unauthorize...

Please sign up or login with your details

Forgot password? Click here to reset