
SampleEfficient Reinforcement Learning via CounterfactualBased Data Augmentation
Reinforcement learning (RL) algorithms usually require a substantial amo...
read it

Policy Continuation with Hindsight Inverse Dynamics
Solving goaloriented tasks is an important but challenging problem in r...
read it

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation
Deep reinforcement learning (DRL) is a promising approach for adaptive r...
read it

Causal Reasoning from Metareinforcement Learning
Discovering and exploiting the causal structure in the environment is a ...
read it

Reinforcement Learning is not a Causal problem
We use an analogy between nonisomorphic mathematical structures defined...
read it

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Leargning
Traditional distributed deep reinforcement learning (RL) commonly relies...
read it

Efficient Intrinsically Motivated Robotic Grasping with LearningAdaptive Imagination in Latent Space
Combining modelbased and modelfree deep reinforcement learning has sho...
read it
Counterfactual Data Augmentation using Locally Factored Dynamics
Many dynamic processes, including common scenarios in robotic control and reinforcement learning (RL), involve a set of interacting subprocesses. Though the subprocesses are not independent, their interactions are often sparse, and the dynamics at any given time step can often be decomposed into locally independent causal mechanisms. Such local causal structures can be leveraged to improve the sample efficiency of sequence prediction and offpolicy reinforcement learning. We formalize this by introducing local causal models (LCMs), which are induced from a global causal model by conditioning on a subset of the state space. We propose an approach to inferring these structures given an objectoriented state representation, as well as a novel algorithm for modelfree Counterfactual Data Augmentation (CoDA). CoDA uses local structures and an experience replay to generate counterfactual experiences that are causally valid in the global model. We find that CoDA significantly improves the performance of RL agents in locally factored tasks, including the batchconstrained and goalconditioned settings.
READ FULL TEXT
Comments
There are no comments yet.