
Behavior Regularized Offline Reinforcement Learning
In reinforcement learning (RL) research, it is common to assume access t...
The Laplacian in RL: Learning Representations with Efficient Approximations
The smallest eigenvectors of the graph Laplacian are wellknown to provi...
Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
Hierarchical reinforcement learning has demonstrated significant success...
NearOptimal Representation Learning for Hierarchical Reinforcement Learning
We study the problem of representation learning in goalconditioned hier...
BRPO: Batch Residual Policy Optimization
In batch reinforcement learning (RL), one often constrains a learned pol...
MultiAgent Manipulation via Locomotion using Hierarchical Sim2Real
Manipulation and locomotion are closely related problems that are often ...
Lyapunovbased Safe Policy Optimization for Continuous Control
We study continuous action reinforcement learning problems in which it i...
DeepMDP: Learning Continuous Latent Space Models for Representation Learning
Many reinforcement learning (RL) tasks provide the agent with highdimen...
MorphNet: Fast & Simple ResourceConstrained Structure Learning of Deep Networks
We present MorphNet, an approach to automate the design of neural networ...
TrustPCL: An OffPolicy Trust Region Method for Continuous Control
Trust region methods, such as TRPO, are often used to stabilize policy o...
Improving Policy Gradient by Exploring Underappreciated Rewards
This paper presents a novel form of policy gradient for modelfree reinf...
Bridging the Gap Between Value and Policy Based Reinforcement Learning
We establish a new connection between value and policy based reinforceme...
Path Consistency Learning in Tsallis Entropy Regularized MDPs
We study the sparse entropyregularized reinforcement learning (ERL) pro...
Deep Reinforcement Learning for VisionBased Robotic Grasping: A Simulated Comparative Evaluation of OffPolicy Methods
In this paper, we explore deep reinforcement learning algorithms for vis...
Smoothed Action Value Functions for Learning Gaussian Policies
Stateaction value functions (i.e., Qvalues) are ubiquitous in reinforc...
DataEfficient Hierarchical Reinforcement Learning
Hierarchical reinforcement learning (HRL) is a promising approach to ext...
A Lyapunovbased Approach to Safe Reinforcement Learning
In many realworld reinforcement learning (RL) problems, besides optimiz...
Identifying and Correcting Label Bias in Machine Learning
Datasets often contain biases which unfairly disadvantage certain groups...
DualDICE: BehaviorAgnostic Estimation of Discounted Stationary Distribution Corrections
In many realworld reinforcement learning applications, access to the en...
Groupbased Fair Learning Leads to Counterintuitive Predictions
A number of machine learning (ML) methods have been proposed recently to...
Imitation Learning via OffPolicy Distribution Matching
When performing imitation learning from expert demonstrations, distribut...
AlgaeDICE: Policy Gradient from Arbitrary Experience
In many realworld applications of reinforcement learning (RL), interact...
Reinforcement Learning via FenchelRockafellar Duality
We review basic concepts of convex duality, focusing on the very general...
Ofir Nachum
