
Generalization in Mean Field Games by Learning Master Policies
Mean Field Games (MFGs) can potentially scale multiagent systems to ext...
Implicitly Regularized RL with Implicit QValues
The Qfunction is a central quantity in many Reinforcement Learning (RL)...
A functional mirror ascent view of policy gradient methods with function approximation
We use functional mirror ascent to propose a general framework (referred...
Offline Reinforcement Learning as AntiExploration
Offline Reinforcement Learning (RL) aims at learning an optimal control ...
There Is No Turning Back: A SelfSupervised Approach for ReversibilityAware Reinforcement Learning
We propose to learn to distinguish reversible from irreversible actions ...
Concave Utility Reinforcement Learning: the Meanfield Game viewpoint
Concave Utility Reinforcement Learning (CURL) extends RL from linear to ...
What Matters for Adversarial Imitation Learning?
Adversarial imitation learning has become a popular framework for imitat...
Hyperparameter Selection for Imitation Learning
We address the issue of tuning hyperparameters (HPs) for imitation learn...
Mean Field Games Flock! The Reinforcement Learning Way
We present a method enabling a large number of agents to learn how to fl...
Offline Reinforcement Learning with Pseudometric Learning
Offline Reinforcement Learning methods seek to learn a policy from logge...
Scaling up Mean Field Games with Online Mirror Descent
We address scaling up equilibrium computation in Mean Field Games (MFGs)...
How To Train Your HERON
In this paper we apply Deep Reinforcement Learning (Deep RL) and Domain ...
Adversarially Guided ActorCritic
Despite definite success in deep reinforcement learning problems, actor...
SelfImitation Advantage Learning
Selfimitation learning is a Reinforcement Learning (RL) method that enc...
Munchausen Reinforcement Learning
Bootstrapping is a core mechanism in Reinforcement Learning (RL). Most a...
Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
In this paper, we deepen the analysis of continuous time Fictitious Play...
Show me the Way: Intrinsic Motivation from Demonstrations
The study of exploration in Reinforcement Learning (RL) has a long histo...
What Matters In OnPolicy Reinforcement Learning? A LargeScale Empirical Study
In recent years, onpolicy reinforcement learning (RL) has been successf...
Primal Wasserstein Imitation Learning
Imitation Learning (IL) methods seek to match the behavior of an agent w...
Stable and Efficient Policy Evaluation
Policy evaluation algorithms are essential to reinforcement learning due...
Leverage the Average: an Analysis of Regularization in RL
Building upon the formalism of regularized Markov decision processes, we...
ImageBased Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description
Most of the research effort on imagebased place recognition is designed...
Momentum in Reinforcement Learning
We adapt the optimization's concept of momentum to reinforcement learnin...
On Connections between Constrained Optimization and Reinforcement Learning
Dynamic Programming (DP) provides standard algorithms to solve Markov De...
Learning Sensor Placement from Demonstration for UAV networks
This work demonstrates how to leverage previous network expert demonstra...
Credit Assignment as a Proxy for Transfer in Reinforcement Learning
The ability to transfer representations to novel environments and tasks ...
ELF: Embedded Localisation of Features in pretrained CNN
This paper introduces a novel feature detector based only on information...
Approximate Fictitious Play for Mean Field Games
The theory of Mean Field Games (MFG) allows characterizing the Nash equi...
Modified ActorCritics
Robot Learning, from a control point of view, often involves continuous ...
MULEX: Disentangling Exploitation from Exploration in Deep RL
An agent learning through interactions should balance its action selecti...
Foolproof Cooperative Learning
This paper extends the notion of equilibrium in game theory to learning ...
Deep Conservative Policy Iteration
Conservative Policy Iteration (CPI) is a founding algorithm of Approxima...
Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations
This paper deals with adversarial attacks on perceptions of neural netwo...
A Theory of Regularized Markov Decision Processes
Many recent successful (deep) reinforcement learning algorithms make use...
Imagebased Natural Language Understanding Using 2D Convolutional Neural Networks
We propose a new approach to natural language understanding in which we ...
Anderson Acceleration for Reinforcement Learning
Anderson acceleration is an old and simple method for accelerating the c...
Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation
Deep Convolutional Neural Networks have pushed the stateofthe art for ...
Human Activity Recognition using Recurrent Neural Networks
Human activity recognition using smart home sensors is one of the bases ...
A Deep Learning Approach for Privacy Preservation in Assisted Living
In the era of Internet of Things (IoT) technologies the potential for pr...
Is the Bellman residual a bad proxy?
This paper aims at theoretically and empirically comparing two standard ...
Difference of Convex Functions Programming Applied to Control with Expert Data
This paper reports applications of Difference of Convex functions (DC) p...
Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee
Local Policy Search is a popular reinforcement learning approach for han...
Offpolicy Learning with Eligibility Traces: A Survey
In the framework of Markov Decision Processes, offpolicy learning, that...
A Dantzig Selector Approach to Temporal Difference Learning
LSTD is a popular algorithm for value function approximation. Whenever t...
Approximate Modified Policy Iteration
Modified policy iteration (MPI) is a dynamic programming (DP) algorithm ...
