
Generalization in Mean Field Games by Learning Master Policies
Mean Field Games (MFGs) can potentially scale multiagent systems to ext...
read it

Implicitly Regularized RL with Implicit QValues
The Qfunction is a central quantity in many Reinforcement Learning (RL)...
read it

A functional mirror ascent view of policy gradient methods with function approximation
We use functional mirror ascent to propose a general framework (referred...
read it

Offline Reinforcement Learning as AntiExploration
Offline Reinforcement Learning (RL) aims at learning an optimal control ...
read it

There Is No Turning Back: A SelfSupervised Approach for ReversibilityAware Reinforcement Learning
We propose to learn to distinguish reversible from irreversible actions ...
read it

Concave Utility Reinforcement Learning: the Meanfield Game viewpoint
Concave Utility Reinforcement Learning (CURL) extends RL from linear to ...
read it

What Matters for Adversarial Imitation Learning?
Adversarial imitation learning has become a popular framework for imitat...
read it

Hyperparameter Selection for Imitation Learning
We address the issue of tuning hyperparameters (HPs) for imitation learn...
read it

Mean Field Games Flock! The Reinforcement Learning Way
We present a method enabling a large number of agents to learn how to fl...
read it

Offline Reinforcement Learning with Pseudometric Learning
Offline Reinforcement Learning methods seek to learn a policy from logge...
read it

Scaling up Mean Field Games with Online Mirror Descent
We address scaling up equilibrium computation in Mean Field Games (MFGs)...
read it

How To Train Your HERON
In this paper we apply Deep Reinforcement Learning (Deep RL) and Domain ...
read it

Adversarially Guided ActorCritic
Despite definite success in deep reinforcement learning problems, actor...
read it

SelfImitation Advantage Learning
Selfimitation learning is a Reinforcement Learning (RL) method that enc...
read it

Munchausen Reinforcement Learning
Bootstrapping is a core mechanism in Reinforcement Learning (RL). Most a...
read it

Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
In this paper, we deepen the analysis of continuous time Fictitious Play...
read it

Show me the Way: Intrinsic Motivation from Demonstrations
The study of exploration in Reinforcement Learning (RL) has a long histo...
read it

What Matters In OnPolicy Reinforcement Learning? A LargeScale Empirical Study
In recent years, onpolicy reinforcement learning (RL) has been successf...
read it

Primal Wasserstein Imitation Learning
Imitation Learning (IL) methods seek to match the behavior of an agent w...
read it

Stable and Efficient Policy Evaluation
Policy evaluation algorithms are essential to reinforcement learning due...
read it

Leverage the Average: an Analysis of Regularization in RL
Building upon the formalism of regularized Markov decision processes, we...
read it

ImageBased Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description
Most of the research effort on imagebased place recognition is designed...
read it

Momentum in Reinforcement Learning
We adapt the optimization's concept of momentum to reinforcement learnin...
read it

On Connections between Constrained Optimization and Reinforcement Learning
Dynamic Programming (DP) provides standard algorithms to solve Markov De...
read it

Learning Sensor Placement from Demonstration for UAV networks
This work demonstrates how to leverage previous network expert demonstra...
read it

Credit Assignment as a Proxy for Transfer in Reinforcement Learning
The ability to transfer representations to novel environments and tasks ...
read it

ELF: Embedded Localisation of Features in pretrained CNN
This paper introduces a novel feature detector based only on information...
read it

Approximate Fictitious Play for Mean Field Games
The theory of Mean Field Games (MFG) allows characterizing the Nash equi...
read it

Modified ActorCritics
Robot Learning, from a control point of view, often involves continuous ...
read it

MULEX: Disentangling Exploitation from Exploration in Deep RL
An agent learning through interactions should balance its action selecti...
read it

Foolproof Cooperative Learning
This paper extends the notion of equilibrium in game theory to learning ...
read it

Deep Conservative Policy Iteration
Conservative Policy Iteration (CPI) is a founding algorithm of Approxima...
read it

Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations
This paper deals with adversarial attacks on perceptions of neural netwo...
read it

A Theory of Regularized Markov Decision Processes
Many recent successful (deep) reinforcement learning algorithms make use...
read it

Imagebased Natural Language Understanding Using 2D Convolutional Neural Networks
We propose a new approach to natural language understanding in which we ...
read it

Anderson Acceleration for Reinforcement Learning
Anderson acceleration is an old and simple method for accelerating the c...
read it

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation
Deep Convolutional Neural Networks have pushed the stateofthe art for ...
read it

Human Activity Recognition using Recurrent Neural Networks
Human activity recognition using smart home sensors is one of the bases ...
read it

A Deep Learning Approach for Privacy Preservation in Assisted Living
In the era of Internet of Things (IoT) technologies the potential for pr...
read it

Is the Bellman residual a bad proxy?
This paper aims at theoretically and empirically comparing two standard ...
read it

Difference of Convex Functions Programming Applied to Control with Expert Data
This paper reports applications of Difference of Convex functions (DC) p...
read it

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee
Local Policy Search is a popular reinforcement learning approach for han...
read it

Offpolicy Learning with Eligibility Traces: A Survey
In the framework of Markov Decision Processes, offpolicy learning, that...
read it

A Dantzig Selector Approach to Temporal Difference Learning
LSTD is a popular algorithm for value function approximation. Whenever t...
read it

Approximate Modified Policy Iteration
Modified policy iteration (MPI) is a dynamic programming (DP) algorithm ...
read it
Matthieu Geist
is this you? claim profile