
Randomized Exploration for Reinforcement Learning with General Value Function Approximation
We propose a modelfree reinforcement learning algorithm inspired by the...
read it

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
Marginalized importance sampling (MIS), which measures the density ratio...
read it

Preferential Temporal Difference Learning
TemporalDifference (TD) learning is a general and very useful tool for ...
read it

Flow Network based Generative Models for NonIterative Diverse Candidate Generation
This paper is about the problem of learning a stochastic policy for gene...
read it

Correcting Momentum in Temporal Difference Learning
A common optimization tool used in deep reinforcement learning is moment...
read it

A ConsciousnessInspired Planning Agent for ModelBased Reinforcement Learning
We present an endtoend, modelbased deep reinforcement learning agent ...
read it

Improving LongTerm Metrics in Recommendation Systems using ShortHorizon Offline RL
We study sessionbased recommendation scenarios where we want to recomme...
read it

AndroidEnv: A Reinforcement Learning Platform for Android
We introduce AndroidEnv, an opensource platform for Reinforcement Learn...
read it

What is Going on Inside Recurrent Meta Reinforcement Learning Agents?
Recurrent meta reinforcement learning (metaRL) agents are agents that e...
read it

Training a FirstOrder Theorem Prover from Synthetic Data
A major challenge in applying machine learning to automated theorem prov...
read it

Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
A major challenge in reinforcement learning is the design of exploration...
read it

Towards Continual Reinforcement Learning: A Review and Perspectives
In this article, we aim to provide a literature review of different form...
read it

Gradient Starvation: A Learning Proclivity in Neural Networks
We identify and formalize a fundamental gradient descent phenomenon resu...
read it

DiversityEnriched OptionCritic
Temporal abstraction allows reinforcement learning agents to represent k...
read it

A Study of Policy Gradient on a Class of Exactly Solvable Models
Policy gradient methods are extensively used in reinforcement learning a...
read it

Forethought and Hindsight in Credit Assignment
We address the problem of credit assignment in reinforcement learning an...
read it

Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning
In this paper, we present connections between three models used in diffe...
read it

A Fully Tensorized Recurrent Neural Network
Recurrent neural networks (RNNs) are powerful tools for sequential model...
read it

Reward Propagation Using Graph Convolutional Networks
Potentialbased reward shaping provides an approach for designing good r...
read it

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks
The core operation of Graph Neural Networks (GNNs) is the aggregation en...
read it

Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks
The performance limit of Graph Convolutional Networks (GCNs) and the fac...
read it

An Equivalence between Loss Functions and NonUniform Sampling in Experience Replay
Prioritized Experience Replay (PER) is a deep reinforcement learning tec...
read it

What can I do here? A Theory of Affordances in Reinforcement Learning
Reinforcement learning algorithms usually assume that all actions are al...
read it

Learning to Prove from Synthetic Theorems
A major challenge in applying machine learning to automated theorem prov...
read it

A Brief Look at Generalization in Visual MetaReinforcement Learning
Due to the realization that deep reinforcement learning algorithms train...
read it

Learning to cooperate: Emergent communication in multiagent navigation
Emergent communication in artificial agents has been studied to understa...
read it

A Distributional Analysis of SamplingBased Reinforcement Learning Algorithms
We present a distributional approach to theoretical analyses of reinforc...
read it

Interference and Generalization in Temporal Difference Learning
We study the link between generalization and interference in temporaldi...
read it

Invariant Causal Prediction for Block MDPs
Generalization across environments is critical to the successful applica...
read it

Policy Evaluation Networks
Many reinforcement learning algorithms use value functions to guide the ...
read it

oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions
Explicit engineering of reward functions for given environments has been...
read it

Valuedriven Hindsight Modelling
Value estimation is a critical component of the reinforcement learning (...
read it

Provably efficient reconstruction of policy networks
Recent research has shown that learning policies parametrized by large ...
read it

Options of Interest: Temporal Abstraction with Interest Functions
Temporal abstraction refers to the ability of an agent to use behaviours...
read it

Shaping representations through communication: community size effect in artificial learning systems
Motivated by theories of language and communication that explain why com...
read it

Marginalized State Distribution Entropy Regularization in Policy Optimization
Entropy regularization is used to get improved optimization performance ...
read it

Doubly Robust OffPolicy ActorCritic Algorithms for Reinforcement Learning
We study the problem of offpolicy critic evaluation in several variants...
read it

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
The policy gradient theorem is defined based on an objective with respec...
read it

Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
read it

Optioncritic in cooperative multiagent systems
In this paper, we investigate learning temporal abstractions in cooperat...
read it

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Textbased games are a natural challenge domain for deep reinforcement l...
read it

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Learning and planning in partiallyobservable domains is one of the most...
read it

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Millions of blind and visuallyimpaired (BVI) people navigate urban envi...
read it

Actor Critic with Differentially Private Critic
Reinforcement learning algorithms are known to be sample inefficient, an...
read it

Augmenting learning using symmetry in a biologicallyinspired domain
Invariances to translation, rotation and other spatial transformations a...
read it

Avoidance Learning Using Observational Reinforcement Learning
Imitation learning seeks to learn an expert policy from sampled demonstr...
read it

Revisit Policy Optimization in Matrix Form
In tabular case, when the reward and environment dynamics are known, pol...
read it

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Batch normalization has been widely used to improve optimization in deep...
read it

Selfsupervised Learning of Distance Functions for GoalConditioned Reinforcement Learning
Goalconditioned policies are used in order to break down complex reinfo...
read it

Neural Transfer Learning for Crybased Diagnosis of Perinatal Asphyxia
Despite continuing medical advances, the rate of newborn morbidity and m...
read it
Doina Precup
is this you? claim profile
Associate Professor School of Computer Science at McGill University