
Forethought and Hindsight in Credit Assignment
We address the problem of credit assignment in reinforcement learning an...
read it

Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning
In this paper, we present connections between three models used in diffe...
read it

A Fully Tensorized Recurrent Neural Network
Recurrent neural networks (RNNs) are powerful tools for sequential model...
read it

Reward Propagation Using Graph Convolutional Networks
Potentialbased reward shaping provides an approach for designing good r...
read it

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks
The core operation of Graph Neural Networks (GNNs) is the aggregation en...
read it

Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks
The performance limit of Graph Convolutional Networks (GCNs) and the fac...
read it

An Equivalence between Loss Functions and NonUniform Sampling in Experience Replay
Prioritized Experience Replay (PER) is a deep reinforcement learning tec...
read it

What can I do here? A Theory of Affordances in Reinforcement Learning
Reinforcement learning algorithms usually assume that all actions are al...
read it

Learning to Prove from Synthetic Theorems
A major challenge in applying machine learning to automated theorem prov...
read it

A Brief Look at Generalization in Visual MetaReinforcement Learning
Due to the realization that deep reinforcement learning algorithms train...
read it

Learning to cooperate: Emergent communication in multiagent navigation
Emergent communication in artificial agents has been studied to understa...
read it

A Distributional Analysis of SamplingBased Reinforcement Learning Algorithms
We present a distributional approach to theoretical analyses of reinforc...
read it

Interference and Generalization in Temporal Difference Learning
We study the link between generalization and interference in temporaldi...
read it

Invariant Causal Prediction for Block MDPs
Generalization across environments is critical to the successful applica...
read it

Policy Evaluation Networks
Many reinforcement learning algorithms use value functions to guide the ...
read it

oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions
Explicit engineering of reward functions for given environments has been...
read it

Valuedriven Hindsight Modelling
Value estimation is a critical component of the reinforcement learning (...
read it

Provably efficient reconstruction of policy networks
Recent research has shown that learning policies parametrized by large ...
read it

Options of Interest: Temporal Abstraction with Interest Functions
Temporal abstraction refers to the ability of an agent to use behaviours...
read it

Shaping representations through communication: community size effect in artificial learning systems
Motivated by theories of language and communication that explain why com...
read it

Marginalized State Distribution Entropy Regularization in Policy Optimization
Entropy regularization is used to get improved optimization performance ...
read it

Doubly Robust OffPolicy ActorCritic Algorithms for Reinforcement Learning
We study the problem of offpolicy critic evaluation in several variants...
read it

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
The policy gradient theorem is defined based on an objective with respec...
read it

Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
read it

Optioncritic in cooperative multiagent systems
In this paper, we investigate learning temporal abstractions in cooperat...
read it

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Textbased games are a natural challenge domain for deep reinforcement l...
read it

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Learning and planning in partiallyobservable domains is one of the most...
read it

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Millions of blind and visuallyimpaired (BVI) people navigate urban envi...
read it

Actor Critic with Differentially Private Critic
Reinforcement learning algorithms are known to be sample inefficient, an...
read it

Augmenting learning using symmetry in a biologicallyinspired domain
Invariances to translation, rotation and other spatial transformations a...
read it

Avoidance Learning Using Observational Reinforcement Learning
Imitation learning seeks to learn an expert policy from sampled demonstr...
read it

Revisit Policy Optimization in Matrix Form
In tabular case, when the reward and environment dynamics are known, pol...
read it

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Batch normalization has been widely used to improve optimization in deep...
read it

Selfsupervised Learning of Distance Functions for GoalConditioned Reinforcement Learning
Goalconditioned policies are used in order to break down complex reinfo...
read it

Neural Transfer Learning for Crybased Diagnosis of Perinatal Asphyxia
Despite continuing medical advances, the rate of newborn morbidity and m...
read it

SVRG for Policy Evaluation with Fewer Gradient Evaluations
Stochastic variancereduced gradient (SVRG) is an optimization method or...
read it

Break the Ceiling: Stronger Multiscale Deep Graph Convolutional Networks
Recently, neural network based approaches have achieved significant impr...
read it

Recurrent Value Functions
Despite recent successes in Reinforcement Learning, valuebased methods ...
read it

Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
Diversity of environments is a key challenge that causes learned robotic...
read it

Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials
The stochastic multiarmed bandit problem is a wellknown model for stud...
read it

The Termination Critic
In this work, we consider the problem of autonomously discovering behavi...
read it

ClusteringOriented Representation Learning with AttractiveRepulsive Loss
The standard loss function used to train neural network classifiers, cat...
read it

OffPolicy Deep Reinforcement Learning without Exploration
Reinforcement learning traditionally considers the task of balancing exp...
read it

Environments for Lifelong Reinforcement Learning
To achieve general artificial intelligence, reinforcement learning (RL) ...
read it

The Barbados 2018 List of Open Issues in Continual Learning
We want to make progress toward artificial general intelligence, namely ...
read it

Temporal Regularization in Markov Decision Process
Several applications of Reinforcement Learning suffer from instability d...
read it

Combined Reinforcement Learning via Abstract Representations
In the quest for efficient and robust reinforcement learning methods, bo...
read it

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants
Extremely preterm infants often require endotracheal intubation and mech...
read it

Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing
Extremely preterm infants commonly require intubation and invasive mecha...
read it

A SemiMarkov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants
After birth, extremely preterm infants often require specialized respira...
read it
Doina Precup
is this you? claim profile
Associate Professor School of Computer Science at McGill University