
What can I do here? A Theory of Affordances in Reinforcement Learning
Reinforcement learning algorithms usually assume that all actions are al...
Learning to Prove from Synthetic Theorems
A major challenge in applying machine learning to automated theorem prov...
A Brief Look at Generalization in Visual MetaReinforcement Learning
Due to the realization that deep reinforcement learning algorithms train...
Learning to cooperate: Emergent communication in multiagent navigation
Emergent communication in artificial agents has been studied to understa...
A Distributional Analysis of SamplingBased Reinforcement Learning Algorithms
We present a distributional approach to theoretical analyses of reinforc...
Interference and Generalization in Temporal Difference Learning
We study the link between generalization and interference in temporaldi...
Invariant Causal Prediction for Block MDPs
Generalization across environments is critical to the successful applica...
Policy Evaluation Networks
Many reinforcement learning algorithms use value functions to guide the ...
oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions
Explicit engineering of reward functions for given environments has been...
Valuedriven Hindsight Modelling
Value estimation is a critical component of the reinforcement learning (...
Provably efficient reconstruction of policy networks
Recent research has shown that learning policies parametrized by large ...
Options of Interest: Temporal Abstraction with Interest Functions
Temporal abstraction refers to the ability of an agent to use behaviours...
Shaping representations through communication: community size effect in artificial learning systems
Motivated by theories of language and communication that explain why com...
Marginalized State Distribution Entropy Regularization in Policy Optimization
Entropy regularization is used to get improved optimization performance ...
Doubly Robust OffPolicy ActorCritic Algorithms for Reinforcement Learning
We study the problem of offpolicy critic evaluation in several variants...
Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods
The policy gradient theorem is defined based on an objective with respec...
Hindsight Credit Assignment
We consider the problem of efficient credit assignment in reinforcement ...
Optioncritic in cooperative multiagent systems
In this paper, we investigate learning temporal abstractions in cooperat...
Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction
Textbased games are a natural challenge domain for deep reinforcement l...
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Learning and planning in partiallyobservable domains is one of the most...
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
Millions of blind and visuallyimpaired (BVI) people navigate urban envi...
Actor Critic with Differentially Private Critic
Reinforcement learning algorithms are known to be sample inefficient, an...
Augmenting learning using symmetry in a biologicallyinspired domain
Invariances to translation, rotation and other spatial transformations a...
Avoidance Learning Using Observational Reinforcement Learning
Imitation learning seeks to learn an expert policy from sampled demonstr...
Revisit Policy Optimization in Matrix Form
In tabular case, when the reward and environment dynamics are known, pol...
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Batch normalization has been widely used to improve optimization in deep...
Selfsupervised Learning of Distance Functions for GoalConditioned Reinforcement Learning
Goalconditioned policies are used in order to break down complex reinfo...
Neural Transfer Learning for Crybased Diagnosis of Perinatal Asphyxia
Despite continuing medical advances, the rate of newborn morbidity and m...
SVRG for Policy Evaluation with Fewer Gradient Evaluations
Stochastic variancereduced gradient (SVRG) is an optimization method or...
Break the Ceiling: Stronger Multiscale Deep Graph Convolutional Networks
Recently, neural network based approaches have achieved significant impr...
Recurrent Value Functions
Despite recent successes in Reinforcement Learning, valuebased methods ...
Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks
Diversity of environments is a key challenge that causes learned robotic...
Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials
The stochastic multiarmed bandit problem is a wellknown model for stud...
The Termination Critic
In this work, we consider the problem of autonomously discovering behavi...
ClusteringOriented Representation Learning with AttractiveRepulsive Loss
The standard loss function used to train neural network classifiers, cat...
OffPolicy Deep Reinforcement Learning without Exploration
Reinforcement learning traditionally considers the task of balancing exp...
Environments for Lifelong Reinforcement Learning
To achieve general artificial intelligence, reinforcement learning (RL) ...
The Barbados 2018 List of Open Issues in Continual Learning
We want to make progress toward artificial general intelligence, namely ...
Temporal Regularization in Markov Decision Process
Several applications of Reinforcement Learning suffer from instability d...
Combined Reinforcement Learning via Abstract Representations
In the quest for efficient and robust reinforcement learning methods, bo...
Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants
Extremely preterm infants often require endotracheal intubation and mech...
Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing
Extremely preterm infants commonly require intubation and invasive mecha...
A SemiMarkov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants
After birth, extremely preterm infants often require specialized respira...
Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation
Deep learning (DL) networks have recently been shown to outperform other...
Attend Before you Act: Leveraging human visual attention for continual learning
When humans perform a task, such as playing a game, they selectively pay...
Safe OptionCritic: Learning Safety in the OptionCritic Architecture
Designing hierarchical reinforcement learning algorithms that induce a n...
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
In this paper, we unravel a fundamental connection between weighted fini...
Resolving Event Coreference with Supervised Representation Learning and ClusteringOriented Regularization
We present an approach to event coreference resolution by developing a g...
Dyna Planning using a Feature Based Generative Model
Dynastyle reinforcement learning is a powerful approach for problems wh...
Learning Safe Policies with Expert Guidance
We propose a framework for ensuring safe behavior of a reinforcement lea...
