
Bootstrapped MetaLearning
Metalearning empowers artificial intelligence to increase its efficienc...
Emphatic Algorithms for Deep Reinforcement Learning
Offpolicy learning allows us to learn about possible policies of behavi...
Reward is enough for convex MDPs
Maximising a cumulative reward function that is Markov and stationary, i...
Online Apprenticeship Learning
In Apprenticeship Learning (AL), we are given a Markov Decision Process ...
Discovery of Options via MetaLearned Subgoals
Temporal abstractions in the form of options have been shown to help rei...
Online Limited Memory NeuralLinear Bandits with Likelihood Matching
We study neurallinear bandits for solving problems where both explorati...
Balancing Constraints and Rewards with MetaGradient D4PG
Deploying Reinforcement Learning (RL) agents to solve realworld applica...
Learning to Ask Medical Questions using Reinforcement Learning
We propose a novel reinforcement learningbased approach for adaptive an...
SelfTuning Deep Reinforcement Learning
Reinforcement learning (RL) algorithms often require expensive manual or...
Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an allinline system in a singleshot
We propose a simple allinline singleshot scheme for diagnostics of ul...
Apprenticeship Learning via FrankWolfe
We consider the applications of the FrankWolfe (FW) algorithm for Appre...
Inverse Reinforcement Learning in Contextual MDPs
We consider the Inverse Reinforcement Learning (IRL) problem in Contextu...
Average reward reinforcement learning with unknown mixing times
We derive and analyze learning algorithms for policy evaluation, apprent...
Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces
We propose a computationally efficient algorithm that combines compresse...
Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies
We consider a settings of hierarchical reinforcement learning, in which ...
Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching
We study the neurallinear bandit model for solving sequential decision...
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
Learning how to act when there are many available actions in each state ...
Deep Learning Reconstruction of UltraShort Pulses
Ultrashort laser pulses with femtosecond to attosecond pulse duration a...
Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies
In this work, we provide theoretical guarantees for reward decomposition...
Train on Validation: Squeezing the Data Lemon
Model selection on validation data is an essential step in machine learn...
Shallow Updates for Deep Reinforcement Learning
Deep reinforcement learning (DRL) methods such as the Deep QNetwork (DQ...
Is a picture worth a thousand words? A Deep MultiModal Fusion Architecture for Product Classification in ecommerce
Classifying products into categories precisely and efficiently is a majo...
Visualizing Dynamics: from tSNE to SEMIMDPs
Deep Reinforcement Learning (DRL) is a trending field of research, showi...
Deep Reinforcement Learning Discovers Internal Models
Deep Reinforcement Learning (DRL) is a trending field of research, showi...
Graying the black box: Understanding DQNs
In recent years there is a growing interest in using deep representation...
Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms
The question why deep learning algorithms generalize so well has attract...
Tom Zahavy
