
Learning Zerosum Stochastic Games with Posterior Sampling
In this paper, we propose Posterior Sampling Reinforcement Learning for ...
Online Learning for Cooperative MultiPlayer MultiArmed Bandits
We introduce a framework for decentralized online learning for multiarm...
Implicit FiniteHorizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path
We introduce a generic template for developing regret minimization algor...
Online Learning for Stochastic Shortest Path Model via Posterior Sampling
We consider the problem of online reinforcement learning for the Stochas...
Online Learning for Unknown Partially Observable MDPs
Solving Partially Observable Markov Decision Processes (POMDPs) is hard....
Learning Infinitehorizon Averagereward MDPs with Linear Function Approximation
We develop several new algorithms for learning Markov Decision Processes...
A Modelfree Learning Algorithm for Infinitehorizon Averagereward MDPs with Nearoptimal Regret
Recently, modelfree reinforcement learning has attracted research atten...
Modelfree Reinforcement Learning in Infinitehorizon Averagereward Markov Decision Processes
Modelfree reinforcement learning is known to be memory and computation ...
PPD: Permutation Phase Defense Against Adversarial Examples in Deep Learning
Deep neural networks have demonstrated cutting edge performance on vario...
