
A Bit Better? Quantifying Information for Bandit Learning
The information ratio offers an approach to assessing the efficacy with ...
Qlearning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning
It has been a trend in the Reinforcement Learning literature to derive s...
Explicit MeanSquare Error Bounds for MonteCarlo and Linear Stochastic Approximation
This paper concerns error bounds for recursive equations subject to Mark...
Zap QLearning With Nonlinear Function Approximation
The Zap stochastic approximation (SA) algorithm was introduced recently ...
Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization
We consider a generic empirical composition optimization problem, where ...
Zap QLearning for Optimal Stopping Time Problems
We propose a novel reinforcement learning algorithm that approximates so...
Differential Temporal Difference Learning
Value functions derived from Markov decision processes arise as a centra...
Zap Meets Momentum: Stochastic Approximation Algorithms with Optimal Convergence Rate
There are two well known Stochastic Approximation techniques that are kn...
Adithya M. Devraj
