
A Bit Better? Quantifying Information for Bandit Learning
The information ratio offers an approach to assessing the efficacy with ...
read it

Qlearning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning
It has been a trend in the Reinforcement Learning literature to derive s...
read it

Explicit MeanSquare Error Bounds for MonteCarlo and Linear Stochastic Approximation
This paper concerns error bounds for recursive equations subject to Mark...
read it

Zap QLearning With Nonlinear Function Approximation
The Zap stochastic approximation (SA) algorithm was introduced recently ...
read it

Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization
We consider a generic empirical composition optimization problem, where ...
read it

Zap QLearning for Optimal Stopping Time Problems
We propose a novel reinforcement learning algorithm that approximates so...
read it

Differential Temporal Difference Learning
Value functions derived from Markov decision processes arise as a centra...
read it

Zap Meets Momentum: Stochastic Approximation Algorithms with Optimal Convergence Rate
There are two well known Stochastic Approximation techniques that are kn...
read it
Adithya M. Devraj
is this you? claim profile