We study the stochastic convergence of the Cesàro mean of a sequence of
...
We consider the model selection task in the stochastic contextual bandit...
We propose the Generalized Policy Elimination (GPE) algorithm, an
oracle...
We study the problem of off-policy evaluation (OPE) in Reinforcement Lea...
Empirical risk minimization over classes functions that are bounded for ...
Empirical risk minimization over sieves of the class F of cadlag
functio...