
A Fully ProblemDependent Regret Lower Bound for FiniteHorizon MDPs
We derive a novel asymptotic problemdependent lowerbound for regret mi...
read it

A Unified Framework for Conservative Exploration
We study bandits and reinforcement learning (RL) subject to a conservati...
read it

Stochastic Shortest Path: Minimax, ParameterFree and Towards HorizonFree Regret
We study the problem of learning in the stochastic shortest path (SSP) s...
read it

Leveraging Good Representations in Linear Contextual Bandits
The linear contextual bandit literature is mostly focused on the design ...
read it

Homomorphically Encrypted Linear Contextual Bandit
Contextual bandit is a general framework for online learning in sequenti...
read it

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
We investigate the exploration of an unknown environment when no reward ...
read it

An Asymptotically Optimal PrimalDual Incremental Algorithm for Contextual Linear Bandits
In the contextual linear bandit setting, algorithms built on the optimis...
read it

Local Differentially Private Regret Minimization in Reinforcement Learning
Reinforcement learning algorithms are widely used in domains where it is...
read it

A Provably Efficient Sample Collection Strategy for Reinforcement Learning
A common assumption in reinforcement learning (RL) is to have access to ...
read it

Improved Analysis of UCRL2 with Empirical Bernstein Inequality
We consider the problem of explorationexploitation in communicating Mar...
read it

A KernelBased Approach to NonStationary Reinforcement Learning in Metric Spaces
In this work, we propose KeRNS: an algorithm for episodic reinforcement ...
read it

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization
We study the problem of learning explorationexploitation strategies tha...
read it

Regret Bounds for KernelBased Reinforcement Learning
We consider the explorationexploitation dilemma in finitehorizon reinf...
read it

Active Model Estimation in Markov Decision Processes
We study the problem of efficient exploration in order to learn an accur...
read it

ExplorationExploitation in Constrained MDPs
In many sequential decisionmaking problems, the goal is to optimize a u...
read it

Adversarial Attacks on Linear Contextual Bandits
Contextual bandit algorithms are applied in a wide range of domains, fro...
read it

Improved Algorithms for Conservative Exploration in Bandits
In many fields such as digital marketing, healthcare, finance, and robot...
read it

Conservative Exploration in Reinforcement Learning
While learning in an unknown Markov Decision Process (MDP), an agent sho...
read it

Concentration Inequalities for Multinoulli Random Variables
We investigate concentration inequalities for Dirichlet and Multinomial ...
read it

Exploiting Language Instructions for Interpretable and Compositional Reinforcement Learning
In this work, we present an alternative approach to making an agent comp...
read it

NoRegret Exploration in GoalOriented Reinforcement Learning
Many popular reinforcement learning problems (e.g., navigation in a maze...
read it

Frequentist Regret Bounds for Randomized LeastSquares Value Iteration
We consider the explorationexploitation dilemma in finitehorizon reinf...
read it

Smoothing Policies and Safe Policy Gradients
Policy gradient algorithms are among the best candidates for the much an...
read it

Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes
We introduce and analyse two algorithms for explorationexploitation in ...
read it

Near Optimal ExplorationExploitation in NonCommunicating Markov Decision Processes
While designing the state space of an MDP, it is common to include state...
read it

Stochastic VarianceReduced Policy Gradient
In this paper, we propose a novel reinforcement learning algorithm cons...
read it

Importance Weighted Transfer of Samples in Reinforcement Learning
We consider the transfer of experience samples (i.e., tuples < s, a, s',...
read it

Efficient BiasSpanConstrained ExplorationExploitation in Reinforcement Learning
We introduce SCAL, an algorithm designed to perform efficient exploratio...
read it

CostSensitive Approach to Batch Size Adaptation for Gradient Descent
In this paper, we propose a novel approach to automatically determine th...
read it

Multiobjective Reinforcement Learning with Continuous Pareto Frontier Approximation Supplementary Material
This document contains supplementary material for the paper "Multiobjec...
read it
Matteo Pirotta
is this you? claim profile