Alessandro Lazaric

research

∙ 02/07/2023

Layered State Discovery for Incremental Autonomous Exploration

We study the autonomous exploration (AX) problem proposed by Lim Aue...

0 Liyu Chen, et al. ∙

research

∙ 01/05/2023

Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping

Developing agents that can execute multiple skills by learning from pre-...

0 Lina Mezghani, et al. ∙

research

∙ 12/19/2022

On the Complexity of Representation Learning in Contextual Linear Bandits

In contextual linear bandits, the reward function is assumed to be a lin...

0 Andrea Tirinzoni, et al. ∙

research

∙ 11/04/2022

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler

Active learning with strong and weak labelers considers a practical sett...

0 Yifang Chen, et al. ∙

research

∙ 10/24/2022

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees

We study the problem of representation learning in stochastic contextual...

0 Andrea Tirinzoni, et al. ∙

research

∙ 10/18/2022

Contextual bandits with concave rewards, and an application to fair ranking

We consider Contextual Bandits with Concave Rewards (CBCR), a multi-obje...

0 Virginie Do, et al. ∙

research

∙ 10/10/2022

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path

We study the sample complexity of learning an ϵ-optimal policy in the St...

0 Liyu Chen, et al. ∙

research

∙ 10/04/2022

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

We consider infinite-horizon discounted Markov decision processes and st...

0 Rui Yuan, et al. ∙

research

∙ 03/21/2022

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

In reinforcement learning, the graph Laplacian has proved to be a valuab...

0 Akram Erraqabi, et al. ∙

research

∙ 01/31/2022

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning

Recent progress in deep learning has relied on access to large and diver...

10 Denis Yarats, et al. ∙

research

∙ 01/30/2022

Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times

Computing a Gaussian process (GP) posterior has a computational cost cub...

0 Daniele Calandriello, et al. ∙

research

∙ 12/13/2021

Top K Ranking for Multi-Armed Bandit with Noisy Evaluations

We consider a multi-armed bandit setting where, at the beginning of each...

0 Evrard Garcelon, et al. ∙

research

∙ 12/02/2021

Differentially Private Exploration in Reinforcement Learning with Linear Representation

This paper studies privacy-preserving exploration in Markov Decision Pro...

0 Paul Luyo, et al. ∙

research

∙ 11/23/2021

Adaptive Multi-Goal Exploration

We introduce a generic strategy for provably efficient multi-goal explor...

0 Jean Tarbouriech, et al. ∙

research

∙ 10/27/2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

We study the role of the representation of state-action value functions ...

12 Matteo Papini, et al. ∙

research

∙ 10/27/2021

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching

Learning meaningful behaviors in the absence of reward is a difficult pr...

0 Pierre-Alexandre Kamienny, et al. ∙

research

∙ 07/23/2021

A general sample complexity analysis of vanilla policy gradient

The policy gradient (PG) is one of the most popular methods for solving ...

0 Rui Yuan, et al. ∙

research

∙ 07/20/2021

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm fo...

15 Denis Yarats, et al. ∙

research

∙ 06/24/2021

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs

We derive a novel asymptotic problem-dependent lower-bound for regret mi...

0 Andrea Tirinzoni, et al. ∙

research

∙ 06/22/2021

A Unified Framework for Conservative Exploration

We study bandits and reinforcement learning (RL) subject to a conservati...

0 Yunchang Yang, et al. ∙

research

∙ 04/22/2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

We study the problem of learning in the stochastic shortest path (SSP) s...

0 Jean Tarbouriech, et al. ∙

research

∙ 04/08/2021

Leveraging Good Representations in Linear Contextual Bandits

The linear contextual bandit literature is mostly focused on the design ...

5 Matteo Papini, et al. ∙

research

∙ 02/22/2021

Reinforcement Learning with Prototypical Representations

Learning effective representations in image-based environments is crucia...

21 Denis Yarats, et al. ∙

research

∙ 12/29/2020

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

We investigate the exploration of an unknown environment when no reward ...

0 Jean Tarbouriech, et al. ∙

research

∙ 10/23/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimis...

0 Andrea Tirinzoni, et al. ∙

research

∙ 08/18/2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

There has been growing progress on theoretical analyses for provably eff...

2 Andrea Zanette, et al. ∙

research

∙ 07/13/2020

Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

We study the exploration-exploitation dilemma in the linear quadratic re...

0 Marc Abeille, et al. ∙

research

∙ 07/13/2020

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

A common assumption in reinforcement learning (RL) is to have access to ...

10 Jean Tarbouriech, et al. ∙

research

∙ 07/10/2020

Improved Analysis of UCRL2 with Empirical Bernstein Inequality

We consider the problem of exploration-exploitation in communicating Mar...

0 Ronan Fruit, et al. ∙

research

∙ 06/22/2020

Sketched Newton-Raphson

We propose a new globally convergent stochastic second order method. Our...

0 Rui Yuan, et al. ∙

research

∙ 05/23/2020

A Novel Confidence-Based Algorithm for Structured Bandits

We study finite-armed stochastic bandits where the rewards of each arm m...

0 Andrea Tirinzoni, et al. ∙

research

∙ 05/18/2020

Meta-learning with Stochastic Linear Bandits

We investigate meta-learning procedures in the setting of stochastic lin...

0 Leonardo Cella, et al. ∙

research

∙ 05/06/2020

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization

We study the problem of learning exploration-exploitation strategies tha...

5 Pierre-Alexandre Kamienny, et al. ∙

research

∙ 03/06/2020

Active Model Estimation in Markov Decision Processes

We study the problem of efficient exploration in order to learn an accur...

25 Jean Tarbouriech, et al. ∙

research

∙ 02/29/2020

Learning Near Optimal Policies with Low Inherent Bellman Error

We study the exploration problem with approximate linear action-value fu...

15 Andrea Zanette, et al. ∙

research

∙ 02/23/2020

Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification

Gaussian processes (GP) are one of the most successful frameworks to mod...

2 Daniele Calandriello, et al. ∙

research

∙ 02/10/2020

Adversarial Attacks on Linear Contextual Bandits

Contextual bandit algorithms are applied in a wide range of domains, fro...

0 Evrard Garcelon, et al. ∙

research

∙ 02/08/2020

Improved Algorithms for Conservative Exploration in Bandits

In many fields such as digital marketing, healthcare, finance, and robot...

0 Evrard Garcelon, et al. ∙

research

∙ 02/08/2020

Conservative Exploration in Reinforcement Learning

While learning in an unknown Markov Decision Process (MDP), an agent sho...

0 Evrard Garcelon, et al. ∙

research

∙ 01/30/2020

Concentration Inequalities for Multinoulli Random Variables

We investigate concentration inequalities for Dirichlet and Multinomial ...

0 Jian Qian, et al. ∙

research

∙ 12/07/2019

No-Regret Exploration in Goal-Oriented Reinforcement Learning

Many popular reinforcement learning problems (e.g., navigation in a maze...

0 Jean Tarbouriech, et al. ∙

research

∙ 11/01/2019

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration

We consider the exploration-exploitation dilemma in finite-horizon reinf...

0 Andrea Zanette, et al. ∙

research

∙ 10/19/2019

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

Effective coordination is crucial to solve multi-agent collaborative (MA...

14 Nicolas Carion, et al. ∙

research

∙ 05/29/2019

Word-order biases in deep-agent emergent communication

Sequence-processing neural networks led to remarkable progress on many N...

0 Rahma Chaabouni, et al. ∙

research

∙ 03/13/2019

Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret

Gaussian processes (GP) are a popular Bayesian approach for the optimiza...

4 Daniele Calandriello, et al. ∙

research

∙ 02/28/2019

Active Exploration in Markov Decision Processes

We introduce the active exploration problem in Markov decision processes...

0 Jean Tarbouriech, et al. ∙

research

∙ 12/11/2018

Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes

We introduce and analyse two algorithms for exploration-exploitation in ...

0 Jian Qian, et al. ∙

research

∙ 11/27/2018

Rotting bandits are no harder than stochastic ones

In bandits, arms' distributions are stationary. This is often violated i...

0 Julien Seznec, et al. ∙

research

∙ 07/06/2018

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

While designing the state space of an MDP, it is common to include state...

0 Ronan Fruit, et al. ∙

research

∙ 03/27/2018

Distributed Adaptive Sampling for Kernel Matrix Approximation

Most kernel-based methods, such as kernel or Gaussian process regression...

0 Daniele Calandriello, et al. ∙

Alessandro Lazaric

Featured Co-authors

Sign in with Google

Consider DeepAI Pro