b'Michal Valko'

research

∙ 09/01/2023

Local and adaptive mirror descents in extensive-form games

We study how to learn ϵ-optimal strategies in zero-sum imperfect informa...

0 Côme Fiegel, et al. ∙

research

∙ 08/17/2023

Half-Hop: A graph upsampling approach for slowing down message passing

Message passing neural networks have shown a lot of success on graph-str...

0 Mehdi Azabou, et al. ∙

research

∙ 05/29/2023

VA-learning as a more efficient alternative to Q-learning

In reinforcement learning, the advantage function is critical for policy...

0 Yunhao Tang, et al. ∙

research

∙ 05/29/2023

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

Multi-step learning applies lookahead over multiple time steps and has p...

0 Yunhao Tang, et al. ∙

research

∙ 05/22/2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice

Mirror descent value iteration (MDVI), an abstraction of Kullback-Leible...

0 Toshinori Kitamura, et al. ∙

research

∙ 05/02/2023

Unlocking the Power of Representations in Long-term Novelty-based Exploration

We introduce Robust Exploration via Clustering-based Online Density Esti...

0 Alaa Saade, et al. ∙

research

∙ 04/06/2023

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

In this work, we derive sharp non-asymptotic deviation bounds for weight...

0 Denis Belomestny, et al. ∙

research

∙ 03/14/2023

Fast Rates for Maximum Entropy Exploration

We consider the reinforcement learning (RL) setting, in which the agent ...

0 Daniil Tiapkin, et al. ∙

research

∙ 12/23/2022

Adapting to game trees in zero-sum imperfect information games

Imperfect information games (IIG) are games in which each player only pa...

0 Côme Fiegel, et al. ∙

research

∙ 12/06/2022

Understanding Self-Predictive Learning for Reinforcement Learning

We study the learning dynamics of self-predictive learning for reinforce...

0 Yunhao Tang, et al. ∙

research

∙ 11/18/2022

Curiosity in hindsight

Consider the exploration in sparse-reward or reward-free environments, s...

0 Daniel Jarrett, et al. ∙

research

∙ 09/28/2022

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

We consider reinforcement learning in an environment modeled by an episo...

0 Daniil Tiapkin, et al. ∙

research

∙ 06/16/2022

BYOL-Explore: Exploration by Bootstrapped Prediction

We present BYOL-Explore, a conceptually simple yet general approach for ...

0 Zhaohan Daniel Guo, et al. ∙

research

∙ 05/27/2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal

In this work, we consider and analyze the sample complexity of model-fre...

6 Tadashi Kozuno, et al. ∙

research

∙ 05/16/2022

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

We propose the Bayes-UCBVI algorithm for reinforcement learning in tabul...

0 Daniil Tiapkin, et al. ∙

research

∙ 03/30/2022

Marginalized Operators for Off-policy Reinforcement Learning

In this work, we propose marginalized operators, a new class of off-poli...

0 Yunhao Tang, et al. ∙

research

∙ 02/17/2022

Retrieval-Augmented Reinforcement Learning

Most deep reinforcement learning (RL) algorithms distill experience into...

0 Anirudh Goyal, et al. ∙

research

∙ 01/30/2022

Scaling Gaussian Process Optimization by Evaluating a Few Unique Candidates Multiple Times

Computing a Gaussian process (GP) posterior has a computational cost cub...

0 Daniele Calandriello, et al. ∙

research

∙ 11/23/2021

Adaptive Multi-Goal Exploration

We introduce a generic strategy for provably efficient multi-goal explor...

0 Jean Tarbouriech, et al. ∙

research

∙ 11/03/2021

Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity

Meaningful and simplified representations of neural activity can yield i...

0 Ran Liu, et al. ∙

research

∙ 06/24/2021

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Model-agnostic meta-reinforcement learning requires estimating the Hessi...

0 Yunhao Tang, et al. ∙

research

∙ 06/11/2021

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall

We study the problem of learning a Nash equilibrium (NE) in an imperfect...

0 Tadashi Kozuno, et al. ∙

research

∙ 06/11/2021

Taylor Expansion of Discount Factors

In practical reinforcement learning (RL), the discount factor used for e...

0 Yunhao Tang, et al. ∙

research

∙ 04/22/2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

We study the problem of learning in the stochastic shortest path (SSP) s...

0 Jean Tarbouriech, et al. ∙

research

∙ 03/30/2021

Broaden Your Views for Self-Supervised Video Learning

Most successful self-supervised learning methods are trained to align th...

3 Adria Recasens, et al. ∙

research

∙ 03/01/2021

UCB Momentum Q-learning: Correcting the bias without forgetting

We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algo...

0 Pierre Ménard, et al. ∙

research

∙ 02/27/2021

Revisiting Peng's Q(λ) for Modern Reinforcement Learning

Off-policy multi-step reinforcement learning algorithms consist of conse...

0 Tadashi Kozuno, et al. ∙

research

∙ 02/19/2021

Mine Your Own vieW: Self-Supervised Learning Through Across-Sample Prediction

State-of-the-art methods for self-supervised learning (SSL) build repres...

0 Mehdi Azabou, et al. ∙

research

∙ 02/12/2021

Bootstrapped Representation Learning on Graphs

Current state-of-the-art self-supervised learning methods for graph neur...

0 Shantanu Thakoor, et al. ∙

research

∙ 01/06/2021

Geometric Entropic Exploration

Exploration is essential for solving complex Reinforcement Learning (RL)...

0 Zhaohan Daniel Guo, et al. ∙

research

∙ 01/05/2021

On the Approximation Relationship between Optimizing Ratio of Submodular (RS) and Difference of Submodular (DS) Functions

We demonstrate that from an algorithm guaranteeing an approximation fact...

0 Pierre Perrault, et al. ∙

research

∙ 12/29/2020

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

We investigate the exploration of an unknown environment when no reward ...

0 Jean Tarbouriech, et al. ∙

research

∙ 11/18/2020

Game Plan: What AI can do for Football, and What Football can do for AI

The rapid progress in artificial intelligence (AI) and machine learning ...

11 Karl Tuyls, et al. ∙

research

∙ 10/20/2020

BYOL works even without batch statistics

Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach ...

0 Pierre H. Richemond, et al. ∙

research

∙ 10/07/2020

Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited

In this paper, we propose new problem-independent lower bounds on the sa...

0 Omar Darwiche Domingues, et al. ∙

research

∙ 07/27/2020

Fast active learning for pure exploration in reinforcement learning

Realistic environments often provide agents with very limited feedback. ...

10 Pierre Ménard, et al. ∙

research

∙ 07/24/2020

Monte-Carlo Tree Search as Regularized Policy Optimization

The combination of Monte-Carlo tree search (MCTS) with deep reinforcemen...

5 Jean-Bastien Grill, et al. ∙

research

∙ 07/13/2020

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

A common assumption in reinforcement learning (RL) is to have access to ...

10 Jean Tarbouriech, et al. ∙

research

∙ 07/09/2020

A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces

In this work, we propose KeRNS: an algorithm for episodic reinforcement ...

51 Omar Darwiche Domingues, et al. ∙

research

∙ 07/02/2020

Gamification of Pure Exploration for Linear Bandits

We investigate an active pure-exploration setting, that includes best-ar...

0 Rémy Degenne, et al. ∙

research

∙ 06/30/2020

Sampling from a k-DPP without looking at all items

Determinantal point processes (DPPs) are a useful probabilistic model fo...

5 Daniele Calandriello, et al. ∙

research

∙ 06/18/2020

Stochastic bandits with arm-dependent delays

Significant work has been recently dedicated to the stochastic delayed b...

0 Anne Gael Manegueu, et al. ∙

research

∙ 06/13/2020

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-su...

0 Jean-Bastien Grill, et al. ∙

research

∙ 06/11/2020

Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

We investigate stochastic combinatorial multi-armed bandit with semi-ban...

0 Pierre Perrault, et al. ∙

research

∙ 06/11/2020

Adaptive Reward-Free Exploration

Reward-free exploration is a reinforcement learning setting recently stu...

3 Emilie Kaufmann, et al. ∙

research

∙ 06/10/2020

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algo...

10 Anders Jonsson, et al. ∙

research

∙ 04/12/2020

Regret Bounds for Kernel-Based Reinforcement Learning

We consider the exploration-exploitation dilemma in finite-horizon reinf...

0 Omar Darwiche Domingues, et al. ∙

research

∙ 03/13/2020

Taylor Expansion Policy Optimization

In this work, we investigate the application of Taylor expansions in rei...

37 Yunhao Tang, et al. ∙

research

∙ 03/04/2020

Fast sampling from β-ensembles

We study sampling algorithms for β-ensembles with time complexity less t...

0 Guillaume Gautier, et al. ∙

research

∙ 02/23/2020

Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification

Gaussian processes (GP) are one of the most successful frameworks to mod...

2 Daniele Calandriello, et al. ∙

Michal Valko

Featured Co-authors

Sign in with Google

Consider DeepAI Pro