
-
Geometric Entropic Exploration
Exploration is essential for solving complex Reinforcement Learning (RL)...
read it
-
On the Approximation Relationship between Optimizing Ratio of Submodular (RS) and Difference of Submodular (DS) Functions
We demonstrate that from an algorithm guaranteeing an approximation fact...
read it
-
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs
We investigate the exploration of an unknown environment when no reward ...
read it
-
Game Plan: What AI can do for Football, and What Football can do for AI
The rapid progress in artificial intelligence (AI) and machine learning ...
read it
-
BYOL works even without batch statistics
Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach ...
read it
-
Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited
In this paper, we propose new problem-independent lower bounds on the sa...
read it
-
Fast active learning for pure exploration in reinforcement learning
Realistic environments often provide agents with very limited feedback. ...
read it
-
Monte-Carlo Tree Search as Regularized Policy Optimization
The combination of Monte-Carlo tree search (MCTS) with deep reinforcemen...
read it
-
A Provably Efficient Sample Collection Strategy for Reinforcement Learning
A common assumption in reinforcement learning (RL) is to have access to ...
read it
-
A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces
In this work, we propose KeRNS: an algorithm for episodic reinforcement ...
read it
-
Gamification of Pure Exploration for Linear Bandits
We investigate an active pure-exploration setting, that includes best-ar...
read it
-
Sampling from a k-DPP without looking at all items
Determinantal point processes (DPPs) are a useful probabilistic model fo...
read it
-
Stochastic bandits with arm-dependent delays
Significant work has been recently dedicated to the stochastic delayed b...
read it
-
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-su...
read it
-
Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits
We investigate stochastic combinatorial multi-armed bandit with semi-ban...
read it
-
Adaptive Reward-Free Exploration
Reward-free exploration is a reinforcement learning setting recently stu...
read it
-
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algo...
read it
-
Regret Bounds for Kernel-Based Reinforcement Learning
We consider the exploration-exploitation dilemma in finite-horizon reinf...
read it
-
Taylor Expansion Policy Optimization
In this work, we investigate the application of Taylor expansions in rei...
read it
-
Fast sampling from β-ensembles
We study sampling algorithms for β-ensembles with time complexity less t...
read it
-
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification
Gaussian processes (GP) are one of the most successful frameworks to mod...
read it
-
No-Regret Exploration in Goal-Oriented Reinforcement Learning
Many popular reinforcement learning problems (e.g., navigation in a maze...
read it
-
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification
We investigate and provide new insights on the sampling rule called Top-...
read it
-
Derivative-Free Order-Robust Optimisation
In this paper, we formalise order-robust optimisation as an instance of ...
read it
-
Multiagent Evaluation under Incomplete Information
This paper investigates the evaluation of learned multiagent strategies ...
read it
-
Exact sampling of determinantal point processes with sublinear time preprocessing
We study the complexity of sampling from a distribution over all index s...
read it
-
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
Gaussian processes (GP) are a popular Bayesian approach for the optimiza...
read it
-
Exploiting Structure of Uncertainty for Efficient Combinatorial Semi-Bandits
We improve the efficiency of algorithms for stochastic combinatorial sem...
read it
-
Optimistic optimization of a Brownian
We address the problem of optimizing a Brownian motion. We consider a (r...
read it
-
Rotting bandits are no harder than stochastic ones
In bandits, arms' distributions are stationary. This is often violated i...
read it
-
A simple parameter-free and adaptive approach to optimization under a minimal local smoothness assumption
We study the problem of optimizing a function under a budgeted number of...
read it
-
Compressing the Input for CNNs with the First-Order Scattering Transform
We study the first-order scattering transform as a candidate for reducin...
read it
-
DPPy: Sampling Determinantal Point Processes with Python
Determinantal point processes (DPPs) are specific probability distributi...
read it
-
Finding the Bandit in a Graph: Sequential Search-and-Stop
We consider the problem where an agent wants to find a hidden object tha...
read it
-
Distributed Adaptive Sampling for Kernel Matrix Approximation
Most kernel-based methods, such as kernel or Gaussian process regression...
read it
-
Second-Order Kernel Online Convex Optimization with Adaptive Sketching
Kernel online convex optimization (KOCO) is a framework combining the ex...
read it
-
Zonotope hit-and-run for efficient sampling from projection DPPs
Determinantal point processes (DPPs) are distributions over sets of item...
read it
-
Analysis of Kelner and Levin graph sparsification algorithm for a streaming setting
We derive a new proof to show that the incremental resparsification algo...
read it
-
Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
We study the stochastic online problem of learning to influence in a soc...
read it
-
Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning
While the harmonic function solution performs well in many semi-supervis...
read it
-
Simple regret for infinitely many armed bandits
We consider a stochastic bandit problem with infinitely many arms. In th...
read it
-
Learning to Act Greedily: Polymatroid Semi-Bandits
Many important optimization problems, such as the minimum spanning tree ...
read it
-
Finite-Time Analysis of Kernelised Contextual Bandits
We tackle the problem of online reward maximisation over a large finite ...
read it
-
Online Semi-Supervised Learning on Quantized Graphs
In this paper, we tackle the problem of online semi-supervised learning ...
read it