b'Simon S. Du'

research

∙ 07/25/2023

Settling the Sample Complexity of Online Reinforcement Learning

A central issue lying at the heart of online reinforcement learning (RL)...

0 Zihan Zhang, et al. ∙

research

∙ 06/15/2023

Active Representation Learning for General Task Space with Applications in Robotics

Representation learning based on multi-task pretraining has become a pow...

0 Yifang Chen, et al. ∙

research

∙ 06/12/2023

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

We investigate learning the equilibria in non-stationary multi-agent sys...

0 Haozhe Jiang, et al. ∙

research

∙ 06/05/2023

Improved Active Multi-Task Representation Learning via Lasso

To leverage the copious amount of data from source tasks and overcome th...

0 Yiping Wang, et al. ∙

research

∙ 02/20/2023

Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron

We revisit the problem of learning a single neuron with ReLU activation ...

0 Weihang Xu, et al. ∙

research

∙ 02/07/2023

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation

We propose a new model, independent linear Markov game, for multi-agent ...

0 Qiwen Cui, et al. ∙

research

∙ 02/03/2023

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

We study stochastic delayed feedback in general multi-agent sequential d...

0 Yunchang Yang, et al. ∙

research

∙ 01/31/2023

Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments

We study variance-dependent regret bounds for Markov decision processes ...

0 Runlong Zhou, et al. ∙

research

∙ 01/27/2023

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

It is believed that Gradient Descent (GD) induces an implicit bias towar...

0 Jikai Jin, et al. ∙

research

∙ 10/24/2022

Offline congestion games: How feedback type affects data coverage requirement

This paper investigates when one can efficiently recover an approximate ...

0 Haozhe Jiang, et al. ∙

research

∙ 10/20/2022

Horizon-Free Reinforcement Learning for Latent Markov Decision Processes

We study regret minimization for reinforcement learning (RL) in Latent M...

0 Runlong Zhou, et al. ∙

research

∙ 10/19/2022

On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness

Generalization in Reinforcement Learning (RL) aims to learn an agent dur...

0 Haotian Ye, et al. ∙

research

∙ 10/04/2022

Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

We consider infinite-horizon discounted Markov decision processes and st...

0 Rui Yuan, et al. ∙

research

∙ 10/03/2022

Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games

Multi-Agent Reinforcement Learning (MARL) – where multiple agents learn ...

0 Shicong Cen, et al. ∙

research

∙ 09/07/2022

Blessing of Class Diversity in Pre-training

This paper presents a new statistical analysis aiming to explain the rec...

0 Yulai Zhao, et al. ∙

research

∙ 06/30/2022

Denoised MDPs: Learning World Models Better Than the World Itself

The ability to separate signal from noise, and reason with clean abstrac...

8 Tongzhou Wang, et al. ∙

research

∙ 06/17/2022

Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization

We consider the smooth convex-concave bilinearly-coupled saddle-point pr...

0 Simon S. Du, et al. ∙

research

∙ 06/04/2022

Learning in Congestion Games with Bandit Feedback

Learning Nash equilibria is a central problem in multi-agent systems. In...

0 Qiwen Cui, et al. ∙

research

∙ 06/01/2022

On Gap-dependent Bounds for Offline Reinforcement Learning

This paper presents a systematic study on gap-dependent sample complexit...

0 Xinqi Wang, et al. ∙

research

∙ 06/01/2022

Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus

This paper considers offline multi-agent reinforcement learning. We prop...

0 Qiwen Cui, et al. ∙

research

∙ 05/31/2022

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

While multitask representation learning has become a popular approach in...

4 Rui Lu, et al. ∙

research

∙ 05/26/2022

Variance-Aware Sparse Linear Bandits

It is well-known that the worst-case minimax regret for sparse linear ba...

0 Yan Dai, et al. ∙

research

∙ 03/29/2022

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

We give novel algorithms for multi-task and lifelong linear bandits with...

0 Jiaqi Yang, et al. ∙

research

∙ 03/24/2022

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

This paper gives the first polynomial-time algorithm for tabular Markov ...

0 Zihan Zhang, et al. ∙

research

∙ 02/11/2022

Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems

Over the recent years, reinforcement learning (RL) has shown impressive ...

9 Runlong Zhou, et al. ∙

research

∙ 02/04/2022

TransFollower: Long-Sequence Car-Following Trajectory Prediction through Transformer

Car-following refers to a control process in which the following vehicle...

0 Meixin Zhu, et al. ∙

research

∙ 02/02/2022

Active Multi-Task Representation Learning

To leverage the power of big data from source tasks and overcome the sca...

14 Yifang Chen, et al. ∙

research

∙ 01/26/2022

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

Reward-free reinforcement learning (RL) considers the setting where the ...

0 Andrew Wagenmaker, et al. ∙

research

∙ 01/10/2022

When is Offline Two-Player Zero-Sum Markov Game Solvable?

We study what dataset assumption permits solving offline two-player zero...

0 Qiwen Cui, et al. ∙

research

∙ 12/21/2021

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

Policy optimization methods are one of the most widely used classes of R...

0 Tianhao Wu, et al. ∙

research

∙ 12/07/2021

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

Obtaining first-order regret bounds – regret bounds scaling not as the w...

0 Andrew Wagenmaker, et al. ∙

research

∙ 10/11/2021

Towards Demystifying Representation Learning with Non-contrastive Self-supervision

Non-contrastive methods of self-supervised learning (such as BYOL and Si...

2 Xiang Wang, et al. ∙

research

∙ 07/01/2021

Gap-Dependent Bounds for Two-Player Markov Games

As one of the most popular methods in the field of reinforcement learnin...

0 Zehao Dou, et al. ∙

research

∙ 06/27/2021

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization

We study the asymmetric low-rank factorization problem: min_𝐔∈ℝ^m ×...

16 Tian Ye, et al. ∙

research

∙ 06/22/2021

A Unified Framework for Conservative Exploration

We study bandits and reinforcement learning (RL) subject to a conservati...

0 Yunchang Yang, et al. ∙

research

∙ 06/21/2021

Corruption Robust Active Learning

We conduct theoretical studies on streaming-based active learning for bi...

0 Yifang Chen, et al. ∙

research

∙ 06/15/2021

On the Power of Multitask Representation Learning in Linear MDP

While multitask representation learning has become a popular approach in...

0 Rui Lu, et al. ∙

research

∙ 06/12/2021

Provable Adaptation across Multiway Domains via Representation Learning

This paper studies zero-shot domain adaptation where each domain is inde...

0 Zhili Feng, et al. ∙

research

∙ 04/22/2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

We study the problem of learning in the stochastic shortest path (SSP) s...

0 Jean Tarbouriech, et al. ∙

research

∙ 03/25/2021

Nearly Horizon-Free Offline Reinforcement Learning

We revisit offline reinforcement learning on episodic time-homogeneous t...

0 Tongzheng Ren, et al. ∙

research

∙ 03/19/2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL

This work introduces Bilinear Classes, a new structural framework, which...

52 Simon S. Du, et al. ∙

research

∙ 02/19/2021

Randomized Exploration is Near-Optimal for Tabular MDP

We study exploration using randomized value functions in Thompson Sampli...

0 Zhihan Xiong, et al. ∙

research

∙ 02/17/2021

Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games

Policy gradient methods are widely used in solving two-player zero-sum g...

11 Yulai Zhao, et al. ∙

research

∙ 02/13/2021

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

We study episodic reinforcement learning under unknown adversarial corru...

0 Yifang Chen, et al. ∙

research

∙ 02/09/2021

Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap

This paper presents a new model-free algorithm for episodic finite-horiz...

0 Haike Xu, et al. ∙

research

∙ 01/29/2021

Variance-Aware Confidence Set: Variance-Dependent Bound for Linear Bandits and Horizon-Free Bound for Linear Mixture MDP

We show how to construct variance-aware confidence sets for linear bandi...

0 Zihan Zhang, et al. ∙

research

∙ 01/02/2021

A Provably Efficient Algorithm for Linear Markov Decision Process with Low Switching Cost

Many real-world applications, such as those in medical domains, recommen...

0 Minbo Gao, et al. ∙

research

∙ 10/12/2020

Nearly Minimax Optimal Reward-free Reinforcement Learning

We study the reward-free reinforcement learning framework, which is part...

0 Zihan Zhang, et al. ∙

research

∙ 09/28/2020

Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon

Episodic reinforcement learning and contextual bandits are two widely st...

0 Zihan Zhang, et al. ∙

research

∙ 09/24/2020

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

We study how neural networks trained by gradient descent extrapolate, i....

68 Keyulu Xu, et al. ∙

Simon S. Du

Featured Co-authors

Sign in with Google

Consider DeepAI Pro