Zhuoran Yang

research

∙ 07/26/2023

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks

We study reinforcement learning (RL) for learning a Quantal Stackelberg ...

0 Siyu Chen, et al. ∙

research

∙ 07/08/2023

Contextual Dynamic Pricing with Strategic Buyers

Personalized pricing, which involves tailoring prices based on individua...

0 Pangpang Liu, et al. ∙

research

∙ 06/26/2023

A General Framework for Sequential Decision-Making under Adaptivity Constraints

We take the first step in studying general sequential decision-making un...

0 Nuoya Xiong, et al. ∙

research

∙ 06/21/2023

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

In this paper, we study representation learning in partially observable ...

0 Jiacheng Guo, et al. ∙

research

∙ 05/31/2023

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning

We examine online safe multi-agent reinforcement learning using constrai...

0 Dongsheng Ding, et al. ∙

research

∙ 05/30/2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization

In this paper, we conduct a comprehensive study of In-Context Learning (...

0 Yufeng Zhang, et al. ∙

research

∙ 05/29/2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

In online reinforcement learning (online RL), balancing exploration and ...

0 Zhihan Liu, et al. ∙

research

∙ 05/29/2023

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

Diffusion models have demonstrated highly-expressive generative capabili...

0 Haoran He, et al. ∙

research

∙ 05/29/2023

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism

In this paper, we study offline Reinforcement Learning with Human Feedba...

0 Zihao Li, et al. ∙

research

∙ 05/08/2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used ...

0 Yulai Zhao, et al. ∙

research

∙ 03/28/2023

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Most offline reinforcement learning (RL) methods suffer from the trade-o...

0 Haoran Xu, et al. ∙

research

∙ 03/20/2023

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

We study the offline contextual bandit problem, where we aim to acquire ...

0 Siyu Chen, et al. ∙

research

∙ 03/15/2023

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model

We study the incentivized information acquisition problem, where a princ...

0 Siyu Chen, et al. ∙

research

∙ 03/03/2023

Can We Find Nash Equilibria at a Linear Rate in Markov Games?

We study decentralized learning in two-player zero-sum discounted Markov...

0 Zhuoqing Song, et al. ∙

research

∙ 12/29/2022

Offline Policy Optimization in RL with Variance Regularizaton

Learning policies from fixed offline datasets is a key challenge to scal...

0 Riashat Islam, et al. ∙

research

∙ 12/23/2022

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

Motivated by the human-machine interaction such as training chatbots for...

0 Zuyue Fu, et al. ∙

research

∙ 12/19/2022

Policy learning "without” overlap: Pessimism and generalized empirical Bernstein's inequality

This paper studies offline policy learning, which aims at utilizing obse...

0 Ying Jin, et al. ∙

research

∙ 10/19/2022

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design

We study reserve price optimization in multi-phase second price auctions...

1 Rui Ai, et al. ∙

research

∙ 09/29/2022

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

It is quite challenging to ensure the safety of reinforcement learning (...

5 Yixuan Wang, et al. ∙

research

∙ 09/20/2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

The cooperative Multi-A gent R einforcement Learning (MARL) with permuta...

1 Fengzhuo Zhang, et al. ∙

research

∙ 09/18/2022

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes

We study the offline reinforcement learning (RL) in the face of unmeasur...

5 Zuyue Fu, et al. ∙

research

∙ 08/23/2022

Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments

We study offline reinforcement learning under a novel model called strat...

0 Mengxin Yu, et al. ∙

research

∙ 07/29/2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

In view of its power in extracting feature representation, contrastive s...

6 Shuang Qiu, et al. ∙

research

∙ 07/25/2022

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

While single-agent policy optimization in a fixed environment has attrac...

3 Shuang Qiu, et al. ∙

research

∙ 06/03/2022

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games

We study decentralized policy learning in Markov games where we control ...

14 Wenhao Zhan, et al. ∙

research

∙ 05/26/2022

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes

We study offline reinforcement learning (RL) in partially observable Mar...

6 Miao Lu, et al. ∙

research

∙ 05/26/2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

Reinforcement learning in partially observed Markov decision processes (...

4 Lingxiao Wang, et al. ∙

research

∙ 05/23/2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation

We study human-in-the-loop reinforcement learning (RL) with trajectory p...

6 Xiaoyu Chen, et al. ∙

research

∙ 05/05/2022

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

Dynamic mechanism design has garnered significant attention from both co...

1 Boxiang Lyu, et al. ∙

research

∙ 04/20/2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

Despite the success of reinforcement learning (RL) for Markov decision p...

5 Qi Cai, et al. ∙

research

∙ 03/07/2022

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

We study a Markov matching market involving a planner and a set of strat...

2 Yifei Min, et al. ∙

research

∙ 03/03/2022

The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches

In this paper, we study the problem of regret minimization for episodic ...

3 Grigoris Velegkas, et al. ∙

research

∙ 02/25/2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

Dynamic mechanism design studies how mechanism designers should allocate...

15 Boxiang Lyu, et al. ∙

research

∙ 02/23/2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Offline Reinforcement Learning (RL) aims to learn policies from previous...

5 Chenjia Bai, et al. ∙

research

∙ 02/22/2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

In today's economy, it becomes important for Internet platforms to consi...

10 Jibang Wu, et al. ∙

research

∙ 01/28/2022

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning

In model-based reinforcement learning for safety-critical control system...

2 Yixuan Wang, et al. ∙

research

∙ 12/28/2021

Exponential Family Model-Based Reinforcement Learning via Score Matching

We propose an optimistic model-based algorithm, dubbed SMRL, for finite-...

12 Gene Li, et al. ∙

research

∙ 12/27/2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Actor-critic (AC) algorithms, empowered by neural networks, have had sig...

17 Yufeng Zhang, et al. ∙

research

∙ 12/27/2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

We study multi-player general-sum Markov games with one of the players d...

27 Han Zhong, et al. ∙

research

∙ 11/06/2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

We study risk-sensitive reinforcement learning (RL) based on the entropi...

0 Yingjie Fei, et al. ∙

research

∙ 10/24/2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning

Offline reinforcement learning (RL) aims to learn the optimal policy fro...

0 Zhihong Deng, et al. ∙

research

∙ 10/19/2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

To achieve sample efficiency in reinforcement learning (RL), it necessit...

0 Shuang Qiu, et al. ∙

research

∙ 10/18/2021

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

We study episodic reinforcement learning (RL) in non-stationary linear k...

0 Han Zhong, et al. ∙

research

∙ 10/04/2021

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima

To regulate a social system comprised of self-interested agents, economi...

0 Boyi Liu, et al. ∙

research

∙ 08/19/2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

In generative adversarial imitation learning (GAIL), the agent aims to l...

0 Zhihan Liu, et al. ∙

research

∙ 08/08/2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning

The recent emergence of reinforcement learning has created a demand for ...

5 Pratik Ramprasad, et al. ∙

research

∙ 07/30/2021

Towards General Function Approximation in Zero-Sum Markov Games

This paper considers two-player zero-sum finite-horizon Markov games wit...

0 Baihe Huang, et al. ∙

research

∙ 07/06/2021

A Unified Off-Policy Evaluation Approach for General Value Function

General Value Function (GVF) is a powerful tool to represent both the pr...

0 Tengyu Xu, et al. ∙

research

∙ 07/01/2021

Gap-Dependent Bounds for Two-Player Markov Games

As one of the most popular methods in the field of reinforcement learnin...

0 Zehao Dou, et al. ∙

research

∙ 06/15/2021

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

We propose a model-free reinforcement learning algorithm inspired by the...

0 Haque Ishfaq, et al. ∙

Zhuoran Yang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro