Zhaoran Wang

research

∙ 07/08/2023

Contextual Dynamic Pricing with Strategic Buyers

Personalized pricing, which involves tailoring prices based on individua...

0 Pangpang Liu, et al. ∙

research

∙ 06/26/2023

A General Framework for Sequential Decision-Making under Adaptivity Constraints

We take the first step in studying general sequential decision-making un...

0 Nuoya Xiong, et al. ∙

research

∙ 05/31/2023

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning

We examine online safe multi-agent reinforcement learning using constrai...

0 Dongsheng Ding, et al. ∙

research

∙ 05/30/2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization

In this paper, we conduct a comprehensive study of In-Context Learning (...

0 Yufeng Zhang, et al. ∙

research

∙ 05/29/2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration

In online reinforcement learning (online RL), balancing exploration and ...

0 Zhihan Liu, et al. ∙

research

∙ 05/08/2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Policy optimization methods with function approximation are widely used ...

0 Yulai Zhao, et al. ∙

research

∙ 04/05/2023

Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice

As one of the most fundamental concepts in transportation science, Wardr...

0 Jiayang Li, et al. ∙

research

∙ 03/28/2023

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

Most offline reinforcement learning (RL) methods suffer from the trade-o...

0 Haoran Xu, et al. ∙

research

∙ 03/20/2023

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

We study the offline contextual bandit problem, where we aim to acquire ...

0 Siyu Chen, et al. ∙

research

∙ 02/20/2023

Differentiable Arbitrating in Zero-sum Markov Games

We initiate the study of how to perturb the reward in a zero-sum Markov ...

0 Jing Wang, et al. ∙

research

∙ 02/20/2023

Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints

In this paper, we develop an approximation scheme for solving bilevel pr...

0 Jiayang Li, et al. ∙

research

∙ 12/30/2022

An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models

With the attention mechanism, transformers achieve significant empirical...

0 Yufeng Zhang, et al. ∙

research

∙ 12/29/2022

Offline Policy Optimization in RL with Variance Regularizaton

Learning policies from fixed offline datasets is a key challenge to scal...

0 Riashat Islam, et al. ∙

research

∙ 12/23/2022

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

Motivated by the human-machine interaction such as training chatbots for...

0 Zuyue Fu, et al. ∙

research

∙ 12/19/2022

Policy learning "without” overlap: Pessimism and generalized empirical Bernstein's inequality

This paper studies offline policy learning, which aims at utilizing obse...

0 Ying Jin, et al. ∙

research

∙ 12/17/2022

Latent Variable Representation for Reinforcement Learning

Deep latent variable models have achieved significant empirical successe...

10 Tongzheng Ren, et al. ∙

research

∙ 10/19/2022

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design

We study reserve price optimization in multi-phase second price auctions...

1 Rui Ai, et al. ∙

research

∙ 09/29/2022

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

It is quite challenging to ensure the safety of reinforcement learning (...

5 Yixuan Wang, et al. ∙

research

∙ 09/20/2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

The cooperative Multi-A gent R einforcement Learning (MARL) with permuta...

1 Fengzhuo Zhang, et al. ∙

research

∙ 09/18/2022

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes

We study the offline reinforcement learning (RL) in the face of unmeasur...

5 Zuyue Fu, et al. ∙

research

∙ 09/15/2022

Differentiable Bilevel Programming for Stackelberg Congestion Games

A Stackelberg congestion game (SCG) is a bilevel program in which a lead...

0 Jiayang Li, et al. ∙

research

∙ 07/29/2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

In view of its power in extracting feature representation, contrastive s...

6 Shuang Qiu, et al. ∙

research

∙ 07/25/2022

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

While single-agent policy optimization in a fixed environment has attrac...

3 Shuang Qiu, et al. ∙

research

∙ 06/11/2022

Federated Offline Reinforcement Learning

Evidence-based or data-driven dynamic treatment regimes are essential fo...

0 Doudou Zhou, et al. ∙

research

∙ 05/26/2022

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes

We study offline reinforcement learning (RL) in partially observable Mar...

6 Miao Lu, et al. ∙

research

∙ 05/26/2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

Reinforcement learning in partially observed Markov decision processes (...

4 Lingxiao Wang, et al. ∙

research

∙ 05/23/2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation

We study human-in-the-loop reinforcement learning (RL) with trajectory p...

6 Xiaoyu Chen, et al. ∙

research

∙ 05/05/2022

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning

Dynamic mechanism design has garnered significant attention from both co...

1 Boxiang Lyu, et al. ∙

research

∙ 04/20/2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations

Despite the success of reinforcement learning (RL) for Markov decision p...

5 Qi Cai, et al. ∙

research

∙ 03/07/2022

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets

We study a Markov matching market involving a planner and a set of strat...

2 Yifei Min, et al. ∙

research

∙ 02/25/2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach

Dynamic mechanism design studies how mechanism designers should allocate...

15 Boxiang Lyu, et al. ∙

research

∙ 02/23/2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Offline Reinforcement Learning (RL) aims to learn policies from previous...

5 Chenjia Bai, et al. ∙

research

∙ 02/22/2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning

In today's economy, it becomes important for Internet platforms to consi...

10 Jibang Wu, et al. ∙

research

∙ 01/28/2022

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning

In model-based reinforcement learning for safety-critical control system...

2 Yixuan Wang, et al. ∙

research

∙ 12/28/2021

Exponential Family Model-Based Reinforcement Learning via Score Matching

We propose an optimistic model-based algorithm, dubbed SMRL, for finite-...

12 Gene Li, et al. ∙

research

∙ 12/27/2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

Actor-critic (AC) algorithms, empowered by neural networks, have had sig...

17 Yufeng Zhang, et al. ∙

research

∙ 12/27/2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

We study multi-player general-sum Markov games with one of the players d...

27 Han Zhong, et al. ∙

research

∙ 11/06/2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

We study risk-sensitive reinforcement learning (RL) based on the entropi...

0 Yingjie Fei, et al. ∙

research

∙ 10/24/2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning

Offline reinforcement learning (RL) aims to learn the optimal policy fro...

0 Zhihong Deng, et al. ∙

research

∙ 10/20/2021

Dynamic Bottleneck for Robust Self-Supervised Exploration

Exploration methods based on pseudo-count of transitions or curiosity of...

9 Chenjia Bai, et al. ∙

research

∙ 10/19/2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

To achieve sample efficiency in reinforcement learning (RL), it necessit...

0 Shuang Qiu, et al. ∙

research

∙ 10/04/2021

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima

To regulate a social system comprised of self-interested agents, economi...

0 Boyi Liu, et al. ∙

research

∙ 08/19/2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation

In generative adversarial imitation learning (GAIL), the agent aims to l...

0 Zhihan Liu, et al. ∙

research

∙ 08/08/2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning

The recent emergence of reinforcement learning has created a demand for ...

5 Pratik Ramprasad, et al. ∙

research

∙ 07/30/2021

Towards General Function Approximation in Zero-Sum Markov Games

This paper considers two-player zero-sum finite-horizon Markov games wit...

0 Baihe Huang, et al. ∙

research

∙ 07/06/2021

A Unified Off-Policy Evaluation Approach for General Value Function

General Value Function (GVF) is a powerful tool to represent both the pr...

0 Tengyu Xu, et al. ∙

research

∙ 07/01/2021

Gap-Dependent Bounds for Two-Player Markov Games

As one of the most popular methods in the field of reinforcement learnin...

0 Zehao Dou, et al. ∙

research

∙ 06/15/2021

Randomized Exploration for Reinforcement Learning with General Value Function Approximation

We propose a model-free reinforcement learning algorithm inspired by the...

0 Haque Ishfaq, et al. ∙

research

∙ 05/18/2021

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach

Multi-agent reinforcement learning (MARL) becomes more challenging in th...

0 Yan Li, et al. ∙

research

∙ 05/13/2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction

One principled approach for provably efficient exploration is incorporat...

3 Chenjia Bai, et al. ∙

Zhaoran Wang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro