Mengdi Wang

research

∙ 09/15/2023

Deep Reinforcement Learning for Efficient and Fair Allocation of Health Care Resources

Scarcity of health care resources could result in the unavoidable conseq...

0 Yikuan Li, et al. ∙

research

∙ 08/03/2023

Aligning Agent Policy with Externalities: Reward Design via Bilevel RL

In reinforcement learning (RL), a reward function is often assumed at th...

0 Souradip Chakraborty, et al. ∙

research

∙ 07/26/2023

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks

We study reinforcement learning (RL) for learning a Quantal Stackelberg ...

0 Siyu Chen, et al. ∙

research

∙ 07/24/2023

Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems

A crucial task in decision-making problems is reward engineering. It is ...

0 Xiang Ji, et al. ∙

research

∙ 07/13/2023

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

We explore the methodology and theory of reward-directed generation via ...

0 Hui Yuan, et al. ∙

research

∙ 07/06/2023

Sample-Efficient Learning of POMDPs with Multiple Observations In Hindsight

This paper studies the sample-efficiency of learning in Partially Observ...

0 Jiacheng Guo, et al. ∙

research

∙ 07/05/2023

Scaling In-Context Demonstrations with Structured Attention

The recent surge of large language models (LLMs) highlights their abilit...

0 Tianle Cai, et al. ∙

research

∙ 07/04/2023

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

Convolutional residual neural networks (ConvResNets), though overparamet...

0 Kaiqi Zhang, et al. ∙

research

∙ 06/26/2023

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Existing theories on deep nonparametric regression have shown that when ...

0 Zixuan Zhang, et al. ∙

research

∙ 06/22/2023

Visual Adversarial Examples Jailbreak Large Language Models

Recently, there has been a surge of interest in introducing vision into ...

0 Xiangyu Qi, et al. ∙

research

∙ 06/21/2023

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

In this paper, we study representation learning in partially observable ...

0 Jiacheng Guo, et al. ∙

research

∙ 06/13/2023

Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective

Off-policy Learning to Rank (LTR) aims to optimize a ranker from data co...

7 Zeyu Zhang, et al. ∙

research

∙ 06/02/2023

Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations

In real-world reinforcement learning (RL) systems, various forms of impa...

5 Minshuo Chen, et al. ∙

research

∙ 05/30/2023

Adversarial Attacks on Online Learning to Rank with Stochastic Click Models

We propose the first study of adversarial attacks on online learning to ...

0 Zichen Wang, et al. ∙

research

∙ 05/29/2023

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism

In this paper, we study offline Reinforcement Learning with Human Feedba...

0 Zihao Li, et al. ∙

research

∙ 05/23/2023

Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

Latest developments in computer hardware, sensor technologies, and artif...

0 Efe Bozkir, et al. ∙

research

∙ 05/23/2023

ChipGPT: How far are we from natural language hardware design

As large language models (LLMs) like ChatGPT exhibited unprecedented mac...

0 Kaiyan Chang, et al. ∙

research

∙ 02/20/2023

Deep Reinforcement Learning for Cost-Effective Medical Diagnosis

Dynamic diagnosis is desirable when medical tests are costly or time-con...

0 Zheng Yu, et al. ∙

research

∙ 02/14/2023

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Diffusion models achieve state-of-the-art performance in various generat...

0 Minshuo Chen, et al. ∙

research

∙ 01/28/2023

STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

Directed Exploration is a crucial challenge in reinforcement learning (R...

19 Souradip Chakraborty, et al. ∙

research

∙ 12/01/2022

Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP

This work considers the sample complexity of obtaining an ε-optimal poli...

0 Jinghan Wang, et al. ∙

research

∙ 11/02/2022

Energy System Digitization in the Era of AI: A Three-Layered Approach towards Carbon Neutrality

The transition towards carbon-neutral electricity is one of the biggest ...

0 Le Xie, et al. ∙

research

∙ 10/30/2022

Representation Learning for General-sum Low-rank Markov Games

We study multi-agent general-sum Markov games with nonlinear function ap...

0 Chengzhuo Ni, et al. ∙

research

∙ 10/03/2022

Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient

Offline reinforcement learning, which aims at optimizing sequential deci...

0 Ming Yin, et al. ∙

research

∙ 06/29/2022

Provably Efficient Reinforcement Learning for Online Adaptive Influence Maximization

Online influence maximization aims to maximize the influence spread of a...

3 Kaixuan Huang, et al. ∙

research

∙ 06/22/2022

Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks

Bilevel optimization have gained growing interests, with numerous applic...

0 Shuoguang Yang, et al. ∙

research

∙ 06/10/2022

Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality

Goal-oriented Reinforcement Learning, where the agent needs to reach the...

0 Ming Yin, et al. ∙

research

∙ 06/10/2022

Communication Efficient Distributed Learning for Kernelized Contextual Bandits

We tackle the communication efficiency challenge of learning kernelized ...

0 Chuanhao Li, et al. ∙

research

∙ 06/06/2022

Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks

We consider the off-policy evaluation problem of reinforcement learning ...

0 Xiang Ji, et al. ∙

research

∙ 06/05/2022

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization

Directed Evolution (DE), a landmark wet-lab method originated in 1960s, ...

0 Hui Yuan, et al. ∙

research

∙ 06/01/2022

Byzantine-Robust Online and Offline Distributed Reinforcement Learning

We consider a distributed reinforcement learning setting where multiple ...

0 Yiding Chen, et al. ∙

research

∙ 05/29/2022

Provable Benefits of Representational Transfer in Reinforcement Learning

We study the problem of representational transfer in RL, where an agent ...

7 Alekh Agarwal, et al. ∙

research

∙ 05/23/2022

Parameter-Efficient Sparsity for Large Language Models Fine-Tuning

With the dramatically increased number of parameters in language models,...

0 Yuchao Li, et al. ∙

research

∙ 03/11/2022

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

Offline reinforcement learning, which seeks to utilize offline/historica...

0 Ming Yin, et al. ∙

research

∙ 02/10/2022

Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory

Off-Policy Evaluation (OPE) serves as one of the cornerstones in Reinfor...

0 Ruiqi Zhang, et al. ∙

research

∙ 01/31/2022

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Policy gradient (PG) estimation becomes a challenge when we are not allo...

0 Chengzhuo Ni, et al. ∙

research

∙ 01/31/2022

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach

We present BRIEE (Block-structured Representation learning with Interlea...

10 Xuezhou Zhang, et al. ∙

research

∙ 09/24/2021

Optimal policy evaluation using kernel-based temporal difference methods

We study methods based on reproducing kernel Hilbert spaces for estimati...

1 Yaqi Duan, et al. ∙

research

∙ 07/16/2021

Boosting the Convergence of Reinforcement Learning-based Auto-pruning Using Historical Data

Recently, neural network compression schemes like channel pruning have b...

9 Jiandong Mu, et al. ∙

research

∙ 06/15/2021

On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

Reinforcement learning is a framework for interactive decision-making wi...

0 Amrit Singh Bedi, et al. ∙

research

∙ 06/04/2021

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient

Despite superior performance on various natural language processing task...

7 Shaokun Zhang, et al. ∙

research

∙ 05/31/2021

1×N Block Pattern for Network Sparsity

Though network sparsity emerges as a promising direction to overcome the...

8 Mingbao Lin, et al. ∙

research

∙ 05/29/2021

MARL with General Utilities via Decentralized Shadow Reward Actor-Critic

We posit a new mechanism for cooperation in multi-agent reinforcement le...

0 Junyu Zhang, et al. ∙

research

∙ 05/24/2021

Towards Compact CNNs via Collaborative Compression

Channel pruning and tensor decomposition have received extensive attenti...

0 Yuchao Li, et al. ∙

research

∙ 05/17/2021

Thin-Film Smoothed Particle Hydrodynamics Fluid

We propose a particle-based method to simulate thin-film fluid that join...

0 Mengdi Wang, et al. ∙

research

∙ 05/03/2021

Learning Good State and Action Representations via Tensor Decomposition

The transition kernel of a continuous-state-action Markov decision proce...

8 Chengzhuo Ni, et al. ∙

research

∙ 02/17/2021

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Policy gradient gives rise to a rich class of reinforcement learning (RL...

0 Junyu Zhang, et al. ∙

research

∙ 02/06/2021

Bootstrapping Statistical Inference for Off-Policy Evaluation

Bootstrapping provides a flexible and effective approach for assessing t...

0 Botao Hao, et al. ∙

research

∙ 11/09/2020

On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces

The classical theory of reinforcement learning (RL) has focused on tabul...

9 Zhuoran Yang, et al. ∙

research

∙ 11/08/2020

High-Dimensional Sparse Linear Bandits

Stochastic linear bandits with high-dimensional sparse features are a pr...

0 Botao Hao, et al. ∙

Mengdi Wang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro