Chengzhuo Ni | DeepAI

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Csaba Szepesvari
116 publications
Mengdi Wang
91 publications
Lin F. Yang
65 publications
Chi Jin
61 publications
Minshuo Chen
31 publications
Xuezhou Zhang
28 publications
Xiang Ji
27 publications
Hui Yuan
25 publications
Huazheng Wang
22 publications
Anru Zhang
20 publications
Junyu Zhang
18 publications

research

∙ 07/13/2023

Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement

We explore the methodology and theory of reward-directed generation via ...

0 Hui Yuan, et al. ∙

research

∙ 10/30/2022

Representation Learning for General-sum Low-rank Markov Games

We study multi-agent general-sum Markov games with nonlinear function ap...

0 Chengzhuo Ni, et al. ∙

research

∙ 06/05/2022

Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization

Directed Evolution (DE), a landmark wet-lab method originated in 1960s, ...

0 Hui Yuan, et al. ∙

research

∙ 02/10/2022

Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory

Off-Policy Evaluation (OPE) serves as one of the cornerstones in Reinfor...

0 Ruiqi Zhang, et al. ∙

research

∙ 01/31/2022

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Policy gradient (PG) estimation becomes a challenge when we are not allo...

0 Chengzhuo Ni, et al. ∙

research

∙ 05/03/2021

Learning Good State and Action Representations via Tensor Decomposition

The transition kernel of a continuous-state-action Markov decision proce...

8 Chengzhuo Ni, et al. ∙

research

∙ 02/17/2021

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Policy gradient gives rise to a rich class of reinforcement learning (RL...

0 Junyu Zhang, et al. ∙

research

∙ 05/05/2019

Learning to Control in Metric Space with Optimal Regret

We study online reinforcement learning for finite-horizon deterministic ...

0 Lin F. Yang, et al. ∙

Success!

An error occurred