Dhawal Gupta | DeepAI

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Mohammad Ghavamzadeh
73 publications
Martha White
65 publications
Craig Boutilier
59 publications
Mohit Iyyer
57 publications
Adam White
37 publications
Philip S. Thomas
36 publications
Yinlam Chow
31 publications
Yash Chandak
21 publications
Georgios Theocharous
17 publications
Simeng Sun
14 publications
Sina Ghiassian
13 publications

research

∙ 09/16/2023

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF

During the last stage of RLHF, a large language model is aligned to huma...

0 Simeng Sun, et al. ∙

research

∙ 05/16/2023

Coagent Networks: Generalized and Scaled

Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011...

0 James E. Kostas, et al. ∙

research

∙ 02/21/2023

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Reinforcement learning (RL) has shown great promise for developing dialo...

0 Dhawal Gupta, et al. ∙

research

∙ 07/01/2020

Gradient Temporal-Difference Learning with Regularized Corrections

It is still common to use Q-learning and temporal difference (TD) learni...

3 Sina Ghiassian, et al. ∙

Success!

An error occurred