research
∙
10/20/2022
Global Convergence of Direct Policy Search for State-Feedback ℋ_∞ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
Direct policy search has been widely applied in modern reinforcement lea...
research
∙
04/20/2022
Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
In this paper, we consider the policy evaluation problem in multi-agent ...
research
∙
02/14/2022