research
          
      
      ∙
      10/20/2022
    Global Convergence of Direct Policy Search for State-Feedback ℋ_∞ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential
Direct policy search has been widely applied in modern reinforcement lea...
          
            research
          
      
      ∙
      04/20/2022
    Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation
In this paper, we consider the policy evaluation problem in multi-agent ...
          
            research
          
      
      ∙
      02/14/2022
     
             
  
  
     
                             share
 share