-
Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints
In the optimization of dynamical systems, the variables typically have c...
read it
-
Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints
In the optimization of dynamic systems, the variables typically have con...
read it
-
Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent
Gradient descent and its variants are widely used in machine learning. H...
read it

Qinbo Bai
is this you? claim profile