We explore the methodology and theory of reward-directed generation via
...
We study multi-agent general-sum Markov games with nonlinear function
ap...
Directed Evolution (DE), a landmark wet-lab method originated in 1960s,
...
Off-Policy Evaluation (OPE) serves as one of the cornerstones in
Reinfor...
Policy gradient (PG) estimation becomes a challenge when we are not allo...
The transition kernel of a continuous-state-action Markov decision proce...
Policy gradient gives rise to a rich class of reinforcement learning (RL...
We study online reinforcement learning for finite-horizon deterministic
...