We revisit the domain of off-policy policy optimization in RL from the
p...
Many real-world continuous control problems are in the dilemma of weighi...
This paper presents a reinforcement learning (RL) framework that utilize...
Reward-biased maximum likelihood estimation (RBMLE) is a classic princip...
Policy optimization is a fundamental principle for designing reinforceme...
Whittle index policy is a powerful tool to obtain asymptotically optimal...
Bayesian optimization (BO) conventionally relies on handcrafted acquisit...
Action-constrained reinforcement learning (RL) is a widely-used approach...
Modifying the reward-biased maximum likelihood method originally propose...
With the explosive growth of online products and content, recommendation...
This paper presents a Brownian-approximation framework to optimize the
q...
This paper presents a Brownian-approximation framework to optimize the
q...
We propose BMLE, a new family of bandit algorithms, that are formulated ...
Recently, considerable research attention has been paid to network embed...
Although shown to be useful in many areas as models for solving sequenti...
We study the scheduling polices for asymptotically optimal delay in queu...