Dynamic optimization of mean and variance in Markov decision processes (...
CVaR (Conditional Value at Risk) is a risk metric widely used in finance...
Among the reasons hindering reinforcement learning (RL) applications to
...
We present a novel method to estimate the dominant eigenvalue and eigenv...
Keeping risk under control is often more crucial than maximizing expecte...
This paper studies the risk-averse mean-variance optimization in
infinit...
Most of reinforcement learning algorithms optimize the discounted criter...
Semi-Markov model is one of the most general models for stochastic dynam...
We study a finite-horizon two-person zero-sum risk-sensitive stochastic ...
This paper investigates the optimization problem of an infinite stage
di...
The option framework has shown great promise by automatically extracting...
Search in social networks such as Facebook poses different challenges th...
The generative adversarial imitation learning (GAIL) has provided an
adv...
Most of reinforcement learning (RL) algorithms aim at maximizing the
exp...
We study an offline multi-action policy learning algorithm based on doub...
Markov decision processes (MDPs) in queues and networks have been an
int...
In this paper, we use a Markov decision process to find optimal asynchro...
By analyzing energy-efficient management of data centers, this paper pro...