High efficiency rl agent
Now a day, model free algorithm achieve state of art performance on many RL problems, but the low efficiency of model free algorithm limited the usage. We combine model base RL, soft actor-critic framework, and curiosity. proposed an agent called RMC, giving a promise way to achieve good performance while maintain data efficiency. We suppress the performance of SAC and achieve state of the art performance, both on efficiency and stability. Meanwhile we can solving POMDP problem and achieve great generalization from MDP to POMDP.
READ FULL TEXT