Reinforcement learning from human feedback (RLHF) can improve the qualit...
StarCraft II is one of the most challenging simulated reinforcement lear...
Deep neural networks are the most commonly used function approximators i...
Bayesian neural network (BNN) priors are defined in parameter space, mak...
We introduce a novel apprenticeship learning algorithm to learn an exper...
Much attention has been devoted recently to the development of machine
l...