Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous vehicles, classical model-based control methods suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. These methods include sampling-based algorithms, lattice-based algorithms and algorithms based on model predictive control (MPC). Recently, end-to-end trained deep neural networks were proposed to map camera images directly to steering control. These algorithms, however, a priori dismiss decades of vehicle dynamics modeling experience, which could be leveraged for control design. In this paper, a model-based reinforcement learning (RL) method is proposed for the training of feedforward controllers in the context of autonomous driving. Fundamental philosophy is to offline train on arbitrarily sophisticated models, while online cheaply evaluate a feedforward controller, thereby avoiding the need for online optimization. The contributions are, first, the discussion of two closed-loop control architectures, and, second, the proposition of a simple gradient-free algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). Therefore, a) simultaneous training on separate deterministic tasks with the purpose of encoding motion primitives in a neural network, and b) the employment of maximally sparse rewards in combinations with virtual actuator constraints on velocity in setpoint proximity are advocated. For feedforward controller parametrization, both fully connected (FC) and recurrent neural networks (RNNs) are used.
READ FULL TEXT