Learning Reward Models for Cooperative Trajectory Planning with Inverse Reinforcement Learning and Monte Carlo Tree Search

02/14/2022
by   Karl Kurzer, et al.
0

Cooperative trajectory planning methods for automated vehicles, are capable to solve traffic scenarios that require a high degree of cooperation between traffic participants. In order for cooperative systems to integrate in human-centered traffic, it is important that the automated systems behave human-like, so that humans can anticipate the system's decisions. While Reinforcement Learning has made remarkable progress in solving the decision making part, it is non-trivial to parameterize a reward model that yields predictable actions. This work employs feature-based Maximum Entropy Inverse Reinforcement Learning in combination with Monte Carlo Tree Search to learn reward models that maximizes the likelihood of recorded multi-agent cooperative expert trajectories. The evaluation demonstrates that the approach is capable of recovering a reasonable reward model that mimics the expert and performs similar to a manually tuned baseline reward model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/10/2018

Decentralized Cooperative Planning for Automated Vehicles with Continuous Monte Carlo Tree Search

Urban traffic scenarios often require a high degree of cooperation betwe...
research
03/09/2022

Cooperative Trajectory Planning in Uncertain Environments with Monte Carlo Tree Search and Risk Metrics

Automated vehicles require the ability to cooperate with humans for smoo...
research
07/25/2018

Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search

Today's automated vehicles lack the ability to cooperate implicitly with...
research
02/02/2020

Accelerating Cooperative Planning for Automated Vehicles with Learned Heuristics and Monte Carlo Tree Search

Efficient driving in urban traffic scenarios requires foresight. The obs...
research
03/25/2021

MCTSteg: A Monte Carlo Tree Search-based Reinforcement Learning Framework for Universal Non-additive Steganography

Recent research has shown that non-additive image steganographic framewo...
research
08/24/2023

TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search

Digital twins for intelligent transportation systems are currently attra...
research
01/18/2010

A Monte Carlo Algorithm for Universally Optimal Bayesian Sequence Prediction and Planning

The aim of this work is to address the question of whether we can in pri...

Please sign up or login with your details

Forgot password? Click here to reset