Massively Scalable Inverse Reinforcement Learning in Google Maps

05/18/2023
by   Matt Barnes, et al.
0

Optimizing for humans' latent preferences is a grand challenge in route recommendation, where globally-scalable solutions remain an open problem. Although past work created increasingly general solutions for the application of inverse reinforcement learning (IRL), these have not been successfully scaled to world-sized MDPs, large datasets, and highly parameterized models; respectively hundreds of millions of states, trajectories, and parameters. In this work, we surpass previous limitations through a series of advancements focused on graph compression, parallelization, and problem initialization based on dominant eigenvectors. We introduce Receding Horizon Inverse Planning (RHIP), which generalizes existing work and enables control of key performance trade-offs via its planning horizon. Our policy achieves a 16-24 in global route quality, and, to our knowledge, represents the largest instance of IRL in a real-world setting to date. Our results show critical benefits to more sustainable modes of transportation (e.g. two-wheelers), where factors beyond journey time (e.g. route safety) play a substantial role. We conclude with ablations of key components, negative results on state-of-the-art eigenvalue solvers, and identify future opportunities to improve scalability via IRL-specific batching strategies.

READ FULL TEXT

page 7

page 14

page 16

page 22

research
02/04/2019

Bayesian method for evaluation an airline profitability on the base components of Airline Route Planning

Airline route planning takes into account the factors of commercial and ...
research
06/18/2022

Deep Inverse Reinforcement Learning for Route Choice Modeling

Route choice modeling, i.e., the process of estimating the likely path t...
research
07/13/2023

On the Effective Horizon of Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) algorithms often rely on (forward) ...
research
06/09/2022

Receding Horizon Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) seeks to infer a cost function that...
research
07/11/2020

Polestar: An Intelligent, Efficient and National-Wide Public Transportation Routing Engine

Public transportation plays a critical role in people's daily life. It h...
research
10/09/2020

Scalable Many-Objective Pathfinding Benchmark Suite

Route planning also known as pathfinding is one of the key elements in l...

Please sign up or login with your details

Forgot password? Click here to reset