DeepAI AI Chat
Log In Sign Up

MOVI: A Model-Free Approach to Dynamic Fleet Management

by   Takuma Oda, et al.
Carnegie Mellon University

Modern vehicle fleets, e.g., for ridesharing platforms and taxi companies, can reduce passengers' waiting times by proactively dispatching vehicles to locations where pickup requests are anticipated in the future. Yet it is unclear how to best do this: optimal dispatching requires optimizing over several sources of uncertainty, including vehicles' travel times to their dispatched locations, as well as coordinating between vehicles so that they do not attempt to pick up the same passenger. While prior works have developed models for this uncertainty and used them to optimize dispatch policies, in this work we introduce a model-free approach. Specifically, we propose MOVI, a Deep Q-network (DQN)-based framework that directly learns the optimal vehicle dispatch policy. Since DQNs scale poorly with a large number of possible dispatches, we streamline our DQN training and suppose that each individual vehicle independently learns its own optimal policy, ensuring scalability at the cost of less coordination between vehicles. We then formulate a centralized receding-horizon control (RHC) policy to compare with our DQN policies. To compare these policies, we design and build MOVI as a large-scale realistic simulator based on 15 million taxi trip records that simulates policy-agnostic responses to dispatch decisions. We show that the DQN dispatch policy reduces the number of unserviced requests by 76 compared to the RHC approach, emphasizing the benefits of a model-free approach and suggesting that there is limited value to coordinating vehicle actions. This finding may help to explain the success of ridesharing platforms, for which drivers make individual decisions.


DeepPool: Distributed Model-free Algorithm for Ride-sharing using Deep Reinforcement Learning

The success of modern ride-sharing platforms crucially depends on the pr...

Sugestões de Rotas Personalizadas para Carrinheiros na Coleta Seletiva de Materiais Recicláveis

Carrinheiros are collectors of recyclable materials that use human-power...

Dual policy as self-model for planning

Planning is a data efficient decision-making strategy where an agent sel...

Optimizing Coordinated Vehicle Platooning: An Analytical Approach Based on Stochastic Dynamic Programming

Platooning connected and autonomous vehicles (CAVs) can improve traffic ...

Finite-Time Stabilization of Longitudinal Control for Autonomous Vehicles via a Model-Free Approach

This communication presents a longitudinal model-free control approach f...

Conditional Expectation based Value Decomposition for Scalable On-Demand Ride Pooling

Owing to the benefits for customers (lower prices), drivers (higher reve...

Memory-aware Online Compression of CAN Bus Data for Future Vehicular Systems

Vehicles generate a large amount of data from their internal sensors. Th...