A Deep Value-network Based Approach for Multi-Driver Order Dispatching

by   Xiaocheng Tang, et al.

Recent works on ride-sharing order dispatching have highlighted the importance of taking into account both the spatial and temporal dynamics in the dispatching process for improving the transportation system efficiency. At the same time, deep reinforcement learning has advanced to the point where it achieves superhuman performance in a number of fields. In this work, we propose a deep reinforcement learning based solution for order dispatching and we conduct large scale online A/B tests on DiDi's ride-dispatching platform to show that the proposed method achieves significant improvement on both total driver income and user experience related metrics. In particular, we model the ride dispatching problem as a Semi Markov Decision Process to account for the temporal aspect of the dispatching actions. To improve the stability of the value iteration with nonlinear function approximators like neural networks, we propose Cerebellar Value Networks (CVNet) with a novel distributed state representation layer. We further derive a regularized policy evaluation scheme for CVNet that penalizes large Lipschitz constant of the value network for additional robustness against adversarial perturbation and noises. Finally, we adapt various transfer learning methods to CVNet for increased learning adaptability and efficiency across multiple cities. We conduct extensive offline simulations based on real dispatching data as well as online AB tests through the DiDi's platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods. We finally show that the performance can be further improved through the efficient use of transfer learning.



There are no comments yet.


page 1

page 2

page 3

page 4


Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms

Large ride-hailing platforms, such as DiDi, Uber and Lyft, connect tens ...

DRAG: Deep Reinforcement Learning Based Base Station Activation in Heterogeneous Networks

Heterogeneous Network (HetNet), where Small cell Base Stations (SBSs) ar...

Secure Your Ride: Real-time Matching Success Rate Prediction for Passenger-Driver Pairs

In recent years, online ride-hailing platforms have become an indispensa...

Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR Control in Active Distribution Networks

Model-based Vol/VAR optimization method is widely used to eliminate volt...

Knowledge Transfer in Deep Reinforcement Learning for Slice-Aware Mobility Robustness Optimization

The legacy mobility robustness optimization (MRO) in self-organizing net...

Rebalancing Dockless Bike Sharing Systems

Bike sharing provides an environment-friendly way for traveling and is b...

Pattern Transfer Learning for Reinforcement Learning in Order Dispatching

Order dispatch is one of the central problems to ride-sharing platforms....
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.