Scalable Deep Reinforcement Learning for Ride-Hailing

09/27/2020
by   Jiekun Feng, et al.
0

Ride-hailing services, such as Didi Chuxing, Lyft, and Uber, arrange thousands of cars to meet ride requests throughout the day. We consider a Markov decision process (MDP) model of a ride-hailing service system, framing it as a reinforcement learning (RL) problem. The simultaneous control of many agents (cars) presents a challenge for the MDP optimization because the action space grows exponentially with the number of cars. We propose a special decomposition for the MDP actions by sequentially assigning tasks to the drivers. The new actions structure resolves the scalability problem and enables the use of deep RL algorithms for control policy optimization. We demonstrate the benefit of our proposed decomposition with a numerical experiment based on real data from Didi Chuxing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2019

Solving Continual Combinatorial Selection via Deep Reinforcement Learning

We consider the Markov Decision Process (MDP) of selecting a subset of i...
research
12/03/2015

Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

Many real-world problems come with action spaces represented as feature ...
research
09/27/2022

Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter

Experimental data is costly to obtain, which makes it difficult to calib...
research
02/15/2021

How RL Agents Behave When Their Actions Are Modified

Reinforcement learning in complex environments may require supervision t...
research
06/09/2022

An Optimization Method-Assisted Ensemble Deep Reinforcement Learning Algorithm to Solve Unit Commitment Problems

Unit commitment (UC) is a fundamental problem in the day-ahead electrici...
research
06/30/2020

MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

This paper introduces MDP homomorphic networks for deep reinforcement le...
research
10/30/2021

Adjacency constraint for efficient hierarchical reinforcement learning

Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promisin...

Please sign up or login with your details

Forgot password? Click here to reset