A Deep Reinforcement Learning Approach for Constrained Online Logistics Route Assignment

by   Hao Zeng, et al.

As online shopping prevails and e-commerce platforms emerge, there is a tremendous number of parcels being transported every day. Thus, it is crucial for the logistics industry on how to assign a candidate logistics route for each shipping parcel properly as it leaves a significant impact on the total logistics cost optimization and business constraints satisfaction such as transit hub capacity and delivery proportion of delivery providers. This online route-assignment problem can be viewed as a constrained online decision-making problem. Notably, the large amount (beyond 10^5) of daily parcels, the variability and non-Markovian characteristics of parcel information impose difficulties on attaining (near-) optimal solution without violating constraints excessively. In this paper, we develop a model-free DRL approach named PPO-RA, in which Proximal Policy Optimization (PPO) is improved with dedicated techniques to address the challenges for route assignment (RA). The actor and critic networks use attention mechanism and parameter sharing to accommodate each incoming parcel with varying numbers and identities of candidate routes, without modeling non-Markovian parcel arriving dynamics since we make assumption of i.i.d. parcel arrival. We use recorded delivery parcel data to evaluate the performance of PPO-RA by comparing it with widely-used baselines via simulation. The results show the capability of the proposed approach to achieve considerable cost savings while satisfying most constraints.


Solving the Order Batching and Sequencing Problem using Deep Reinforcement Learning

In e-commerce markets, on time delivery is of great importance to custom...

Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

Deep reinforcement learning (DRL) has great potential for acquiring the ...

Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization

In unmanned aerial vehicle (UAV) applications, the UAV's limited energy ...

Re-route Package Pickup and Delivery Planning with Random Demands

Recently, a higher competition in logistics business introduces new chal...

A Survey on Service Route and Time Prediction in Instant Delivery: Taxonomy, Progress, and Prospects

Instant delivery services, such as food delivery and package delivery, h...

A Model-free Deep Reinforcement Learning Approach To Maneuver A Quadrotor Despite Single Rotor Failure

Ability to recover from faults and continue mission is desirable for man...

Efficient Neural Neighborhood Search for Pickup and Delivery Problems

We present an efficient Neural Neighborhood Search (N2S) approach for pi...

Please sign up or login with your details

Forgot password? Click here to reset