DeepAI AI Chat
Log In Sign Up

A Deep Reinforcement Learning Approach for Constrained Online Logistics Route Assignment

by   Hao Zeng, et al.

As online shopping prevails and e-commerce platforms emerge, there is a tremendous number of parcels being transported every day. Thus, it is crucial for the logistics industry on how to assign a candidate logistics route for each shipping parcel properly as it leaves a significant impact on the total logistics cost optimization and business constraints satisfaction such as transit hub capacity and delivery proportion of delivery providers. This online route-assignment problem can be viewed as a constrained online decision-making problem. Notably, the large amount (beyond 10^5) of daily parcels, the variability and non-Markovian characteristics of parcel information impose difficulties on attaining (near-) optimal solution without violating constraints excessively. In this paper, we develop a model-free DRL approach named PPO-RA, in which Proximal Policy Optimization (PPO) is improved with dedicated techniques to address the challenges for route assignment (RA). The actor and critic networks use attention mechanism and parameter sharing to accommodate each incoming parcel with varying numbers and identities of candidate routes, without modeling non-Markovian parcel arriving dynamics since we make assumption of i.i.d. parcel arrival. We use recorded delivery parcel data to evaluate the performance of PPO-RA by comparing it with widely-used baselines via simulation. The results show the capability of the proposed approach to achieve considerable cost savings while satisfying most constraints.


Solving the Order Batching and Sequencing Problem using Deep Reinforcement Learning

In e-commerce markets, on time delivery is of great importance to custom...

Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

Deep reinforcement learning (DRL) has great potential for acquiring the ...

Energy Minimization in UAV-Aided Networks: Actor-Critic Learning for Constrained Scheduling Optimization

In unmanned aerial vehicle (UAV) applications, the UAV's limited energy ...

A Model-free Deep Reinforcement Learning Approach To Maneuver A Quadrotor Despite Single Rotor Failure

Ability to recover from faults and continue mission is desirable for man...

Model-Free Voltage Regulation of Unbalanced Distribution Network Based on Surrogate Model and Deep Reinforcement Learning

Accurate knowledge of the distribution system topology and parameters is...

Optimal Route Planning with Prioritized Task Scheduling for AUV Missions

This paper presents a solution to Autonomous Underwater Vehicles (AUVs) ...

Intelligent Warehouse Allocator for Optimal Regional Utilization

In this paper, we describe a novel solution to compute optimal warehouse...