Intelligent Warehouse Allocator for Optimal Regional Utilization

07/09/2020
by   Girish Sathyanarayana, et al.
Myntra
2

In this paper, we describe a novel solution to compute optimal warehouse allocations for fashion inventory. Procured inventory must be optimally allocated to warehouses in proportion to the regional demand around the warehouse. This will ensure that demand is fulfilled by the nearest warehouse thereby minimizing the delivery logistics cost and delivery times. These are key metrics to drive profitability and customer experience respectively. Warehouses have capacity constraints and allocations must minimize inter warehouse redistribution cost of the inventory. This leads to maximum Regional Utilization (RU). We use machine learning and optimization methods to build an efficient solution to this warehouse allocation problem. We use machine learning models to estimate the geographical split of the demand for every product. We use Integer Programming methods to compute the optimal feasible warehouse allocations considering the capacity constraints. We conduct a back-testing by using this solution and validate the efficiency of this model by demonstrating a significant uptick in two key metrics Regional Utilization (RU) and Percentage Two-day-delivery (2DD). We use this process to intelligently create purchase orders with warehouse assignments for Myntra, a leading online fashion retailer.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

02/04/2018

Optimal Stochastic Delivery Planning in Full-Truckload and Less-Than-Truckload Delivery

With an increasing demand from emerging logistics businesses, Vehicle Ro...
07/09/2018

Fair Task Allocation in Crowdsourced Delivery

Faster and more cost-efficient, crowdsourced delivery is needed to meet ...
04/24/2021

A Deep Reinforcement Learning Approach for the Meal Delivery Problem

We consider a meal delivery service fulfilling dynamic customer requests...
05/16/2022

Learning-Based Orchestration for Dynamic Functional Split and Resource Allocation in vRANs

One of the key benefits of virtualized radio access networks (vRANs) is ...
11/06/2020

Optimal Resource and Demand Redistribution for Healthcare Systems Under Stress from COVID-19

When facing an extreme stressor, such as the COVID-19 pandemic, healthca...
05/04/2021

Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets

We assess the demand effects of discounts on train tickets issued by the...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Buying and Replenishment is a major, throughout the year activity in large fashion e-commerce firms. Fast changing trends, seasonality, and just-in-time inventory models make inventory procurement an everyday activity. Hundreds of Purchase Orders (POs) are raised every day to procure styles of different article types from vendors. Once the buying quantity is fixed, typically using a demand forecasting model, the next important task is to distribute the procured inventory optimally between the principal warehouses. Logistics cost is a significant cost component in e-commerce operations and reduction in this by a few percentage points will result in cost savings of the order of millions of dollars. This will drive companies towards profitability. Optimal warehouse stocking leads to lower logistics costs and faster delivery of products and hence improved customer experience. An intelligent solution, which can optimally distribute the inventory between warehouses, is of great value.

The distribution must mirror the geographical demand split. The quantity of the product (SKU) that gets located in a warehouse must be in proportion to the regional demand of the product. This is the localised demand from the areas in the vicinity of the warehouse. Every postal pincode is mapped to the nearest warehouse. Sets of pincodes mapped to the same warehouse partition the whole region into geographical clusters with an associated unique nearest warehouse. The quantity that gets located in a given warehouse must be in proportion to the demand of the associated cluster. The problem is to determine optimal stocking levels for these warehouses.

Warehouses have capacity constraints and old inventory which must be accounted while determining the new stocking quantities. Geographical distribution of the demand can be different for every SKU.

2. Related Work

Optimal allocation of resources is a well-studied problem across e-commerce and supply chain management. There is literature (stadtler2005supply) for solving problems in production planning, distribution planning, and transport planning using optimization methods. Problems in logistics are modelled using costs such as inventory holding cost, transport cost, etc, and have realistic fixed limits to the variables. (almeder2009simulation) have modelled Supply Chain Networks using a general framework to support the operational decisions using a combination of an optimization model and discrete-event simulation.

Although some variables are linearly relaxed, the addition of integer constraints like knapsack constraints or capacity in volume or number makes the problem partly discrete and hence needs Mixed Integer Programming methods.

There has been work using Integer Programming methods to allocate resources optimally between Brick and Mortar and online stores (bhatnagar2014allocating). (lejeune2008integer)

discuss in detail about a production and distribution problem with boolean and general integer variables, formulated as an Integer Linear Program.

(lee2008mixed) study an inventory planning problem where the procurement of items and their quantity is modelled using a mixed 0/1 integer program.

3. Problem Statement and Solution Outline

The objective is to determine the optimal stocking quantity of each SKU for individual warehouses which will maximise regional utilization. Regional Utilization (RU) is defined as the fraction of total demand which is fulfilled from the nearest warehouse. In cases when warehouses cannot inward the recommended quantities due to capacity constraints, the next nearest warehouse must be prioritised to minimise the cost of redistribution.

We formalise these requirements by defining a penalty matrix for redistribution costs and expressing the requirements as a constrained optimization problem. Table 1 lists down all the variables and notation used in the paper. We can formally state the problem as follows.

3.1. Problem Statement

Given a Purchase Order (PO) specification: which is just a set of pairs for unique SKUs and warehouses with capacities and existing inventory of the SKUs in warehouses , compute the optimal warehouse allocations where is the number of allocated to warehouse . denotes the number of not assigned to any warehouse. This is optionally allowed to handle cases where total SKU quantity is more than the combined capacity of all warehouses. Allocations must satisfy the following constraints.

(1)
(2)

3.2. Solution Outline

The solution comprises 2 steps.

  1. Ideal Splits Computation: Predicting the ideal split at the SKU level by a split prediction model and generating an ideal warehouse allocation given the order quantity from a demand model for every SKU assuming unlimited capacity in warehouses.

  2. Optimal Feasible Allocations: Finding optimal feasible allocations considering warehouse capacity constraints by defining an allocation penalty matrix and solving the constrained optimization problem.

In the following sections, we will describe the above 2 steps and results in detail.

Purchase Order Specification of size
A tuple of
Quantity of to be inwarded
The total number of unique SKUs in the
Number of Warehouses
Exploded Purchase Order Specification of size
Total quantity of all SKUs in
Existing Inventory Matrix
The existing quantity of in warehouse

Warehouse Capacity Vector

Capacity of warehouse
Optimal Warehouse Allocation Matrix
The quantity of to be inwarded to warehouse for
The quantity of not assigned to any warehouse

Ideal Split Probability Matrix

The probability that the purchase event for is located in geographic region for which nearest warehouse is
Ideal Split Matrix
The ideal quantity of to be inwarded to warehouse
Binary Decision Variable Matrix (formulation 1)
1 if is inwarded to warehouse else 0 (formulation 1)
Redistribution Cost Penalty Matrix
The penalty for placing an item which is ideally assigned to warehouse in warehouse for
Truncated Penalty Matrix
A vector of 1s
Table 1. Notation of Variables and Constants in the paper

4. Ideal Splits Computation

Given a purchase order specification: and warehouses, we compute ideal splits among warehouses for each :

(3)

We assume unlimited capacity in warehouses. Ideal splits are computed by estimating the split probabilities
for each defined as follows.

(4)

We estimate these probabilities by learning a classifier that predicts the warehouse probabilities using predictors like attributes of the style like colour, brand, fabric, size, age group, price, sleeve length, neck-type, etc. We use around 10 - 12 relevant attributes for each article type and gender group.

Every purchase event on the platform is associated with a postal pin code that is mapped to the nearest warehouse. This is used to create labelled training records on which the classifier is trained. We tried different classifiers like logistic regression, tree-based classifiers and feed-forward neural network classifiers. Table

2 compares the performance of these classifiers. The neural network model outperforms other classifiers and we use a 3-layer MLP for each article type, gender combination in our application. Table 3 lists the log loss for these classifiers.

Classifier Men’s T-shirts Dresses
Baseline 3-month mean estimate 1.395 1.474
Logistic Regression 1.356 1.392
Random Forest 1.283 1.317
Neural Network 1.040 1.062
Table 2. Ideal Split Classifier Log-loss Comparison
Article Type Gender Log Loss
Sweaters Women 0.660428
Jackets Women 0.701047
Jackets Men 0.728329
Sweaters Men 0.781279
Blazers Men 0.855442
Suits Men 0.879383
Sweatshirts Women 0.884996
Casual Shoes Women 0.924080
Sports Shoes Women 0.926811
Sweatshirts Men 0.938906
Sports Shoes Men 0.955780
Heels Women 0.984653
Casual Shoes Unisex 0.986829
Jeans Women 0.990782
Handbags Women 0.991535
Jeggings Women 0.993759
Trolley Bag Unisex 0.999156
Kurta Sets Women 1.000210
Trackpants Men 1.005823
Formal Shoes Men 1.014115
Table 3.

Log Loss for the three layer multi layer perceptron for various article types and genders

Using split probabilities, ideal split for is computed using Eqn. 5

(5)

If we have existing inventory for the SKUs, being the quantity of existing inventory for in warehouse , we compute the ideal splits by solving the following system of equations.

(6)
(7)
(8)

Solving these equations may sometimes result in negative values for ideal splits which is obviously not admissible. In such cases, we can compute ideal splits by solving a constrained optimization problem with constraints ensuring that ideal splits are non-negative. It suffices to say that using the classifier output, we can always compute the ideal splits .

Ideal splits are rounded to the nearest integer and any difference in total quantity arising due to rounding is offset using heuristic based rules. At the end of step 1, we have the ideal warehouse allocation matrix

. In the next step, we optimally impose capacity constraints on ideal splits.

5. Optimal Feasible Allocations

In this step, we optimally impose warehouse capacity constraints on ideal splits computed in the previous section. Optimality is with regard to the inter warehouse redistribution task. We formalise this by defining a redistribution cost penalty matrix where

is the penalty for placing an item which is ideally assigned to warehouse to warehouse . We set these penalties to mirror logistics cost involved in fulfilling an order from warehouse when the nearest warehouse is warehouse .

Non-assignment is allowed to handle cases where total number of items in the PO is greater than the combined capacity of all warehouses. The column of represents the non-assigment penalty (Eqn . 9).

(9)

However, we make sure that assignments are always preferred over non-assignments by setting .

Optimization algorithms will always choose assignment over non-assignment because the non-assignment penalty is greater than any assignment penalty.

The key idea behind these optimization formulations is the following. Warehouse allocation implied by ideal splits need not be always feasible because of warehouse capacity constraints. By defining a redistribution penalty matrix, we can compute an optimal re-arrangement of ideal allocations which will be capacity feasible. We can think of the final allocation as minimum distortion from the ideal splits allocation which is feasible with distortion costs defined by the penalty matrix.

Having defined the redistribution cost penalty matrix , we came up with two novel formulations of the optimization problem. One is a Binary Integer Programming formulation using item-level decision variables and the other is a standard Integer Programming formulation using level decision variables. Both are equivalent from an optimality perspective but differ in computational complexity. In the following subsections, we will describe both the formulations.

5.1. Binary Integer Programming Formulation

Given a PO specification , we can define an exploded PO Specification item set by repeating number of times represented as .

(10)

are identical SKUs of type . Size of the exploded item set is

For every item in the exploded PO item set, we have an ideal warehouse assignment from ideal split computation from Step 1 (Sec. 4). Warehouse assignments implied by ideal splits are not unique. Different permutations of assignments leading to the same allocation quantities are possible. Different permutations of this warehouse assignments are all equivalent from the point of view of optimization. We can choose any one of them.

Ideal assignment for exploded PO item set =
where denotes the warehouse assigned to item . We can represent different warehouses by K-dimensional one-hot vectors like . If an item is not assigned to any warehouse, its warehouse vector will be a K-dimensional zero vector.

Using these one hot representations, we can define an ideal warehouse assignment matrix . Row of i.e.

is the ideal warehouse assignment one hot encoding for

.

For each we define binary-decision variables, which represent the optimal feasible assignment considering capacity constraints. These are determined by solving the optimization problem.

This gives us the binary decision variable matrix:
to be determined by optimization. Lets define the truncated penalty matrix by dropping the last column representing the non-assignment penalties.

Let denote a column vector of all 1s of dimension . We define the non-assigned vector as

(11)

We determine the optimal feasible assignment by solving the following optimization problem: Y⏟trace(YL’W^T)_assignment loss + ⏟λ_NA∑_i=1^NZ_i_non assignment loss ∑_j=1^KY_ij≤1   1 ≤i ≤N ∑_i=1^NY_ij≤C_j   1 ≤j ≤K

Optimization objective (Eqn. 5.1) represents the total redistribution cost for all SKUs given an ideal allocation and final assignment . represents the redistribution cost for . There are two constraints. Eqn 5.1 constrains that we choose at most 1 warehouse per item. Eqn. 5.1 constrains that total items assigned to warehouse is not more than the capacity of the warehouse .

Using optimal feasible assignments , we compute optimal feasible warehouse splits for SKUs as follows.

The ideal assignments for is given by:

(12)
(13)
(14)

are the desired optimal feasible warehouse allocations.

Binary Integer Programming formulation works well for small size orders. The number of decision variables is where is the total quantity of items and is the total number of warehouses. For large sized orders, the number of decision variables will become too large and this will be very cumbersome to solve. In the next subsection, we describe an alternative formulation which can efficiently solve large orders.

5.2. Integer Programming Formulation

From Step 1 (Sec. 4), we have computed ideal splits:
. This gives us the ideal split matrix: where represents the quantity of assigned to warehouse . Penalty matrix is the same as before.

We define a final split Decision Variable Tensor


as follows:


We determine optimal final split tensor by solving the following integer programming problem. Y∑_i=1^Mtrace(Y_i * L^T) ∑_m=1^K+1Y_ijm= I_ij   1 ≤i ≤M; 1 ≤j ≤K ∑_i=1^M∑_j=1^KY_ijm≤C_m   1 ≤m ≤K

Optimization objective (5.2) represents total redistribution cost for given an ideal split and a final split. The term is the redistribution cost for as depicted in Figure 1. Eqn 5.2 and 5.2 are optimization constraints for SKU order quantities and warehouse capacities respectively.

Once we have by solving the optimization problem, we compute final optimal feasible allocations as follows:

(15)
(16)

are the desired optimal feasible warehouse allocations.

The second formulation (Sec. 5.2) is computationally more efficient. The number of decision variables is which is usually much smaller than binary integer programming case when

is not very large. With this we can efficiently solve the optimization problem with total SKU quantity ranging in millions, number of SKUs ranging in few thousands and number of warehouses is less than 100 using open source solvers like coin-or/Cbc.

Note that for this problem, we can always supply an initial feasible solution to warm start the optimization. One trivial feasible solution is the one which does not assign warehouses to any . This also has the maximum objective value. Using ideal splits and some heuristics, we can supply much better initial feasible solutions to speed up the optimization.

We use this second formulation for our back-testing and for optimization, we used the coin-or/Cbc solver at
https://github.com/coin-or/Cbc.

Figure 1. This figure visualizes the matrices. Suppose we have 4 warehouses (). is the decision variable matrix representing warehouse assignment for . is the penalty matrix. The region is highlighted to represent non-assignment region. row of represents how ideal split gets redistributed between K + 1 warehouses. column of represents redistribution losses for a sku ideally bound to warehouse . Therefore, diagonal element in the will have the total redistribution loss for items of which were ideally bound to warehouse .

6. Evaluation and Results

In this section, we describe the back-testing set up and results. We back-tested the model on purchase orders created during an entire month at Myntra. We tested it on major business units like Apparel, Footwear and Personal Care, covering 23 different Article types. Around 8,000 purchase orders were created for 43,000 SKUs and a total quantity of 3.4 million.

We used two warehouse constraint scenarios (Table 4 and Table 5) provided by the Supply-Chain-Management team.

Month Type C1 C2 C3 C4
April 19 Apparel 1005714 502857 377143 754286
Footwear 91429 45714 34286 68571
Personal Care 102857 - 17143 -
May 19 Apparel 435810 217905 163429 326857
Footwear 76191 38095 28571 57143
Personal Care 85714 - 14286 -
Table 4. Warehouse capacity constraint - Scenario 1
Month Type C1 C2 C3 C4
April 19 Apparel 550979 721502 844438 546241
Footwear 50089 65591 76768 49658
Personal Care 56350 - 38384 -
May 19 Apparel 238758 312650 365923 236705
Footwear 21705 28423 33265 21518
Personal Care 24418 - 16634 -
Table 5. Warehouse capacity constraint - Scenario 2

We solved the optimization problem using constraints depicted in tables 4 and 5. After every purchase order is created, we deduct the total quantity of the order from the capacities and update the constraints.

Using the optimal splits computed using our optimization algorithm and the purchase data for these skus in subsequent months, we estimated Regional Utilization (RU) which is the fraction of orders which are fulfilled by the nearest warehouse. We also estimated percentage Two-Day-Delivery which is the fraction of orders which are fulfilled within two days of placing the order. Every warehouse has a set of pincodes for which delivery can be done within 2 days. Using this, we can estimate the percentage Two-Day-Delivery. We observed a substantial uptick in these two key metrics — RU and 2DD over a heuristics based approach previously followed. We compared the estimates with the numbers produced using simple business heuristics. Tables 6, 7 and 8 depict the results.

Metric Scenario 1 Scenario 2
RU (ideal splits) 0.91 0.91
RU (constrained splits) 0.82 0.84
RU (Heuristics based) 0.64 0.64
2DD (ideal splits) 0.64 0.64
2DD (constrained splits) 0.58 0.61
2DD (Heuristics based) 0.48 0.48
Table 6. Apparel RU and 2DD Estimates
Metric Scenario 1 Scenario 2
RU (ideal splits) 0.9 0.9
RU (constrained splits) 0.87 0.9
RU (Heuristics based) 0.38 0.38
2DD (ideal splits) 0.69 0.69
2DD (constrained splits) 0.66 0.69
2DD (Heuristics based) 0.34 0.35
Table 7. Footwear RU and 2DD Estimates
Metric Scenario 1 Scenario 2
RU (ideal splits) 0.93 0.93
RU (constrained splits) 0.7 0.7
RU (Heuristics based) 0.5 0.5
2DD (ideal splits) 0.72 0.72
2DD (constrained splits) 0.53 0.53
2DD (Heuristics based) 0.35 0.35
Table 8. Personal Care RU and 2DD Estimates
  • RU (ideal splits): RU estimate using ideal splits with unlimited warehouse capacities.

  • RU (constrained splits): RU estimate using optimal feasible splits considering capacity constraints.

  • RU (Heuristics based): Actual realised RU using a heuristics based warehouse allocation policy.

  • 2DD (ideal splits): Percentage Two-Day-Delivery estimate using ideal splits with unlimited warehouse capacities.

  • 2DD (constrained splits): Percentage Two-Day-Delivery estimate using optimal feasible splits considering capacity constraints.

  • 2DD (Heuristics based): Actual realised Percentage Two-Day-Delivery using a heuristics based warehouse allocation policy.

Results clearly show a significant improvement in both RU and 2DD. RU improves by around 27% over the naive heuristic method and 2DD improves by around 20%. The performance trend is consistent across all the 3 categories and the 2 constraint scenarios.

7. Conclusion

We have developed an efficient solution to the problem of optimal warehouse allocations from a Regional Utilization perspective using machine learning and optimization methods. We have computed ideal allocations using predictive ML models. We have formalised the requirements of optimal warehouse allocation problem using 2 novel optimization formulations. We have demonstrated the efficacy of the solution with an elaborate back-testing. We have demonstrated a substantial uptick in key business metrics like Regional Utilization and Two-Day-Delivery over a heuristics based approach.

References