1 Introduction
The Bin Packing
problem is a classical combinatorial optimization problem that has been widely studied since the 1970’s and can be stated as follows:
Given a collection of unit capacity bins and a sequence of items with their respective sizes such that , determine a packing of the items in that uses a minimum number of bins from .
This problem has a wide variety of applications[17] including cutting stock applications, packing problems in supply chain management, and resource allocation problems in distributed systems. Algorithms for bin packing attempt to pack the items in using minimum number of bins in
and can be broadly classified as
offline and online. Offline algorithms are algorithms that pack items with complete knowledge of the list of items prior to packing, whereas online algorithms need to pack items as they arrive without any knowledge of future. The bin packing problem even for the offline version is known to be NPHard[9] and hence most research efforts have focused on the design of fast online and offline approximation algorithms with good performance. The performance of an approximation algorithm is defined in terms of its worst case behavior as follows: Let be an algorithm for bin packing and let denote the number of bins required by to pack items in , and OPT denote the optimal algorithm for packing items in . Let denote the set of all possible list sequences whose items are of sizes in . For every , . Then the asymptotic worst case ratio is given by . This ratio is the asymptotic approximation ratio and measures the quality of the algorithms packing in comparison to the optimal packing in the worst case scenario. The second way of measuring the performance of an approximation algorithm is and this ratio is the absolute approximation ratio of the algorithm. In the case of online algorithms this ratio is often referred to as competitive ratio. A polynomial time approximation scheme (PTAS) for bin packing is a class of algorithms that given any instance and an produces an approximation algorithm whose solution quality is within times the optimal solution quality and its computational time is polynomial in its size and . A PTAS essentially allows the user to choose by specifing the parameter the algorithm from this class that guarantees an optimal solution. However, stricter performance guarantee is achieved by these algorithms at the cost of increased computation. The computation cost in practice is very high even for . The high computational cost of PTAS, coupled with the inability to provide strict theoretical guarantees for many algorithms that perform well in practice, and traditional worst case performance measures categorizing an algorithm’s performance as poor based on very few degenerate instances has motivated the need for efficient heuristics  fast algorithms that perform well on most instances without requiring any theoretical guarantees on its worst case behavior. In addition, the phenomenal growth in volume of data has driven the need for heuristics that can scale computationally and are amenable to tighter empirical analysis.In this paper we employ simple combinatorial ideas in the design and analysis of a fast scalable heuristic that partitions the given problem into identical subproblems of constant size and solves these constant size subproblems by considering only a constant number of bin configurations with bounded unused space. We present our empirical study that provides evidence for the scalability of our heuristic and its tighter empirical analysis for hard instances due to improved lower bound on the necessary wastage in an optimal solution.
2 Related Results
In this section, we summarize the main results in bin packing from the perspective of approximation algorithms and polynomial time approximation schemes. For a detailed survey of these and other related results, we refer the reader to Johnson’s Phd Thesis [14], Coffman
et al.[4] and Hochbaum[13].
Online Algorithms:
NEXTFIT(NF), FIRSTFIT(FF) and BESTFIT(BF) are the most widely studied natural and classical online algorithms for bin packing. Johnson et al.[14, 15, 17] showed that both and have an asymptotic competitive ratio of . Subsequently, Yao presented an online algorithm REVISEDFF(RFF)[21] based on that achieved an asymptotic competitive ratio of . This was further improved by Lee and Lee[19], Seiden[20], and more recently Balogh et al.[2] settled this online problem by presenting an optimal online bin packing with absolute worst case competitive ratio of .
Offline Algorithms:
The most natural offline algorithms for bin packing essentially reorder the items and then employ other classical online algorithms like , , or other online algorithms to pack the items. This has resulted in three simple but effective offline algorithms; they are denoted by , , and , with the “D” standing for “Decreasing”. The sorting needs time and so the total running time of each of these algorithms is . Baker and Coffman [3] established the asymptotic approximation ratio for to be , Johnson et al.[17], Baker [1] and Yue[22] established and ’s aymptotic approximation ratio to be . Subsequently, RefinedFirstFit Decreasing (RFFD) by Yao [21], Modified First Fit (MFFD) by Garey and Johnson [10], and by Friesen and Langsten[8] resulted in achieving an asymptotic competitive ratio of but at a very high computational cost.
Approximation Schemes: Fernandez de la Vega and Lueker[7] designed a PTAS that for any real number , constructed a optimal solution in time where and are large constants that depend on . Subsequently, Johnson[16], Karmarkar and Karp[18] and Hochbaum and Shmoys[11, 12] presented improved approximation schemes. These approximation schemes helped in obtaining near optimal solutions with its computation time polynomial in its size and . For
, the computational time even for moderate sized instances made these PTAS practically not usable. This high computational cost of PTAS coupled with the inability to provide strict theoretical guarantees for many algorithms that perform well in practice has lead to the study of heuristics. In this paper our focus is on heuristics based on simple combinatorial ideas (simple and effective heuristics using combinatorial ideas are mostly similar to the online and offline algorithms already described earlier in this section) and hence we do not go into the heuristics based on approaches like branch and bound, local search, simulated annealing, tabu search, genetic algorithms and constraint optimization. However, the interested reader can look at the survey paper of Delorme et al
[5] for these results.2.1 Our Results
In this paper, we first present Algorithm , an algorithm that given a real valued parameter , partitions the original problem into many identical subproblems of size and then uses exact algorithms or existing PTAS to solve these length bin packing problems. Then we present Heuristic , a heuristic that just like Algorithm partitions the original problem into may identical length problems but solves the length subproblems by considering only a constant number of bin configurations with wastage very close to (i.e. an extremely small fraction of the bin configurations considered by PTAS or exact algorithms). This results in significant reduction in computation time without any noticable impact on its performance guarantee. Finally, we conducted an empirical study of Heuristic involving several hundred large instances of both randomly generated as well as hard instances to study its computational scalability under the constraint that it provides an approximation guarantee of . For most of the instances Heuristic was computationally scalable (i.e. the problem instance were split into identical subproblems of size which were then solved by considering less than distinct bin configurations). For some instances Heuristic needs to consider distinct bin configurations in order to satisfy the performance guarantee constraint. For some instances traditional analysis did not establish the desired performance guarantee, but we were able to obtain the desired performance guarantee by obtaining a better lower bound on the necessary wastage in an optimal solution. The rest of this paper is organized as follows: In Section we present Algorithm , in Section we present Heuristic , and in Section we present our empirical study of Heuristic .
3 Bin Packing Based on Near Identical Partitioning
In this section, we present an algorithm that given a real valued parameter , partitions the input sequence into identical subsequences (except for the last subsequence) of length (i.e. sum of sizes of items in these subsequences is ) and then makes use of either exact algorithms or known PTAS to pack the items in these length subsequences onto unit capacity bins. We now introduce some necessary terms and definitions, before presenting our algorithm and its analysis.
Definitions 3.1
The sequence with distinct item sizes {} can be viewed as a
dimensional vector
, where for , is the number of items of type (size ); we refer to as the distribution vector corresponding to . For a given real number , let denote a length segment of (i.e. a vector that is parallel to and contains its initial segment such that its component sum equals ), and to be the smallest sized bin packing of items corresponding to .Remark :
If the number of distinct sizes in is not bounded by a constant , then we can still apply the above idea by partitioning the interval into distinct sizes and round the item sizes in to the nearest multiple of that is greater than or equal to the item size.
Key Idea: For an integer , we partition the distribution vector into many copies of , the length initial segment of (except for the last segment), where is determined as follows:
For each , we determine the packing ratio . Then, we choose to be for which the packing ratio is minimum.
Example 1
Let us consider a sequence of items consisting of items of size , items of size , items of size and items of size . Let is the approximation ratio desired. For this instance the distribution vector is a dimensional vector of length . Our algorithm attempts to partition into a segment vector for some between . We can observe that we can partition into copies of the segment vector of length . The minpacking for this segment vector of length can be determined using any of the exact algorithms or existing PTAS for regular bin packing with .
Example 2
Let us consider a sequence of items consisting of items of size , items of size , and items of size . Let us consider the problem instance in Example with is the approximation ratio desired. For this instance the distribution vector is a dimensional vector of length . Our algorithm attempts to partition into a segment vector for some between . We can observe that we can partition into copies of the segment vector of length . The minpacking for this segment vector of length can be determined using exact algorithms or any of the existing PTAS for regular bin packing with .
ALGORITHM B(, ) Input(s): (1) = be the sequence of items with their respective sizes in the interval ; (2) be a user specified parameter; Output(s): The assignment of the items in to the bins in ; Begin (1) Let be the distribution vector corresponding to ; (2) For (; ; ) (2a) Let be the length initial segment of with a packing ratio ; (3) Let be an integer in : ; (4) Let and ; (5) return ; End
Definitions 3.2
Let Algorithm partition into copies of (discarding the last segment), where is an integer in and is a length initial segment of . Let be the segment vector obtained by truncating for , the th components of to the nearest integer multiple of . Let be the packing determined by Algorithm for .
Theorem 1
If (i) (i.e. for the th component is an integer multiple of ) ; OR (ii) OR , then Algorithm constructs an asymptotically optimal packing for .
4 A Fast Heuristic Based on Near Identical Partitioning
The Algorithm constructed the bin packing for sequence by essentially partitioning the distribution vector into identical copies of a length segment (except for the last segment) and then constructing minpacking of using either exact methods or known PTAS for bin packing. However these exact algorithms or PTAS construct by considering all possible bin configurations of unit capacity bins and hence are computationally expensive. We address this computational issue by designing a Heuristic that constructs by restricting the choice of vectors (bin configurations) to a small subset of vectors (i.e. vectors that correspond to bin configurations with unused space of at most ). This restriction results in significant improvement in the computational efficiency of the Algorithm without significant downside on its solution quality. Also, for many hard instances of bin packing an optimal bin packing is not compact because of large unavoidable wastage in their bin packing. This wastage in an optimal solution is often underestimated resulting in weak analysis on the performance of PTAS / approximation algorithms. We address this analysis problem by using vectors to get a better lower bound on the wastage in an optimal bin packing. We now introduce some definitions necessary for describing Heuristic . Heuristic will make use of a subroutine that will be defined subsequently.
Definitions 4.1
For a real number , the configuration of a unit capacity bin containing items whose sizes are from and has a wastage of at most can be specified by a dimensional vector whose component, for , is the sum of sizes of items of type (size ) in that bin; and its length is in the interval , where the length of a vector is defined to be the sum of its components. We refer to such a vector as a vector (bin configuration) consistent with ; and we denote by the set of all ()vectors (bin configurations) consistent with .
Definitions 4.2
For a given sequence and a real number , if is nonempty then we define

a packing for to be a minimal collection of vectors from such that for , the sum of the th component of these collection of vectors is greater than or equal to the th component of ;

to be a packing for of the smallest size; If for a given , if it is not possible to pack using vectors from then .
Note: For certain sequences , the item sizes in may be such that for certain values of there are no vectors consistent with (i.e. is empty).
Example 3
Let us consider the problem instance in Example with is the approximation ratio desired. For this instance the distribution vector is a dimensional vector of length . For this instance if then there are no vectors consistent with and for there are no packings of .
Key Idea: For an integer , we partition the distribution vector into many copies of , the length segment of (except for the last segment), where is determined as follows:
For each , we determine to be the smallest real number for which . Then, we determine to be an integer in that minimizes the packing ratio (ie.
).
Heuristic C(, )
Input(s):
(1)
=
be the sequence of items with their respective sizes
in the interval ;
(2)
be a user specified parameter;
Output(s): The assignment of the items in to the bins in ;
Begin
(1)
Let
be the distribution vector corresponding to ;
(2)
For
(; ; )
(2a)
Let be the length segment of ;
(2b)
For
(; ;
)
If ) break;
(2c)
Let and ;
(3)
Let
be an integer in :
;
(4) Let and ;
(5)
Let ;
(6)
return
End
Subroutine for computing : We first present a recurrence that determines , for , and can be easily converted into a dynamic program. We then reduce the computation time of this dynamic program by presenting a heuristic that employs the same recurrence but restricts the choice of vectors to a small subset from . Now, we present some essential definitions before presenting our recurrence and heuristic.
Definitions 4.3
Let denote , the length initial segment of . Let be a real number and denote the number of configurations consistent with . Let denote the complete enumeration of the vectors (bins) consistent with , where denotes the th component of .
From definition, we can observe that , a minimum sized packing of , is a smallest sized collection of vectors from such that for , the sum of the th components of these vectors is greater than or equal to the th component of . So, we can define recursively as follows:
(1) 
This recurrence helps construct by choosing at most vectors from and can be converted into an dynamic program, where . The computation time of this dynamic program is very high, so we design a heuristic that employs the same recurrence but restricts the choice of vectors to a small subset of size from .
Key Idea: For a given is constructed as as follows:
(i) Construct efficiently and store it compactly; (ii) Construct a set consisting of vectors randomly chosen from , and (iii) Construct using recurrence Equation with the choice of vectors restricted to .
Heuristic Packing(, )
Input(s):
(1)
a length segment of , the distribution vector of ;
(2)
be a user specified parameter.
Output(s): A cover for .
Begin
(1)
Cons
tructing :
(1a)
Solve the following Knapsack problem(KP(S)): Given a collection consisting
of copies of items of size , , we need to determine the subsets
of whose sum is in the interval .
The standard dynamic programming solution for will result in a two
dimensional table consisting of entries where the number of items
and the number of distinct weight classifications is .
(1b)
Comp
actly store using a directed graph constructed from the
dynamic programming table of :
In there are nodes with each node associated with a
dynamic programming entry in . There is an edge in
if the subproblems in corresponding to nodes and
are directly related (i.e. solution to subproblem corresponding to
can be obtained from the subproblem corresponding to by adding
a single item in ), and the
weight associated with the edge is
the weight of the item that relates these two subproblems.
Note:
There is a correspondence between the subproblems in
and the table entries of a dynamic programming solution
for .
(2)
Cons
tructing from :
(2a)
Construct by removing all
useless nodes from , where
a node is useless if its weight is in the interval and has no
directed edge to a node with greater weight.
(2b)
Cons
truct the set of size by choosing ,
as follows:
Set the current node to ; while the current node has an out degree
greater than , choose uniformly at random a directed edge from
among the edges leaving node and include the weight corresponding to
that edge in set . Now set current node to node and repeat the
above step.
Note:
There is a correspondence between the
vector constructed by
following a directed path from to in , where
and , and a vector in . So, we construct by
sampling uniformly at random from paths in that correspond
to vectors in . In since there are directed paths that do
not correspond to a vector in , we modify to obtain
where there is an correspondence between a directed
path from node to a node with out degree , and a vector in
.
(3) Construct
by modifying the Equation as follows:
End
5 Empirical Analysis of Heuristic
In this section we present our empirical study of Heuristic from two perspectives:
(i) solution quality  the nature of approximation guarantees it can provide; and (ii) computational efficiency  scalability of the heuristic. We desire an algorithm that can provide near optimal solutions and is computationally efficient. However, there is a natural tradeoff between solution quality and computational efficiency that is made worse by the hardness of the bin packing problem. Our Heuristic , where is the desired error bound, attempts to obtain an optimal solution by essentially breaking the original problem into many identical subproblems of size and then solving that subproblem using at most distinct bin configurations. For most instances Heuristic splits the original problem instance into identical subproblems of size which is then solved by considering less than distinct bin configurations (i.e. ). For some instances Heuristic needs to consider distinct bin configurations in order to satisfy the performance guarantee constraint. For these instances Heuristic is scalable and provides good guarantee on its solution quality. For a very small fraction of instances Heuristic does not provide the approximation guarantee when analyzed using traditional means. However, for most of these instances we were able to obtain the desired performance guarantee by obtaining a better lower bound on the necessary wastage in an optimal solution. We now present our empirical study of Heuristic by first describing our experimental setup and experiments, and then presenting the experimental results and our observations.
Experimental setup: We created two sets of sequences: (i) SequenceSetH: A set of instances obtained by randomly partitioning a unit interval into triplets, quartets or quadruplets. These are combinatorially hard instances for which we know an optimal solution with wastage almost zero and hence for these hard instances our experimental analysis is tight
; (ii) SequenceSetR: A set of instances where the item sizes are drawn randomly from a distribution parameterized by the number of item types. These are very few instances
for which we do not necessarily know the optimal and also we do not have a good lower bound on
the wastage in an optimal solution. Hence traditional analysis for these instances may not be tight. We now describe how we generate and .
SequenceSetH: For each , we use to generate a random sequence of length obtained by randomly partitioning unit intervals into pieces each. We generate such sequences for each value of as follows:
: Create items by randomly partitioning the unit interval into pieces by using
cut points drawn from standard uniform distribution and rounded to the nearest multiple of
as cut points, and then use the lengths of these pieces to be the sizes of the pieces obtained by partitioning the unit interval. Repeat this step times.
SequenceSetR: For and each , we use to generate a random sequence of length consisting of at most distinct item sizes. We generate such sequences for each value of as follows:
: First, determine the item sizes by generating a sample of size where each item is drawn from a standard uniform distribution and rounded to the nearest multiple of ; Second, partition the unit interval by using cut points drawn from standard uniform distribution, and then use the lengths of the pieces obtained by scanning the unit interval from left to right to specify
, the probability distribution of item sizes in
; and finally generate by simulating a multinomial distribution using .
Note: The sequences generated using are similar to the instances used by Falkenauer (i.e. Falkenauer Triplets). Here we generate triplets, quadruplets and quintuplets.
We refer the reader to BPPLib [6] for an excellent and comprehensive collection of codes, benchmarks, and links for the onedimensional Bin Packing and Cutting Stock problem.
Experiments:
We ran Heuristic on instances in and Heuristic , (Best Fit Decreasing) and (First Fit Decreasing) on instances in . For instances
in , we wanted an optimal solution, where .
For instances in , we wanted a solution whose quality is better than the solutions obtained through either (Best Fit Decreasing) or (First Fit Decreasing). For each sequence, we observed the following: (i)  the size of the subproblem it partitions the input instance into; (ii)  the number of bin configurations it considers while solving the sized subproblem in (i); and (iii) lower bound on the necessary wastage of an optimal solution for the instances in where we are
not able to guarantee optimality.
Experimental Results: For instances in , ranged from to . However, for of the instances is in . For instances where , Heuristic is able to get the desired quality for and . Also, for , Heuristic is able to get the desired quality for and . However,
for , there are some instances where we are unable to get the desired solution quality for (irrespective of the value of ). For these instances, the performance
is very sensitive to the heuristic’s choice of configurations. So for the randomly chosen bin
configurations to contain some specific collection of bin configurations our heuristic ends up picking a larger sample.
For , for instances where and , we are able to get the solution
of desired quality for and .However, for some instances where
and the item size distribution is skewed to the right (i.e. many items of size
) Heuristic is able to perform as good as the best of and but traditional analysis is unable to provide guarantee about its near optimality mostly due to the inability to get a good lower bound on the necessary wastage in any optimal solution. In most of these instances we are able to improve the lower bound and hence the performance guarantee.Conclusions: We are able to design a simple heuristic that our preliminary empirical study indicates is highly scalable and is amenable to tighter analysis due to the use of bins with wastage as close to as possible. For most instances it is able to scale because it is able to split the given sequence of items into identical subproblems of length for and each of these length subproblems is solved using fewer than distinct bin configurations for most instances. However, when  the number of item sizes is in and the average item size is our heuristic is not able to guarantee near optimality for some very few instances partly because of sensitivity of the instance to the choice of bin configurations and mostly due to the inability to get a good lower bound on the necessary wastage in any optimal solution for these instances.
References
 [1] B. S. Baker. A new proof for the firstfit decreasing binpacking algorithm. Journal of Algorithms, Volume 6, Issue 1, 4970, 1985.
 [2] J. Balogh, J. Bekesi, G. Dosa, J. Sgall and R. Van Stee. The optimal absolute ratio for online bin packing. Proceedings of the 26th Annual ACMSIAM Symposium on Discrete Algorithms  SODA, 14251438, 2015.
 [3] B. S. Baker and E. G. Coffman Jr. A tight asymptotic bound for nextfit decreasing binpacking. SIAM Journal on Algebraic Discrete Methods, Volume 2, Issue 2, 147152, 1981.
 [4] E. G. Coffman, J. Csirik, G. Galambos, S. Martello and D. Vigo. Bin Packing Approximation Algorithms: Survey and Classification. Handbook of Combinatorial Optimization edited by P. M. Paradalos et al., Springer Science Business Media, New York., 455531, 2013.
 [5] M. Delorme, M. Iori. and Silvano Martello. Bin Packing and Cutting Stock Problems: Mathematical Models and Exact Algorithms. European Journal of Operational Research, Volume 255, Issue 1, 120, 2016.
 [6] M. Delorme, M. Iori. and Silvano Martello. BPPLIB: a library for bin packing and cutting stock problems. Optimization Letters, Volume 12, Issue 12, 235250, 2018.
 [7] W. Fernandez de la Vega and G.S. Leuker. Bin packing can be solved within in linear time. Combinatorica, Volume 1, Issue 4, 349355, 1981.
 [8] D. K. Friesen, M. A. Langston. Analysis of a compound binpacking algorithm. SIAM Journal on Discrete Mathematics, Volume 4, Issue 1, 6179, 1991.
 [9] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NPCompleteness. W. H. Freeman, New York, 1979.
 [10] M. R. Garey and D. S. Johnson. A 71/60 theorem for bin packing. Journal of Complexity, Volume 1, Issue 1, 65106, 1985.
 [11] D. S. Hochbaum and D. B. Shmoys. A packing problem you can almost solve by sitting on your suitcase. SIAM Journal on Algebraic and Discrete Methods, Volume 7, Issue 2, 247257, 1986.
 [12] D. S. Hochbaum and D. B. Shmoys. Using dual approximation algorithms for scheduling problems: theoretical and practical results. Journal of the ACM., Volume 34, Issue 1, 144162, 1987.
 [13] D. S. Hochbaum. Approximation Algorithms for NPHard Problems. PWS Publishing Company, Boston, 1997.
 [14] D. S. Johnson. NearOptimal Bin Packing Algorithms. PhD thesis, MIT, Cambridge, MA, 1973.
 [15] D. S. Johnson. Fast Algorithms for bin packing. Journal of Computer and System Sciences, Volume 8, Issue 3, 272314, 1974.
 [16] D. S. Johnson. The NPcompleteness column: an ongoing guide. Journal of Algorithms, Volume 3, Issue 2, 288300, 1982.
 [17] D. S. Johnson, A. Demers, J. D. Ullman, M. R. Garey and R. L. Graham. Worstcase performance bounds for simple onedimensional packing algorithms. SIAM Journal on Computing, Volume 3, Issue 4, 256278, 1974.
 [18] N. Karmarkar and R. M. Karp. An efficient approximation scheme for the onedimensional binpacking problem. Proceedings of the 23rd Annual IEEE Symposium on Foundations of Computer Science, Chicago, IL, 312320, 1982.
 [19] C. C. Lee and D. T. Lee. A simple online binpacking algorithm. Journal of the ACM, Volume 32, Issue 3, 562572, 1985.
 [20] S. S. Seiden. On the online bin packing problem. Journal of the ACM, Volume 49, Issue 5, 640671, 2002.
 [21] A. C. C. Yao. New algorithms for bin packing. Journal of the ACM, Volume 27, Issue 2, 207227, 1980.
 [22] M. Yue. A simple proof of the inequality for FFD bin packing algorithm. Acta Mathematicae Applicatae Sinica, Volume 7, Issue 4, 321331, 1991.