1 Introduction
This technical note studies a stable merge sort algorithm, called AdaptiveShiversSort, which is a variant of the algorithms SSand AugmentedShiversSort, respectively introduced by Shivers [8] and by Buss and Knop [3]. Like SS, AugmentedShiversSort and the most wellknown TimSort algorithm, AdaptiveShiversSort is a sorting algorithm based on splitting arrays into monotonic runs, which are then merged together.
Plain greedy algorithms are already very efficient for splitting the array into a minimal number of monotonic runs, and there exist a wealth of merging algorithms, i.e., of algorithms designed for merging two sorted arrays into one sorted array. Hence, our description of the algorithm AdaptiveShiversSort itself mainly focuses on its merging policy, i.e. the order in which monotonic runs are to be merged.
The idea of starting with a decomposition of the array into monotonic runs already appears in Knuth’s NaturalMergeSort [5], where increasing runs are sorted using the same mechanism as in MergeSort. Other merging strategies combined with decomposition into runs appear in the literature, such as the MinimalSort of [9] (see also [2] for other considerations on the same topic), the wellknown TimSort, which was implemented in several popular programming langugages, and the most recent PeekSort and PowerSort of [7]. All of them have nice properties: they run in and even , where is the number of runs, which is optimal in the model of sorting by comparisons [6], using the classical counting argument for lower bounds.
Some of them even adapt to the run lengths, and not only to the number of runs: if the array consists of runs of lengths , then they run in , where is defined as and is the binary Shannon entropy. Since , this finer upper bound is once again optimal.
However, the question of evaluating the multiplicative constant hidden in the notation needs to be addressed too. Therefore, we settle for the following cost model. Since naive merging algorithms approximately require element comparisons and element moves for merging two arrays of lengths and , and since element moves may be needed in the worst case (for any values of and ), we measure below the complexity in terms of merge cost [1, 3, 4, 7]: the cost of merging two runs of lengths and is defined as , and we identify the complexity of AdaptiveShiversSort with the sum of the cost of the mergings processed during an execution of the algorithm.
In that context, the result of [6] can also be refined [2, 7]: in the the model of sorting by comparisons, at least comparisons are required in the worst case. The two sorting algorithms PeekSort and PowerSort designed by Munro and Wild [7] match exactly this lower bound, since their merge cost (which does not take into account the cost of identifying monotonic runs of the array) is at most .
However, both algorithms require knowing beforehand the total length of the array to be sorted. Consequently, neither of them falls into the class of aware stable sorting algorithm described by Buss and Knop [3] for any integer , nor are they adapted for merging streams of data on the fly.
In what follows, we introduce, describe and analyse the worstcase merge cost of the algorithm AdaptiveShiversSort. In particular, and unlike the algorithms PeekSort and PowerSort, this algorithm is aware in the sense of Buss and Knop. Yet, Theorem 3 below proves that its merge cost is at most , which means that it also matches the lower bound of comparisons. In addition, it has the advantage of having a structure that is very similar to that of TimSort, which means that switching from one algorithm to the other might be essentially costless in practice.
2 Algorithm description
Algorithm 1 presents a highlevel description of the AdaptiveShiversSort algorithm.
Like its counterparts SS, AugmentedShiversSort and TimSort, the algorithm AdaptiveShiversSort is based on discovering monotonic runs and on maintaining a stack of such runs, which may be merged or pushed onto the stack according to whether the conditions of cases #1 to #4 apply. In particular, since these conditions only refer to the values of , and , and since only the runs , and may be merged, this algorithm falls within the class of aware stable sorting algorithms such as described by Buss and Knop [3].
Note that the algorithm SSitself is obtained by deleting lines 6 and 7 (i.e., the cases #2 and #3), and that AugmentedShiversSort is obtained by just deleting the line 7 (i.e., the case #3) and replacing the condition of line 6 by the condition .
3 Worstcase merge cost analysis
On the other hand, as mentioned in Section 2, AdaptiveShiversSort is a aware algorithm, with similar worstcase upper bounds in terms of merge cost, as outlined by the result below.
The merge cost of AdaptiveShiversSort is bounded above by .
The remainder of this section is devoted to proving Theorem 3. In what follows, we denote by the length of a run , and by the integer . We adpat readily these notations when the name of the run considered varies, e.g., we note by the length of the run , and by the integer . In particular, we will commonly denote the stack by , where is the ^{th} topmost run of the stack. The length of is then denoted by , and we set .
With this notation in mind, we first prove two auxiliary results.
When to runs and are merged into a single run , we have .
Proof.
Without loss of generality, we assume that . In that case, it comes that , and therefore that . ∎
At any time during the execution of the algorithm, if the stack of runs is , we have:
(1) 
Proof.
The proof is done by induction. First, if , there is nothing to prove: this case occurs, in particular, when the algorithm starts.
We prove now that, if the inequalities of (1) hold at some point, they still hold after an update on the stack, triggered by any of the cases #1 to #4. This is done by a case analysis, denoting by the stack before the update and by the stack after the update:

If Case #1 just occured, then a new run was just pushed onto the stack; we have , and for all . Since (1) holds in , we already have . Moreover, since Case #1 occured, none of the conditions for trigerring the cases #2 to #4 holds in . This means that or, equivalently, that , which shows that (1) holds in .

If one of the cases #2 to #4 just occured, then , for all , and is either equal to (in Case #4) or to the result of the merge between the runs and (in Cases #2 and #3). Thanks to Lemma 3, this means that either (in Case #4) or that . Moreover, since (1) holds in , we already have . It follows that , which shows that (1) also holds in .
∎
Before going further in the proof of Theorem 3
, we first need to classify run merges in several classes. When merging two runs
and into one bigger run , we say that the merging of is expanding if , and is nonexpanding otherwise. Hence, we refer below to the merging of with and to the merging of with as if these were two separate objects. In particular, the merge cost is itself split in two parts: one part, for a cost of , is assigned to the merging of , and the other part, for a cost of , is assigned to the merging of . Finally, note that, if , then the merging of with is necessarily expanding. Hence, when two runs and are merged, either the merge of or of (or both, among others when ) is expanding.The total cost of expanding merges is at most .
Proof.
While the algorithm is performed, the elements of a run of initial length may take part to at most expanding merges. Consequently, it the array is initially split into runs of lengths , the total cost of expanding merges is at most ∎
It remains to prove that the total cost of nonexpanding merges is at most . This requires further splitting sequences of merges based on the case that triggered these merges. Hence, when discussing the various merges that may arise, we also call merge every merge triggered by a case .
Now, we define the starting sequence of a run as the (possibly empty) maximal sequence of consecutive #2merges that follows immediately the push of onto the stack. We call middle sequence of the maximal sequence of consecutive #2 and #3merges that follows the starting sequence of , and ending sequence of the maximal sequence of consecutive merges that follows the middle sequence of (and precedes the next run push). Below, we also call nonexpanding cost of a sequence of merges the total cost of the nonexpanding merges included in this sequence.
Before looking more closely at the nonexpanding cost of these sequences, we first prove invariants similar to that of Lemma 3.
Every middle sequence consists only of #3merges, and every ending sequence consists only of #4merges.
Proof.
Let be a middle sequence. By construction, it can contain #2merges and #3merges only. Hence, we just need to prove that it cannot contain #2merges, i.e., that at any time during the sequence (including just before it starts). This statement is proved by induction.
First, the sequence starts with a #3merge, which means precisely that just before it starts. Then, we prove that, if the inequality holds at some point, it still holds after a #3merge. Indeed, let us denote by the stack before the merge and by the stack after the merge.
We have and . Hence, it follows from Lemma 3 that , which proves that consists only of #3merges.
Similarly, let be an ending sequence. This time, we need to prove that it cannot contain #2merges or #3merges, i.e., that and at any time during the sequence. We also proceed by induction. First, the sequence starts with a #4merge, which means that and just before its starts.
Then, we prove that, if the inequalities and hold at some point, they still hold after a #4merge. Indeed, let us denote again by and the stacks before and after the merge. We have , , and is the result of the merges of and . Moreover, the cannot satisfy the conditions for Cases #2 or #3, which means that and . Hence, and due to Lemmas 3 and 3, it follows that . ∎
At any time during an ending sequence, except possibly just before the first merge of the sequence, we have .
Proof.
Lemma 3 states that the ending sequence consists only of #4merges. Hence, we just need to prove that the inequality must hold after any #4merge.
Indeed, let the stack before the merge and by the stack after the merge: we have , and is the result of the merges of and . Moreover, since the conditions for Cases #2 and #3 are not satisfied by , we have and . It follows from Lemma 3 that . ∎
With the help of these auxiliary results, we may now prove the following statement, whose proof is also the most technical one in this paper.
The total nonexpanding cost of the starting, middle and ending sequences of a run is at most .
Proof.
Let be the stack just after the run has been pushed, and let be the largest integer such that . We will respectively denote by , and the starting, middle and ending sequences of .
The starting sequence consists in merging the run with , then merging the resulting run successively with . Due to Lemma 3, the nonexpanding merges of involve at most once each of the runs , when these are first merged. Let be the largest integer, if any, such that the first merge of is nonexpanding, and let be the run into which is merged. Since , the nonexpanding cost of is at most .
Then, let us consider a merge of one of the sequences or , and let be the stack just before the merge occurs:

If this is the first merge of the ending sequence, then note that . Let us further assume that this merge is nonexpanding. Then, it must be the case that . Moreover, if the starting sequence had a nonzero nonexpanding cost, then the abovedefined run turns out to be entirely contained in the run . It follows that .
Let us gather the above results. First, the starting sequence has a nonexpanding cost . Then, the middle sequence consists only of expanding merges, hence its nonexpanding cost is . Finally, the ending sequence may contain only one nonexpanding merge; if it does, then its nonexpanding cost is exactly , and we also have . Hence, in all cases, it the total nonexpanding cost of the sequences , and is at most . ∎
We may now conclude with the proof of our main result.
Proof of Theorem 3.
Lemma 3 states that the total cost of expanding merges is at most . Then, Lemma 3 states that the total nonexpanding cost of the starting, middle and ending sequences of a given run is at most ; hence and taking all runs into account, the total nonexpanding cost of these sequences is at most .
It remains to take care of the nonexpanding cost of the runs merged in line 2 of Algorithm 1. The merges performed in line 2 are the same merges as those that would occur if we had appended a (fictitious) run of length to our array. Adding this run would increase the the total nonexpanding cost Algorithm 1 by at most , including a cost of paid because of the nonexpanding merge of the fictitious run itself. By discounting this fictitious cost of , we observe that the nonexpanding cost of line 2 is at most , which completes the proof of Theorem 3. ∎
4 Implementation details
Here, and based on the above study, we focus on several implementation details.
Our first remark follows directly from the classification of merges in starting, middle and ending sequences and from Lemma 3. It suggests replacing the loop that runs from lines 1 to 1 with three successive loops, thereby obtaining Algorithm 2, which is equivalent to Algorithm 1
Our second remark is a side effect of Lemma 3, thanks to which we can derive the following upper bound on the size of the stack maintained throughout the algorithm. This upper bound has obvious consequences on realworld implementations of the AdaptiveShiversSort algorithm, since the stack size may be evaluated a priori from the knowledge of , which allows simulating the stack on a fixedsize array.
Proposition .
At any time during the execution of the algorithm, the stack size is at most .
Proof.
Our last remark is more anecdotical, and concerns a lowlevel implementation detail: having to store the values or to recompute them on the fly whenever needed, from the formula , might be bothersome. Fortunately, given two integers and , checking whether is made very easy by the use of boolean integer operations, such as the bitwise xor function, and can be coded as the oneline program presented in Algorithm 3.
Indeed, if , then both and are bit integers, hence are greater than the bit integer ; if , is clearly greater than ; finally, if , the integer uses at most bits, since its ^{th} bit must be a , and therefore is greater than .
References
 [1] Nicolas Auger, Cyril Nicaud, and Carine Pivoteau. Merge strategies: From Merge Sort to TimSort. Research Report hal01212839, hal, 2015. URL: https://halupecupem.archivesouvertes.fr/hal01212839.
 [2] Jérémy Barbay and Gonzalo Navarro. On compressing permutations and adaptive sorting. Theor. Comput. Sci., 513:109–123, 2013. URL: http://dx.doi.org/10.1016/j.tcs.2013.10.019, doi:10.1016/j.tcs.2013.10.019.
 [3] Sam Buss and Alexander Knop. Strategies for stable merge sorting. Research Report abs/1801.04641, arXiv, 2018. URL: http://arxiv.org/abs/1801.04641.
 [4] Mordecai J Golin and Robert Sedgewick. Queuemergesort. Information Processing Letters, 48(5):253–259, 1993.
 [5] Donald E. Knuth. The Art of Computer Programming, Volume 3: (2nd Ed.) Sorting and Searching. Addison Wesley Longman Publish. Co., Redwood City, CA, USA, 1998.
 [6] Heikki Mannila. Measures of presortedness and optimal sorting algorithms. IEEE Trans. Computers, 34(4):318–325, 1985. URL: http://dx.doi.org/10.1109/TC.1985.5009382, doi:10.1109/TC.1985.5009382.
 [7] J. Ian Munro and Sebastian Wild. Nearlyoptimal mergesorts: Fast, practical sorting methods that optimally adapt to existing runs. In Hannah Bast Yossi Azar and Grzegorz Herman, editors, 26th Annual European Symposium on Algorithms (ESA 2018), Leibniz International Proceedings in Informatics (LIPIcs), pages 63:1–63:15, 2018.
 [8] Olin Shivers. A simple and efficient natural merge sort. Technical report, Georgia Institute of Technology, 2002.
 [9] Tadao Takaoka. Partial solution and entropy. In Rastislav Královič and Damian Niwiński, editors, Mathematical Foundations of Computer Science 2009, pages 700–711, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg.
Comments
There are no comments yet.