DeepAI
Log In Sign Up

Generalized Leapfrogging Samplesort: A Class of O(n ^2 n) Worst-Case Complexity and O(n n) Average-Case Complexity Sorting Algorithms

The original Leapfrogging Samplesort operates on a sorted sample of size s and an unsorted part of size s+1. We generalize this to a sorted sample of size s and an unsorted part of size (2^k-1)(s+1), where k = O(1). We present a practical implementation of this class of algorithms and we show that the worst-case complexity is O(n ^2 n) and the average-case complexity is O(n n). Keywords: Samplesort, Quicksort, Leapfrogging Samplesort, sorting, analysis of algorithms.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

03/23/2015

A Machine Learning Approach to Predicting the Smoothed Complexity of Sorting Algorithms

Smoothed analysis is a framework for analyzing the complexity of an algo...
11/02/2018

Worst-Case Efficient Sorting with QuickMergesort

The two most prominent solutions for the sorting problem are Quicksort a...
06/24/2016

Asymptotic and exact results on the complexity of the Novelli-Pak-Stoyanovskii algorithm

The Novelli-Pak-Stoyanovskii algorithm is a sorting algorithm for Young ...
06/08/2020

Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

Average-case analysis computes the complexity of an algorithm averaged o...
05/22/2018

On the Worst-Case Complexity of TimSort

TimSort is an intriguing sorting algorithm designed in 2002 for Python, ...
06/19/2020

Minimax rates without the fixed sample size assumption

We generalize the notion of minimax convergence rate. In contrast to the...
10/07/2019

Minimal sample size in balanced ANOVA models of crossed, nested and mixed classifications

We consider balanced one-, two- and three-way ANOVA models to test the h...

1 Introduction

Samplesort was shown by Frazer and McKellar [5] to be a sorting algorithm that has a potential of competing with Quicksort [6] in terms of average running time. In fact, it was shown in Frazer and McKellar [5] that Samplesort average running time slowly approaches the information-theoretic lower bound. Apers [3], on the other hand, improved Samplesort by introducing Recursive Samplesort. The idea is to make use of Samplesort itself, instead of Quicksort, in sorting the sample. The expected number of comparisons of Recursive Samplesort was shown in Apers [3] to be close to the information-theoretic lower bound. Another implementation of Samplesort was given by Peters and Kritzinger [7]. Unfortunately, not one of these implementations of Samplesort can be considered practical. The implementation of Peters and Kritzinger [7], for example, uses temporary storage locations for storing the sample which eventually are used to store pointers to positions in the array bounded by the sample. The implementation of Apers [3], on the other hand, uses a stack to store pointers to positions in the array that are bounded by the sample. All the implementations run in worst-case time.

In 1995, Albacea [1] reported the algorithm Leapfrogging Samplesort which is a practical implementation of Samplesort. The algorithm has a worst-case complexity of 111All logarithms in this paper are to base 2, except when it is explicitly stated. and an average-case complexity of . Albacea [2]estimated the exact average-case complexity to a value that is very near the information-theoretic lower bound. Chen [4], in 2006, proposed the algorithm Full Sample sort whose worst-case complexity is and whose average-case complexity is .

In this paper, we introduce a generalization of Leapfrogging Samplesort, where we have a sample of size and an unsorted part of size where . When , say , the algorithm reduces to Quicksort. The generalized Leapfrogging Samplesort has a worst-case complexity of and an average-case complexity of . Thus,this class of algorithms extends the number of practical algorithms whose worst-case complexity is and whose average-case complexity is . The author is aware of only two such algorithms in this class of algorithms, Leapfrogging Samplesort by Albacea [1] and Full Sample Sort by Chen [4].

2 Generalized Leapfrogging Samplesort

The original Leapfrogging Samplesort involves in each stage of the sorting process the first elements of the sequence, where the first elements are already sorted and the next elements are to be partitioned and sorted using the sorted elements as the sample.

The algorithm starts with the leftmost element as a sorted sample of size that is used to partition the next elements, eventually producing a sorted sequence of size . The sorted sequence of size is used as a sample to partition the next elements, eventually producing a sorted sequence of size . The sorted sequence of size is used to partition the next elements, eventually producing a sorted sequence of size . The process is repeated until the whole sequence is sorted.

Given a sequence prefixed by a sample of size and an unsorted part whose size is at most , an outline of the algorithm for partitioning the unsorted part using the sorted sample is as follows:

Step : Let be the middle element of the sorted sample and the group of elements to the left is the left subsample and the group of elements to the right is the right subsample. Using as a pivot element, we partition the unsorted part thereby producing two partitions, namely: the left partition (elements which are less than )222Without loss of generality, we assume that the elements of the sequence are distinct. and the right partition (elements which are greater than ). Then, and the right subsample are moved to the left of the right partition and the left partition, is moved to the right of the left subsample. This step will produce two subsequences where each subsequence is prefixed by a sorted sample.

Step : Recursively apply Step until the size of the sorted sample is equal to on the two sequences produced in Step .

If after the partitioning process, a partition whose size is greater than is produced, then such partition is sorted by Leapfrogging Samplesort itself.

Table 1 illustrates the sizes of the sorted and unsorted parts using the ratio .

sorted sample unsorted part
1 2
3 4
7 8
15 16
31 32
63 64
127 128
255 256
Table 1: Sizes of the sorted and unsorted parts using the ratio .

A generalization of this is obtained by reducing the ratio between the sizes of the sorted sample and the unsorted part. One such class of ratios is the ratio defined by where , is the size of the sorted sample and is the size of te unsorted part. Of course with , this reduces to the original Leapfrogging Samplesort. Table 2 illustrates the sizes of the sorted and unsorted parts for to .

sorted sample unsorted part sorted sample unsorted part sorted sample unsorted part
Table 2: Sizes of sorted and unsorted parts using the ratio where to .

A practical implementation of the generalized Leapfrogging Samplesort is given below:

        void LFSamplesort(int first, int last)
        {
                int s;
                int r;
                if (last > first) {
                        s = 1;
                        r = M*(s+1);
                        while (s <= (last-first+1-r)) {
                                Leapfrog(first, first+s-1, first+s+r-1);
                                s = s+r;
                                r = M*(s+1);
                                }
                        Leapfrog(first, first+s-1, last);
                        }
        }

The constant .

        void Leapfrog(int s1, int ss, int u)
        {
                int i,j,k, sm, v,t;
                if (s1 > ss) LFSamplesort(ss+1, u);
                else
                if (u > ss) {
                        sm = (s1+ss) / 2;
                        /* Partition */
                        v = A[sm];
                        j = ss;
                        for(i=ss+1; i <= u; i++) {
                                if (A[i] < v) {
                                        j++;
                                        t = A[j];
                                        A[j] = A[i];
                                        A[i] = t;
                                        }
                                }
                        /* Move Sample */
                        if (j > ss) {
                                for (k=j, i=ss; i >= sm; k- -, i- -) {
                                        t = A[i];
                                        A[i] = A[k];
                                        A[k] = t;
                                        }
                                }
                        Leapfrog(s1, sm-1,sm+j-ss-1);
                        Leapfrog(sm+j-ss+1, j, u);
                        }
         }

The code above of the generalized Leapfrogging Samplesort is similar to the code of the Leapfrogging Samplesort given in Albacea [2], except for a minor difference. Specifically, the difference between the two codes is the introduction of constant to the code of the generalized Leapfrogging Samplesort.

3 Worst-Case Analysis

The operation that dominates the execution of the algorithm is the comparison operation. Hence, our analysis will be in terms of number of comparisons involved in the algorithm. We refer to the number of comparisons involved in the algorithm as the cost of the algorithm.

The worst case is exhibited when the values of the sample are all less than or all greater than the unsorted elements every time the unsorted portion is partitioned using the elements of the sample. Without loss of generality, we assume . With this value of , we obtain a worst-case complexity of:

where is the cost of applying Leapfrogging Samplesort on the sample of size , is the cost of sorting using Leapfrogging Samplesort the unsorted sequence of size which remains unsorted after the partitioning process, and is the cost of partitioning the unsorted sequence of size using a sorted sample of size .

When , given

we obtain the recurrence relation

When , similarly, given

we obtain the recurrence relation

For any integer , , given

we produce the recurrence relation

4 Average-Case Analysis

Without loss of generality, we assume . The average-case complexity of the algorithm is given by the recurrence relation

where is the average cost of sorting the sample of size , is the cost of partitioning the unsorted part of size using the sorted sample of size . The idea is that the middle element of the sample will be used as a pivot element in partitioning the unsorted part of size . Using Lemma 1 of Frazer and McKellar[5], the expected size of each of the 2 partitions is the size of the unsorted part, provided the pivot element is a random sample of size 1 from the set composed of the pivot element and elements of the unsorted part. Then, using the first quarter and the third quarter elements of the sorted sample as pivot elements, we split each partition into 2 more partitions. We continue doing this for steps. This will produce partitions where the expected size of each partition is . Hence, the cost of sorting the partitions is . Given

will produce the recurrence relation

where , when

5 Conclusions

We have presented a practical implementation of a generalized Leapfrogging Samplesort and analyzed its worst-case complexity and average-case complexity. It was shown that the worst-case complexity is and the average-case complexity is . Thus, extending the number of practical algorithms whose worst-case complexity is and whose average-case complexity is . What remains open is the computation of the exact average-case complexity of the generalized Leapfrogging Samplesort.

References

  • [1] Albacea, E.A. Leapfrogging Samplesort, Proceedings of the 1st Asian Computing Science Conference, Lecture Notes in Computer Science 1023 (1995), 1-9.
  • [2] Albacea, E.A. Average-case analysis of Leapfrogging Samplesort, Philippine Science Letters, Vol 5 No 1 (2012), 14-16.
  • [3] Apers, P.M.G., Recursive samplesort, BIT 18 (1978), 125-132.
  • [4] Chen, J.C. Efficient Samplesort and average case analysis of PE sort, Theoretical Computer Science, Vol 369 Issues 1-3 (2006), 44-66.
  • [5] Frazer, W.D. and McKellar, A.C., Samplesort: A sampling approach to minimal storage tree sorting, J. ACM 17 (1970), 496-507.
  • [6] Hoare, C.A.R., Quicksort, Computer Journal 5 (1962), 10-15.
  • [7] Peters, J.G., and Kritzinger, P.S., Implementation of samplesort: a minimal storage tree sort, BIT 15 (1975), 85-93.