## 1 Introduction

Samplesort was shown by Frazer and McKellar [5] to be a sorting algorithm that has a potential of competing with Quicksort [6] in terms of average running time. In fact, it was shown in Frazer and McKellar [5] that Samplesort average running time slowly approaches the information-theoretic lower bound. Apers [3], on the other hand, improved Samplesort by introducing Recursive Samplesort. The idea is to make use of Samplesort itself, instead of Quicksort, in sorting the sample. The expected number of comparisons of Recursive Samplesort was shown in Apers [3] to be close to the information-theoretic lower bound. Another implementation of Samplesort was given by Peters and Kritzinger [7]. Unfortunately, not one of these implementations of Samplesort can be considered practical. The implementation of Peters and Kritzinger [7], for example, uses temporary storage locations for storing the sample which eventually are used to store pointers to positions in the array bounded by the sample. The implementation of Apers [3], on the other hand, uses a stack to store pointers to positions in the array that are bounded by the sample. All the implementations run in worst-case time.

In 1995, Albacea [1] reported the algorithm Leapfrogging Samplesort which is a practical implementation of Samplesort.
The algorithm has a worst-case complexity of ^{1}^{1}1All logarithms in this paper are to base 2, except when it is explicitly
stated. and an average-case complexity of . Albacea [2]estimated the exact average-case complexity to a value that is
very near the information-theoretic lower bound. Chen [4], in 2006, proposed the algorithm Full Sample sort whose worst-case complexity
is and whose average-case complexity is .

In this paper, we introduce a generalization of Leapfrogging Samplesort, where we have a sample of size and an unsorted part of size where . When , say , the algorithm reduces to Quicksort. The generalized Leapfrogging Samplesort has a worst-case complexity of and an average-case complexity of . Thus,this class of algorithms extends the number of practical algorithms whose worst-case complexity is and whose average-case complexity is . The author is aware of only two such algorithms in this class of algorithms, Leapfrogging Samplesort by Albacea [1] and Full Sample Sort by Chen [4].

## 2 Generalized Leapfrogging Samplesort

The original Leapfrogging Samplesort involves in each stage of the sorting process the first elements of the sequence, where the first elements are already sorted and the next elements are to be partitioned and sorted using the sorted elements as the sample.

The algorithm starts with the leftmost element as a sorted sample of size that is used to partition the next elements, eventually producing a sorted sequence of size . The sorted sequence of size is used as a sample to partition the next elements, eventually producing a sorted sequence of size . The sorted sequence of size is used to partition the next elements, eventually producing a sorted sequence of size . The process is repeated until the whole sequence is sorted.

Given a sequence prefixed by a sample of size and an unsorted part whose size is at most , an outline of the algorithm for partitioning the unsorted part using the sorted sample is as follows:

Step : Let be the middle element of the sorted sample and the group of elements to the left is the left subsample and the group of elements
to the right is the right subsample. Using as a pivot element, we partition the unsorted part thereby producing two partitions, namely: the left
partition (elements which are less than )^{2}^{2}2Without loss of generality, we assume that the elements of the sequence are distinct. and the
right partition (elements which are greater than ). Then, and the right subsample are moved to the left of the right partition and the left
partition, is moved to the right of the left subsample. This step will produce two subsequences where each subsequence is prefixed by a sorted sample.

Step : Recursively apply Step until the size of the sorted sample is equal to on the two sequences produced in Step .

If after the partitioning process, a partition whose size is greater than is produced, then such partition is sorted by Leapfrogging Samplesort itself.

Table 1 illustrates the sizes of the sorted and unsorted parts using the ratio .

sorted sample | unsorted part |

1 | 2 |

3 | 4 |

7 | 8 |

15 | 16 |

31 | 32 |

63 | 64 |

127 | 128 |

255 | 256 |

… | … |

A generalization of this is obtained by reducing the ratio between the sizes of the sorted sample and the unsorted part. One such class of ratios is the ratio defined by where , is the size of the sorted sample and is the size of te unsorted part. Of course with , this reduces to the original Leapfrogging Samplesort. Table 2 illustrates the sizes of the sorted and unsorted parts for to .

sorted sample | unsorted part | sorted sample | unsorted part | sorted sample | unsorted part |

… | … | … | … | … | … |

A practical implementation of the generalized Leapfrogging Samplesort is given below:

void LFSamplesort(int first, int last) { int s; int r; if (last > first) { s = 1; r = M*(s+1); while (s <= (last-first+1-r)) { Leapfrog(first, first+s-1, first+s+r-1); s = s+r; r = M*(s+1); } Leapfrog(first, first+s-1, last); } }

The constant .

void Leapfrog(int s1, int ss, int u) { int i,j,k, sm, v,t; if (s1 > ss) LFSamplesort(ss+1, u); else if (u > ss) { sm = (s1+ss) / 2; /* Partition */ v = A[sm]; j = ss; for(i=ss+1; i <= u; i++) { if (A[i] < v) { j++; t = A[j]; A[j] = A[i]; A[i] = t; } } /* Move Sample */ if (j > ss) { for (k=j, i=ss; i >= sm; k- -, i- -) { t = A[i]; A[i] = A[k]; A[k] = t; } } Leapfrog(s1, sm-1,sm+j-ss-1); Leapfrog(sm+j-ss+1, j, u); } }

The code above of the generalized Leapfrogging Samplesort is similar to the code of the Leapfrogging Samplesort given in Albacea [2], except for a minor difference. Specifically, the difference between the two codes is the introduction of constant to the code of the generalized Leapfrogging Samplesort.

## 3 Worst-Case Analysis

The operation that dominates the execution of the algorithm is the comparison operation. Hence, our analysis will be in terms of number of comparisons involved in the algorithm. We refer to the number of comparisons involved in the algorithm as the cost of the algorithm.

The worst case is exhibited when the values of the sample are all less than or all greater than the unsorted elements every time the unsorted portion is partitioned using the elements of the sample. Without loss of generality, we assume . With this value of , we obtain a worst-case complexity of:

where is the cost of applying Leapfrogging Samplesort on the sample of size , is the cost of sorting using Leapfrogging Samplesort the unsorted sequence of size which remains unsorted after the partitioning process, and is the cost of partitioning the unsorted sequence of size using a sorted sample of size .

When , given

we obtain the recurrence relation

When , similarly, given

we obtain the recurrence relation

For any integer , , given

we produce the recurrence relation

## 4 Average-Case Analysis

Without loss of generality, we assume . The average-case complexity of the algorithm is given by the recurrence relation

where is the average cost of sorting the sample of size , is the cost of partitioning the unsorted part of size using the sorted sample of size . The idea is that the middle element of the sample will be used as a pivot element in partitioning the unsorted part of size . Using Lemma 1 of Frazer and McKellar[5], the expected size of each of the 2 partitions is the size of the unsorted part, provided the pivot element is a random sample of size 1 from the set composed of the pivot element and elements of the unsorted part. Then, using the first quarter and the third quarter elements of the sorted sample as pivot elements, we split each partition into 2 more partitions. We continue doing this for steps. This will produce partitions where the expected size of each partition is . Hence, the cost of sorting the partitions is . Given

will produce the recurrence relation

where , when

## 5 Conclusions

We have presented a practical implementation of a generalized Leapfrogging Samplesort and analyzed its worst-case complexity and average-case complexity. It was shown that the worst-case complexity is and the average-case complexity is . Thus, extending the number of practical algorithms whose worst-case complexity is and whose average-case complexity is . What remains open is the computation of the exact average-case complexity of the generalized Leapfrogging Samplesort.

## References

- [1] Albacea, E.A. Leapfrogging Samplesort, Proceedings of the 1st Asian Computing Science Conference, Lecture Notes in Computer Science 1023 (1995), 1-9.
- [2] Albacea, E.A. Average-case analysis of Leapfrogging Samplesort, Philippine Science Letters, Vol 5 No 1 (2012), 14-16.
- [3] Apers, P.M.G., Recursive samplesort, BIT 18 (1978), 125-132.
- [4] Chen, J.C. Efficient Samplesort and average case analysis of PE sort, Theoretical Computer Science, Vol 369 Issues 1-3 (2006), 44-66.
- [5] Frazer, W.D. and McKellar, A.C., Samplesort: A sampling approach to minimal storage tree sorting, J. ACM 17 (1970), 496-507.
- [6] Hoare, C.A.R., Quicksort, Computer Journal 5 (1962), 10-15.
- [7] Peters, J.G., and Kritzinger, P.S., Implementation of samplesort: a minimal storage tree sort, BIT 15 (1975), 85-93.