Motivation

Wikipedia explains it very nicely: sorting is ubiquitous in Computer Sciences. It is a simple problem (“How can I sort the following values the most efficiently?”) that has many solutions, but still offers open problems.

We only consider correct algorithms, i.e., one where their output is such that

  • each element is larger than or equal to the previous one, according to the order picked,
  • all the elements that were in the input are present in the output (with the same cardinality, if repetition is allowed).

There are many ways of “comparing” sorting algorithms. A sorting algorithm…

  • has a best, worst and average case time complexity (measured in general in number of comparisons required),
  • has a best, worst and average case space complexity (i.e., “how much additional memory is required?”),
  • can be “stable” (i.e., equal values are not permutted),
  • uses insertion, exchange, selection, merging method,
  • is serial or parallel,

among other properties.

Algorithm Comparison

The algorithms we are about to study compare as follows, omitting the , , and letting be the size of the list we have to sort:

NameBestAverageWorstMemoryStable
Insertion sortYes
HeapsortNo
Bubble sortYes
ShellsortNo
QuicksortNo
Selection sortNo
Merge sortYes

All the algorithms above are in-place except for merge sort.

Helper Methods

We start by defining two simple “helper” methods:

    private static void Swap(List<T> listP, int lhs, int rhs)
    {
        T current = listP[lhs];
        listP[lhs] = listP[rhs];
        listP[rhs] = current;
    }
 
    public static bool IsSorted(List<T> listP)
    {
        bool isSorted = true;
        for (int i = 0; i < listP.Count - 1; i++)
        {
            if (listP[i].CompareTo(listP[i + 1]) > 0)
            {
                isSorted = false;
            }
        }
        return isSorted;
    }

(Download this code)

Insertion Sort Algorithm

Implementation

This algorithm is nicely explained and illustrated on wikipedia, and can be implemented as follows:

    public static void InsertionSort(List<T> listP)
    {
        T current;
        int slot;
        for (int bar = 1; bar < listP.Count; bar++)
        {
            current = listP[bar];
            for (
              slot = bar;
              slot > 0 && current.CompareTo(listP[slot - 1]) < 0;
              slot--
            )
            {
                listP[slot] = listP[slot - 1];
            }
            listP[slot] = current;
        }
    }

(Download this code)

Description

This algorithm moves the bar from the beginning of the list to the end, one by one. At every step, it position a slot on the bar and look back, moving the value at the slot earlier and earlier on as long as its predecessor is smaller than itself.

Complexity

As explained on wikipedia, the simplest worst case input is an array sorted in reverse order. With an array sorted in reverse order, every iteration of the inner loop will scan and shift the entire sorted subsection of the array (i.e., from bar to the beginning) before inserting the next element. This gives a quadratic running time (i.e., ), since bar is linear in n, and we iterate twice over it.

On the flip side, if the array is already sorted, then the algorithm is linear, since the inner loop will always execute just one time, giving an overall best performance of .

But on average, the algorithm remains in since it will need to go through the list twice.

Heapsort Algorithm

Implementation

We first define a helper method:

    private static int LeftChild(int i)
    {
        return 2 * i + 1;
    }

(Download this code)

and then leverage the heap structure to sort:

    public static void Heapsort(List<T> listP)
    {
        // Step 1: heapify, or build heap.
        for (int i = listP.Count / 2; i >= 0; i--)
            PercDown(listP, i, listP.Count);
        // Step 2: recursively extract the largest value.
        for (int i = listP.Count - 1; i > 0; i--)
        {
            Swap(listP, 0, i);
            PercDown(listP, 0, i);
        }
    }
 
    private static void PercDown(List<T> listP, int i, int N)
    {
        int Child;
        T current;
        for (current = listP[i]; LeftChild(i) < N; i = Child)
        {
            Child = LeftChild(i);
            if (
              Child != N - 1
              && listP[Child].CompareTo(listP[Child + 1]) < 0
            )
                Child++;
            if (current.CompareTo(listP[Child]) < 0) // current < listP[child]
                listP[i] = listP[Child];
            else
                // if current >= listP[child] we *do not*
                // swap: we are constructing a *max* heap!
                break;
        }
        listP[i] = current;
    }

(Download this code)

Note that PercDown builds a max heap: once the values are “pre-sorted greater value first”, removing the first one to move it to the end of the list makes the list sorted from smallest to greatest value once we are done.

Description

This algorithm works in two steps:

  • First, it constructs the heap “in-place”, by arranging the elements from the last-to-lower level (listP.Count / 2 corresponds to “the last parent”) to the first level (i >=0), in increasing order (i.e., this is a max heap, the greater value is first),
  • Then, it recursively extract the first element, and place it after the end of the heap: note that PercDown(listP, 0, i) makes it so that the heap is considered to stop at index i, as it decreases in size by 1 at every iteration.

Complexity

  • The PercDown method has complexity , since it iterates through the tree from top to bottom, i.e., it is relative to the tree height, which is .
  • The first step calls PercDown times, which is equivalent to , so overall this first step is .
  • The second step also calls PercDown times, so it is overall as well.

Hence, the complexity of heapsort is by the sum rule. Note that the average, worst and best complexity are all the same!

Bubble Algorithm

Implementation

    public static void BubbleSort(List<T> listP)
    {
        for (int i = listP.Count - 1; i >= 0; i--)
        {
            for (int j = 0; j < i; j++)
            {
                if (listP[j].CompareTo(listP[j + 1]) > 0)
                    Swap(listP, j, j + 1);
            }
        }
    }

(Download this code)

Description

The nested loop accomplishes the following: “from the beginning of the list to where I stopped the last time -1, go through the elements one by one and swap them if needed”.

Complexity

Since both loops depends on the size of the list, , the algorithm is on average : we need to perform times operations. An optimization (not presented here) that stops the inner loop when elements were not swapped allows to bring the best case performance of bubble sort to linear ().

ShellSort Algorithm

Implementation

    public static void ShellSort(List<T> listP)
    {
        int slot;
        T current;
        for (int gap = listP.Count / 3 + 1; gap > 0; gap /= 2)
        {
            for (int next = gap; next < listP.Count; next++)
            {
                // goes thru array by steps
                current = listP[next];
                for (
                  slot = next;
                  slot >= gap
                    && current.CompareTo(listP[slot - gap]) < 0;
                  slot -= gap
                )
                // slides current until in place
                {
                    listP[slot] = listP[slot - gap];
                }
                listP[slot] = current;
            }
        }
    }

(Download this code)

Description

Consider a list of size 30, we have (assuming current.CompareTo(listP[slot - gap]) < 0 is always true):

gapnextslotslot - gap
1111110
12121
13132
222211
""110
232312
""121
11303019
""198
5550
""61
""72
10105
1050
11116
""61
2
1

The important point is to understand that we generate the sequences of pairs (slot, slot-gap) as follows:

  • Gap of 11: (11, 0), (12, 1), (13, 2), … (22, 11), (11, 0), (23, 12), (12, 1), (30, 19), (19, 8),
  • Gap of 5: (5, 0), (11, 6), (6, 1), …

which are sequences of values we are comparing. For the gap of 11, it means we do the following:

  • First, we compare the values at indices 11 and 0, and swap them if needed,
  • Then, we compare the values at indices 12 and 1, and swap them if needed,
  • Then, we compare the values at indices 30 and 19, and swap them if needed,
  • If we did swap the values previously, then we compare the values at indices 19 and 8, and swap them if needed.

After we are done going through “the gap”, we know that all values indices apart are sorted. Reducing the value of to makes it so that the whole array is sorted.

Complexity

The complexity of shell sort depends with the “gap sequence” that is used. We use listP.Count / 3 + 1, (listP.Count / 3 + 1) / 2, (listP.Count / 3 + 1) / 4, …, 1. This sequence follows Shell’s original algorithm, and it is of complexity in the worst case: indeed, we may need to explore gaps, each requiring swaps. If the best case, if the array is already mostly sorted, then we still need to explore gaps, but each gap takes only swaps, giving a complexity. On average, the complexity depends a lot on the sequence, but can be around , which is still better than quadratic!

Playing with the gap sequence further can give a best, worst and average performance of !

Quick Sort Algorithm

Implementation

    public static void QuickSort(
        List<T> listP
    )
    {
        QuickSort(listP, 0, listP.Count - 1);
    }
 
    public static void QuickSort(
      List<T> listP,
      int leftP,
      int rightP
    )
    {
        if (leftP < rightP + 1)
        {
            T pivot = MedianOfThree(listP, leftP, rightP);
            int left = leftP;
            int right = rightP;
            while (left <= right)
            {
                // looking for value larger
                // than the pivot
                // on the left:
                while (listP[left].CompareTo(pivot) < 0) left++;
                // looking for value smaller
                // than or equal to the pivot
                // on the right, without "crossing"
                // left and right.
                while ((left <= right) && pivot.CompareTo(listP[right]) <= 0) right--;
                if (left < right)
                    Swap(listP, left, right);
 
            }
 
            Swap(listP, left, rightP); // Move pivot back
            QuickSort(listP, leftP, left - 1); // sort left sub-list
            QuickSort(listP, left + 1, rightP); // sort rigth sub-list
        }
    }
 
    private static T MedianOfThree(
  List<T> listP,
  int left,
  int right
)
    {
        int center = (left + right) / 2;
        // We sort the left, center and right 
        // elements:
        if (listP[center].CompareTo(listP[left]) < 0)
            Swap(listP, left, center);
        if (listP[right].CompareTo(listP[left]) < 0)
            Swap(listP, left, right);
        if (listP[right].CompareTo(listP[center]) < 0)
            Swap(listP, center, right);
 
        // Move the pivot to the right:
        Swap(listP, center, right);
        return listP[right];
    }

(Download this code)

Description

At a high level, the algorithm

  • pick a “median” value (the pivot), at the middle of the list to be sorted,
  • organize the list so that the values to the left of the pivot are smaller than the pivot, and the values greater than or equal to the pivot are to its right,
  • then recursively call quicksort on the list on the left of the pivot, and on the list on the right of the pivot,
  • when the lists left to be sorted are of size 1, we know that quicksort is done (a list of size 1 is sorted!), the list is sorted.

In detail, this algorithm works as follows:

  • Choose a pivot: we use the medianOfThree method to select an element from the list as the pivot. Other ways of choosing the pivot exist, this “median-of-three” technique is optimal when no information about the ordering of the input is known. Note that this method actually sorts those three elements (at the beginning, center and end of the list to be sorted), and take the median one as the pivot,

  • Partition the list: the while (left <= right) loop in QuickSort proceeds as follows:

    • It first look for the smallest left index such that listP[left] > pivot,
    • It then look for the greatest right index such that pivot <= listP[right] and left <= right,
    • At this point, we know that those values at left and right needs to be swapped (if left < right, i.e., they did not cross) and we swap them.
  • When this loop is done, we know that left is where the pivot ought to be, it’s the median value of the list we are sorting.

  • We can then call Quicksort on the two sub-lists:

    • the one that goes from leftP to left - 1 (i.e, the values to the left of the pivot),
    • the one that goes from left + 1 to rightP (i.e, the values to the left of the pivot),

One subtlety is that we know that left is where the pivot value ought to be (we actually did not know where it should have been where we started, we simply made an educated guess): hence, it was swapped at the end of medianOfThree and back at the end of Quicksort.

Complexity

The complexity of quick sort depends on how “lucky” we were when we picked the pivot value. The best case is when the pivot always divides the array in two equal halves. Let be the complexity of sorting a list of size using quicksort:

  • We need to perform a linear () number of comparison, to partition the values to the left and the right of the pivot, and then to sort each-sublist,
  • To sort each sub-list, we need to first partition them, which takes number of comparison,
  • Then, sorting each sub-list takes .
  • So, we get that ,
  • Iterating this mechanism, we get that , for ,
  • Then, we get and .

The average case is also , but if we are unlucky with our pivot (which is always at the beginning or at the end of the list), then we have to keep on sorting a sub-list that is linear (we partition our list in lists of size and ), and we get a worst case complexity of .