Sorting I

Sorting means arranging items in order — like alphabetizing a list of names or ordering numbers from smallest to largest. It is one of the most fundamental operations in computer science because so many other tasks (searching, finding duplicates, matching data) become much easier once data is sorted. This week you will learn four sorting methods: Selection Sort (scan for the smallest, put it first, repeat), Insertion Sort (pick each item and slide it into the right place in the already-sorted portion), Merge Sort (split the list in half, sort each half, then merge them back together), and QuickSort (pick a pivot, partition around it, recurse on both sides). You will also learn the divide-and-conquer strategy, how to use the Master Theorem to calculate running times of recursive algorithms, a surprising mathematical proof that no comparison-based sorting algorithm can ever do better than $O(n \log n)$ in the worst case, and Counting Sort — a non-comparison sort that bypasses this limit.

Algorithm	Best Case	Average	Worst Case	Stable?	In-place?
Selection Sort	$O(n^2)$	$O(n^2)$	$O(n^2)$	No	Yes
Insertion Sort	$O(n)$	$O(n^2)$	$O(n^2)$	Yes	Yes
Merge Sort	$O(n \log n)$	$O(n \log n)$	$O(n \log n)$	Yes	No ( $O(n)$ extra)
QuickSort	$O(n \log n)$	$O(n \log n)$	$O(n^2)$	No	Yes
Counting Sort	$O(n+k)$	$O(n+k)$	$O(n+k)$	Yes	No ( $O(n+k)$ extra)

Learning Objectives

Implement Selection Sort and Insertion Sort step by step, and analyze why both are O(n²) in the worst case

Understand the divide-and-conquer strategy through Merge Sort and QuickSort — splitting a problem, solving subproblems, and combining the results

Implement QuickSort with Lomuto partition, understand its O(n log n) expected time and O(n²) worst case, and know how randomized pivot selection avoids worst-case behavior

Use the Master Theorem to solve recurrence relations (equations that describe how a recursive algorithm's running time breaks down)

Understand why no comparison-based sorting algorithm can ever beat O(n log n) — a fundamental theoretical limit proven using decision trees

Know Counting Sort as a non-comparison-based sort that runs in O(n+k) time, bypassing the comparison sort lower bound

Key Concepts

Selection Sort

Imagine you have a row of playing cards face-up on a table and you want to arrange them from smallest to largest. Selection Sort works like this: scan all the cards to find the smallest one, then swap it with the first card. Now the first position is correct. Next, scan the remaining cards (positions 2 onward) to find the next smallest, and swap it into position 2. Repeat until every card is in the right place. The downside? Even if the cards are already mostly sorted, Selection Sort still scans through everything — it always does the same number of comparisons, making it O(n²) no matter what.

-
Always performs exactly $\frac{n(n-1)}{2}$ comparisons regardless of input order — there is no 'lucky' input that makes it faster
-
In-place (uses no extra memory beyond the original array) but not stable (equal elements may get reordered during swaps)
-
Only does $O(n)$ swaps — this can be useful when writing data is expensive (like writing to flash memory), even though comparisons are $O(n^2)$

Insertion Sort

Insertion Sort works the way most people sort a hand of playing cards. You hold a sorted portion in your left hand and pick up one new card at a time with your right hand. For each new card, you slide it into the correct position among the cards already in your left hand. The algorithm does the same thing with an array: it starts from the second element, compares it with elements to its left, and shifts larger elements to the right until the correct spot is found. The beauty of Insertion Sort is that if the data is already nearly sorted, very few shifts are needed, making it run in O(n) — linear time. However, in the worst case (a reverse-sorted array), every element must be shifted all the way to the front, resulting in O(n²) time.

-
Best case $O(n)$ when the array is already sorted — each element is compared once and stays in place, so only n comparisons are needed
-
Stable sorting algorithm — if two elements are equal, they stay in their original relative order (important when sorting complex records by one field)
-
Adaptive — the closer the data is to being sorted, the fewer operations it performs, making it a great choice for 'almost sorted' data
-
In-place with $O(1)$ extra space — only needs one temporary variable to hold the element being inserted

Merge Sort

Merge Sort introduces one of the most powerful ideas in computer science: divide and conquer. Instead of trying to sort the whole array at once, you split it in half, sort each half separately (by applying the same strategy recursively), and then merge the two sorted halves back together. The merging step is the clever part — since both halves are already sorted, you just compare the front elements of each half, take the smaller one, and repeat. This guarantee of O(n log n) time makes Merge Sort much faster than Selection Sort or Insertion Sort for large inputs. The trade-off is that it needs O(n) extra memory to hold temporary copies during merging, so it is not 'in-place.'

-
Recurrence relation: $T(n) = 2T(n/2) + O(n)$ — this says the time to sort n items equals two sorts of n/2 items plus $O(n)$ work for merging
-
Solved by the Master Theorem to get $T(n) = O(n \log n)$ — much faster than $O(n^2)$ algorithms for large inputs
-
Stable sorting algorithm — equal elements keep their original order, which is useful when sorting records by multiple fields
-
Not in-place — requires $O(n)$ extra memory for temporary arrays during the merge step, which can be a downside for very large datasets
-
⚠ Common mistake: If an exam asks for a sort with guaranteed $O(n \log n)$ worst case, use Merge Sort or HeapSort. Do NOT say QuickSort — its worst case is $O(n^2)$ .

QuickSort

QuickSort is arguably the most widely used sorting algorithm in practice. Like Merge Sort, it uses divide and conquer, but with a twist: instead of splitting the array in the middle, it picks a pivot element and partitions the array so that everything smaller than the pivot goes to its left and everything larger goes to its right. After partitioning, the pivot is in its final sorted position. Then QuickSort recursively sorts the left and right portions.

The most common partition scheme is Lomuto partition: use the last element as the pivot, maintain a boundary index $i$ , and scan through the array. Each time you find an element ≤ pivot, swap it to the boundary and advance $i$ . At the end, swap the pivot into position $i+1$ .

Worked example — Lomuto partition on [7, 2, 1, 8, 6, 3, 5, 4] with pivot 4:

-Start: $i = -1$ , scan from left
- $j=0$ : 7 > 4 → skip
- $j=1$ : 2 ≤ 4 → $i=0$ , swap A[0]↔A[1] → [2, 7, 1, 8, 6, 3, 5, 4]
- $j=2$ : 1 ≤ 4 → $i=1$ , swap A[1]↔A[2] → [2, 1, 7, 8, 6, 3, 5, 4]
- $j=3$ : 8 > 4 → skip
- $j=4$ : 6 > 4 → skip
- $j=5$ : 3 ≤ 4 → $i=2$ , swap A[2]↔A[5] → [2, 1, 3, 8, 6, 7, 5, 4]
-Final: swap pivot A[7] with A[ $i+1$ ]=A[3] → [2, 1, 3, 4, 6, 7, 5, 8]
-Pivot 4 is now at index 3 — its correct sorted position!

QuickSort's average case is $O(n \log n)$ because a random pivot typically splits the array roughly in half. But if the pivot is consistently the smallest or largest element (e.g., already sorted input with last-element pivot), the split is maximally unbalanced ( $n-1$ vs $0$ ), giving $O(n^2)$ worst case. Randomized QuickSort fixes this by choosing the pivot randomly, making the $O(n^2)$ case astronomically unlikely.

-
Average case $O(n \log n)$ : a random pivot splits the array roughly in half each time, giving the recurrence $T(n) = 2T(n/2) + O(n)$
-
Worst case $O(n^2)$ : occurs when the pivot is always the smallest or largest element (e.g., already sorted input with last-element pivot), giving $T(n) = T(n-1) + O(n)$
-
Randomized QuickSort: pick the pivot randomly (or use median-of-three) to avoid worst-case behavior — the expected time is $O(n \log n)$ regardless of input order
-
In-place with $O(\log n)$ extra space for the recursion stack (average case) — much less memory than Merge Sort's $O(n)$ extra space
-
Not stable — the partition step can change the relative order of equal elements during swaps
-
⚠ Common mistake: Saying QuickSort is $O(n \log n)$ without qualification. Always specify: expected $O(n \log n)$ with random pivot, or $O(n \log n)$ average case. The worst case is $O(n^2)$ . On exams, if asked for a guaranteed $O(n \log n)$ sort, use MergeSort or HeapSort — NOT QuickSort.

Lower Bound for Comparison Sorts

Here is a surprising theoretical result: no matter how clever you are, if your sorting algorithm works by comparing pairs of elements (like all the algorithms we have seen so far), it must make at least Ω(n log n) comparisons in the worst case. How do we know this? We can model every possible comparison-based algorithm as a 'decision tree' — a branching diagram where each internal node represents a comparison (is A[i] < A[j]?), each branch represents the outcome (yes or no), and each leaf represents a final sorted order. Since there are n! (n factorial) possible ways to arrange n items, the tree must have at least n! leaves. The minimum height of a binary tree with n! leaves is log₂(n!), which by Stirling's approximation equals Ω(n log n). This means algorithms like Merge Sort and Heap Sort are optimal — they hit this theoretical speed limit.

-
Any comparison-based sort can be drawn as a decision tree where each internal node is one comparison between two elements
-
The tree must have at least $n!$ leaves because there are $n!$ possible orderings (permutations) of the input
-
The minimum height of a binary tree with $n!$ leaves is at least $\log_2(n!)$ , and using Stirling's approximation, $\log_2(n!) = \Omega(n \log n)$
-
This proves Merge Sort and Heap Sort are asymptotically optimal among comparison sorts — you cannot do fundamentally better

Counting Sort

If the $\Omega(n \log n)$ lower bound says no comparison sort can do better, how can we sort faster? The answer is: stop comparing! Counting Sort does not compare elements at all — instead, it counts how many times each value appears. If you know all values fall in the range $0$ to $k$ , you create a count array of size $k+1$ , scan the input to fill the counts, compute prefix sums (so you know where each value belongs in the output), and place each element directly into its correct position. The total time is $O(n + k)$ : $O(n)$ to count, $O(k)$ for prefix sums, and $O(n)$ to place. When $k = O(n)$ , this is $O(n)$ — faster than any comparison sort! The catch: if $k$ is huge (like sorting 32-bit integers where $k = 2^{32}$ ), the count array becomes impractically large.

-
Not a comparison sort — it counts values and uses them as array indices, which is why it can beat the $\Omega(n \log n)$ lower bound for comparison sorts
-
Stable — the reverse scan in the placement step ensures equal elements maintain their original relative order
-
Practical only when $k = O(n)$ — if the range of values is much larger than the number of elements, the count array wastes too much space
-
Often used as a subroutine in Radix Sort, which sorts numbers digit by digit using Counting Sort for each digit position
-
⚠ Common mistake: Counting Sort does NOT contradict the $\Omega(n \log n)$ lower bound — that bound only applies to comparison-based sorts. Counting Sort uses a fundamentally different technique (indexing, not comparing).

W1: Basics & Notation W3: Searching & Selection