Greedy & Amortized Analysis
Sometimes the simplest strategy is the best: just make the best choice available right now, at each step, without worrying about future consequences. This is the greedy approach, and surprisingly, it produces optimal solutions for many important problems — as long as the problem has the right structure. This week you will learn when greedy works (activity selection, fractional knapsack, Huffman coding) and when it does not (0/1 knapsack — where DP is needed instead). You will also learn amortized analysis, a technique for understanding the true cost of operations that occasionally spike in expense. For example, when a dynamic array doubles in size, that one expansion costs O(n), but because it happens so rarely, the average cost per insertion is still O(1). Amortized analysis makes this intuition mathematically rigorous using three methods: aggregate, accounting, and the potential method.
Learning Objectives
Key Concepts
Greedy Algorithms
Greedy algorithms build a solution step by step, always making the choice that looks best right now without looking ahead. For example, in the activity selection problem (scheduling the most activities that do not overlap), the greedy strategy is simple: sort all activities by their finish time, and always pick the earliest-finishing compatible activity. This seems almost too simple, but it provably gives the optimal answer. Greedy algorithms work when two conditions hold: (1) the greedy choice property — choosing the locally best option is always part of some globally best solution, and (2) optimal substructure — after making the greedy choice, the remaining problem is a smaller instance of the same type. The classic way to prove a greedy algorithm is correct is the 'exchange argument': assume you have any optimal solution, then show you can swap its choices for the greedy choices without making the result worse.
- -
Greedy choice property: the locally optimal choice is always part of a globally optimal solution — if this is not true, greedy will give a wrong answer
- -
Exchange argument proof: take any optimal solution, replace a non-greedy choice with the greedy choice, and show the result is at least as good — this proves greedy is optimal
- -
Activity selection: sort activities by finish time, greedily pick the next compatible one — this maximizes the number of non-overlapping activities
- -
Fractional knapsack: greedily take items with the highest value-per-weight ratio — since you can take fractions of items, this always works. But for 0/1 knapsack (whole items only), greedy fails!
- -
Weighted interval scheduling: When activities have different values/weights (not just count), greedy fails — you need DP. Sort by finish time, then where is the last non-conflicting activity before
- -
Key insight: greedy is much simpler and faster than DP, but it only works for problems with the right structure — always verify the greedy choice property before using it
- -
⚠ Common mistake: Greedy works for fractional knapsack but NOT for 0/1 knapsack. Counterexample: items with weights [10, 20, 30] and values [60, 100, 120], capacity 50. Greedy by value/weight takes item 1 (ratio 6) and item 2 (ratio 5) for value 160, but optimal is items 2+3 for value 220.
Huffman Coding
Huffman coding is a brilliant application of greedy algorithms to data compression. The problem: given a text where some characters appear much more frequently than others, assign binary codes to each character such that the total number of bits is minimized. The key constraint is that the code must be 'prefix-free' — no character's code can be the beginning of another character's code (this ensures unambiguous decoding). Huffman's greedy approach: use a priority queue, repeatedly extract the two characters (or subtrees) with the lowest frequencies, merge them into a new internal node whose frequency is their sum, and put it back. When only one tree remains, that is the Huffman tree. Characters deep in the tree get longer codes, and characters near the root get shorter codes. Since the most frequent characters end up near the root, the total encoding length is minimized.
- -
Prefix-free property: no code is a prefix of another code — this means you can decode a stream of bits unambiguously without needing separators between characters
- -
Greedy strategy: always merge the two lowest-frequency nodes — this ensures the least common characters end up deepest in the tree (with the longest codes) while common characters get short codes
- -
Optimal for character-by-character encoding — Huffman coding produces the minimum possible total number of bits for any prefix-free code
- -
The tree is built bottom-up from the leaves (individual characters) to the root — left edges represent '0' and right edges represent '1', so a character's code is the sequence of edge labels from root to leaf
Amortized Analysis
Sometimes the worst-case cost of a single operation is misleading. Consider a dynamic array (like Python's list or Java's ArrayList): most insertions are O(1), but occasionally the array fills up and must be doubled in size, copying all n elements — an O(n) operation. If you only look at the worst case, you might think every insert is O(n), but that is far too pessimistic. Amortized analysis gives the true average cost per operation over any sequence of operations (not probabilistic — this is a worst-case guarantee over the sequence). There are three methods: (1) Aggregate method — add up the total cost of n operations and divide by n, (2) Accounting method — overcharge cheap operations by a little bit (saving 'credits') that pay for the occasional expensive operation, and (3) Potential method — define a 'potential function' that captures stored-up energy that is released during expensive operations.
- -
Aggregate method: compute the total cost T(n) for n operations, then amortized cost = T(n)/n — simplest approach but not always easy to apply
- -
Accounting method: assign each operation a 'charge' that may be higher than its actual cost — the excess is stored as credit that pays for future expensive operations. The credit must never go negative.
- -
Potential method: define a potential function Φ(state) — the amortized cost of operation i is (actual cost) + (change in potential). Expensive operations decrease potential (using stored energy), while cheap operations increase it.
- -
Classic example: dynamic array doubling — most inserts cost O(1), but doubling costs O(n). Over n inserts, total cost is at most 3n, so amortized cost per insert is O(1)
- -
Another example: stack with multipop — popping k elements at once costs O(k), but over n operations the total pops cannot exceed n total pushes, so each operation is O(1) amortized