Dashboard/Block 2: Data Structures/Week 5
Week 5Data Structures

BSTs & Balanced Trees

A Binary Search Tree (BST) organizes data so that searching, inserting, and deleting items follow one simple rule: smaller values go left, larger values go right. This makes lookups fast — in a balanced tree you can find any element among millions in about 20 steps. But there is a catch: if you insert elements in sorted order, the tree becomes a tall, skinny chain (like a linked list), and operations slow down to O(n). To fix this, computer scientists invented self-balancing trees like Red-Black Trees and 2-3 Trees, which automatically reorganize themselves through rotations and node splits to guarantee O(log n) height. This week covers standard BST operations, tree traversals, and these important balanced tree variants.

Learning Objectives

Implement BST search (follow left/right based on comparison), insert (find the right leaf spot), and delete (handle the three cases: leaf node, one child, and two children)
Understand the three traversal orders: inorder (gives sorted output), preorder (root first), and postorder (root last), and when each is useful
Know the four properties of Red-Black Trees and how rotations and recoloring keep the tree balanced after insertions
Understand 2-3 Trees — where each node holds 1 or 2 keys — and how they are structurally equivalent to Red-Black Trees

Key Concepts

Binary Search Tree Operations

A Binary Search Tree (BST) is a tree-shaped data structure where every node follows one simple rule: all values in the left subtree are smaller, and all values in the right subtree are larger. This rule makes searching efficient — to find a value, start at the root, go left if your target is smaller, go right if it is larger, and repeat until you find it or reach a dead end. Inserting works the same way: walk down the tree following the rule until you find an empty spot, then place the new value there. Deleting is trickier and has three cases: (1) the node is a leaf — just remove it, (2) the node has one child — replace it with that child, (3) the node has two children — find its inorder successor (the smallest value in the right subtree), copy that value up, and delete the successor. All operations take O(h) time where h is the tree height — O(log n) if the tree is balanced, but O(n) if it degenerates into a straight line.

  • -

    Inorder traversal (visit left subtree, then root, then right subtree) gives elements in sorted order — this is a key property of BSTs

  • -

    Delete has 3 cases: (1) leaf — simply remove, (2) one child — replace the node with its child, (3) two children — replace with the inorder successor and delete that successor

  • -

    The inorder successor is the smallest element in the right subtree — it is the next value in sorted order after the deleted node

  • -

    If you insert elements in sorted order (like 1, 2, 3, 4...), the BST degenerates into a linked list with height n, making all operations O(n) — this is why balanced trees are important

Red-Black Trees

Red-Black Trees solve the 'unbalanced BST' problem by adding a simple coloring rule. Each node is painted either red or black, and the tree must follow four rules that together guarantee it stays approximately balanced. The result is that the height is always at most 2·log₂(n+1), so search, insert, and delete are guaranteed O(log n) even in the worst case. When an insertion violates the rules (for example, creating two red nodes in a row), the tree fixes itself through 'rotations' (rearranging a few nodes) and 'recoloring' (changing node colors). These fixes are local and take O(1) time each, so the total insertion cost remains O(log n). Red-Black Trees are one of the most widely used data structures in practice — they power Java's TreeMap, C++ std::map, and many database indexes.

  • -

    Property 1: Every node is colored either red or black

  • -

    Property 2: The root is always black

  • -

    Property 3: A red node's children must both be black — no two red nodes can appear in a row on any path (the 'no red-red' rule)

  • -

    Property 4: Every path from the root to a null leaf passes through exactly the same number of black nodes (called the 'black height')

  • -

    These rules ensure the longest path is at most twice the shortest path, giving height ≤ 2·log₂(n+1) and guaranteeing O(log n) operations

  • -

    After insertion, at most 2 rotations and some recolorings are needed to restore the properties — making fixes very efficient

2-3 Trees

A 2-3 Tree is another way to keep a search tree balanced, and it is actually the intuition behind Red-Black Trees. In a 2-3 Tree, each node can hold either 1 key (a '2-node' with 2 children) or 2 keys (a '3-node' with 3 children), and all leaves must be at exactly the same depth. This 'same depth' rule means the tree is always perfectly balanced. When you insert a new element, it goes into a leaf. If that leaf now has 3 keys (too many), it 'splits' — the middle key moves up to the parent, creating two new children. This splitting can propagate upward to the root, and if the root splits, the tree grows one level taller. The beautiful connection is that every 2-3 Tree can be converted into an equivalent Red-Black Tree and vice versa — they are two different representations of the same balanced structure.

  • -

    A 2-node holds one key and has two children (like a normal BST node): values less than the key go left, values greater go right

  • -

    A 3-node holds two keys and has three children: values less than the first key, values between the two keys, and values greater than the second key

  • -

    All leaves are at the same depth, which guarantees the tree is perfectly balanced with height Θ(log n)

  • -

    Insertion may cause a node to temporarily have 3 keys ('overflow'), which triggers a split — the middle key moves up to the parent, potentially causing more splits up the tree

  • -

    Every 2-3 Tree has an equivalent Red-Black Tree and vice versa — a 3-node corresponds to a red node connected to its black parent