Advanced Pipeline & Hazard Resolution
This week we learn techniques to resolve pipeline hazards without stalling: data forwarding (bypassing) and instruction scheduling. We also study the unavoidable load-use hazard that still requires one stall cycle even with full forwarding.
Learning Objectives
Key Concepts
Data Forwarding (Bypassing)
Forwarding passes a result directly from where it's produced to where it's needed, bypassing the register file. Two main forwarding paths:
- -EX/MEM → EX: result from ALU output forwarded to ALU input of next instruction
- -MEM/WB → EX: result from memory/ALU forwarded to ALU input two instructions later
Forwarding eliminates stalls for most ALU-to-ALU RAW hazards.
- -
Forwarding from EX/MEM pipeline register: value is available at the end of the EX stage
- -
Forwarding from MEM/WB pipeline register: value is available at the end of the MEM stage
- -
Forwarding hardware adds multiplexers before ALU inputs
- -
Without forwarding, every RAW hazard between adjacent instructions causes 2 stall cycles
Load-Use Hazard
A load-use hazard occurs when a lw instruction is immediately followed by an instruction that uses the loaded value. Even with full forwarding, the data is not available until the end of the MEM stage — but it's needed at the start of the EX stage of the next instruction. This requires exactly one stall cycle (bubble).
- -
lw produces its result at the end of MEM, but the next instruction needs it at the start of EX
- -
One mandatory stall cycle even with full forwarding
- -
If the using instruction is 2 or more slots away, no stall is needed
- -
Pipeline scheduling can insert independent instructions into the bubble slot
Pipeline Scheduling
Pipeline scheduling (or instruction scheduling) reorders independent instructions to fill stall slots. The compiler or hardware rearranges instructions without changing program semantics to avoid hazards.
- -
Move independent instructions into load-use bubble slots
- -
Must preserve data dependencies (only reorder independent instructions)
- -
Compiler scheduling is done at compile time — simpler hardware
- -
Hardware scheduling (dynamic) is more flexible but complex