CPU Optimization
Out-of-Order Execution (OoOE)
Instruction CacheInstruction Cache
FLASH RAM
ALU
Data CacheData Cache
CPU with separate instruction and data caches
An Ideal PipelineAn Ideal Pipeline
Pipeline with instruction & data cachePipeline with instruction & data cache
Instruction NOT in cache Operand NOT in cache
Example: How long would this program take to execute?
R1 = R2
R2 = R1 + R3
R4 = R3 + R5
*R3, R4, R5 Already in data cache
R1=R2 F D A E WR2=R1+R3 F D A E WR4=R3+R5 F D A E W
R4=R3+R5 F E A E WR1=R2 F D A E WR2=R1+R3 F E A E W
Sequential ExecutionSequential Execution
Re-ordered ExecutionRe-ordered Execution
Time = 12 Units
Time = 12 Units
The CDC 6600. The first computer to use OoOE (1964)
Source: wikipedia
Reservation Stations (Buffer)
R1 R2 R3 R4 R5
R1=R2 RAM RAM
R2=R1+R3 RAM RAM cache
R4=R3+R5 cache cache cache
OoOE Hardware