Post on 02-Jan-2016
transcript
VTU – IISc Workshop
(C)RG@SERC,IISc
Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era
R. GovindarajanCSA & SERC, IISc
govind@[csa,serc].iisc.ernet.in
VTU-IISc Workshop © RG@SERC,IISc 2
Moore’s Law : Transistors
VTU-IISc Workshop © RG@SERC,IISc 3
Moore’s Law : Performance
Processor performance doubles every 1.5 yearsProcessor performance doubles every 1.5 years
VTU-IISc Workshop © RG@SERC,IISc 4
Moore’s Law: Processor Architecture Roadmap (Pre-2000)
First P
Super-scalar
EPIC
RISC
VLIW
VTU-IISc Workshop © RG@SERC,IISc 5
Progress in Processor Architecture
• More transistors New architecture innovations– Pipelined Architecture– Multiple Instruction Issue processors
• VLIW • Superscalar• EPIC
– More on-chip caches, multiple levels of cache hierarchy, speculative execution, …Era of Instruction Level
Parallelism
VTU-IISc Workshop © RG@SERC,IISc 6
Influence on Compiler Optimization
Pipelined ArchitectureVLIW Architecture Superscalar ProcessorEPIC
ILP Compilation Techniques(Instrn. Scheduling, Register Allocation, Software Pipelining, …)
VTU-IISc Workshop © RG@SERC,IISc 7
IF ID IssueReg.Read
Superscalar Architecture
IF ID IssueReg.Read
WriteBack
Ld/Store UnitWriteBack
Int. ALU
Align Add AlignWriteBack
• Multiple instructions are fetched, decoded, issued and executed in each cycle.
• Speculation, Cache/Memory hierarchy, Prefetching, Performance, Power Efficiency, …
VTU-IISc Workshop © RG@SERC,IISc 8
Progress in Processor Architecture (Post-2000)
• More transistors New architecture innovations– Multiple Instruction Issue processors– More on-chip caches– Multi cores
Era of Multi-Cores
VTU-IISc Workshop © RG@SERC,IISc 9
Multicores : The Right Turn
6 G
Hz
1 C
ore
3 G
Hz
1 C
ore
1 G
Hz
1 C
ore
Per
form
ance 3 GHz
16 Core3 GHz 4 Core
3 GHz 2 Core
VTU-IISc Workshop © RG@SERC,IISc 10
Moore’s Law: Processor Architecture Roadmap (Post-2000)
First P
RISC
VLIW Super-scalar
EPIC Multi-cores
VTU-IISc Workshop © RG@SERC,IISc 11
Era of Multicores (Post 2000)
• Multiple cores in a single die
• Early efforts utilized multiple cores for multiple programs
• Throughput oriented rather than speedup-oriented!
VTU-IISc Workshop © RG@SERC,IISc 12
Influence on Compilation Techniques
Multi-Core Processors
• Extracting Parallelism • Thread-Level Parallelism• Speculative Multithreading
VTU-IISc Workshop © RG@SERC,IISc 13
MultiCore-Based Node
L2-Cache
C0 C2
L1$ L1$
L2-Cache
C4 C6
L1$ L1$
L2-Cache
C1 C3
L1$ L1$
L2-Cache
C5 C7
L1$ L1$
Memory
VTU-IISc Workshop © RG@SERC,IISc 14
HPC Cluster using Multi-Core Nodes
Memory MemoryNIC NIC
Memory MemoryNIC NIC
N/WSwitch
Node 0 Node 1
Node 3 Node 2
VTU-IISc Workshop © RG@SERC,IISc 15
Progress in Processor Architecture
• More transistors New architecture innovations– Multiple Instruction Issue processors– More on-chip caches– Multi cores– Heterogeneous cores and accelerators
Graphics Processing Units (GPUs)
Cell BE, Clearspeed
Larrabee
Reconfigurable accelerators …
Era of Heterogeneous Accelerators
VTU-IISc Workshop © RG@SERC,IISc 16
Moore’s Law: Processor Architecture Roadmap (Post-2000)
First P
RISC
VLIW Super-scalar
EPIC Multi-cores
Accele-rators
VTU-IISc Workshop © RG@SERC,IISc 17
Accelerators
VTU-IISc Workshop © RG@SERC,IISc 18
Why Bother about Accelerators?
Some Top500 Systems (Nov. 2009 List)
Rank
System Description # Procs. R_max
(TFLOPS)2 Roadrunner Opteron +
CellBE6480
+129601,105
29 LANL Opteron + CellBE
14400 126.50
56 TSUBAME Grid Opteron +Xeon + Clearspeed + GPU
31024 87.0
79 IBM Poughkeepsie
Opteron + CellBE
7200 63.25
VTU-IISc Workshop © RG@SERC,IISc 19
HPC Design Using Accelerators
• High level of performance from Accelerators• Variety of general-purpose hardware accelerators
– GPUs : nVidia, ATI,– Accelerators: Clearspeed, Cell BE, …– Plethora of Instruction Sets even for SIMD
• Programmable accelerators, e.g., FPGA-based• HPC Design using Accelerators
– Exploit instruction-level parallelism – Exploit data-level parallelism on SIMD units– Exploit thread-level parallelism on multiple units/multi-cores
• Challenges– Portability across different generation and platforms– Ability to exploit different types of parallelism
VTU-IISc Workshop © RG@SERC,IISc 20
Summary
• Multi-cores and Heterogeneous accelerators present tremendous research opportunity in– Architecture– High Performance Computing– Programming Languages & Models – Compilers
• Proebsting’s LawCompiler Technology Doubles CPU Power
Every 18 YEARS!!
• Time to Rewrite Probesting’s Law?
VTU – IISc Workshop
(C)RG@SERC,IISc
Thank You !!