1
University of Notre Dame
1Lecture 07 - Multicore Computation
Lecture 07Multicore Computation
Lecture based on notes fromJohn Mellor-Crummey
Department of Computer ScienceRice University & Jernej Barbic
2
University of Notre Dame
2Lecture 07 - Multicore Computation
This was thinking mid-90s.
3
University of Notre Dame
3Lecture 07 - Multicore Computation
4
University of Notre Dame
4Lecture 07 - Multicore Computation
Circuit complexityand interconnect
delay limitpracticality of
support structuresfor larger issue
width
5
University of Notre Dame
5Lecture 07 - Multicore Computation
6
University of Notre Dame
6Lecture 07 - Multicore Computation
7
University of Notre Dame
7Lecture 07 - Multicore Computation
8
University of Notre Dame
8Lecture 07 - Multicore Computation
9
University of Notre Dame
9Lecture 07 - Multicore Computation
10
University of Notre Dame
10Lecture 07 - Multicore Computation
11
University of Notre Dame
11Lecture 07 - Multicore Computation
12
University of Notre Dame
12Lecture 07 - Multicore Computation
Some important points• Technology alone is not driving push to multi-core
– What was state of the art - more issue, superscalar -provides diminishing performance returns b/c ofprogram properties
• Still, performance gains possible with scaling
• If CCs/instruction performance gains tapped out +scaling performance inhibited (b/c of lower Vdd, lowerclock rates), where does performance come from?
13
University of Notre Dame
13Lecture 07 - Multicore Computation
Some important points• Performance must come from combination of
parallelism + previously ignored HW optimizations– E.g. instead of getting 2x from technology, get 10%
from A, 5% from B, etc.
14
University of Notre Dame
14Lecture 07 - Multicore Computation
15
University of Notre Dame
15Lecture 07 - Multicore Computation
16
University of Notre Dame
16Lecture 07 - Multicore Computation
17
University of Notre Dame
17Lecture 07 - Multicore Computation
18
University of Notre Dame
18Lecture 07 - Multicore Computation
19
University of Notre Dame
19Lecture 07 - Multicore Computation
20
University of Notre Dame
20Lecture 07 - Multicore Computation
The cores fit on a single processor socket(also called CMP - chip multiprocessor)
21
University of Notre Dame
21Lecture 07 - Multicore Computation
22
University of Notre Dame
22Lecture 07 - Multicore Computation
23
University of Notre Dame
23Lecture 07 - Multicore Computation
Back to case study…
24
University of Notre Dame
24Lecture 07 - Multicore Computation
(standard benchmarksparallelized for comparison)
25
University of Notre Dame
25Lecture 07 - Multicore Computation
26
University of Notre Dame
26Lecture 07 - Multicore Computation
27
University of Notre Dame
27Lecture 07 - Multicore Computation
(If CPU time constant,performance comes from
parallelism)
28
University of Notre Dame
28Lecture 07 - Multicore Computation
Take Aways
29
University of Notre Dame
29Lecture 07 - Multicore Computation
30
University of Notre Dame
30Lecture 07 - Multicore Computation
31
University of Notre Dame
31Lecture 07 - Multicore Computation
32
University of Notre Dame
32Lecture 07 - Multicore Computation
Multi-core flavors• Cores need not be the same
– (If they are, we talk about symmetric core machines)– (If not, asymmetric)
• Imagine FPGA + GP processor?
33
University of Notre Dame
33Lecture 07 - Multicore Computation
Other issues:
(Amdahl’s Law and Parallelization)
34
University of Notre Dame
34Lecture 07 - Multicore Computation
35
University of Notre Dame
35Lecture 07 - Multicore Computation
36
University of Notre Dame
36Lecture 07 - Multicore Computation
37
University of Notre Dame
37Lecture 07 - Multicore Computation
38
University of Notre Dame
38Lecture 07 - Multicore Computation
39
University of Notre Dame
39Lecture 07 - Multicore Computation
40
University of Notre Dame
40Lecture 07 - Multicore Computation
41
University of Notre Dame
41Lecture 07 - Multicore Computation
Other issues:Core-to-core communication
Must factor in communication costs in processing time too…
42
University of Notre Dame
42Lecture 07 - Multicore Computation
Back to Processor-Memory Wall(still need to feed cores)
(Peter Kogge will discuss on Monday)(Not only a problem for multi-core)