+ All Categories
Home > Documents > Dezső Sima

Dezső Sima

Date post: 04-Feb-2016
Category:
Upload: shaman
View: 29 times
Download: 1 times
Share this document with a friend
Description:
Dezső Sima. Evolution of Intel’s Basic Microarchitectures - 2. Vers. 3.3. April 20 1 3. 1. Introduction. 2. Core 2. 3. Penryn. 4. Nehalem. 7. Westmere-EX. 5. Nehalem-EX. 6. Westmere. Contents. 9. Sandy Bridge Extreme Edition. 10. Ivy Bridge. 12. Overview of the evolution. - PowerPoint PPT Presentation
Popular Tags:
63
Dezső Sima Evolution of Intel’s Basic Microarchitectures - 2 April 2013 Vers. 3.3
Transcript
Page 1: Dezső Sima

Dezső Sima

Evolution of Intel’s Basic Microarchitectures - 2

April 2013

Vers. 3.3

Page 2: Dezső Sima

Contents

1. Introduction•

2. Core 2•

3. Penryn•

4. Nehalem•

7. Westmere-EX •

5. Nehalem-EX •

6. Westmere•

Page 3: Dezső Sima

Contents

9. Sandy Bridge Extreme Edition•

10. Ivy Bridge•

12. Overview of the evolution•

8. Sandy Bridge•

11. Haswell•

Page 4: Dezső Sima

8. Sandy Bridge

8.1 Introduction• 8.2 Advanced Vector Extension (AVX)• 8.3 On-die ring interconnect bus• 8.4 On-die integrated graphics unit• 8.5 Enhanced turbo boost technology•

Page 5: Dezső Sima

8.1 Introduction (1)

• Sandy Bridge is Intel’s new microarchitecture using 32 nm line width.• First delivered in 1/2011

8.1 Introduction

Page 6: Dezső Sima

32K L1D (3 clk)AVX 256 bit4 Operands

256 KB L2(9 clk)

HyperthreadingAES Instr.

VMX Unrestrict.20 nm2 / Core

256 KB L2(9 clk)

256 KB L2(9 clk)

256 KB L2(9 clk)

256 KB L2(9 clk)

256 KB L2(9 clk)

256 KB L2(9 clk)

PCIe 2.0

@ 1.0 1.4 GHz(to L3 connected)

256 b/cycle Ring Architecture(25 clk)

DDR3-1600 25.6 GB/s

Main functional units of Sandy Bridge [143] Part 4

32 nm process / ~225 nm2 die size / 85W TDP

8.1 Introduction (2)

8 MB

Page 7: Dezső Sima

Desktops

Servers

DP-ServersE5 2xxx, Sandy Bridge-EP, up to 8C, Q4/2011

UP-Servers

E3 12xx, 4C, Sandy Bridge-H2, 4C, 3/2011

MobilesCore i3-23xxM, 2C, 2/2011 Core i5-24xxM//25xxM, 2C, 2/2011Core i7-26xxQM/27xxQM/28xxQM, 4C, 1/2011 Core i7 Extreme-29xxXM , 4C, Q1 2011

Core i3-21xx, 2C,no HT, no vPro, 2/2011Core i5-23xx 4C+G, no HT no VPro, 1/2011Core i5/24xx/25xx, 4C+G, no HT, vPro, 1/2011Core i7-26xx, 4C+G, HT, vPro, 1/2011Core i7-2700K, 4C+G, HT, no vPro, 10/2011

MP-Servers

E5 4xxx, Sandy Bridge-EX, up to 8C, Q1/2012

Overview of the Sandy Bridge based processor lines

Based on [62] and [63]

8.1 Introduction (3)

Core i7-3960X, 6C, HT, vPro??, 11/2011Core i7-3930K, 6C, HT, vPro??, 11/2011

Desktops

Sandy Bridge Sandy Bridge-ESection 9)

Page 8: Dezső Sima

Key features and benefits of the Sandy Bridge line vs the 1. generation Nehalem line [61]

8.1 Introduction (4)

Page 9: Dezső Sima

8.2 Advanced Vector Extension (AVX) (1)

Figure: Evolution of the SIMDprocessing width [18] BMA-ból

8.2 Advanced Vector Extension (AVX)

Sandy Bridge

Introduction of AVX

Page 10: Dezső Sima

Figure: Intel’s x86 ISA extensions - the SIMD register space (based on [18]) BMA

NorhwoodNorthwood (Pentium4)Northwood (Pentium4)

8 MM registers (64-bit), aliased on the FP Stack registers

8 XMM registers (128-bit)

16 XMM registers (128-bit)

16 YMM registers (256-bit)

Ivy Bridge

8.2 Advanced Vector Extension (AVX) (2)

Page 11: Dezső Sima

8.3 On-die ring interconnect bus (1)

8.4 The on die ring interconnect bus of Sandy Bridge [66]

Six bus agents.

The four cores and theL3 slices share interfaces.

Page 12: Dezső Sima

8.4 On-die integrated graphics unit (1)

8.5 Sandy Bridge’s integrated graphics unit [102] Part4

12 EUs

Page 13: Dezső Sima

Specification data of the HD 2000 and HD 3000 graphics [125] Part 4

-

8.4 On-die integrated graphics unit (2)

Page 14: Dezső Sima

frames per sec

i5/i7 2xxx/3xxx:Sandy Bridge

i5 6xxArrandale

HD5570400 ALUs

Performance comparison: gaming [126] part 4

8.4 On-die integrated graphics unit (3)

Page 15: Dezső Sima

8.5 Enhanced turbo boost technology (1)

Cooler

Innovative concept of the 2.0 generation Turbo Boost technology

Thermal capacitance

The concept utilizes the real temperature response of processors to power changes in order to increase the extent of overclocking [64]

8.5 Enhanced turbo boost technology [64]

Page 16: Dezső Sima

Concept: Use thermal energy budget accumulated during idle periods to push the corebeyond the TDP for short periods of time (e.g. for 20 sec).

Multiple algorithms manage in parallel current, power and die temperature. [64]

8.5 Enhanced turbo boost technology (2)

Page 17: Dezső Sima

Intelligent power sharing between the cores and the integrated graphics [64]

8.5 Enhanced turbo boost technology (3)

Page 18: Dezső Sima

[61]

WSM/M

WSM/D

NHM/M

NHM/D

8.5 Enhanced turbo boost technology (4)

Page 19: Dezső Sima

Remark

8.5 Enhanced turbo boost technology (6)

• Individual cores may run at different frequencies but all cores share the same power plane.• Individual cores may be shut down if idle by power gates.

Page 20: Dezső Sima

9. The Sandy Bridge-E line

Page 21: Dezső Sima

9. The Sandy Bridge-E line (1)

9. The Sandy Bridge-E line of processors (2. gen. Core i7 processors)

Introduced in 11/2011 as a “precursor” of the upcoming DP/MP server lines.Key features vs the original Sandy Bridge line (1)

a) 6 cores (with 2 cores disabled from the original design) but no integrated graphics [76].

Page 22: Dezső Sima

32 nm435 mm2 2.27 B trs

15 MB L3

32 nm216 mm2 995 mtrs

8 MB L3

[61][76]

9. The Sandy Bridge-E line (2)

Sandy Bridge (2x)Sandy Bridge E

Page 23: Dezső Sima

CPU Specification Comparison

CPUManufacturin

gProcess

Cores

Transistor Count Die Size

AMD Bulldozer 8C 32nm 8 ~2B 315mm2

AMD Thuban 6C 45nm 6 904M 346mm2

AMD Deneb 4C 45nm 4 758M 258mm2

Intel Gulftown 6C 32nm 6 1.17B 240mm2

Intel Sandy Bridge E (6C) 32nm 6 2.27B 435mm2

Intel Nehalem/Bloomfield 4C 45nm 4 731M 263mm2

Intel Sandy Bridge 4C 32nm 4 995M 216mm2

Intel Lynnfield 4C 45nm 4 774M 296mm2

Intel Clarkdale 2C 32nm 2 384M 81mm2

Intel Sandy Bridge 2C (GT1) 32nm 2 504M 131mm2

Intel Sandy Bridge 2C (GT2) 32nm 2 624M 149mm2

Comparison of die parameters of recent DT processors [77]

9. The Sandy Bridge-E line (3)

Page 24: Dezső Sima

  L1 L2 L3 Main Memory

AMD FX-8150 (3.6GHz) 4 21 65 195

AMD Phenom II X4 975 BE (3.6GHz) 3 15 59 182

AMD Phenom II X6 1100T (3.3GHz) 3 14 55 157

Intel Core i5 2500K (3.3GHz) 4 11 25 148

Intel Core i7 3960X (3.3GHz) 4 11 30 167

Cache/memory latencies of recent DT processors [77]

9. The Sandy Bridge-E line (4)

Sandy Bridge

Sandy Bridge-E

Bulldozer

Page 25: Dezső Sima

b) 4 parallel memory channels (inherited from the server side) instead of 2 of the previous lines. Support of DDR3 of up to 1600 MT/s. A single DDR3-1600 DIMM per channel or 2 DDR3-1333 DIMMs per channel [78].

9. The Sandy Bridge-E line (5)

Page 26: Dezső Sima

c) 40 PCIe 2. gen. lanes to connect graphics cards directly to the processor instead of 16 to 32 of the previous generation Sandy Bridge [78].

9. The Sandy Bridge-E line (6)

Page 27: Dezső Sima

1x x16 or 2x x8 lanes

PCIe lanes provided on the processor

40 configurable lanes(e.g. 2x x16 + 1x x8 or 4x x8)

PCIe

3.0

lane

s P

CIe

2.0

lane

s

Type

of a

vaila

ble

PCIe

lane

s

PCIe

1.0

lan

es

Mem.P

Periph. Contr.

PCIe 2.0

X16/2x x8

X16/2x x8

Mem.P

Periph. Contr.

PCIe 3.0

Intel 2. gen. Nehalem (Lynnfield) (4C), 2 MCh with P55 (2009)Intel Sandy Bridge (4C), 2 MCh with P67 (2011)

Intel Ivy Bridge (4C), 2 MCh with Z77 PCH (2012)

P55/P67

Z77

Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011)

Main options of providing PCIe lanes on the processor for graphics cards in DT systems

PCIe 3.040

configurablelanes

Mem.P

Periph. Contr. X79

Page 28: Dezső Sima

Lane configuration options - Sandy Bridge Extreme Edition []

Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011)

Periph. Contr.

Mem.P

PCIe 3.0x16

x16

40 configurable

lanes

X79

Page 29: Dezső Sima

PCIe

3.0

lane

s P

CIe

2.0

lan

es

Type

of a

vaila

ble

PCIe

lane

s

PCIe

1.0

la

nes

Trend

Evolution of the topology and type of available PCIe lanes for graphics cards

Topology of PCIe lanes provided for graphics cards

PCIe lanes on both the NB and the SB

PCIe lanes on the NB

PCIe laneson the processor

PCIe lanes on the PCH

2. G. Nehalem (Lynnfield) (2009)

Sandy Bridge (2011)

Sandy Bridge EE, (2011)Ivy Bridge, (2012)

Intel Sandy Bridge EE (6C), 4 MCh with X79 (2011)

4.1 Introduction (6)/4

Page 30: Dezső Sima

d) LGA-2011 socket instead of the LGA-1155 used in the pervious generation Sandy Bridge due to the increased number of memory channels connected to the processor..

9. The Sandy Bridge-E line (7)

LGA 2011 Sandy Bridge EELGA 1366 1. gen. Nehalem (Bloomfield)LGA 1155 Sandy Bridge/Ivy BridgeLGA 1156 2. gen. Nehalem (Lynnfield)LGA 775 Pentium 4 Prescott until Nehalem

LGA 775

Intel’s LGA sockets (Land Grid Array)

LGA 2011 [87]

Page 31: Dezső Sima

Processor Core Clock Cores / Threads L3 Cache Max Turbo

Max Overclock Multiplier

TDP Price

Intel Core i7 3960X 3.3GHz 6 / 12 15MB 3.9GHz 57x 130W $990

Intel Core i7 3930K 3.2GHz 6 / 12 12MB 3.8GHz 57x 130W $555

Intel Core i7 3820 3.6GHz 4 / 8 10MB 3.9GHz 43x 130W TBD

Intel Core i7 2700K 3.5GHz 4 / 8 8MB 3.9GHz 57x 95W $332

Intel Core i7 2600K 3.4GHz 4 / 8 8MB 3.8GHz 57x 95W $317

Intel Core i7 2600 3.4GHz 4 / 8 8MB 3.8GHz 42x 95W $294

Intel Core i5 2500K 3.3GHz 4 / 4 6MB 3.7GHz 57x 95W $216

Intel Core i5 2500 3.3GHz 4 / 4 6MB 3.7GHz 41x 95W $205

Main features of the Sandy Bridge-E line vs the Sandy Bridge line [77]

9. The Sandy Bridge-E line (8)

Page 32: Dezső Sima

10. The Ivy Bridge line

Page 33: Dezső Sima

10. Te Ivy Bridge line – 10.1 Introduction (1)

Introduced: 4/2012

Figure 10.1: Intel’s Tick-Tock development model [Based on 1]

Tick-Tock Development Model

Merom1

NEWMicroarchitecture

65nm

PenrynNEW

Process

45nm

NehalemNEW

Microarchitecture

45nm

WestmereNEW

Process

32nm

SandyBridge

NEWMicroarchitecture

32nm

IvyBridge

NEWProcess

22nm

HaswellNEW

Microarchitecture

22nm

TOCK TOCKTICKTOCKTICKTOCKTICK

10. The Ivy Bridge line11.1 IntroductionThe Ivy Bridge is termed also as the 3. gen. Intel Core processors.

Page 34: Dezső Sima

10.1 Introduction (2)

32 nm216 mm2

995 mtrs

22 nm 160 mm2

1480 mtrs(Resized to

32 nm feature size)

Figure 10.2: Contrasting the Sandy Bridge and Ivy Bridge dies [81]

Sandy Bridge

Ivy Bridge

8 MB

8 MB

Page 35: Dezső Sima

10.1 Introduction (3)

[84]

Page 36: Dezső Sima

10.1 Introduction (4)

Major innovations of Ivy Bridge [80]

Page 37: Dezső Sima

11.2 The new 22 nm tri-gate process technology (1)

11.2 The new 22 nm tri-gate process technology [82]

Page 38: Dezső Sima

10.2 The new 22 nm tri-gate process technology (2)

[82]

Page 39: Dezső Sima

10.2 The new 22 nm tri-gate process technology (3)

[82]

Page 40: Dezső Sima

10.2 The new 22 nm tri-gate process technology (4)

[82]

Page 41: Dezső Sima

10.2 The new 22 nm tri-gate process technology (5)

[82]

Page 42: Dezső Sima

10.2 The new 22 nm tri-gate process technology (6)

[82]

Page 43: Dezső Sima

10.2 The new 22 nm tri-gate process technology (7)

[82]

Page 44: Dezső Sima

10.2 The new 22 nm tri-gate process technology (8)

[82]

Page 45: Dezső Sima

10.2 The new 22 nm tri-gate process technology (9)

Figure: Ivy Bridge chips on a 300 mm wafer

Page 46: Dezső Sima

10.2 The new 22 nm tri-gate process technology (10)

Processor Feature size No. of cores L2 + L3 size No. of transistor Die size

Ivy Bridge 22 nm Tri-Gate 4 (+ IGP) 9 MB 1,48 milliárd 160 mm2

Sandy Bridge 32 nm HKMG 4 (+ IGP) 9 MB 995 millió 216 mm2

Sandy Bridge-E 32 nm HKMG 6 16,5 MB 2,27 milliárd 435 mm2

Gulftown 32 nm HKMG 6 13,5 MB 1,17 milliárd 240 mm2

Lynnfield 45 nm HKMG 4 9 MB 774 millió 296 mm2

Bloomfield 45 nm HKMG 4 9 MB 731 millió 263 mm2

Orochi (Bulldozer) 32 nm HKMG SOI 8 (4 modul) 16 MB ~1,2 milliárd 315 mm2

Llano 32 nm HKMG SOI 4 (+ IGP) 4 MB 1,45 milliárd 228 mm2

Thuban 45 nm SOI 6 9 MB 904 millió 346 mm2

Deneb 45 nm SOI 4 8 MB 758 millió 258 mm2

Table: Main implementation parameters of recent processors [81]

Page 47: Dezső Sima

10.3 Supervisory Mode Execution Protection (SMEP)

[83]

Page 49: Dezső Sima

10.4 System architecture (2)/1

[81]

Page 50: Dezső Sima

Analog video interfaces to external displays Digital video interfaces to external displays

Video interfaces of computing devices to external displays

MDA EGA DVI HDMICGA

Overview of video interfaces of computing devices to external displays

No audio transmission Audio/video transmission

Analog audio/digital video i.f.

Dig. audio/dig. video i.f.

VGA DP

Earliest video interfaces Legacy video interfaces Recently preferred video interfaces

To TVs To displays

Dig. audio/dig. video i.f.s

10.4 System architecture (2)/2

Page 51: Dezső Sima

10.5 Performance (1)

[81]

Sandy Bridge

Sandy Bridge

Bulldozer

Ivy Bridge

Sandy Bridge EE

Sandy Bridge EE

Page 52: Dezső Sima

10.5 Performance (2)

[81]

Page 53: Dezső Sima

11. The Haswell line

Page 54: Dezső Sima

11. The Haswell line of processors (1)

Expected date of introduction: 4/2013

Figure 1.1: Intel’s Tick-Tock development model [Based on 1]

Tick-Tock Development Model

Merom1

NEWMicroarchitecture

65nm

PenrynNEW

Process

45nm

NehalemNEW

Microarchitecture

45nm

WestmereNEW

Process

32nm

SandyBridge

NEWMicroarchitecture

32nm

IvyBridge

NEWProcess

22nm

HaswellNEW

Microarchitecture

22nm

TOCK TOCKTICKTOCKTICKTOCKTICK

11. The Haswell line of processors

Page 55: Dezső Sima

11. The Haswell line of processors (2)

The Haswell die [85]

Page 56: Dezső Sima

11. The Haswell line of processors (3)

Haswell’s system architecture [86]

Page 57: Dezső Sima

11. The Haswell line of processors (4)

[80]

Page 58: Dezső Sima

11. The Haswell line of processors (5)

[80][80]

Page 59: Dezső Sima

11. The Haswell line of processors (6)/1

[80]

FMA: Fused Multiply-Add ( ax b+c)

Page 60: Dezső Sima

Figure: Evolution of the SIMDprocessing width [18] BMA-ból

8.2 Advanced Vector Extension (AVX)

Sandy Bridge

Introduction of AVX

11. The Haswell line of processors (6)/2

Haswell

Page 61: Dezső Sima

11. The Haswell line of processors (7)

[80]

Page 62: Dezső Sima

To 12 – Additional references

Page 63: Dezső Sima

[81]: Olivera, A régóta várt Intel Ivy Bridge tesztje, Prohardware, 2012-04-13, http://prohardver.hu/teszt/intel_ivy_bridge_teszt/az_ivy_bridge.html

[80]: Chappell R., Toll B., Singhal R.: Intel Next Generation Microarchitecture Codename Haswell: New Processor Innovations, IDF 2012

[82]: Bohr M., Mistry K.: Intel’s Revolutionary 22 nm transistor technology, May 2011, http://download.intel.com/newsroom/kits/22nm/pdfs/22nm-Details_Presentation.pdf

[83]: George V., Piazza T.,Jiang H.: Technology Insight: Intel Next Generation Microarchitecture Codename Ivy Bridge, IDF 2011[84] 3rd Generation Intel Core Processor Family Quad Core Launch Product Information, April 23, 2012 http://download.intel.com/newsroom/kits/core/3rdgen/pdfs/3rd_Generation _Intel_Core_Product_Information.pdf

[85] Ivy Bridge and Haswell die configurations (estimates included), Anandtech, 03-21-2012, http://forums.anandtech.com/showthread.php?t=2234017

[86]: Piazza T.,Jiang H., Hammerlund P., Singhal R.: Technology Insight: Intel Next Generation Microarchitecture Codename Haswell, IDF 2012 SPCS001

[87] Haynes D.: 2012 Socket Guide, Aug. 4 2012, http://www.ocmodshop.com/cpu-socket-guide-2012/lga2011/


Recommended