+ All Categories
Home > Documents > 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe:...

1. 2 Components of an IA processor Upon completion of this module, you will be able to describe:...

Date post: 17-Jan-2016
Category:
Upload: jennifer-cobb
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
77
1
Transcript
Page 1: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

1

Page 2: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

2

Upon completion of this module, you will be able to describe:

Page 3: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.
Page 4: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

Advanced Digital Media Boost Single Cycle SIMD Operation

8 Single Precision Flops/cycle4 Double Precision Flops/cycle

Wide Operations128 it packed Add128 bit packed Multiply128 bit packed Load128 bit packed Store

Core™ Core™ arch arch

PreviousPrevious

X4X4

Y4Y4

X4opY4X4opY4

SOURCESOURCE

X1opY1X1opY1

X3X3

Y3Y3

X3opY3X3opY3

X2X2

Y2Y2

X2opY2X2opY2

X1X1

Y1Y1

X1opY1X1opY1

SSE/2/3 OPSSE/2/3 OP

X2opY2X2opY2

X3opY3X3opY3X4opY4X4opY4

CLOCKCLOCKCYCLE 1CYCLE 1

CLOCKCLOCKCYCLE 2CYCLE 2

127127

CLOCKCLOCKCYCLE 1CYCLE 1

SIMD OperationSIMD Operation(SSE/SSE2/SSE3/SSSE)(SSE/SSE2/SSE3/SSSE)

Page 5: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

CISC vs. RISC Super-scalarOut of Order vs. In OrderArchitecture vs. Micro-Architecture

Intel ArchitecturesIA32/X86Intel64

Historical Micro-ArchitecturesP6 (Pentium Pro, Pentium II, Pentium IIINetBurst (Pentium 4Mobile (Centrino platforms

Page 6: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

6

System Bus

2nd Level Cache 1st Level Cache (Data)

Bus Unit

Decode/IQ

Instruction Fetch Unit

Execution Unit

Renamer/AllocatorBuffers(Retirement)Scheduler

Branch Prediction Unit

Front EndFront EndExecution CoreExecution Core

Page 7: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

7

Page 8: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

8

Instruction Preparation before executed Instruction Fetch Unit

Instruction Queue

Instruction Decode Unit

Branch Prediction Unit

Page 9: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

9

Prefetches instructions that are likely to be executedCaches frequently-used instructionsPrecodes and buffers instructions

Page 10: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

04/21/23 10

I-Cache (Instruction Cache)32Kbytes/8-way/64-byte line16 aligned bytes fetched per cycle

ITLB (Instruction Translation Lookaside Buffer128 4K pages, 8 2M pages

Instruction Prefetcher16-byte aligned lookup through the ITLB into the instruction cache and instruction prefetch buffers

Instruction Pre-decoderInstruction length Decode (precode)Avoid Length Changing Prefix, for example

The REX (EM64T) prefix (4xH) is not an LCP

Avoid in loop: MOV dx, 1234h

Page 11: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

11

Buffer between instruction pre-decode unit and decoder

Up to 6 predecoded instructions written per cycle18 instructions contained in IQUp to 5 instructions read from IQ

Potential Loop cacheLoop Stream Detector (LSD) support

Re-use of decoded instructionPotential power saving

Page 12: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

12

Decode the instructions into micro-opsReady for the execution in OOO core

Page 13: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

13

Instructions converted to micro-ops (uops)1-uop includes load+op, stores, indirect jump, RET…

4 Decoders: 1 “large” and 3 “small”All decoders handle “simple” 1-uop instructionsOne large decoder handles instructions up to 4 uops

All decoders working in parallelFour (+) instructions / cycle

Micro sequencer takes over for long flows (handling instruction contains 2~ 4 uops, uCodeRom handles more complex)

Page 14: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

14

These instructions took more than one fetch since they are 22 bytesIQ buffers them togetherAll instructions are decodable by all decodersCMP and adjacent JCC are “fused” into a single uop up to 5 instructionsDecoded per cycle cmpjne EAX, [mem], label

sta_std [EAX+240], xmm0mulps xmm0, xmm0, xmm0load_add xmm0, xmm0, [EAX+16]

Page 15: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

15

Roughly ~15% of all instructions are conditional branches.Macro-fusion merges two instructions into a single micro-op, as if the two instructions were a single long instruction.Enhanced arithmetic Logic Unit (ALU) for macro-fusin. Each marco-fused instruction executes with a single dispatch.Not supported in EM64T long mode

cmpjae eax, [mem], labelScheduler

Execution

flags and target to Write back

BranchEval

Page 16: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

16

Read four instructions from Instruction Queue.Each instruction gets decoded into separate uops.Enabling example for (i=0; i<100000; i++ ) {}

Page 17: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

17

Read five instructions from Instruction Queue.Send fusable pair to single decoderSingle uop represents two instructionsEnabling example for (unsigned int i=0; i<100000; i++) {}

Page 18: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

18

Frequent pairs of micro-operations derived from the same Macro Instruction can be fused into a single micro-operation.

Micro-op fusion effectively widens the pipeline

Page 19: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

19

U-ops of a Store “mov edx, [mem1]

Sta mem1

Sta edx, [mem1]St edx, [mem1]

Page 20: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

20

ESP is calculated by dedicate logic

No explicit Micro-Ops updating ESP

Micro-Ops savingPower saving

Page 21: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

21

Allow executing instructions long before the branch outcome is decided.

Superset of Prescott/Pentium-M features

One taken branch every other clockBranch predictions for 32 bytes at a

time, twice the width of the fetch engine

Page 22: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

22

16-entry Return Stack Buffer (RSB).Front-end queuing of BPU lookupsType of predictions

Direct Calls and JumpsIndirect Calls and JumpsConditional Calls and Jumps

Page 23: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

23

Intel® Pentium® 4 Processor branch prediction PLUS the following two improvements

Indirect Branch Predictor Loop Detector

Branch miss-predictions reduced by >20%

Page 24: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

24

Accepted decoder u-ops, assign resources, execute and retire u-ops

RenamerReservation station (RS)Issue portsExecution Unit

Page 25: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

25

Page 26: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

26

4 uops renamed / retired per clockOne taken branch, any # of untakenOne fxchq per cycle

Uops written to RS and ROBDecoded uops were renamed and allocated with resource by RAT and sent to ROB read and RSRegisters not “in flight” read from ROB during RS write

Page 27: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

27

6 dispatch ports from RS3 Execution ports

(shared for integer / fp / simd)LoadStore (address)

128-bit implementationPort 0 has packed multiply (4 cycles SP 5

DP pipelined)Port 1 had packed add (3 cycles all

precisions)FP data had one additional cycle bypass latency

Do not mix SSE FP and SSE integer ops on same register

Page 28: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

28

Each uop only takes a single RS entryLoad + add dispatches twice (load, then add)Sta + std dispatches twice

Sta (address) can fire as early as possibleStd must wait for mulps to write back

Cmpjne dispatches only once (functionality is truly fused)No dependency, can fire as early as it wants

Page 29: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

29

Page 30: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

30

ReOrder Buffer (ROB)Holds micro-ops in various stages of

completionBuffers completed micro-opsUpdates the architectural state in orderManages ordering of exceptions

Page 31: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

31

32K D-cache (8-way, 64 byte line size)Loads and services

One 128-bit load and one 128-bit store per cycle to different memory locations

Out-of-order memory operationsData PrefetchingMemory DisambiguationStore forwardingShared cache

Page 32: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

32

3 clk latency and 1 clock throughput of L1D; 14 and 2 for L2

Miss LatenciesL1 miss hits L2 ~ 10 cyclesL2 miss, access to memory ~300 cycles (server/FBD)L2 miss, access to memory ~165 cycles (Desk/DDR2)C step broadwater is reported to have ~50ns

latency

Cache BandwidthBandwidth to cache ~ 8.5 bytes/cycle

Memory BandwidthDesktop ~ 6 GB/sec/socket (linux)Server ~3.5 GB/sec/socket

Page 33: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

33

Speculates the next needed data and loads it into cache by hardware and/or software.

Door(L1 Cache

Valet Parking Area

(L2 Cache)

Main Parking Lot

(External Memory

Page 34: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

34

L1D cache prefetchingData Cache Unit Prefetcher

Known as the streaming prefetcher Recognizes ascending access patterns in recently loaded data Prefetches the next line into the processors cache

Instruction Based Stride PrefetcherPrefetches based upon a load having a regular strideCan prefetch forward or backward 2 Kbytes

1/2 default page size

L2 cache prefetching: Data Prefetch Logic (DPL)Prefetches data to the 2nd level cache before the DCU requests the

dataMaintains 2 tables for tracking loads

Upstream – 16 entriesDownstream – 4 entries

Every load is either found in the DPL or generates a new entryUpon recognition of the 2nd load of a “stream” the DPL will prefetch

the next load

Page 35: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

35

Memory Disambiguation predictor:Loads that are predicted NOT to

forward from preceding store are allowed to schedule as early as possible

Increasing the performance of OOO memory pipelines

Disambiguated loads checked at retirement

Extension to existing coherency mechanism

Invisible to software and system

Page 36: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

36

Load4 must WAIT until previous stores complete

Page 37: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

37

Loads can decouple from stores

Load4 can get its data WITHOUT waiting for stores

Page 38: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

38

If a load follows a store and reloads the data that the store writes to memory, the micro-architecture can forward the data directly from the store to the load

Page 39: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

39

Page 40: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

40

Note that unaligned store forward does not occur when the load crosses a cache line boundary

Note: Unaligned 128-bit stores are issued as two 64-bit stores. This provides two alignments for store forwarding

ld 8 Store forwarded to load

ld 8 No forwarding

‡: No forwarding if the loadcrosses a cache line boundary

Page 41: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

41

CPU1 CPU2

Memory

Cache Line

Shipping L2 Cache Line~Half access to memory

Front Side Bus (FSB)

Page 42: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

42

L2 is share: No need to ship cache line

CPU2 CPU1

Memory

Cache Line

Front Side Bus (FSB)

Page 43: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

43

Load and store access order1.L1 cache of immediate core2.L1 cache of the other core3.L2 cache4.Memory

BusBus

2 MB L2 Cache2 MB L2 Cache

Core1Core1 Core2Core2

Page 44: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

44

Shared second level (L2) 2MB 8-way or 4MB 16-way instruction and data cache

Cache 2 cache transferImproves producer / consumer style MPWider interface to L2

reduced interferenceprocessor line fill is 2 cycles

Higher bandwidth from the L2 cache to the core

~14 clock latency and 2 clock throughput

Page 45: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

45

Avoid “Length Changing Prefixes” (LCPs) Affects instructions with immediate data or offset Operand Size Override (66H) Address Size Override (67H) [obsolete] LCPs change the length decoding algorithm –

increasing the processing time from one cycle to six cycles (or eleven cycles when the instruction spans a 16-byte boundary)

The REX (EM64T) prefix (4xH) is not an LCPThe REX prefix does lengthen the instruction by one byte, so use of the first eight general registers in EM64T is preferred

Page 46: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

46

Includes a “Loop Stream Detector” (LSD)Potentially very high bandwidth instruction

streamingA number of requirements to make use of the

LSDMaximum of 18 instructions in up to four 16-

byte packetsNo RET instructions (hence, little practical

use for CALLs)Up to four taken branches allowedMost effective at 70+ iterations

LSD is after PreDecode so there is no added cost for LCPs

Trade-off LSD with conventional loop unrolling

Page 47: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

47

Decoder issues up to 4 uOps for renaming/ allocation per clock

This creates a trade off between more complex instruction uOps versus multiple simple instruction uOps

For example, a single four uOp instruction is all that can be renamed/allocated in a single clock

In some cases, multiple simple instructions may be a better choice than a single complex instruction

Single uOp instructions allow more decoder flexibility

For example, 4-1-1-1 can be decoded in one clock

However, 2-2-2-1 takes three clocks to decode

Page 48: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

48

Up to six uOps can be dispatched per clock“Store Data” and “Store Address” dispatch ports

are combined on the block diagramUp to four results can be written back per clockSingle clock latency operations are best

Differing latency operations can create writeback conflicts

Separate multiple-clock uOps with several single uOp instructions

Typical instructions here: ADC/SBB, RWM, CMOVcc

In some cases, separating a RMW instruction into its piece might be faster (decode and scheduling flexibility)

When equivalent, PS preferred to PD (LCP)For example, MOVAPS over MOVAPD, XORPS

over XORPD

Page 49: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

49

Bypass register “access” preferred to register readsPartial register accesses often lead to stalls

Register size access that ‘conflicts’ with recent previous register write

Partial XMM updates subject to dependency delaysPartial flag stall can occur, too much higher cost

Use TEST instruction between shift and conditional to prevent

Common zeroing instructions (e.g., XOR reg,reg) don’t stall

Avoid bypass between execution domainsFor example: FP (ADDPS) and logical ops (PAND) on

XMMnVectorization: careful packing/unpacking sequence

Use MXCSR’s FZ and DAZ controls as appropriate

Page 50: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

50

Software prefetch instructionsCan reach beyond a page boundary (including page walk)Prefetches only when it completes without an exception

General techniques to help these prefetchersOrganize data in consecutive linesIn general, increasing addresses are more easily prefetched

Page 51: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

51

What has been coveredNotable features of Core® Micro-architecture

Wide Dynamic ExecutionAdvanced Memory AccessAdvanced Smart CacheAdvanced Digital Media BoostPower Efficient Support

Core® Micro-architecture componentsFront EndOOO execution coreMemory sub-system

Advanced cache technology

Page 52: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

52

What has been coveredNotable features of Core® Micro-architecture

Wide Dynamic ExecutionAdvanced Memory AccessAdvanced Smart CacheAdvanced Digital Media BoostPower Efficient Support

Core® Micro-architecture componentsFront EndOOO execution coreMemory sub-system

Advanced cache technology

Page 53: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

53

Software prefetches are rarely ignored on Merom Architecture

On P4 if you had a DTLB miss the prefetch could be ignored

On Merom architecture they are not ignored and the prefetch can hurt performance since it cannot retire until after the page walk

Critical chunk is not utilized on a software prefetch

A prefetch can hurt performance if it is too close to the load

When the data comes in due to a prefetch you need the entire cache line instead of just the critical chunk before the data can be used by the actual load

Page 54: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

54

2x Compute Throughput / Clock

A

B

Lets scale a vector: B[i] := A[i] * C

Page 55: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

55

Assume both microarchs have 128-bit path from L1 to Processor

A

B

2x Compute Throughput / Clock

Page 56: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

56

Handles all the memory data

2x Compute Throughput / Clock

A

B

Multiply can’tkeep up with

load bandwidth

Multiplier operates on all

data

Page 57: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

04/21/23 57

Page 58: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

58

Handles all the memory data

2x Compute Throughput / Clock

A

B

Load eventually stalls waiting for multiplier

Load pipe is free to advance

Page 59: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

59

Keeps pipeline free for computations

2x Compute Throughput / Clock

A

B

Load eventually stalls waiting for multiplier

Load pipe is free to advance

Page 60: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

60

Maintains 2X throughput compared to prior implementations

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 61: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

61

8 Single Precision flops per cycle

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 62: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

62

4 Double Precision flops per cycle

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 63: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

63

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 64: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

64

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 65: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

65

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 66: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

66

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 67: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

67

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 68: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

68

2x Compute Throughput / Clock

Load eventually stalls waiting for multiplier

Load pipe is free to advance

A

B

Page 69: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

69

Page 70: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

70

Page 71: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

71

Processor communicates power consumption to external platform components

Optimization of voltage regulator efficiency Load line and power delivery efficiency

PSI-2 / VID

Page 72: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

72

DTS – Digital Thermal Sensor

Several thermal sensors are located within the Processor to cover all possible hot spots

Dedicated logic scans the thermal sensors and measures the maximum temperature on the die at any given time.

Accurately reporting Processor temperature enables advanced thermal control schemes

LPF

LPFCore 1 DTS Logic

Core 2DTS Logic

DTS control and status

Page 73: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

73

Processor provides its temperature reading over a multi drop single wire bus allowing efficient platform thermal control.

ProcessorFan

AuxiliaryFan

Manager

PECI

ChassisFan 1

ChassisFan 2

PROC #2

PROC #3

PROC #1

Page 74: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

74

Front side bus with the following low power improvements Lower voltage DPWR# and BPRI# signals

Must have FSB traffic to enable data and address bus input sense amplifiers and control signals (~ 120 pins)

Eliminated higher address and dual processor capable pins

Page 75: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

75

Voltage-Frequency switching separationClock partitioning and recoveryEvent blockingEven during periods of high performance execution, many

parts of the chip core can be shut off.

Page 76: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

76

Page 77: 1. 2 Components of an IA processor Upon completion of this module, you will be able to describe: Working flow of the instruction pipeline Notable features.

77


Recommended