Date post: | 19-Jan-2016 |
Category: |
Documents |
Upload: | donald-norris |
View: | 225 times |
Download: | 0 times |
Chapter 5:Chapter 5:Computer Systems Computer Systems
Design and OrganizationDesign and Organization
Dr Mohamed MenacerDr Mohamed MenacerTaibah UniversityTaibah University
2007-20082007-2008
Understanding Structure and Function of Understanding Structure and Function of Digital ComputerDigital Computer
Multiple levels of computer operationMultiple levels of computer operation Application levelApplication level High Level Language(s), HLL, level(s)High Level Language(s), HLL, level(s) Assembly/machine language level: instruction Assembly/machine language level: instruction
setset System architecture level: subsystems & System architecture level: subsystems &
connectionsconnections Digital logic level: gates, memory elements, Digital logic level: gates, memory elements,
busesbuses Electronic design levelElectronic design level Semiconductor physics levelSemiconductor physics level
Von Neumann/TuringVon Neumann/TuringStored Program conceptStored Program concept
Main memory storing programs and dataMain memory storing programs and data
ALU operating on binary dataALU operating on binary data
Control unit interpreting instructions from Control unit interpreting instructions from memory and executingmemory and executing
Input and output equipment operated by Input and output equipment operated by control unitcontrol unit
Princeton Institute for Advanced Studies Princeton Institute for Advanced Studies (IAS)(IAS)
Completed 1952Completed 1952
Structure of von Neumann machineStructure of von Neumann machine
CPU - detailsCPU - details
1000 x 40 bit words1000 x 40 bit words Binary numberBinary number 2 x 20 bit instructions 2 x 20 bit instructions Operation code (Opcode) (8 bits) + address (12 bits)Operation code (Opcode) (8 bits) + address (12 bits)
Set of registers (storage in CPU)Set of registers (storage in CPU) Memory Buffer Register (MBR)Memory Buffer Register (MBR) Memory Address Register (MAR)Memory Address Register (MAR) Instruction Register (IR)Instruction Register (IR) Instruction Buffer Register (IBR)Instruction Buffer Register (IBR) Program Counter (PC)Program Counter (PC) AccumulatorAccumulator Multiplier QuotientMultiplier Quotient
Computer StructureComputer Structure
Computer EvolutionComputer Evolution
Over 50 years, computers have evolvedOver 50 years, computers have evolved from memory size of 1 kiloword (1024 words) from memory size of 1 kiloword (1024 words)
and clock periods of 1 millisecond (0.001 s.)and clock periods of 1 millisecond (0.001 s.) to memory size of a terabyte (2to memory size of a terabyte (24040 bytes) and bytes) and
clock periods of 100 ps. (10clock periods of 100 ps. (10-12-12 s.) and shorter s.) and shorter
More speed and capacity is needed for More speed and capacity is needed for many applications, such as real-time 3D many applications, such as real-time 3D animation, various simulationsanimation, various simulations
IntelIntel
1971 - 4004 1971 - 4004 First microprocessorFirst microprocessor All CPU components on a single chipAll CPU components on a single chip 4 bit4 bit
Followed in 1972 by 8008Followed in 1972 by 8008 8 bit8 bit Both designed for specific applicationsBoth designed for specific applications
1974 - 80801974 - 8080 Intel’s first general purpose microprocessorIntel’s first general purpose microprocessor
Speeding it upSpeeding it up
PipeliningPipelining
On board cacheOn board cache
On board L1 & L2 cacheOn board L1 & L2 cache
Branch predictionBranch prediction
Data flow analysisData flow analysis
Speculative executionSpeculative execution
Performance BalancePerformance Balance
Processor speed increasedProcessor speed increased
Memory capacity increasedMemory capacity increased
Memory speed lags behind processor Memory speed lags behind processor speedspeed
Logic and Memory Performance Logic and Memory Performance GapGap
Intel Microprocessor PerformanceIntel Microprocessor Performance
I/O DevicesI/O DevicesPeripherals with intensive I/O demandsPeripherals with intensive I/O demands
Large data throughput demandsLarge data throughput demands
Processors can handle thisProcessors can handle this
Problem moving data Problem moving data
Solutions:Solutions: CachingCaching BufferingBuffering Higher-speed interconnection busesHigher-speed interconnection buses More elaborate bus structuresMore elaborate bus structures Multiple-processor configurationsMultiple-processor configurations
Typical I/O Device Data RatesTypical I/O Device Data Rates
Improvements in Chip Improvements in Chip Organization and ArchitectureOrganization and ArchitectureIncrease hardware speed of processorIncrease hardware speed of processor Fundamentally due to shrinking logic gate sizeFundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clock rateMore gates, packed more tightly, increasing clock rate
Propagation time for signals reducedPropagation time for signals reduced
Increase size and speed of cachesIncrease size and speed of caches Dedicating part of processor chip Dedicating part of processor chip
Cache access times drop significantlyCache access times drop significantly
Change processor organization and architectureChange processor organization and architecture Increase effective speed of executionIncrease effective speed of execution ParallelismParallelism
Problems with Clock Speed and Logic DensityProblems with Clock Speed and Logic Density
PowerPower Power density increases with density of logic and clock speedPower density increases with density of logic and clock speed Dissipating heatDissipating heat
RC delayRC delay Speed at which electrons flow limited by resistance and Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting themcapacitance of metal wires connecting them Delay increases as RC product increasesDelay increases as RC product increases Wire interconnects thinner, increasing resistanceWire interconnects thinner, increasing resistance Wires closer together, increasing capacitanceWires closer together, increasing capacitance
Memory latencyMemory latency Memory speeds lag processor speedsMemory speeds lag processor speeds
Solution:Solution: More emphasis on organizational and architectural More emphasis on organizational and architectural
approachesapproaches
Increased Cache CapacityIncreased Cache Capacity
Typically two or three levels of cache Typically two or three levels of cache between processor and main memorybetween processor and main memory
Chip density increasedChip density increased More cache memory on chipMore cache memory on chip
Faster cache accessFaster cache access
Pentium chip devoted about 10% of chip Pentium chip devoted about 10% of chip area to cachearea to cache
Pentium 4 devotes about 50%Pentium 4 devotes about 50%
More Complex Execution LogicMore Complex Execution Logic
Enable parallel execution of instructionsEnable parallel execution of instructions
Pipeline works like assembly linePipeline works like assembly line Different stages of execution of different Different stages of execution of different
instructions at same time along pipelineinstructions at same time along pipeline
Superscalar allows multiple pipelines Superscalar allows multiple pipelines within single processorwithin single processor Instructions that do not depend on one Instructions that do not depend on one
another can be executed in parallelanother can be executed in parallel
Diminishing ReturnsDiminishing Returns
Internal organization of processors complexInternal organization of processors complex Can get a great deal of parallelismCan get a great deal of parallelism Further significant increases likely to be Further significant increases likely to be
relatively modestrelatively modest
Benefits from cache are reaching limitBenefits from cache are reaching limit
Increasing clock rate runs into power Increasing clock rate runs into power dissipation problem dissipation problem Some fundamental physical limits are being Some fundamental physical limits are being
reachedreached
New Approach – Multiple CoresNew Approach – Multiple CoresMultiple processors on single chipMultiple processors on single chip Large shared cacheLarge shared cache
Within a processor, increase in performance proportional Within a processor, increase in performance proportional to square root of increase in complexityto square root of increase in complexityIf software can use multiple processors, doubling If software can use multiple processors, doubling number of processors almost doubles performancenumber of processors almost doubles performanceSo, use two simpler processors on the chip rather than So, use two simpler processors on the chip rather than one more complex processorone more complex processorWith two processors, larger caches are justifiedWith two processors, larger caches are justified Power consumption of memory logic less than processing logicPower consumption of memory logic less than processing logic
Example: IBM POWER4Example: IBM POWER4 Two cores based on PowerPCTwo cores based on PowerPC
POWER4 Chip OrganizationPOWER4 Chip Organization
Pentium Evolution (1)Pentium Evolution (1)80808080 first general purpose microprocessorfirst general purpose microprocessor 8 bit data path8 bit data path Used in first personal computer – AltairUsed in first personal computer – Altair
80868086 much more powerfulmuch more powerful 16 bit16 bit instruction cache, prefetch few instructionsinstruction cache, prefetch few instructions 8088 (8 bit external bus) used in first IBM PC8088 (8 bit external bus) used in first IBM PC
8028680286 16 Mbyte memory addressable16 Mbyte memory addressable up from 1Mbup from 1Mb
8038680386 32 bit32 bit Support for multitaskingSupport for multitasking
Pentium Evolution (2)Pentium Evolution (2)8048680486 sophisticated powerful cache and instruction pipeliningsophisticated powerful cache and instruction pipelining built in maths co-processorbuilt in maths co-processor
PentiumPentium SuperscalarSuperscalar Multiple instructions executed in parallelMultiple instructions executed in parallel
Pentium ProPentium Pro Increased superscalar organizationIncreased superscalar organization Aggressive register renamingAggressive register renaming branch predictionbranch prediction data flow analysisdata flow analysis speculative executionspeculative execution
Pentium Evolution (3)Pentium Evolution (3)Pentium IIPentium II MMX technologyMMX technology graphics, video & audio processinggraphics, video & audio processing
Pentium IIIPentium III Additional floating point instructions for 3D graphicsAdditional floating point instructions for 3D graphics
Pentium 4Pentium 4 Note Arabic rather than Roman numeralsNote Arabic rather than Roman numerals Further floating point and multimedia enhancementsFurther floating point and multimedia enhancements
ItaniumItanium 64 bit64 bit see chapter 15see chapter 15
Itanium 2Itanium 2 Hardware enhancements to increase speedHardware enhancements to increase speed
See Intel web pages for detailed information on See Intel web pages for detailed information on processorsprocessors
Internet ResourcesInternet Resources
http://www.intel.com/ http://www.intel.com/ Search for the Intel MuseumSearch for the Intel Museum
http://www.ibm.comhttp://www.ibm.com
http://www.dec.comhttp://www.dec.com