+ All Categories
Home > Documents > Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang...

Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang...

Date post: 15-Dec-2015
Category:
Upload: allan-bracher
View: 218 times
Download: 1 times
Share this document with a friend
Popular Tags:
36
Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng Zhu @ U. of Maine Fall, 2007 Portions of these slides are derived from: Dave Patterson © UCB
Transcript

Slide 1

Fundamentals of Computer Design

CSCE430/830 Computer Architecture

Instructor: Hong Jiang

Courtesy of Prof. Yifeng Zhu @ U. of Maine

Fall, 2007

Portions of these slides are derived from:Dave Patterson © UCB

Slide 2

Motivations and Introduction

•Phenomenal growth in computer industry/technology:

X2/18mo in 20yr. multi-GFLOPs processors, largely due to–Micro-electronics technology–Computer Design innovations

•We have come a long way in a short time of 60 years since the 1st general purpose computer in 1946:

• Instruction Set Architecture: •An Introduction

Slide 3

Motivations and Introduction

Past (Milestones):– First electronic computer ENIAC in 1946: 18,000 vacuum tubes, 3,000 cubic feet, 20 2-foot 10-digit registers, 5 KIPs (thousand additions per second);

– First microprocessor (a CPU on a single IC chip) Intel 4004 in 1971: 2,300 transistors, 60 KIPs, $200;

– Virtual elimination of assembly language programming reduced the need for object-code compatibility;

– The creation of standardized, vendor-independent operating systems, such as UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture

– RISC instruction set architecture paved ways for drastic design innovations that focused on two critical performance techniques: instruction-level parallelism and use of caches

Slide 4

Motivations and Introduction

Present (State of the art): – Microprocessors approaching/surpassing 10 GFLOPS;– A high-end microprocessor (<$10K) today is easily more powerful than a supercomputer (>$10million) ten years ago;

– While technology advancement contributes a sustained annual growth of 35%, innovative computer design accounts for another 25% annual growth rate a factor of 15 in performance gains!

Slide 5

Technology Trend

Big Fish Eating Little Fish

In reality:

Slide 6

Technology Trend

PCWork-stationMini-

computer

Mainframe

Mini-supercomputer

Supercomputer

Massively Parallel

Processors

1988 Computer Food Chain

Slide 7

Technology Trend

1998 Computer Food Chain

PCWork-station

Mainframe

Supercomputer

Mini-supercomputerClusters

Mini-computer

Now who is eating whom?

Server

Slide 8

Parallel Computing Architectures in Top 500

www.top500.orgNov. 2004

MEMORY

BUS/CROSSBAR

CPU CPU CPU CPU

Symmetric Multiprocessing (SMP)

Massively Parallel Processor (MPP)

CPU M

CPU MCPU M

CPU MPC PCPC

network

cluster

MPP

Cluster

SMPConstellations

SIMD

Single processor

Supercomputer Trends in Top 500

Slide 9

Why Such Changes in 10 years?

• Performance– Technology Advances

» CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance

– Computer architecture advances improves low-end » RISC, superscalar, RAID, …

• Price: Lower costs due to …– Simpler development

» CMOS VLSI: smaller systems, fewer components– Higher volumes

» CMOS VLSI : same dev. cost 10,000 vs. 10,000,000 units – Lower margins by class of computer, due to fewer services

• Function– Rise of networking/local interconnection technology

Slide 10

Amazing Underlying Technology Change

• In 1965, Gordon Moore sketched out his prediction of the pace of silicon technology.

• Moore's Law: The number of transistors incorporated in a chip will approximately double every 24 months.

• Decades later, Moore's Law remains true.

From Intel

Slide 11

Technology Trends: Moore’s Law

• Gordon Moore (Founder of Intel) observed in 1965 that the number of transistors on a chip doubles about every 24 months.

• In fact, the number of transistors on a chip doubles about every 18 months.

From intel

Slide 12

Technology Trends

Based on SPEED, the CPU has increased dramatically, but memory and disk have increased only a little. This has led to dramatic changed in architecture, Operating Systems, and programming practices.

Slide 13

Technology dramatic change• Processor

– transistor number in a chip: about 55% per year– clock rate: about 20% per year

• Memory– DRAM capacity: about 60% per year (4x every 3 years)– Memory speed: about 10% per year– Cost per bit: improves about 25% per year

• Disk– capacity: about 60% per year– Total use of data: 100% per 9 months!

• Network Bandwidth– 10 years: 10Mb 100Mb– 5 years: 100Mb 1 Gb

Slide 14

Technology dramatic change

From IBM

Slide 15

Computer Architecture Is …

the attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls, the logic design, and the physical implementation.

Amdahl, Blaaw, and Brooks, 1964SOFTWARESOFTWARE

Slide 16

Computer Architecture’s Changing Definition

• 1950s to 1960s Computer Architecture Course:

Computer Arithmetic• 1970s to mid 1980s Computer Architecture

Course:

Instruction Set Design, especially ISA appropriate for compilers

• 1990s Computer Architecture Course:Design of CPU, memory system, I/O system, Multiprocessors, Networks

• 2010s: Computer Architecture Course:

Self adapting systems? Self organizing structures?DNA Systems/Quantum Computing?

Slide 17

CSCE430/830 Course Focus

Understanding the design techniques, machine structures, technology factors, evaluation methods that will determine the form of computers in the 21st Century

Technology ProgrammingLanguages

OperatingSystems

History

ApplicationsInterface Design

(ISA)

Measurement & Evaluation

Parallelism

Computer Architecture:• Instruction Set Design• Organization• Hardware/Software Boundary Compilers

Slide 18

Computer Engineering Methodology

TechnologyTrends

Evaluate ExistingEvaluate ExistingSystems for Systems for BottlenecksBottlenecks

Benchmarks

Simulate NewSimulate NewDesigns andDesigns and

OrganizationsOrganizations

Workloads

Implement NextImplement NextGeneration SystemGeneration System

ImplementationComplexity

Architecture design is an iterative process: Searching the space of possible designs at all levels of computer systems

Slide 19

Summary

1. Moors’s laws: The number of transistors incorporated in a chip will approximately double every 18 months.

2. CPU speed increases dramatically, but the speed of memory, disk and network increases slowly.

3. Architecture design is an iterative process. Measure performance: Benchmarks

Slide 20

Quantitative Principles• Performance Metrics: How do we

conclude that System-A is “better” than System-B?

• Amdahl’s Law: Relates total speedup of a system to the speedup of some portion of that system.

• Topics: (Sections 1.1, 1.2, 1.5, 1.6)– Metrics for different market segments– Benchmarks to measure performance– Quantitative principles of computer design

Slide 21

Importance of Measurement

Architecture design is an iterative process:• Search the possible design space• Make selections • Evaluate the selections made

Good IdeasGood Ideas

Mediocre IdeasBad Ideas

Cost /PerformanceAnalysis

Good measurement tools are required to accurately evaluate the selection.

Slide 22

Two notions of “performance”

Plane

Boeing 747

BAD/Sud Concodre

Speed

610 mph

1350 mph

DC to Paris

6.5 hours

3 hours

Passengers

470

132

Throughput (pmph)

286,700

178,200

• Time to do the task (Execution Time)

– execution time, response time, latency, etc.

• Tasks per day, hour, week, sec, ns. .. (Performance)

– throughput, bandwidth, etc.

Which has higher performance?

Slide 23

Performance Definitions

• Performance is in units of things-per-second.– bigger is better

• Execution time is the reciprocal of performance.– performance(x) = 1

execution_time(x)

• "X is n times faster than Y" means

execution_time (Y) performance(X)

n = ----------------- = -----------------

execution_time (X) performance(Y)• When is throughput more important than execution

time?• When is execution time more important than

throughput?

Slide 24

Performance Terminology“X is n% faster than Y” means:ExTime(Y) Performance(X) n

-------- = -------------- = 1 + ------

ExTime(X) Performance(Y) 100

n = 100(Performance(X) - Performance(Y))

Performance(Y)

Example: Y takes 15 seconds to complete a task, X takes 10 seconds. What % faster is X than Y?

n = 100(ExTime(Y) - ExTime(X))

ExTime(X)

Slide 25

Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

tEnhancemenWithoutePerformanc

tEnhancemenWithePerformanc

tEnhancemenWithTimeExecution

tEnhancemenWithoutTimeExecutionESpeedup

__

__

___

___)(

Speedup due to enhancement E:

This fraction enhanced

Quantitative Design: Amdahl's Law

Amdahl’s Law gives a quick way to find the speedup from some enhancement.

Slide 26

Quantitative Design: Amdahl's Law

This fraction enhanced

ExTimeold ExTimenew

ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced

Speedupoverall =ExTimeold

ExTimenew

Speedupenhanced

=

1

(1 - Fractionenhanced) + Fractionenhanced

Speedupenhanced

Slide 27

Pictorial Depiction of Amdahl’s Pictorial Depiction of Amdahl’s LawLaw

Before: Execution Time without enhancement E

After: Execution Time with enhancement E:

Enhancement E accelerates fraction F of original execution time by a factor of S

Unaffected fraction: (1- F) Affected fraction: F

Unaffected fraction: (1- F) F/S

Unchanged

Execution Time without enhancement E 1Speedup(E) = --------------------------------------------------------- = ---------------------- Execution Time with enhancement E (1 - F) + F/S

• shown normalized to 1 = (1-F) + F =1

Slide 28

• Floating point (FP) instructions improved to run 2X; but only 10% of actual instructions are FP. Suppose the old execution time is ExTimeold, What are the current execution time and speedup?

Quantitative Design: Amdahl's Law

Speedupoverall = 1

0.95= 1.053

ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold

Speedup =ExTimeold

ExTimenew

=

1

(1 - Fractionenhanced) + Fractionenhanced

Speedupenhanced

Speedup =1

(1 - 0.1) + 0.1/2= 1.053

Slide 29

• The clock cycle time is the amount of time for one clock period to elapse (e.g. 5 ns).

• The clock rate is the inverse of the clock cycle time.

• For example, if a computer has a clock cycle time of 5 ns, the clock rate is:

1 ---------------------- = 200 MHz

5 x 10 sec

Computer Clocks • A computer clock runs at a constant rate

and determines when events take placed in hardware.

Clk

clock period

-9

Slide 30

Computing CPU time• The time to execute a given program is

CPU time = CPU clock cycles for a program x clock cycle time

Since clock cycle time and clock rate are reciprocals, thusCPU time = CPU clock cycles for a program / clock rate

• CPI: clock cycles per instruction CPU clock cycle for a program CPI = ------------------------- Instruction count

Slide 31

Computing CPU time• The time to execute a given program is

CPU time = CPU clock cycles for a program x clock cycle time

Since clock cycle time and clock rate are reciprocals, thus

CPU time = CPU clock cycles for a program / clock rate

• The number of CPU clock cycles can be determined by

CPU clock cycles = (instructions/program) x (clock cycles/instruction)

= Instruction count x CPI

which gives

• The units for this are instructions clock cycles secondsseconds = ---------------- x -------------- x -------------- program instruction clock cycle

CPU time = Instruction count x CPI x clock cycle timeCPU time = Instruction count x CPI / clock rate

Slide 32

Example of Computing CPU

time• If a computer has a clock rate of 2

GHz, how long does it take to execute a program with 1,000,000 instructions, if the CPI for the program is 3.5?

Slide 33

Example of Computing CPU

time• If a computer has a clock rate of 2 GHz, how long

does it take to execute a program with 1,000,000 instructions, if the CPI for the program is 3.5?

• Using the equation

CPU time = Instruction count x CPI / clock rate

gives

CPU time = 1000000 x 3.5 / (2 x 109 )

• If a computer’s clock rate increases from 200 MHz to 250 MHz and the other factors remain the same, how many times faster will the computer be?

CPU time old clock rate new 250 MHz-------------- = ------------------ = -------------- = 1.25CPU time new clock rate old 200 MHZ

• What simplifying assumptions did we make?

6

Slide 34

Performance Example

• Two computers M1 and M2 with the same instruction set.

• For a given program, we have

• How many times faster is M2 than M1 for this program?

ExTimeM1 ICM1 x CPIM1 / Clock RateM1

=ExTimeM2 ICM2 x CPIM2 / Clock RateM2

=2.8/50

3.2/75= 1.31

Clock rate

(MHz)

CPI

M1 50 2.8M2 75 3.2

Slide 35

Aspects of CPU Performance

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

CPU time = Seconds = Instructions x Cycles x Seconds

Program Program Instruction Cycle

Inst Count CPICycle Time

Program X

Compiler X (X)

Inst. Set. X X

Organization X X

Technology X

Slide 36

Performance Summary• Two performance metrics execution time and

throughput.

• Amdahl’s Law

• When trying to improve performance, look at what occurs frequently => make the common case fast.

• CPU time:

CPU time = Instruction count x CPI x clock cycle time

CPU time = Instruction count x CPI / clock rate

Execution Time without enhancement E 1Speedup(E) = --------------------------------------------------------- = ---------------------- Execution Time with enhancement E (1 - F) + F/S


Recommended