+ All Categories
Home > Documents > Chapter 1 Fundamentals of Quantitative Design and...

Chapter 1 Fundamentals of Quantitative Design and...

Date post: 23-Oct-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
16
The University of Adelaide, School of Computer Science 22 November 2018 Chapter 2 — Instructions: Language of the Computer 1 1 Copyright © 2019, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative Approach, Sixth Edition 2 Computer Technology n Performance improvements: n Improvements in semiconductor technology n Feature size, clock speed n Improvements in computer architectures n Enabled by HLL compilers, UNIX n Lead to RISC architectures n Together have enabled: n Lightweight computers n Productivity-based managed/interpreted programming languages Copyright © 2019, Elsevier Inc. All rights reserved. Introduction
Transcript
  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 1

    1Copyright © 2019, Elsevier Inc. All rights reserved.

    Chapter 1

    Fundamentals of Quantitative Design and Analysis

    Computer ArchitectureA Quantitative Approach, Sixth Edition

    2

    Computer Technologyn Performance improvements:

    n Improvements in semiconductor technologyn Feature size, clock speed

    n Improvements in computer architecturesn Enabled by HLL compilers, UNIX

    n Lead to RISC architectures

    n Together have enabled:n Lightweight computers

    n Productivity-based managed/interpreted programming languages

    Copyright © 2019, Elsevier Inc. All rights reserved.

    Intro

    ductio

    n

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 2

    3

    Single Processor Performance

    Copyright © 2019, Elsevier Inc. All rights reserved.

    Introduction

    4Copyright © 2019, Elsevier Inc. All rights reserved.

    Current Trends in Architecturen Cannot continue to leverage Instruction-Level

    parallelism (ILP)n Single processor performance improvement ended in

    2003

    n New models for performance:n Data-level parallelism (DLP)n Thread-level parallelism (TLP)n Request-level parallelism (RLP)

    n These require explicit restructuring of the application

    Introduction

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 3

    5Copyright © 2019, Elsevier Inc. All rights reserved.

    Classes of Computersn Personal Mobile Device (PMD)

    n e.g. start phones, tablet computersn Emphasis on energy efficiency and real-time

    n Desktop Computingn Emphasis on price-performance

    n Serversn Emphasis on availability, scalability, throughput

    n Clusters / Warehouse Scale Computersn Used for “Software as a Service (SaaS)”n Emphasis on availability and price-performancen Sub-class: Supercomputers, emphasis: floating-point

    performance and fast internal networks

    n Internet of Things/Embedded Computersn Emphasis: price

    Classes of C

    omputers

    6Copyright © 2019, Elsevier Inc. All rights reserved.

    Parallelismn Classes of parallelism in applications:

    n Data-Level Parallelism (DLP)n Task-Level Parallelism (TLP)

    n Classes of architectural parallelism:n Instruction-Level Parallelism (ILP)n Vector architectures/Graphic Processor Units (GPUs)n Thread-Level Parallelismn Request-Level Parallelism

    Classes of C

    omputers

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 4

    7Copyright © 2019, Elsevier Inc. All rights reserved.

    Flynn’s Taxonomyn Single instruction stream, single data stream (SISD)

    n Single instruction stream, multiple data streams (SIMD)n Vector architecturesn Multimedia extensionsn Graphics processor units

    n Multiple instruction streams, single data stream (MISD)n No commercial implementation

    n Multiple instruction streams, multiple data streams (MIMD)n Tightly-coupled MIMDn Loosely-coupled MIMD

    Classes of C

    omputers

    8Copyright © 2019, Elsevier Inc. All rights reserved.

    Defining Computer Architecturen “Old” view of computer architecture:

    n Instruction Set Architecture (ISA) designn i.e. decisions regarding:

    n registers, memory addressing, addressing modes, instruction operands, available operations, control flow instructions, instruction encoding

    n “Real” computer architecture:n Specific requirements of the target machinen Design to maximize performance within constraints:

    cost, power, and availabilityn Includes ISA, microarchitecture, hardware

    Defining C

    omputer Architecture

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 5

    9

    Instruction Set Architecturen Class of ISA

    n General-purpose registersn Register-memory vs load-store

    n RISC-V registersn 32 g.p., 32 f.p.

    Copyright © 2019, Elsevier Inc. All rights reserved.

    Defining C

    omputer Architecture

    Register Name Use Saverx0 zero constant 0 n/ax1 ra return addr callerx2 sp stack ptr calleex3 gp gbl ptrx4 tp thread ptr

    x5-x7 t0-t2 temporaries callerx8 s0/fp saved/

    frame ptrcallee

    Register Name Use Saverx9 s1 saved callee

    x10-x17 a0-a7 arguments callerx18-x27 s2-s11 saved calleex28-x31 t3-t6 temporaries caller

    f0-f7 ft0-ft7 FP temps callerf8-f9 fs0-fs1 FP saved callee

    f10-f17 fa0-fa7 FP arguments

    callee

    f18-f27 fs2-fs21 FP saved calleef28-f31 ft8-ft11 FP temps caller

    10

    Instruction Set Architecturen Memory addressing

    n RISC-V: byte addressed, aligned accesses fastern Addressing modes

    n RISC-V: Register, immediate, displacement (base+offset)

    n Other examples: autoincrement, indexed, PC-relativen Types and size of operands

    n RISC-V: 8-bit, 32-bit, 64-bit

    Copyright © 2019, Elsevier Inc. All rights reserved.

    Defining C

    omputer Architecture

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 6

    11

    Instruction Set Architecturen Operations

    n RISC-V: data transfer, arithmetic, logical, control, floating point

    n See Fig. 1.5 in textn Control flow instructions

    n Use content of registers (RISC-V) vs. status bits (x86, ARMv7, ARMv8)

    n Return address in register (RISC-V, ARMv7, ARMv8) vs. on stack (x86)

    n Encodingn Fixed (RISC-V, ARMv7/v8 except compact instruction

    set) vs. variable length (x86)Copyright © 2019, Elsevier Inc. All rights reserved.

    Defining C

    omputer Architecture

    12Copyright © 2019, Elsevier Inc. All rights reserved.

    Trends in Technologyn Integrated circuit technology (Moore’s Law)

    n Transistor density: 35%/yearn Die size: 10-20%/yearn Integration overall: 40-55%/year

    n DRAM capacity: 25-40%/year (slowing)n 8 Gb (2014), 16 Gb (2019), possibly no 32 Gb

    n Flash capacity: 50-60%/yearn 8-10X cheaper/bit than DRAM

    n Magnetic disk capacity: recently slowed to 5%/yearn Density increases may no longer be possible, maybe increase from 7 to

    9 plattersn 8-10X cheaper/bit then Flashn 200-300X cheaper/bit than DRAM

    Trends in Technology

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 7

    13Copyright © 2019, Elsevier Inc. All rights reserved.

    Bandwidth and Latencyn Bandwidth or throughput

    n Total work done in a given timen 32,000-40,000X improvement for processorsn 300-1200X improvement for memory and disks

    n Latency or response timen Time between start and completion of an eventn 50-90X improvement for processorsn 6-8X improvement for memory and disks

    Trends in Technology

    14Copyright © 2019, Elsevier Inc. All rights reserved.

    Bandwidth and Latency

    Log-log plot of bandwidth and latency milestones

    Trends in Technology

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 8

    15Copyright © 2019, Elsevier Inc. All rights reserved.

    Transistors and Wiresn Feature size

    n Minimum size of transistor or wire in x or y dimension

    n 10 microns in 1971 to .011 microns in 2017n Transistor performance scales linearly

    n Wire delay does not improve with feature size!n Integration density scales quadratically

    Trends in Technology

    16Copyright © 2019, Elsevier Inc. All rights reserved.

    Power and Energyn Problem: Get power in, get power out

    n Thermal Design Power (TDP)n Characterizes sustained power consumptionn Used as target for power supply and cooling systemn Lower than peak power (1.5X higher), higher than

    average power consumption

    n Clock rate can be reduced dynamically to limit power consumption

    n Energy per task is often a better measurement

    Trends in Power and Energy

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 9

    17Copyright © 2019, Elsevier Inc. All rights reserved.

    Dynamic Energy and Powern Dynamic energy

    n Transistor switch from 0 -> 1 or 1 -> 0n ½ x Capacitive load x Voltage2

    n Dynamic powern ½ x Capacitive load x Voltage2 x Frequency switched

    n Reducing clock rate reduces power, not energy

    Trends in Power and Energy

    18Copyright © 2019, Elsevier Inc. All rights reserved.

    Powern Intel 80386

    consumed ~ 2 Wn 3.3 GHz Intel

    Core i7 consumes 130 W

    n Heat must be dissipated from 1.5 x 1.5 cm chip

    n This is the limit of what can be cooled by air

    Trends in Power and Energy

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 10

    19Copyright © 2019, Elsevier Inc. All rights reserved.

    Reducing Powern Techniques for reducing power:

    n Do nothing welln Dynamic Voltage-Frequency Scaling

    n Low power state for DRAM, disksn Overclocking, turning off cores

    Trends in Power and Energy

    20Copyright © 2019, Elsevier Inc. All rights reserved.

    Static Powern Static power consumption

    n 25-50% of total powern Currentstatic x Voltagen Scales with number of transistorsn To reduce: power gating

    Trends in Power and Energy

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 11

    21Copyright © 2019, Elsevier Inc. All rights reserved.

    Trends in Costn Cost driven down by learning curve

    n Yield

    n DRAM: price closely tracks cost

    n Microprocessors: price depends on volumen 10% less for each doubling of volume

    Trends in Cost

    22Copyright © 2019, Elsevier Inc. All rights reserved.

    Integrated Circuit Costn Integrated circuit

    n Bose-Einstein formula:

    n Defects per unit area = 0.016-0.057 defects per square cm (2010)n N = process-complexity factor = 11.5-15.5 (40 nm, 2010)

    Trends in Cost

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 12

    23Copyright © 2019, Elsevier Inc. All rights reserved.

    Dependabilityn Module reliability

    n Mean time to failure (MTTF)n Mean time to repair (MTTR)n Mean time between failures (MTBF) = MTTF + MTTRn Availability = MTTF / MTBF

    Dependability

    24Copyright © 2019, Elsevier Inc. All rights reserved.

    Measuring Performancen Typical performance metrics:

    n Response timen Throughput

    n Speedup of X relative to Yn Execution timeY / Execution timeX

    n Execution timen Wall clock time: includes all system overheadsn CPU time: only computation time

    n Benchmarksn Kernels (e.g. matrix multiply)n Toy programs (e.g. sorting)n Synthetic benchmarks (e.g. Dhrystone)n Benchmark suites (e.g. SPEC06fp, TPC-C)

    Measuring Perform

    ance

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 13

    25Copyright © 2019, Elsevier Inc. All rights reserved.

    Principles of Computer Designn Take Advantage of Parallelism

    n e.g. multiple processors, disks, memory banks, pipelining, multiple functional units

    n Principle of Localityn Reuse of data and instructions

    n Focus on the Common Casen Amdahl’s Law

    Principles

    26Copyright © 2019, Elsevier Inc. All rights reserved.

    Principles of Computer Designn The Processor Performance Equation

    Principles

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 14

    27Copyright © 2019, Elsevier Inc. All rights reserved.

    Principles of Computer DesignPrinciples

    n Different instruction types having different CPIs

    28Copyright © 2019, Elsevier Inc. All rights reserved.

    Principles of Computer Design

    Principles

    n Different instruction types having different CPIs

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 15

    29

    Fallacies and Pitfalls

    n All exponential laws must come to an endn Dennard scaling (constant power density)

    n Stopped by threshold voltagen Disk capacity

    n 30-100% per year to 5% per year

    n Moore’s Lawn Most visible with DRAM capacityn ITRS disbandedn Only four foundries left producing state-of-the-art

    logic chipsn 11 nm, 3 nm might be the limit

    Copyright © 2019, Elsevier Inc. All rights reserved.

    30

    Fallacies and Pitfalls

    n Microprocessors are a silver bulletn Performance is now a programmer’s burden

    n Falling prey to Amdahl’s Lawn A single point of failuren Hardware enhancements that increase

    performance also improve energy efficiency, or are at worst energy neutral

    n Benchmarks remain valid indefinitelyn Compiler optimizations target benchmarks

    Copyright © 2019, Elsevier Inc. All rights reserved.

  • The University of Adelaide, School of Computer Science 22 November 2018

    Chapter 2 — Instructions: Language of the Computer 16

    31

    Fallacies and Pitfalls

    n The rated mean time to failure of disks is 1,200,000 hours or almost 140 years, so disks practically never failn MTTF value from manufacturers assume

    regular replacementn Peak performance tracks observed

    performancen Fault detection can lower availability

    n Not all operations are needed for correct execution

    Copyright © 2019, Elsevier Inc. All rights reserved.


Recommended