+ All Categories
Home > Documents > Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems...

Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems...

Date post: 15-Jan-2016
Category:
Upload: nico-wetherell
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
41
Multi-processing and Distributed S ystems CS-502 Fall 2007 1 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials from Operating System Concepts, 7 th ed., by Silbershatz, Galvin, & Gagne, Modern Operating Systems, 2 nd ed., by Tanenbaum, and Operating Systems: Internals and Design Principles , 5 th ed., by Stallings)
Transcript
Page 1: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 1

Multiprocessing and Distributed Systems

CS-502 Operating SystemsFall 2007

(Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne, Modern Operating Systems, 2nd ed., by Tanenbaum, and

Operating Systems: Internals and Design Principles, 5th ed., by Stallings)

Page 2: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 2

Multiprocessing Distributed Computing(a spectrum)

• Many independent problems at same time• Similar

• Different

• One very big problem (or a few)

• Computations that are physically separated• Client-server

• Inherently dispersed computations

Different kinds of computers and operating systems

Page 3: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 3

Multiprocessing Distributed Computing(a spectrum)

• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations

• Different — e.g., university computer center; your own PC

• One very big problem (or a few)

• Computations that are physically separated• Client-server

• Inherently dispersed computations

Different kinds of computers and operating systems

Page 4: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 4

Multiprocessing Distributed Computing(a spectrum)

• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations

• Different — e.g., university computer center; your own PC

• One very big problem (too big for one computer)

• Weather modeling, finite element analysis; drug discovery; gene modeling; weapons simulation; etc.

• Computations that are physically separated• Client-server

• Inherently dispersed computations

Different kinds of computers and operating systems

Page 5: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 5

Multiprocessing Distributed Computing(a spectrum)

• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations• Different — e.g., university computer center; your own PC

• One very big problem (too big for one computer)• Weather modeling, Finite element analysis; Drug discovery;

Gene modeling; Weapons simulation; etc.

• Computations that are physically separated• Client-server• Dispersed and/or peer-to-peer

– Routing tables for internet– Electric power distribution– International banking

Different kinds of computers and operating systems

Page 6: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 6

Inherently Distributed Computation(example)

• Internet routing of packets• Network layer – send a packet to its IP address

• Problems• No central authority to manage routing in Internet

• Nodes and links come online or fail rapidly

• Need to be able to send a packet from any address to any address

• Solution• Distributed routing algorithm

Page 7: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 7

Distributed routing algorithm(simplified example)

• Each node “knows” networks other nodes directly connected to it.

• Each node maintains table of distant networks

• [network #, 1st hop, distance]

• Periodically adjacent nodes exchange tables• Update algorithm (for each network in table)

• If (neighbor’s distance to network + my distance to neighbor < my distance to network), then update my table entry for that network

Page 8: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 8

Distributed routing algorithm(result)

• All nodes in Internet have reasonably up-to-date routing tables

• Rapid responses to changes in network topology, congestion, failures, etc.

• Very reliable with no central management!

• Network management software• Monitoring health of network (e.g., routing tables)• Identifying actual or incipient problems• Data and statistics for planning purposes

Page 9: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 9

Summary

• Some algorithms are inherently distributed

• Depend upon computations at physically separate places

• Result is the cumulative results of individual computations

• Not much impact on computer or OS architecture

Page 10: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 10

Taxonomy of Parallel Processing

Page 11: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 11

Client-Server Computations

• Very familiar model

• Parts of computation takes place on• User’s PC, credit-card terminal, etc.

• … and part takes place on central server• E.g., updating database, search engine, etc.

• Central issues• Protocols for efficient communication

• Where to partition the problem

Page 12: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 12

Very Big Problems

• Lots and lots of calculations• Highly repetitive

• Easily parallelizable into separate subparts

• (Usually) dependent upon adjacent subparts

• Arrays or clusters of computers• Independent memories

• Very fast communication between elements close coupling

Page 13: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 13

Closely-coupled Systems

• Examples– Beowulf clusters – 32-256 PCs

– ASCI Red – ~1000-4000 PC-like processors

– BlueGene/L – up to 65,536 processing elements

Page 14: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 14

Taxonomy of Parallel Processing

Page 15: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 15

IBM BlueGene/L

Page 16: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 16

Systems for Closely-Coupled Clusters

• Independent OS for each node• E.g., Linux in Beowulf

• Additional middleware for• Distributed process space

• Synchronization among nodes (Silbershatz, Ch 18)

• Access to network and peripheral devices

• Failure management

• Parallelizing compilers• For partitioning a problem into many processes

Page 17: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 17

Cluster Computer Architecture

Page 18: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 18

Interconnection Topologies

Page 19: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 19

Goal

• Microsecond-level – messages– synchronization and barrier operations among

processes

Page 20: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 20

Questions?

Page 21: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 21

Multiprocessing Distributed Computing(a spectrum)

• Many independent problems at same time• Similar — e.g., banking & credit card; airline reservations

• Different — e.g., university computer center; your own PC

• One very big problem (or a few)

• Problems that physically separated by their very nature

Different kinds of computers and operating systems

Page 22: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 22

Many Separate Computations

• Similar – banking & credit card; airline reservation; … • Transaction processing

• Thousands of small transactions per second

• Few (if any) communications among transactions– Exception: locking, mutual exclusion, serialization

• Common database

Page 23: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 23

Requirements

• Lots of disks, enough CPU power• Accessible via multiple paths

• High availability• Fault-tolerant components & OS support

• Processor independence• Any transaction on any processor

• Global file system

• Load-balancing

• Scalable

Page 24: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 24

Options

• Cluster (of a somewhat different sort)

• Shared-memory multi-processor

Page 25: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 25

Taxonomy of Parallel Processing

Page 26: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 26

Cluster Computing

• Scalable to large number of processors• Common I/O space

– Multiple channels for disk access

– Hyper-channel, Fiber-channel, etc.

• Multi-headed disks– At least two paths from any processor to any disk

• Remote Procedure Call for communication among processors– DCOM, CORBA, Java RMI, etc.

Page 27: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 27

Shared-Memory Multiprocessor

• All processors can access all memory• Some memory may be faster than other memory

• Any process can run on any processor• Multiple processors executing in same address space

concurrently

• Multiple paths to disks (as with cluster)

Page 28: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 28

Shared Memory MultiprocessorsBus-based

•Bus contention limits the number of CPUs

•Lower bus contention

•Caches need to be synced (big deal)

•Compiler places data and text in private or shared memory

Page 29: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 29

Multiprocessors (2) - Crossbar

• Can support a large number of CPUs -

• Non-blocking network

• Cost/performance effective up to about 100 CPU – growing as n2

Page 30: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 30

Multiprocessors(3) – Multistage Switching Networks

• Omega Network – blocking – Lower cost, longer latency

– For N CPUs and N memories – log2 n stages of n/2 switches

Page 31: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 31

Type of Multiprocessors – UMA vs. NUMA

• UMA (Uniform Memory Access)– All memory is equal

– Familiar programming model

– Number of processors is limited

• 8-32

– Symmetrical• No processor is

different from others

• NUMA (Non-Uniform Memory Access)– Single address space

visible to all CPUs

– Access to remote memory via commands

• LOAD & STORE

• remote memory access slower than to local

Page 32: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 32

Common SMP Implementations

• Multiple processors on same board or bus

• Dual core and multi-core processors• I.e., two or more processors on same chip

• Share same L2 cache

• Hyperthreading• One processor shares its circuitry among two

threads or processes

• Separate registers, PSW, etc.

• Combinations of above

Page 33: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 33

Hardware issue – Cache Synchronization

• Each processor cache monitors memory accesses on bus (i.e., “snoops”)– Memory updates are propagated to other caches– Complex cache management hardware– Order of operations may be ambiguous

• Some data must be marked as not cacheable– Visible to programming model

Page 34: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 34

Operating System Considerations

• Master-slave vs. full symmetry

• Explicit cache flushing• When page is replaced or invalidated

• Processor affinity of processes• Context switches across processors can be

expensive

• Synchronization across processors

• Interrupt handling

Page 35: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 35

Multiprocessor OS – Master-Slave

• One CPU (master) runs the OS and applies most policies

• Other CPUs– run applications– Minimal OS to acquire and terminate processes

• Relatively simple OS

• Master processor can become a bottleneck for a large number of slave processors

Page 36: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 36

Multiprocessor OS – Symmetric Multi-Processor (SMP)

• Any processor can execute the OS and any applications

• Synchronization within the OS is the issue1. Lock the whole OS – poor utilization – long queues

waiting to use OS2. OS critical regions – much preferred

– Identify independent OS critical regions that be executed independently – protect with mutex

– Identify independent critical OS tables – protect access with MUTEX

– Design OS code to avoid deadlocks– The art of the OS designer– Maintenance requires great care

Page 37: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 37

Multiprocessor OS – SMP (continued)

• Multiprocessor Synchronization– Need special instructions – test-and-set– Spinlocks are common

• Can context switch if time in critical region is greater than context switch time

• OS designer must understand the performance of OS critical regions

• Context switch time could be onerous– Data cached on one processor needs to be re-

cached on another

Page 38: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 38

Multiprocessor Scheduling

• When processes are independent (e.g., timesharing)– Allocate CPU to highest priority process– Tweaks

• For a process with a spinlock, let it run until it releases the lock• To reduce TLB and memory cache flushes, try to run a process

on the same CPU each time it runs

• For groups of related processes– Attempt to simultaneously allocate CPUs to all related

processes (space sharing)– Run all threads to termination or block– Gang schedule – apply a scheduling policy to related

processes together

Page 39: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 39

Linux SMP Support

• Multi-threaded kernel• Any processor can handle any interrupt

• Interrupts rarely disabled, and only for shortest time possible

• Multiple processors can be active in system calls concurrently

• Process vs. Interrupt context• Process context:– when processor is acting on behalf of a

specific process

• Interrupt context:– when processor is acting on behalf of no particular process

• Extensive use of spinlocks• Processor affinity properties in task_struct

Page 40: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 40

WPI Computing & Communications Center

• CCC cluster – 11 dual 2.4 GHz Xeon processors

• toth – 16-node 1.5 GHz Itanium2

• big16 – 8 dual-core 2.4 GHz Opterons

• toaster – NetApp 5.5 TByte Fiber Channel RAID NFS

• breadbox – NetApp 12 TByte RAID NFS

• Many others

• All connected together by very fast networks

Page 41: Multi-processing and Distributed Systems CS-502 Fall 20071 Multiprocessing and Distributed Systems CS-502 Operating Systems Fall 2007 (Slides include materials.

Multi-processing and Distributed Systems

CS-502 Fall 2007 41

Questions?


Recommended