Parallel Computing Models & Techniques

Post on 02-Jan-2016

31 views 4 download

Tags:

description

Parallel Computing Models & Techniques. About Me. Microsoft MVP Intel Blogger TechEd Israel, TechEd Europe Expert C++ Book http ://AsyncOp.com http://Asaf.Shelly.co.il. Parallel Computing. Multi-Core Distributed Systems SOA & WebServices Transaction, Session, Queue, Event, Interrupt - PowerPoint PPT Presentation

transcript

Parallel ComputingModels & Techniques

About Me

• Microsoft MVP• Intel Blogger• TechEd Israel, TechEd Europe• Expert C++ Book

• http://AsyncOp.com• http://Asaf.Shelly.co.il

Parallel Computing

• Multi-Core• Distributed Systems• SOA & WebServices• Transaction, Session, Queue, Event, Interrupt• User Experience over User Interface• Maximize performance: No Free Work Unit• Best performance: No I/O Wait

Advantages of Multi-Core

• Low Power Consumption• Extended battery life• Less heating• Smaller and lighter devices• Software replaces custom hardware!

Signaling In Hardware

Error Detection

CPU

RAMRequest: Read AddressWait: Preparing DataResponse: Data

Signaling In Hardware

Error Detection

CPU

RAMInterrupt : Data Pending

Interrupt : Processing Complete

Software Locks?

• Locks Are BAD!• By design a lock is forcing serial work• Using a resource on a single core• Use a lock only when you want to use 1 core

at a time and eliminate parallel work• Locks can be used on single steps for example

entrance to a queue

Locks Are BAD!

• Can you find the bug??

Lock( MUTEX_A )Buffer_A [ 12 ] = 23;// here Buffer_A [ 12 ] is 57 !!!!

Unlock( MUTEX_A )

• Would you find it with a code review?

Lock = Stop

Need Lock-Free Solutions!

Protecting A Resource

• Lock as way to share ownership• Using a single owner– Owner Thread– Owner Task– TPL Agent– Device Driver– Owner Service

Asynchronous Work Without Locks

• Phone as Synchronous System• Phone as Asynchronous System• Mail and Email System• Order Pizza

Unprotected Parallel Access To Data

• Two Writers or Writer and Reader

Writer

A A A A A A A A A A A A A A A A A

Writer Reader

Race Condition - Location

• Two Writers or Writer and Reader

Writer

A A A A A A A A A A A A A A A A A

Writer Reader

Race Condition – Timeframe

• Collision over the same communication line

Writer Writer Reader

A

Race Condition - Sequence

• Bugs in Parallel Pipeline

Clear Buffer

Add ABCDE ABCDE

Add X ABCDEX

Add ‘1’ To ASCII BCDEFY

123

Add ABCDE 123ABCDE

Clear Buffer

Add ‘1’ To ASCII

Add X X

123

TCP, CJP, PF

Race Condition Solutions

• Wave-In Signal – Manager (ex. USB BUS)• Pass Ownership (Token Ring, MUTEX)• TDM• Burst Write, Retry Read (ex. SeqLock, Reader-

Writer Lock, Network Layer 2)• Write and Verify• Queue• Transaction – A Sequence

Serial Problem with Communication

• Transaction based Ping-Pong

ComputerUSB Device

Packet Request

Packet Data

Acknowledge

Packet Request

Packet Data

Acknowledge

Parallel Solution for Ping-Pong

• Collected Transaction

Request List

Packet Data A

Packet Data B

Packet Data C

Packet Data D

Retransmit B

Ack List

ComputerUSB Device

Cancel Operation

• Search For File

Request List

Packet Data A

Data found in B

Packet Data B

Packet Data C

Acknowledge A

Packet Data D

Abort

Packet Data E

Acknowledge

ComputerUSB Device

Object Oriented Design

• Definition Of Objects• Object Relations• Object Reusability• Object Management• Object Oriented Block Diagram• Object Oriented System Design• Avoid “Spaghetti Code”

Procedural Design

• Definition Of State• Procedure Relations• Procedure Reusability• Flow Control Management• Poor Block Diagram• Limited System Design• Avoid “Spaghetti Flow”

Good Application Design

Good Application Design

Queue

• Pass Data Without Using Lock• Full Asynchronous Operation• Event With Data• Event With Priority• Event With Destination• Structured Event vs. Stream

Flow Control

• Keep Internal State• Object State• Execution Phase• Collection of State as System State• System State for Debug

Task Management

• Stack – Hardware Accelerated Management• Fork• Software Stack Management• Session• Task Groups

Software Dispatcher

• .Net Parallel Extensions

Network Dispatcher

• Load Balancing

Firewall

Load Balance Front End

Firewall Firewall

Load Balance Front End

Hardware Dispatcher

• 10 Gbps Network Switch

2.67 GHzCORE

10 GHz Network Dispatcher / MUX

2.67 GHzCORE

2.67 GHzCORE

2.67 GHzCORE

Cloud Dispatcher

• Microsoft Server 2008 HPC

Parallel Memory?

Task Oriented Design

Operation: Setting up a Tent

Task: locate items in storage

Task: carry items to build site

Task: use items to build tent

Execution Timeline

Time

Output

Locate Carry Use

Pole

Fabric

Wires

Horizontal Division

Time

Time

Vertical Division

Resource Partitioning

Force Duplication

• Entire Process• Sharing Resources• Flow Barriers• Simple to implement• Simple Affinity• Simple Priority• No Optimization

Pipeline

• Functional• Resources Ownership• Communication Barriers• Requires Design• Affinity Planning• Priority Planning• Optimization

Super Networks and Grids

• Multiple Reads• Multiple Writes• Replication Time• Replication Overhead• Network Consistency• Data Snapshot• Real Time

Super Networks

Thank You