+ All Categories
Home > Documents > M S 13 DD R Performance Barrier Mechnisms

M S 13 DD R Performance Barrier Mechnisms

Date post: 06-Apr-2018
Category:
Upload: bomezzz-enterprises
View: 221 times
Download: 0 times
Share this document with a friend

of 14

Transcript
  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    1/14

    Parallel and Distributed Simulation

    Deadlock Detection & Recovery: Performance

    Barrier Mechanisms

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    2/14

    Outline

    Deadlock Detection and Recovery Algorithm

    Empirical performance measurements

    Synchronous Algorithms

    Barrier mechanisms

    Centralized Barriers Tree Barrier

    Butterfly Barrier

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    3/14

    PerformanceT = arrival time of jobQ = waiting time in queue

    S = service time

    Example: Tandem first-come-first-serve queues

    Classical approach: lookahead?

    LP1 LP2

    arrivalevent

    departureevent

    arrivalevent

    T

    T+Q

    T+Q+S

    begin service

    Optimized to exploit lookahead

    LP1 LP2

    arrivalevent

    arrivalevent

    T

    T+Q

    T+Q+S

    Maintain variable indicating departure time of previous job

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    4/14

    Efficiency of Queueing Network Simulation

    1

    10

    100

    1000

    10000

    1 2 4 8 16 32 64 128 256

    Number of Jobs in Network

    MessagesperDeadlock Optimized(deterministic service

    time)Optimized (exponentialservice time)

    Classical (deterministic

    service time)

    Classical (exponentialservice time)

    Parallel Simulationof a Central ServerQueueing Network

    Deadlock Detection and

    Recovery Algorithm(5 processors)

    merge fork

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    5/14

    Speedup of Queueing Network Simulation

    0

    1

    2

    3

    4

    1 2 4 8 16 32 64 128 256

    Number of Jobs in Network

    Speedup

    Optimized(deterministic servicetime)Optimized (exponentialservice time)

    Classical (deterministicservice time)

    Classical (exponentialservice time)

    Deadlock Detection and Recovery Algorithm(5 processors)

    Exploiting lookahead is essential to obtain good performance

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    6/14

    Synchronous ExecutionBasic idea: each process cycles through the

    following steps: Determine the events that are safe to process

    Process events, exchange messages

    Global synchronization (barrier)

    Messages generated in one cycle are not eligiblefor processing until the next cycle

    Issues

    Barrier mechanism, transient messages

    Determining safe events

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    7/14

    Barrier Synchronization

    Barrier Synchronization: when a process invokes the barrierprimitive, it will block until all other processors have alsoinvoked the barrier primitive.

    When the last process invokes the barrier, all processes can

    execute forward

    - barrier -

    - barrier -- barrier -

    waitwait

    wait

    - barrier -

    process 1 process 2 process 3 process 4

    wallclocktime

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    8/14

    Barrier ImplementationCentralized Message-Passing Approach

    Central controller used to implement barrier 2 step process

    Determine when barrier reached

    Broadcast message to release processes from the barrier

    Barrier primitive for non-controller processes: Send a message to central controller

    Wait for a reply

    Barrier primitive for controller process

    Receive barrier messages from other processes When a message is received from each process, broadcast

    message to release barrier

    Performance

    Controller must send and receive N-1 messages

    Potential bottleneck

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    9/14

    Broadcast Barrier

    1 step approach

    Each process broadcasts message when it reaches barrier

    Wait until a message is received from each other process

    N (N-1) messages

    0 1 2 3

    0 1 2 3

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    10/14

    Tree Barrier

    Organize processes into a tree

    A process sends a message to its parent process when

    The process has reached the barrier point, and

    A message has been received from each of its children processes

    Root detects completion of barrier, broadcast message torelease processes (e.g., send messages down tree)

    2 log N time if all processes reach barrier at same time

    3

    87

    4

    109

    5

    1211

    6

    13

    1 2

    0

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    11/14

    Butterfly Barrier N processes (here, assume N is a power of 2)

    Sequence of log2 N pairwise barriers (let k = log2 N) Pairwise barrier:

    Send message to partner process

    Wait until message is received from that process

    Process p: bkbk-1 b1 = binary representation of p

    Step i: perform barrier with process bk bi b1(complement ith bit of the binary representation)

    Example: Process 3 (011) Step 1: pairwise barrier with process 2 (010)

    Step 2: pairwise barrier with process 1 (001)

    Step 3: pairwise barrier with process 7 (111)

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    12/14

    Butterfly Barrier Example

    0 1 2 3 4 5 6 7

    0 1 2 3 4 5 6 7

    0 1 2 3 4 5 6 7

    0 1 2 3 4 5 6 7

    step 1

    step 2

    step 3

    Wallclocktime

    0 1 2 3 4 5 6 7

    0,1 2,3 4,5 6,7

    0-3 4-7

    0-7

    step 1

    step 2

    step 3

    The communication pattern forms a tree from the perspective of any process

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    13/14

    0 1 2 3 4 5 6 7

    0 2 4 6

    0 4

    0

    1 3 5 7

    1 5

    1

    2 6

    2

    3 7

    3 4 5 6 7

    Butterfly: Superimpose Trees

    After log2 N steps each process is notified that the

    barrier operation has completed

  • 8/3/2019 M S 13 DD R Performance Barrier Mechnisms

    14/14

    Summary

    Deadlock detection and recovery algorithm

    Performance critically dependent on lookahead

    Barrier mechanisms

    Simple barriers using broadcast or central controller

    OK for small number of processors Tree or butterfly give more scalable performance


Recommended