+ All Categories
Home > Documents > Parallel Computing—Introduction to Message Passing Interface (MPI)

Parallel Computing—Introduction to Message Passing Interface (MPI)

Date post: 02-Feb-2016
Category:
Upload: anakin
View: 73 times
Download: 1 times
Share this document with a friend
Description:
Parallel Computing—Introduction to Message Passing Interface (MPI). Two Important Concepts. Two fundamental concepts of parallel programming are: Domain decomposition Functional decomposition. Domain Decomposition. Functional Decomposition. Message Passing Interface (MPI). - PowerPoint PPT Presentation
39
1 Parallel Computing— Introduction to Message Passing Interface (MPI)
Transcript
Page 1: Parallel Computing—Introduction to Message Passing Interface (MPI)

1

Parallel Computing—Introduction to Message Passing Interface

(MPI)

Page 2: Parallel Computing—Introduction to Message Passing Interface (MPI)

2

Two Important Concepts• Two fundamental concepts of parallel

programming are: • Domain decomposition• Functional decomposition

Page 3: Parallel Computing—Introduction to Message Passing Interface (MPI)

3

Domain Decomposition

Page 4: Parallel Computing—Introduction to Message Passing Interface (MPI)

4

Functional Decomposition

Page 5: Parallel Computing—Introduction to Message Passing Interface (MPI)

5

Message Passing Interface (MPI)• MPI is a standard (an interface or an API):

• It defines a set of methods that are used by application developers to write their applications

• MPI library implement these methods

• MPI itself is not a library—it is a specification document that is followed!

• MPI-1.2 is the most popular specification version

• Reasons for popularity:• Software and hardware vendors were involved

• Significant contribution from academia

• MPICH served as an early reference implementation

• MPI compilers are simply wrappers to widely used C and Fortran compilers

• History: • The first draft specification was produced in 1993

• MPI-2.0, introduced in 1999, adds many new features to MPI

• Bindings available to C, C++, and Fortran

• MPI is a success story:• It is the mostly adopted programming paradigm of IBM Blue Gene systems

• At least two production-quality MPI libraries:• MPICH2 (http://www-unix.mcs.anl.gov/mpi/mpich2/)

• OpenMPI (http://open-mpi.org)

• There’s even a Java library: • MPJ Express (http://mpj-express.org)

Page 6: Parallel Computing—Introduction to Message Passing Interface (MPI)

6

Message Passing Model• Message passing model allows processors to

communicate by passing messages: • Processors do not share memory

• Data transfer between processors required cooperative operations to be performed by each processor:• One processor sends the message while other receives the

message

Page 7: Parallel Computing—Introduction to Message Passing Interface (MPI)

7

Proc 6

Proc 0

Proc 1

Proc 3

Proc 2

Proc 4

Proc 5

Proc 7

message

CPU

Memory LANEthernetMyrinet

Infiniband etc

Distributed Memory Cluster

Page 8: Parallel Computing—Introduction to Message Passing Interface (MPI)

8

Writing “Hello World” MPI Program

• MPI is very simple: • Initialize MPI environment:

• MPI_Init(&argc,&argv); // C Code • MPI.Init(args); // Java Code

• Send or receive message:• MPI_Send(..); // C Code • MPI.COMM_WORLD.Send(); // Java Code

• Finalize MPI environment• MPI_Finalize(); // C Code • MPI.Finalize(); // Java Code

Page 9: Parallel Computing—Introduction to Message Passing Interface (MPI)

9

Hello World in C#include <stdio.h>#include <string.h>#include “mpi.h”

..

// Initialize MPI MPI_Init(&argsc,&&argsv);

// Find out the `id’ or `rank’ of current processMPI_Comm_Rank(MPI_COMM_WORLD,&my_rank); //get the rank

// Get total number of processesMPI_Comm_Size(MPI_COMM_WORLD,&p); //get total processor

// Print the rank of the processprintf(“Hello World from process no %d”,my_rank);

MPI_Finalize();

..

Page 10: Parallel Computing—Introduction to Message Passing Interface (MPI)

10

Hello World in Java

import java.util.*;import mpi.*;

.. // Initialize MPI MPI.Init(args); // start up MPI

// Get total number of processes and ranksize = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();

System.out.println(“Hello World <”+rank+”>”);

MPI_Finalize();

..

Page 11: Parallel Computing—Introduction to Message Passing Interface (MPI)

11

After Initializationimport java.util.*;import mpi.*;

.. // Initialize MPI MPI.Init(args); // start up MPI

// Get total number of processes and ranksize = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();

..

Page 12: Parallel Computing—Introduction to Message Passing Interface (MPI)

12

What is size?

• Total number of processes in a communicator:• The size of MPI.COMM_WORLD is 6

import java.util.*;import mpi.*;

..

// Get total number of processessize = MPI.COMM_WORLD.Size();

..

Page 13: Parallel Computing—Introduction to Message Passing Interface (MPI)

13

What is rank?

• The “unique” identify (id) of a process in a communicator:• Each of the six processes in MPI.COMM_WORLD has a distinct rank

or id

import java.util.*;import mpi.*;

..

// Get total number of processesrank = MPI.COMM_WORLD.Rank();

..

Page 14: Parallel Computing—Introduction to Message Passing Interface (MPI)

14

Running “HelloWorld” in C• Write parallel code• Start MPICH2 daemon• Write machines file• Start the parallel job

Page 15: Parallel Computing—Introduction to Message Passing Interface (MPI)

15

Page 16: Parallel Computing—Introduction to Message Passing Interface (MPI)

16

Page 17: Parallel Computing—Introduction to Message Passing Interface (MPI)

17

Running “Hello World” in Java• The code is executed on a cluster called

“Starbug”: • One head-node “holly” and eight compute-nodes

• Steps: • Write machines files• Bootstrap MPJ Express (or any MPI library) runtime• Write parallel application• Compile and execute

Page 18: Parallel Computing—Introduction to Message Passing Interface (MPI)

18

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 19: Parallel Computing—Introduction to Message Passing Interface (MPI)

19

Write machines files

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 20: Parallel Computing—Introduction to Message Passing Interface (MPI)

20

Bootstrap MPJ Express runtime

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 21: Parallel Computing—Introduction to Message Passing Interface (MPI)

21

Write Parallel Program

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 22: Parallel Computing—Introduction to Message Passing Interface (MPI)

22

Compile and Execute

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 23: Parallel Computing—Introduction to Message Passing Interface (MPI)

23

Single Program Multiple Data (SPMD) Model

import java.util.*;import mpi.*;

public class HelloWorld { MPI.Init(args); // start up MPI

size = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();

if (rank == 0) { System.out.println(“I am Process 0”); } else if (rank == 1) { System.out.println(“I am Process 1”); }

MPI.Finalize();}

Page 24: Parallel Computing—Introduction to Message Passing Interface (MPI)

24

Single Program Multiple Data (SPMD) Model

import java.util.*;import mpi.*;

public class HelloWorld { MPI.Init(args); // start up MPI

size = MPI.COMM_WORLD.Size(); rank = MPI.COMM_WORLD.Rank();

if (rank%2 == 0) { System.out.println(“I am an even process”); } else if (rank%2 == 1) { System.out.println(“I am an odd process”); }

MPI.Finalize();}

Page 25: Parallel Computing—Introduction to Message Passing Interface (MPI)

25

Point to Point Communication• The most fundamental facility provided by MPI• Basically “exchange messages between two

processes”: • One process (source) sends message• The other process (destination) receives message

Page 26: Parallel Computing—Introduction to Message Passing Interface (MPI)

26

Point to Point Communication• It is possible to send message for each basic

datatype:• Floats, Integers, Doubles …

• Each message contains a “tag”—an identifier

Tag1

Tag2

Page 27: Parallel Computing—Introduction to Message Passing Interface (MPI)

27

Process 6

Process 0

Process 1

Process 3

Process 2

Process 4

Process 5

Process 7

message

Integers Process 4 Tag COMM_WORLD

Point to Point Communication

Page 28: Parallel Computing—Introduction to Message Passing Interface (MPI)

28

Blocking and Non-blocking • There are blocking and non-blocking version of send

and receive methods• Blocking versions:

• A process calls send() or recv(), these methods return when the message has been physically sent or received

• Non-blocking versions: • A process calls isend() or irecv(), these methods return

immediately • The user can check the status of message by calling test() or

wait()

• Note the “i” in isend() and irecv()• Non-blocking versions provide overlapping of

computation and communication: • It also depends on the “quality” of the implementation

Page 29: Parallel Computing—Introduction to Message Passing Interface (MPI)

29

CPU waits

“Blocking”

send() recv()

Sender Receiver

time CPU waits

“Non Blocking”

isend() irecv()

Sender Receiver

time CPU

perform task

iwait()

CPU waitsiwait()

CPU waits

CPU perform task

Page 30: Parallel Computing—Introduction to Message Passing Interface (MPI)

30

Modes of Send

• The MPI standard defines four modes of send:• Standard• Synchronous• Buffered• Ready

Page 31: Parallel Computing—Introduction to Message Passing Interface (MPI)

31

Standard Mode (Eager send protocol used for small messages)

time ->control message to receiver

actual data sent

sender receiver

Page 32: Parallel Computing—Introduction to Message Passing Interface (MPI)

32

Synchronous Mode (Rendezvous Protocol used for large messages)

time ->

control message to receiver

actual data sent

acknowledgement

sender receiver

Page 33: Parallel Computing—Introduction to Message Passing Interface (MPI)

33

Performance Evaluation of Point to Point Communication

• Normally ping pong benchmarks are used to calculate: • Latency: How long it takes to send N bytes from

sender to receiver?• Throughput: How much bandwidth is achieved?

• Latency is a useful measure for studying the performance of “small” messages

• Throughput is a useful measure for studying the performance of “large” messages

Page 34: Parallel Computing—Introduction to Message Passing Interface (MPI)

34

Latency on Fast Ethernet

Page 35: Parallel Computing—Introduction to Message Passing Interface (MPI)

35

Throughput on Fast Ethernet

Page 36: Parallel Computing—Introduction to Message Passing Interface (MPI)

36

Latency on Gigabit Ethernet

Page 37: Parallel Computing—Introduction to Message Passing Interface (MPI)

37

Throughput on GigE

Page 38: Parallel Computing—Introduction to Message Passing Interface (MPI)

38

Latency on Myrinet

Page 39: Parallel Computing—Introduction to Message Passing Interface (MPI)

39

Throughput on Myrinet


Recommended