Date post: | 18-Dec-2015 |
Category: |
Documents |
View: | 230 times |
Download: | 2 times |
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 1
Account Setup and MPI Introduction
Parallel Computing & Bioinformatics Lab
Sylvain Pitre ([email protected])
Web: http://cgmlab.carleton.ca
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 2
Overview
• CGM Cluster specs
• Account Creation
• Logging in Remotely (Putty, X-Win32)
• Account Setup for MPI
• Checking Cluster Load
• Listing Your Jobs
• MPI Introduction and Basics
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 4
CGM Lab Cluster (2)
• 8 dual-core workstations (total of 16 processors)– Named cgm01, cgm02…cgm08.– Intel Core 2 Duo 1.6GHz, 4GB DDR2 RAM, 320GB disks– Server (cgm01) has an extra terabyte (1TB) disk space.
• Connected through a dedicated gigabit switch.• Running Fedora 8 (64-bit).• OpenMPI (http://www.open-mpi.org/)• cgmXX.carleton.ca (SSH, where XX=01 to 08)• Putty (terminal): http://www.putty.nl/download.html• WinSCP (file transfer): http://winscp.net/eng/index.php• XWin-32 (http://www.starnet.com/)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 5
CGM Lab Cluster (3)
• Accounts are handled by LDAP (Lightweight Directory Access Protocol) on the server.
• User files are stored on the server and accessed by every workstation using NFS (Network File System).
• Same login and password will work on any workstation.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 6
CGM Lab Cluster (4)cgm01
cgm02
cgm03
cgm04
cgm05
cgm06
cgm07
cgm08
NFS Server
LDAP Server
Carleton
Network
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 7
Account Creation
• To get an account send an email to Sylvain Pitre ([email protected])
• Include in your email– your full name– your email address (if different from the one
used to send the email).– your supervisor name (or course professor).– your preferred login name (8 characters max)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 8
Logging In Remotely
• You can login remotely to the cluster by SSH (Secure Shell).
• Users familiar to unix/linux should already know how to do this.
• Windows users can use Putty, a lightweight SSH client (see link on slide 4)
• Windows users can also log in by X-Win32• DNS names: cgmXX.carleton.ca (XX=01 to 08)• Log in any node except cgm01 (server)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 9
Logging in with Putty
• Under Host Name, enter the cgm machine you want to log into (cgm03 in this case) then click Open.
• A terminal will open and ask you for your username then password.
• That’s it! You are logged into one of the cgm nodes.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 10
Login with X-Win32
• You can also log in to the nodes using X-Win32 • Open the X-Win32 Configuration program (X-Config)• Under the Sessions Tab, click on Wizard.• Enter a name for the session (ex: cgm03) and under
Type click on ssh then click Next.• As host enter the name of the node you wish to connect
to (ex: cgm03.carleton.ca) then click Next.• Enter your login name and password and Click Next.• For Command, click on Linux then click Finish.• The new session is now added to your Sessions
Window.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 11
Login with X-Win32 (2)
• Click on the newly created session then click on Launch.• After launching the session, you might get asked to
accept a key (click on “yes”).• You should now be at a terminal. • You can work in this terminal if you wish (like in Putty)
but if you wish to have the visual interface type:– gnome-session &
• After a few seconds the visual interface will start up. • Now you have access to all the menus and windows of
the Fedora 8 interface (using Gnome).
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 12
Login with X-Win32 (3)
Demonstration
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 13
Account Setup
• First time login:– Once you have your account (send me an
email to get one) and login, change your password with the passwd command.
• If you are unfamiliar with unix/linux:– I strongly recommend reading some tutorials
and playing around with commands (but be careful!).
– I assume you have some basic unix/linux knowledge in the rest of the slides.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 14
“Password-less” SSH
• In order to run MPI on different nodes transparently, we need to setup SSH to it doesn’t constantly ask us for a password. Type:
ssh-keygen -t rsa cd .ssh cp id_rsa.pub authorized_keys2 chmod go-rwx authorized_keys2ssh-agent $SHELLssh-addcd ..
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 15
“Password-less” SSH (2)
• Now after your initial login you should be able to SSH into any other cgmXX machine without a password. SSH to every workstation in order to add that node to your known_hosts. Type:
ssh cgm01 date (answer “yes” when asked)
ssh cgm02 date
…
ssh cgm08 date
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 16
Ready for MPI!
• After completing the steps above your account is now ready to run MPI jobs.
• Running big jobs on multiple processors– Since there is no job scheduler jobs are
launched manually so please be considerate. Use nodes that are not in use or that have less load (I’ll show you how to check).
– If you need all the nodes for a longer period of time we’ll try to reserve them for you.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 17
Network Vs. Local Files
• If you need to do a lot of disk I/O, it is preferable to use the local disk’s /tmp directory.– Since your account is mounted by NFS, all
files written to your home directory are sent to the server (network bottleneck).
– To reduce network transfers, place your large input/output files in /tmp on your local node.
– Make the filename “unique”.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 18
Checking Cluster Load
• To check the load on each workstation type the command: load
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 19
Listing Your Jobs
• To check all of your jobs (processes) across the cluster type: listjobs
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 20
MPI Introduction
• Message Passing Interface (MPI)– Portable message-passing standard that facilitates
the development of parallel applications and libraries.– For parallel computers, clusters…– Not a language in it’s own. It is used as a package
with another language, like C or Fortran.– Different implementations: OpenMPI, LAM/MPI,
MPICH… – Portable = not limited to a specific architecture.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 21
MPI Basics
• Every node (process) executes the same code.– Nodes can follow different paths (Master/slave model)
but don’t abuse!
• Communication is done by message passing.• Every node has a unique rank (ID) from 0 to p.• The total number of nodes is known to every node.• Synchronous or asynchronous messages.• Thread safe.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 22
Compiling/Running MPI Programs
• Compiler: mpicc
• Command line:mpirun –n <p> --hostfile <hostfile> <prog> <params>
Where <p> is the number of processes you want to use. Can be greater than the number of processors available (used for overloading or simulation).
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 23
Hostfile
• For running a job on more than one node, a hostfile must be used.
• What’s in a hostfile:– Node name or IP.– How many processors on each node (1 by default).
• Example:
cgm01 slots=2
cgm02 slots=2
…
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 24
MPI Startup/Finalize
#include "mpi.h"int main(int argc, char *argv[]) {int rank, wsize;MPI_Init (&argc, &argv);MPI_Comm_rank(MPI_COMM_WORLD, &rank);MPI_Comm_size(MPI_COMM_WORLD, &wsize);
/* CODE */
MPI_Finalize(); return 0;}
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 25
MPI TypesMPI C Type C Type
MPI_CHAR char
MPI_SHORT signed short int
MPI_INT signed int
MPI_LONG signed long int
MPI_UNSIGNED_CHAR unsigned char
MPI_UNSIGNED_SHORT unsigned short int
MPI_UNSIGNED unsigned int
MPI_UNSIGNED_LONG unsigned long int
MPI_FLOAT float
MPI_DOUBLE double
MPI_LONG_DOUBLE long double
MPI_BYTE -
MPI_PACKED -
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 26
MPI Functions
• Send/receive
• Broadcast
• All to all
• Gather/Scatter
• Reduce
• Barrier
• Other
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 27
MPI Send/Receive (synch)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 28
MPI Send/Receive (synch)
• Communication between nodes (processors).• Blocking
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)
*buf send buffer addresscount number of entries in bufferdatatype data type of entriesdest destination process ranktag message tagcomm communicator*status status after operation (returned)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 29
MPI Send/Receive (asynch)
• A buffer can be used with asynchronous messages.• Problems occur when the buffer becomes empty or full.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 30
MPI Send/Receive (asynch)
• Non-blocking (not guaranteed to be received)
int MPI_Isend(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm)
Parameters are the same as MPI_Send() and MPI_Recv()
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 31
MPI Broadcast
• One to all (including itself).
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 32
MPI Broadcast (syntax)
int MPI_Bcast(void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
*buf send buffer address
count number of entries in buffer
datatype data type of entries
root rank of root
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 33
MPI All to All
• Flood a message from every process to every process.
MPI_AlltoAll(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype datatype, MPI_Comm comm)
*sendbuf send buffer addresssendcount number of send buffer elementssendtype data type of send elements*recvbuf receive buffer address (loaded)recvcount number of elements each receiverecvtype data type of receiving processcomm communicator
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 35
MPI All to All (alternative)
• MPI_AlltoAllv()– Sends data to all processes, with
displacement.
MPI_Alltoallv ( void *sendbuf, int *sendcounts, int *sdispls, MPI_Datatype sendtype, void *recvbuf, int *recvcnts, int *rdispls, MPI_Datatype recvtype, MPI_Comm comm )
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 36
MPI Gather (Description)
• MPI_Gather()– Each process in comm sends the contents of send
buf to the process with rank root. The process root concatenates the received data in process rank order in recvbuf That is the data from process is followed by the data from process which is followed by the data from process, etc. The recv arguments are signicant only on the process with rank root. The argument recv count indicates the number of items received from each process not the total number received
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 37
MPI Scatter (Description)
• MPI_Scatter()– The process with rank root distributes the contents of
sendbuf among the processes. The contents of sendbuf are split into p segments each consisting of sendcount items The first segment goes to process 0, the second to process 1, etc. The send arguments are significant only on process root.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 38
MPI Gather/Scatter
Gather
Scatter
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 39
MPI Gather/Scatter (syntax)int MPI_Gather(void *sendbuf, int sendcount,
MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
int MPI_Scatter(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
*sendbuf send buffer addresssendcount number of send buffer elementssendtype data type of send elements*recvbuf receive buffer address (loaded)recvcount number of elements each receiverecvtype data type of receiving processroot rank of sending (scatter) or receiving (gather) processcomm communicator
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 40
MPI Gatherv/Scatterv
• Similar functions than gather/scatter, but allows for varying amounts of data to be sent instead of a fixed amount.
• For example, varying parts of an array can be scattered/gathered in one step.
• See Parallel Image Processing example to see how they can be used.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 41
MPI Gatherv/Scatterv (Syntax)
int MPI_Scatterv(void* sendbuf,int *sendcounts,int *displs,MPI_Datatype sendtype,void* recvbuf,int recvcount,MPI_Datatype recvtype,int root,MPI_Comm comm);
int MPI_Gatherv(void* sendbuf,int sendcount,MPI_Datatype sendtype,void* recvbuf,int *recvcounts,int *displs,MPI_Datatype recvtype,int root,MPI_Comm comm);
sendcounts* number of send buffer elements for each processesrecvcounts* number of elements each receive from each processes*displs displacement for each processorOther parameters are the same as gather/scatter.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 42
MPI Reduce
• Gather results and reduce them to one value using an operation (Max, Min, Sum, Product).
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 43
MPI Reduce (syntax)
int MPI_Reduce(void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)
*sendbuf send buffer address*recvbuf receive buffer addresscount number of send buffer elementsdatatype data type of send elementsop reduce operation:
- MPI_MAX Maximum - MPI_MIN Minimum - MPI_SUM Sum - MPI_PROD Product
root root process rank for resultcomm communicator
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 44
MPI Barrier
• Blocks until all processes have called it.
int MPI_Barrier(MPI_Comm comm)
comm communicator
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 45
Other MPI Routines
• MPI_Allgather(): Gather values and distribute to all.• MPI_Allgatherv(): Gather values into specified
locations and distribute to all. • MPI_Reduce_scatter(): Combine values and scatter
results.• MPI_Wait(): Waits for a MPI send/receive to complete
then returns.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 46
Parallel Problem Examples
• Embarrassingly Parallel– Simple Image Processing (Brightness, Negative…)
• Pipelined computations– Sorting
• Synchronous computations
– Heat Distribution Problem– Cellular Automata
• Divide and Conquer– N-Body Problem
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 47
MPI Hello World!
#include "mpi.h"
int main(int argc, char *argv[]) { int rank, wsize; MPI_Status status; MPI_Init (&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &wsize);
printf("Hello World!, I am processor %d.\n",rank); MPI_Finalize(); return 0;}
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 48
Parallel Image processing
• Input: Image of size MxN.
• Output: Negative of the image.
• Each processor should have an equal share of the work, roughly (MxN)/P.
• Master/slave model– The master will read in the image and
distribute the pixels to the slave nodes. Once done the slaves will return the results to the master who will output the negative image.
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 49
Parallel Image processing (2)
• Workload– If we have 32 pixels to process, and 4 CPUs,
each CPU will process 8 pixels.– For P0, the work will start at pixel 0
(displacement) and process 8 pixels (count).
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 50
Parallel Image processing (3)
- Find the displacement/count for each processor.- Master processor scatters the image:- Execute the negative operation- Gather the results on the master processor.- Displacement (displs) tells you where to start,
count (counts) tells you how many to do.
MPI_Scatterv (image, counts, displs, MPI_CHAR, image, counts [myId], MPI_CHAR, 0, MPI_COMM_WORLD);
MPI_Gatherv (image, counts [myId], MPI_CHAR, image, counts, displs, MPI_CHAR, 0, MPI_COMM_WORLD);
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 51
MPI Timing
• Calculate the wall clock time of some code. Can be executed by master to find out total runtime.
double start, total;
start = MPI_Wtime();//Do some work!total = MPI_Wtime() - start;printf(“ Total Runtime: %f \n", total);
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 52
Compiling & Running Your First MPI Program
• Download the MPI_hello.tar.gz example from the cgmlab.carleton.ca website. In the terminal type:wget http://cgmlab.carleton.ca/files/MPI_hello.tar.gz
• Uncompress the files by typing:
tar zxvf MPI_hello.tar.gz
• Compile the program by typing:
make
• Run the program on all 16 cores by typing:mpirun –np 16 --hostfile hostfile ./hello
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 53
What To Do Next?
• There is also a prefix sums example on the cgmlab.carleton.ca website.
• Try other examples you find on the web.
• Find MPI tutorials online or in books.
• Write your own MPI programs.
• Have fun ;)
Parallel Computing & Bioinformatics Lab: Account Setup and MPI Introduction 54
References
• Parallel Programing: Techniques and Applications Using Networked Workstations and Parallel Computers, Barry Wilkinson and Michael Allen, Prentice Hall, 1999.
• MPI Information/Tutorials:– http://www-unix.mcs.anl.gov/mpi/learning.html
• A draft of a Tutorial/User's Guide for MPI by Peter Pacheco.– ftp://math.usfca.edu/pub/MPI/mpi.guide.ps
• OpenMPI (http://www.open-mpi.org/)