+ All Categories
Page 1: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 1

Intel ® Cluster ToolsIntroduction and Hands on Sessions

MSU Summer School Intel Cluster Software and

TechnologiesSoftware & Services GroupJuly, 8 2010, MSU Moscow

Page 2: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 2

Agenda– Intel Cluster Tools settings and configuration– Intel MPI fabrics– Message Checker– ITAC Introduction– ITAC practice

Page 3: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 3

Setup configuration• Source /opt/intel/cc/11.0.74/bin/iccvars.sh intel64• Source /opt/intel/fc/11.0.74/bin/ifortvars.sh intel64• Source /opt/intel/impi/ • Source /opt/intel/itac/


Page 4: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 4

Check configuration• Which icc• Which ifort• Which mpiexec• Which traceanalyzer• Echo $LD_LIBRARY_PATH• Set | grep I_MPI• Set | grep VT_

Page 5: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 5

Compile your first MPI application• Using Intel compilers

• mpiicc, mpiicpc, mpiifort, ...• Using Gnu compilers

• mpicc, mpicxx, mpif77, ...• mpiicc -o hello_c test.c• mpiifort -o hello_f test.f

Page 6: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 6

Create mpd.hosts file• Create mpd.hosts file in the working

directory with list of available nodes

Create mpd ring• mpdboot -r ssh -n #nodes

Check mpd ring• mpdtrace

Page 7: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 7

Start your first application• mpiexec -n 16 ./hello_c• mpiexec -n 16 ./hello_f

Kill mpd ring• mpdallexit• mpdcleanup -a

Start your first application• mpirun -r ssh -n 16 ./hello_c• mpirun -r ssh -n 16 ./hello_f

Page 8: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 8

Alternative Process Manager• Use mpiexec.hydra for better scalability• All options are the same

Page 9: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 9

OFED & DAPL• OFED - OpenFabrics Enterprise Distribution

http://openfabrics.org/• DAPL - Direct Access Programming Library

http://www.openfabrics.org/downloads/dapl/• Check /etc/dat.conf• Set I_MPI_DAPL_PROVIDER=OpenIB-mlx4_0-2

Page 10: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 10

Fabrics selectionI_MPI_DEVICE I_MPI_FABRICS Description

sock tcp TCP/IP-capable network fabrics, such as Ethernet and InfiniBand* (through IPoIB*)

shm shm Shared-memory only

ssm shm:tcp Shared-memory + TCP/IP

rdma dapl DAPL–capable network fabrics, such as InfiniBand*, iWarp*, Dolphin*, and XPMEM* (through DAPL*)

rdssm shm:dapl Shared-memory + DAPL + sockets

ofa OFA-capable network fabric including InfiniBand* (through native OFED* verbs)

shm:ofa OFA-capable network fabric with shared memory for intra-node communication

tmi TMI-capable network fabrics including Qlogic*, Myrinet*, (through Tag Matching Interface)

shm:tmi TMI-capable network fabric with shared memory for intra-node communication

Page 11: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 11

Fabrics selection (cont.)• Use I_MPI_FABRICS to set the desired fabric– export I_MPI_FABRICS shm:tcp– mpirun -r ssh -n -env I_MPI_FABRICS

shm:tcp ./a.out• DAPL varieties:– export I_MPI_FABRICS shm:dapl– export I_MPI_DAPL_PROVIDER ofa-v2-mlx4_0-1– export I_MPI_DAPL_UD enable

• Connectionless communication• Better scalability• Less memory is required

Page 12: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 12

Fabrics selection (cont.)• OFA fabric– export I_MPI_FABRICS shm:ofa

• Multi-rail feature– export I_MPI_OFA_NUM_ADAPTERS=<n>– export I_MPI_OFA_NUM_PORTS=<n>

• For OFA devices Intel® MPI Library recognizes some hardware events, can stop using one line and restore connection when a line is OK again

Page 13: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 13

How to get information from Intel MPI library

• Use I_MPI_DEBUG env variable– Use a number from 2 to 1001 for different

details level– Level 2 shows data transfer mode– Level 4 shows pinning information

Page 14: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 14

cpuinfo utility• Use this utility to get information about

processors used in your systemIntel(R) Xeon(R) Processor (Intel64 Harpertown)===== Processor composition =====Processors(CPUs) : 8Packages(sockets) : 2Cores per package : 4Threads per core : 1===== Processor identification =====Processor Thread Id. Core Id. Package Id.0 0 0 0 1 0 0 1 2 0 1 0 3 0 1 1 4 0 2 0 5 0 2 1 6 0 3 0 7 0 3 1

===== Placement on packages =====Package Id. Core Id. Processors0 0,1,2,3 0,2,4,61 0,1,2,3 1,3,5,7===== Cache sharing =====Cache Size ProcessorsL1 32 KB no sharingL2 6 MB (0,2)(1,3)(4,6)(5,7)

Page 15: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 15

Pinning• One can change default pinning settings– export I_MPI_PIN on/off– export I_MPI_PIN_DOMAIN cache2 (for hybrid)– export I_MPI_PROCESSOR_LIST allcores– export I_MPI_PROCESSOR_LIST shift=socket

Page 16: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 16

OpenMP and Hybrid applications• Check command line for application building

– Use the thread safe version of the Intel® MPI Library (-mt_mpi option)– Use the libraries with SMP parallelization (i.e. parallel MKL)– Use –openmp compiler option to enable OpenMP* directives

• Set application execution environment for hybrid applications– Set OMP_NUM_THREADS to threads number– Use –perhost option to control process pinning$ export OMP_NUM_THREADS=4

$ export I_MPI_FABRICS=shm:dapl$ export KMP_AFFINITY=compact

$ mpirun -perhost 4 -n <N> ./a.out

$ mpiicc –openmp -o ./your_app

Page 17: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 17

Intel® MPI Library and MKL• MKL creates own threads (openMP, TBB,

…)• MKL from version 10.2 understands

settings of Intel® MPI Library and doesn’t create more processes than cores


Page 18: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 18

How to run a debugger• TotalView– mpirun -r ssh -tv –n # ./a.out

• GDB– mpirun -r ssh -gdb –n # ./a.out

• Allinea DDT (from GUI)• IDB– mpirun -r ssh -idb –n # ./a.out– You need idb available in your $PATH– Some settings are required

Page 19: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 19

Message Checker• Local checks: isolated to single process

– Unexpected process termination– Buffer handling– Request and data type management– Parameter errors found by MPI

• Global checks: all processes– Global checks for collectives and p2p ops

• Data type mismatches• Corrupted data transmission• Pending messages• Deadlocks (hard & potential)

– Global checks for collectives – one report per operation• Operation, size, reduction operation, root mismatch• Parameter error• Mismatched MPI_Comm_free()

Page 20: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 20

Message Checker (cont.)• Levels of severity:

– Warnings: application can continue– Error: application can continue but almost certainly not

as intended– Fatal error: application must be aborted

• Some checks may find both warnings and errors– Example: CALL_FAILED check due to invalid parameter

• Invalid parameter in MPI_Send() => msg cannot be sent => error

• Invalid parameter in MPI_Request_free() => resource leak => warning

Page 21: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 21

Message Checker (cont.)• Usage model:

– Recommended:• -check option when running an MPI job

$ mpiexec –check –n 4 ./a.out• Use fail-safe version in case of crash

$ mpiexec –check libVTfs.so –n 4 ./a.out

– Alternatively:• -check_mpi option during link stage

$ mpiicc –check_mpi –g test.c –o a.out• Configuration

– Each check can be enabled/disabled individually• set in VT_CONFIG file, e.g. to enable local checks only:CHECK ** OFFCHECK LOCAL:** ON

– Change number of warnings and errors printed and/or tolerated before abortSee lab/poisson_ITAC_dbglibs

Page 22: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 22

Trace Collector• Link with trace library:– mpiicc -trace test.c -o a.out

• Run with -trace option– mpiexec -trace -n # ./a.out

• Using of itcpin utility– mpirun –r ssh –n # itcpin --run -- ./a.out– Binary instrumentation

• Use -tcollect link option– mpiicc -tcollect test.c -o a.out

Page 23: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 23

Using Trace Collector for openMP applications

• ITA can show only those threads which call MPI functions. There is very simple trick: e.g. before "#pragma omp barrier" add MPI call: – { int size; MPI_Comm_size(MPI_COMM_WORLD, &size); }

• After such modification ITA will show information about OpenMP threads.

• Please remember that to support threads you need to use thread-safe MPI Library. Don't forget to set VT_MPI_DLL environment variable. –     $ set VT_MPI_DLL=impimt.dll   (for Windows)–     $ export VT_MPI_DLL=libmpi_mt.so   (for Linux)

Page 24: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 24

Light weight statistics Use I_MPI_STATS environment

variable– export I_MPI_STATS # (up to 10)– export I_MPI_STATS_SCOPE


~~~~ Process 0 of 256 on node C-21-23

Data TransfersSrc --> Dst Amount(MB) Transfers-----------------------------------------000 --> 000 0.000000e+00 0000 --> 001 1.548767e-03 60000 --> 002 1.625061e-03 60000 --> 003 0.000000e+00 0000 --> 004 1.777649e-03 60…=========================================Totals 3.918986e+03 1209

Communication ActivityOperation Volume(MB) Calls-----------------------------------------P2P Csend 9.147644e-02 1160Send 3.918895e+03 49CollectivesBarrier 0.000000e+00 12Bcast 3.051758e-05 6Reduce 3.433228e-05 6Allgather 2.288818e-04 30Allreduce 4.108429e-03 97

Page 25: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 25

Intel® Trace Analyzer• Generate a trace file for Game of Life• Investigate blocking Send using ITA• Change code • Look at difference

Page 26: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 26

Ideal Interconnect Simulator (IIS)Helps to figure out application's imbalance simulating its behavior in the "ideal communication environment"

Ideal traceReal trace

Page 27: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 27

Imbalance diagramCalculation MPI_Allreduce

Calculation MPI_Allreduce



Calculation MPI_Allreduce

Calculation MPI_Allreduce

= load imbalance

= interconnect



Þ model

Page 28: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 28

Trace Analyzer - Filtering

Page 29: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 29

mpitune utilityCluster-specific tune• Run it once after installation and each time after cluster configuration change• Best configuration is recorded for each combination of communication device, number

of nodes, MPI ranks and process distribution model

Application-specific tuning

• Tune any kind of MPI application specifying its command line• By default performance is measured as inversed execution time• To reduce overall tuning time use the shortest representative application

workload (if applicable)# Collect configuration settings $ mpitune –-application \”mpiexec –n 32 ./my_app\” –of ./my_app.confNote: using of backslash and quote is mandatory

# Reuse recorded values$ mpiexec -tune ./my_app.conf -n 32 ./my_app

# Collect configuration values: $ mpitune

# Reuse recorded values:$ mpiexec –tune –n 32 ./your_app

Page 30: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 30

Stay tuned!• Learn more online

– Intel® MPI self-help pageshttp://www.intel.com/go/mpi

• Ask questions and share your knowledge– Intel® MPI Library support page


– Intel® Software Network Forumhttp://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/

Page 31: Intel ® Cluster Tools Introduction and Hands on Sessions

Software and Services Group 31

Top Related