Home >Documents >High Performance Communication using MPJ Express

High Performance Communication using MPJ Express

Date post:29-Jan-2016
View:17 times
Download:0 times
Share this document with a friend
High Performance Communication using MPJ Express. Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan. Presentation Outline. Introduction Parallel computing HPC Platforms Software programming models MPJ Express Design Communication devices - PowerPoint PPT Presentation
  • High Performance Communication using MPJ Express

    *Presented byJawad ManzoorNational University of Sciences and Technology, Pakistan*

  • Presentation Outline *IntroductionParallel computingHPC PlatformsSoftware programming modelsMPJ Express DesignCommunication devicesPerformance Evaluation*

  • Serial vs Parallel Computing*Serial Computing

    Parallel Computing


  • HPC PlatformsThere are three kind of High performance computing (HPC) platforms.Distributed Memory ArchitectureMassively Parallel Processor (MPP) Shared Memory ArchitectureSymmetric Multi processor (SMP) , Multicore computersHybrid ArchitectureSMP ClustersMost of the modern HPC hardware is based on hybrid modelsDistributed MemoryShared MemoryHybrid**

  • Software Programming ModelsShared Memory ModelsProcess has direct access to all memoryPthreads, OpenMPDistributed Memory ModelsNo direct access to memory of other processesMessage Passing Interface (MPI)


  • Message Passing Interface (MPI)Message Passing Interface is the defacto standard for writing applications on parallel hardwarePrimarily designed for distributed memory machines but it is also used for shared memory machines**

  • MPI ImplementationsOpenMPI It is an open source production quality implementation of MPI-2 in CExisting high performance driversTCP/IP, Shared memory, Myrinet, Quadrics, Infiniband

    MPICH2It is the implementation of MPI on SMPs, clusters, and massively parallel processorsPOSIX shared memory, SysV shared memory, Windows shared memory, Myrinet, Quadrics, Infiniband, 10 Gigabit Ethernet

    MPJ ExpressImplements the high level functionality of MPI in pure Java Provides flexibility to update the layers or add new communication devicesTCP/IP, Myrinet, Threads shared memory, SysV shared memory**

  • Presentation Outline *IntroductionParallel computingHPC PlatformsSoftware programming modelsMPJ ExpressDesignCommunication devicesPerformance Evaluation**

  • **

  • Java NIO DeviceUses non-blocking I/O functionality, Implements two communication protocols:Eager-send For small messages (< 128 Kbytes),May incur additional copying.Rendezvous:Exchange of control messages before the actual transmission,For long messages ( 128 Kbytes).


  • Standard mode with eager send protocol (small messages)**

  • Standard mode with rendezvous protocol (large messages)**

  • Threads basedMPJ process is represented by a Java thread and data is communicated using shared data structures.sendQueue and recvQueue

    SysV basedMPJ process is represented by a Unix process and data is communicated using shared data structures.Java Module -The xdev API implementation for shared memory communicationC Module - Unix SysV Inter Process Communication methodsJNI Module Bridge between C and Java.

    *Shared Memory Communication Device*

  • *MPI communication using socketsMPI communication using shared memory *

  • Key Implementation aspectsCritical operations include: InitializePoint to pointSend Receive Finalize**

  • * Process 0s shared memory segmentProcess 1s shared memory segmentProcess 2s shared memory segmentProcess 3s shared memory segmentInitialization*

  • *Point-to-point communicationCommunication between two processes.

    Source process sends message to destination process.

    Source and destination processes are identified by their rank *

  • *

    Blocking SendOnly return from sub routine call when the operation has completed

    Non Blocking SendReturn straight away and allow sub program to continue to perform other work.At some time later check for the completion of the processSend Modes*

  • Sending a message* Memory space of each process is divided into sub-sections equal to the number of processes.Each subsection is used for communication with one process.*

  • Receiving a message* Destination process attaches itself to the shared memory segment of source process and starts reading messages from the sub-section allocated to it using offset


  • Finalization* When the communication is completed, barrier method is called at the end which synchronizes all process. Then the finalize method is called which destroys the shared memory allocated to the processes.


  • Presentation Outline *IntroductionParallel computingHPC PlatformsSoftware programming modelsDesign and ImplementationDesignCommunication devicesPerformance Evaluation**

  • Performance EvaluationA ping pong program was written in which two processes repeatedly pass a message back and forth.Timing calls to measure the time taken for one message.We used a warm up loop of 10K iterations and the average time was calculated for 20K iterations after warm up.We present latency and throughput graphsLatency is the delay between the initiation of a network transmission by a sender and the receipt of that transmission by a receiverThroughput is the amount of data that passes through a network connection over time as measured in bits per second. We have plotted the latency graph from message size of 1 byte up to 2KB and bandwidth graph from 2KB to 16MB**

  • Latency on Fast Ethernet**

  • Throughput on Fast Ethernet **

  • Latency on Gigabit Ethernet**

  • Throughput on GigE**

  • Latency on Myrinet**

  • Throughput on Myrinet**

  • Q ? **

  • Further ReadingParallel Computinghttps://computing.llnl.gov/tutorials/parallel_comp/ MPIwww.mcs.anl.gov/mpiMPJ Expresshttp://mpj-express.org/MPICH2http://www.mcs.anl.gov/research/projects/mpich2/OpenMPIhttp://www.open-mpi.org/



Popular Tags:

Click here to load reader

Embed Size (px)