11
Program Systems Institute Russian Academy of Sciences
Open TS Open TS dynamic parallelization systemdynamic parallelization system
Program Systems Institute RAS, Program Systems Institute RAS, Alexander Moskovsky,Alexander Moskovsky,
09/27/0509/27/05Pereslavl-ZalesskyPereslavl-Zalessky
33
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
SKIF Supercomputing SKIF Supercomputing ProjectProject
SKIF Supercomputing SKIF Supercomputing ProjectProject
Joint of Russian Federation Joint of Russian Federation and Republic of Belarusand Republic of Belarus
2000-2004 2000-2004 10 + 10 organizations10 + 10 organizations PSI RAS is lead organization PSI RAS is lead organization
from Russian Federationfrom Russian Federation Hardware and Hardware and SoftwareSoftware
44
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Comparison: T-System and Comparison: T-System and MPIMPI
Comparison: T-System and Comparison: T-System and MPIMPI
C/Fortran T-System
Assembler MPI
High-levela few
keywords
Low-levelhundred(s)primitives
Sequential Parallel
66
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Open TS: an OutlineOpen TS: an OutlineOpen TS: an OutlineOpen TS: an Outline High-performance computing High-performance computing ““Automatic dynamic parallelization”Automatic dynamic parallelization” Combining functional and Combining functional and
imperative approaches, high-level imperative approaches, high-level parallel programmingparallel programming
Т++ Т++ language: “Parallel dialect” of language: “Parallel dialect” of C++ — an approach popular in 90-C++ — an approach popular in 90-iesies
77
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Т-Т-ApproachApproachТ-Т-ApproachApproach ““Pure” function (Pure” function (tfunctiontfunction) invocations ) invocations
produce grains of parallelismproduce grains of parallelism T-Program isT-Program is
Functional – on higher levelFunctional – on higher level Imperative – on low level (optimization)Imperative – on low level (optimization)
C-compatible execution modelC-compatible execution model Non-ready variables, Multiple Non-ready variables, Multiple
assignmentassignment ““Seamless” C-extension Seamless” C-extension (or Fortran-(or Fortran-
extension)extension)
88
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Т++Т++ Keywords KeywordsТ++Т++ Keywords Keywords tfuntfun —— Т-Т-functionfunction tvaltval—— Т-Т-variablevariable tptrtptr—— Т-Т-pointerpointer touttout —— Output parameter (like &) Output parameter (like &) tdroptdrop —— Make ready Make ready twaittwait —— Wait for readiness Wait for readiness tcttct —— Т-Т-contextcontext
99
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Sample ProgramSample ProgramSample ProgramSample Program#include <stdio.h>#include <stdio.h>
tfuntfun int fib (int n) { int fib (int n) { return n < 2 ? n : fib(n-1)+fib(n-2);return n < 2 ? n : fib(n-1)+fib(n-2);}}
tfuntfun int main (int argc, char **argv) { int main (int argc, char **argv) { if (argc != 2) { printf("Usage: fib <n>\n"); return 1; }if (argc != 2) { printf("Usage: fib <n>\n"); return 1; } int n = atoi(argv[1]);int n = atoi(argv[1]); printf("fib(%d) = %d\n", n, printf("fib(%d) = %d\n", n, (int)fib(n));(int)fib(n)); return 0;return 0;}}
1010
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Open TSOpen TS: : EnvironmentEnvironmentOpen TSOpen TS: : EnvironmentEnvironment
Supports 1000 000 threads per CPU
1111
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
NPB, TestNPB, Test ЕР ЕРRewritten @OpenTSRewritten @OpenTS
NPB, TestNPB, Test ЕР ЕРRewritten @OpenTSRewritten @OpenTS
ЕР – ЕР – EmbarrassinglEmbarrassingly Parallely Parallel
NASA Parallel NASA Parallel Benchmarks Benchmarks suitesuite
SpeedupSpeedup = = 96%96%of theoretical of theoretical maximummaximum(on 10 nodes)(on 10 nodes)
Time, % of sequential
Efficiency,
% of theoretical
1313
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Open TS vs MPI case Open TS vs MPI case studystudy Applications Applications
Open TS vs MPI case Open TS vs MPI case studystudy Applications Applications
Popular and widely used Popular and widely used Developed by independent teams (MPI Developed by independent teams (MPI
experts)experts)
PovRayPovRay – Persistence of Vision Ray- – Persistence of Vision Ray-tracer, enabled for parallel run by a tracer, enabled for parallel run by a patchpatch
ALCMD/MP_liteALCMD/MP_lite – molecular dynamics – molecular dynamics package (Ames Lab)package (Ames Lab)
1414
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
T-PovRay vs MPI PovRay: T-PovRay vs MPI PovRay: code complexitycode complexity
T-PovRay vs MPI PovRay: T-PovRay vs MPI PovRay: code complexitycode complexity
ProgramProgram Source code Source code volumevolume
MPI modules for MPI modules for PovRay 3.10gPovRay 3.10g
1,500 lines1,500 lines
MPI patch for MPI patch for PovRay 3.50cPovRay 3.50c
3,000 lines3,000 lines
T++ modules (for T++ modules (for both versions 3.10g & both versions 3.10g & 3.50c)3.50c)
200 lines200 lines
1616
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
T-PovRay vs MPI PovRay: T-PovRay vs MPI PovRay: performanceperformance
T-PovRay vs MPI PovRay: T-PovRay vs MPI PovRay: performanceperformance
90%100%110%120%130%140%150%160%170%180%190%200%210%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Number of processors
Time MPI/Time OpenTS
2CPUs AMD Opteron 248 2.2 GHz RAM 4GB, GigE, LAM 7.1.1
1717
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
ALCMD/MPI vs ALCMD/MPI vs ALCMD/OpenTS ALCMD/OpenTS ALCMD/MPI vs ALCMD/MPI vs
ALCMD/OpenTS ALCMD/OpenTS MP_Lite component of ALCMD MP_Lite component of ALCMD
rewritten in T++rewritten in T++ Fortran code is left intact Fortran code is left intact
M PI
M PIM P_Lite
ALCMD
OpenTS
OpenTSM P_Lite
ALCMD
1818
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
ALCMD/MPI vs ALCMD/MPI vs ALCMD/OpenTS : ALCMD/OpenTS : code complexitycode complexity
ALCMD/MPI vs ALCMD/MPI vs ALCMD/OpenTS : ALCMD/OpenTS : code complexitycode complexity
ProgramProgram Source code Source code volumevolume
MP_Lite total/MPIMP_Lite total/MPI ~20,000 lines~20,000 lines
MP_Lite,ALCMD-MP_Lite,ALCMD-related/related/MPIMPI
~3,500 lines~3,500 lines
MP_Lite,ALCMD-MP_Lite,ALCMD-related/related/OpenTSOpenTS
500 lines500 lines
2020
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
ALCMD/MPI vs ALCMD/MPI vs ALCMD/OpenTS: ALCMD/OpenTS:
performanceperformance
ALCMD/MPI vs ALCMD/MPI vs ALCMD/OpenTS: ALCMD/OpenTS:
performanceperformance
80%
90%
100%
110%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Number of processors
Time MPI/Time OpenTS
2CPUs AMD Opteron 248 2.2 GHz RAM 4GB, GigE, LAM 7.1.1, Lennard-Jones MD, 512000 atoms
2323
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
Т-Т-ApplicationsApplicationsТ-Т-ApplicationsApplications MultiGen – biological activity estimationMultiGen – biological activity estimation Remote sensing applicationsRemote sensing applications Plasma modelingPlasma modeling Protein simulationProtein simulation AeromechanicsAeromechanics Query engine for XMLQuery engine for XML AI-applicationsAI-applications etc.etc.
3333
Open TS dynamic parallelization systemOpen TS dynamic parallelization system
ACKNOLEDGEMENTSACKNOLEDGEMENTSACKNOLEDGEMENTSACKNOLEDGEMENTS ““SKIF” supercomputing projectSKIF” supercomputing project Russian Academy of Science grantsRussian Academy of Science grants
Program “High-performance computing systems on Program “High-performance computing systems on new principles of computational process new principles of computational process organization” organization”
Program of Presidium of Russian Academy of Program of Presidium of Russian Academy of Science “Development of basics for implementation Science “Development of basics for implementation of distributed scientific informational-computational of distributed scientific informational-computational environment on GRID technologies”environment on GRID technologies”
Russian Foundation Basic Research “05-07-Russian Foundation Basic Research “05-07-08005-офи_а”08005-офи_а”
Microsoft – contract for “Open TS vs MPI” case Microsoft – contract for “Open TS vs MPI” case studystudy
3434
Program Systems Institute Russian Academy of Sciences
THANKS THANKS
… … … … ANY QUESTIONSANY QUESTIONS ??????… …… …