Date post: | 18-Nov-2014 |
Category: |
Technology |
Upload: | larkc |
View: | 1,036 times |
Download: | 0 times |
Parallelisation in LarKC
Parallelization and Distribution - Motivation
• Distribution– Make use of all (distributed) resources
available
– Use data that cannot be shipped (either because of size or because of security restrictions) => move computation to the data vs move data to computation
• Parallelization– Make use of all resources available (e.g. if
we have 17 machines, we would like them to work at the same time, not one after the other).
– Either within 1 site (e.g. HPC cluster) or distributed (e.g. thinking@home)
– Improve efficiency of computation
2
Scalability
General conceptsize N => time T size 2*N =>
time ≤ 2T (same resources) OR time ~T (double resources)
“within a plug-in” parallelization
“within a plug-in” parallelization
MPIMPI OpenMPOpenMP
hybridhybrid …
“across plug-ins” or “across instances of the same plug-in”
parallelization
“across plug-ins” or “across instances of the same plug-in”
parallelization
IBIS/JavaGATIBIS/JavaGAT …Grid middleware
solutionsGrid middleware
solutions
Parallelization and Distribution strategies in LarKC
3
Scalability at plug-in level
Scalability at plug-in level
Scalability at pipeline level
Scalability at pipeline level
Plug-in scope Platform scope
Parallelization and Distribution in the LarKC Platform – Local execution
4
Current Prototype
• Modular• Plugable • Loosely coupling between platform&plug-ins and between plug-ins• Support for coarse-grained parallelization (across plug-ins)
LocalPlug-in Manager
LocalPlug-in Manager
QueryTransformer
QueryTransformer
Plug-in APIPlug-in API
LocalPlug-in Manager
LocalPlug-in Manager
IdentifierIdentifier
Plug-in APIPlug-in API
LocalPlug-in Manager
LocalPlug-in Manager
Info. SetTransformer
Info. SetTransformer
Plug-in APIPlug-in API
LocalPlug-in Manager
LocalPlug-in Manager
SelecterSelecter
Plug-in APIPlug-in API
LocalPlug-in Manager
LocalPlug-in Manager
ReasonerReasoner
Plug-in APIPlug-in API
DeciderDecider
Plug-in RegistryPlug-in
Registry
PipelineSupportSystem
PipelineSupportSystem
RemotePlug-in Manager
RemotePlug-in Manager
QueryTransformer
QueryTransformer
Plug-in APIPlug-in API
RemotePlug-in Manager
RemotePlug-in Manager
IdentifierIdentifier
Plug-in APIPlug-in API
RemotePlug-in Manager
RemotePlug-in Manager
Info. SetTransformer
Info. SetTransformer
Plug-in APIPlug-in API
RemotePlug-in Manager
RemotePlug-in Manager
SelecterSelecter
Plug-in APIPlug-in API
RemotePlug-in Manager
RemotePlug-in Manager
ReasonerReasoner
Plug-in APIPlug-in API
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
StubPlug-in Manager
DeciderDecider
Plug-in RegistryPlug-in
Registry
PipelineSupportSystem
PipelineSupportSystem
Parallelization and Distribution in the LarKC Platform – Remote execution
5
Implementation in progress
+ Support for distributed remote execution
Parallelization across plug-ins
Application of Parallelization and Distribution - Example
6
Identifier Identifier
Selecter 1Selecter 1
ReasonerReasoner
DeciderDecider
Selecter 2Selecter 2
QueryTransformer
QueryTransformer
Reasoner
Parallelization within plug-in
Distribution
Strategy for Parallelization and Distribution must be customized for
every use case Optimization of performance
Automating this in Decider: maybe another research programme
LarKC offers the necessary support for its deployment and execution
High Performance and Distributed Computing support in LarKC (1/2)
LarKC supports large-scale HPC and distributed computing environments for executing plug-ins/pipelines
LarKC supports large-scale HPC and distributed computing environments for executing plug-ins/pipelines
7
Plug-in layerPlug-in layer
Platform layer
Platform layer
DeciderDecider IdentifierIdentifier
LarKC platformLarKC
platform
ReasonerReasoner
LarKC Data Layer
Resource layer
Resource layer
…
Developer extensions
LarKC middleware adapters/extensions
UserUserenvironmentenvironment
High-performance and High-performance and Grid (Cloud) environmentGrid (Cloud) environment Data StorageData Storage
RDFStoreRDFStore
RDFDocRDFDoc
RDFDocRDFDoc
RDFStoreRDFStore
High-performance and cluster
systems
High-performance and cluster
systems
Public Desktop Grid
Public Desktop Grid
Volunteer resources
Volunteer resources
User desktop machine
User desktop machine
Cloud resourcesCloud resourcesNative
middlewaresolutions
High-performance computing systems (clusters of SMP nodes)
High-performance computing systems (clusters of SMP nodes)
Computing environments potentially supported by LarKCComputing environments potentially supported by LarKC
Public Desktop Grid (BOINC based)
Public Desktop Grid (BOINC based)
Public Desktop Grid (XtremWeb based)
Public Desktop Grid (XtremWeb based)
Volunteer resourcesVolunteer resources
Public Desktop GridPublic Desktop Grid Local Desktop GridLocal Desktop Grid
High Performance and Distributed Computing support in LarKC (2/2)
8
High Performance Computing Grid infrastructure (e.g. EGEE, DEISA, etc.)
High Performance Computing Grid infrastructure (e.g. EGEE, DEISA, etc.)
Cloud computing environments Cloud computing environments
Service Grid (e.g. EDGeS)Service Grid (e.g. EDGeS)
Implementation in progress
9footer08/04/23
end