1
Introduction to gUSE and Introduction to gUSE and WS-PGRADE portalWS-PGRADE portal
Gergely Sipos [email protected]
MTA SZTAKI
www.guse.huwww.wspgrade.hu
2
OutlineOutline
• History, family of P-GRADE products– P-GRADE Portal, WS-PGRADE, gUSE
• WS-PGRADE features– Scalable architecture– Seamless access to various types of resources
• Comfort features– Separated views for end users and developers
• Advanced data-flows
• Users and applications
• Summary and Next steps
3
Family of P-GRADE Portal products Family of P-GRADE Portal products
• P-GRADE portal– Creating (basic) workflows and parameter sweeps for
gLite and Globus middleware based utility grids
– www.portal.p-grade.hu
• P-GRADE/GEMLCA portal (University of Westminster)– To wrap legacy applications into Grid Services
– To add legacy code services to P-GRADE workflows
– http://www.cpc.wmin.ac.uk/cpcsite/gemlca
– (No parameter sweep support!)
• WS-PGRADE– Creating complex workflow and parameter sweeps for local clusters,
utility grids and desktop grids
– Creating complex applications using embedded workflows,legacy codes and community components from repository
– www.wspgrade.hu• Apply for an account on Beta release• Browse the User manual
4
PP--GRADE GRADE Grid PGrid Portalortal
• Pros.– Easy-to-use workflow system with graphical editor– Easy-to-use parameter sweep concept at workflow level– Multi-grid / multi-VO access mechanism: job submission to
LCG (~old gLite), gLite and Globus Toolkit 2– Intelligent handling of grid errors– Open source community on Sourceforge– Reliable, production installations for several Grid, EGEE VOs– Part of EGEE RESPECT programme
• Cons.– Considered too simple for some IT end users, while too
complecated for some non-IT end users– Workflow features found limited for some applications– Internal structure is monolithic hard to build developer
community around it
5
MotivationsMotivations of creating gUSE of creating gUSE
• To overcome (most of) the limitations of P-GRADE Portal:
• To provide better modularity• To improve scalability• To enable advanced dataflow patterns• To interface with wider range of resources• To separate Application Developer view from
Application End User view
• New products: WS-PGRADE (Web Services Parallel Grid Runtime and Developer Environment)
and gUSE (Grid User Support Environment) architecture
6
WS-PGRADE – gUSE architectureWS-PGRADE – gUSE architecture
Graphical User Interface: WS-PGRADEGraphical User Interface: WS-PGRADE
WorkflowEngine
WorkflowEngine
Workflowstorage
Workflowstorage File
storage
Filestorage
Applicationrepository
Applicationrepository
LoggingLogging
gUSEinformation
system
gUSEinformation
system
SubmittersSubmitters
Gridsphere portlets
Autonomous Services: high level
middleware service layer
Resources: middleware service layer
Local resources, Service grid resources, Desktop Grid resources, Web services, Databases
Local resources, Service grid resources, Desktop Grid resources, Web services, Databases
gUSE
Meta-brokerMeta-broker SubmittersSubmittersSubmittersSubmitters
Filestorage
Filestorage
SubmittersSubmitters
7
gUSE application: gUSE application: Acyclic dAcyclic dataflowataflow
•Job to run on dedicated machine
•Job to run in a gLite VO
•Job to run in a Globus 2 VO
•Job to run in a Globus 4 VO
•Task to run in a BOINC Grid
•Web service invocation
•Database operation (R / W)
•File from the client host
•File from a GridFTP site
•File from an LFC catalog (content from gLite SE)
•Input string for a task or service
•Result of a Database query
8
DDataflowataflow programming programming with gUSE with gUSE
• Separate application logic from data• Cross & dot product data-pairing
– All-to-all vs. one-to-one pairing of data items
– (Concept from Taverna)
• Generator components: to produce many output files from 1 input file
• Collector components: to produce 1 output file from many input files
• Any component can be generator or collector
• Conditional execution based on equality of data
• Nesting, cycle, recursion
40
401000
50 20
5000
1
5000 1
7042 tasks
9
Task execution processTask execution process
WS-PGRADEWS-PGRADE
Workflow EngineWorkflow Engine
Workflow storageWorkflow storage File storageFile storage
EGEESubmitter
EGEESubmitter
Dedicatedcluster
Dedicatedcluster
gUSEWeb
Services
Meta-brokerMeta-broker
Desktop GridSubmitter
Desktop GridSubmitter
LocalSubmitter
LocalSubmitter
Web ServiceClient
Web ServiceClient
DatabaseClient
DatabaseClient
User action, external event or time triggering
gLite WMSgLite WMS Desktop Grid server
Desktop Grid server
WebService
WebService DBMSDBMS
…
10
EGEE VO 1
UI machine
WMSMachine
Other EGEEservices
Comp. Element
WN WN WN
WNWNWN
Desktop Grid
DGserver
3G Bridge
DC
-AP
I p
lug
in
3G Bridge
EG
EE
plu
ginUI machine
WS-PGRADEgUSE
EGEE VO 2
gUSEgUSE Generic Grid-Grid Generic Grid-Grid bridgebridge
EDGeS
11
ErgonomicsErgonomics
• Users can be grid application developers or end-users. • Application developers design sophisticated dataflow
graphs in gUSE– embedding into any depth, recursive invocations,
conditional structures, generators and collectors at any position– Use personal grid certificate for test executions– Publish applications in the repository at certain stages of work
• Graphs • Templates• Concrete
• End-users see gUSE as a science gateway – List ready to use applications from repository– Import and execute applications without knowledge of
programming, dataflow, grid, internal structure of application– Use personal grid certificate for production execution
12
End users’ viewEnd users’ view
13
Import an application from repositoryImport an application from repository
To avoid overwriting any of you existing applications:Choose a new name!
14
Email notificationEmail notification
Ask email notification to know when the execution of your application is finished!
15
Application listApplication list
Provide input for the application then submit!
16
Set auto-submissionSet auto-submission
Define when should WSPGRADE submit your application!
17
Submission triggered by an external Submission triggered by an external eventevent
Your application will be submitted when a request arrives with the key that you just set. (WS invocation)
18
Write you own submission trigger!Write you own submission trigger!URL of the WS-PGRADE
portal server
pUser Owner of the application (Portal User)
pID Key to identify the call
pText Freeform string to identify the application instance
19
Set custom input files/ input Strings/ Set custom input files/ input Strings/ SQL queries for componentsSQL queries for components
20
Monitoring application executionMonitoring application execution
Step 1:The application is selected by button “Submit”
Step 2:Define a freeform description to identify this particular executable instance
21
States of a Workflow InstanceStates of a Workflow Instance
Origin
Submitted
Running Suspended
AbortedFinishedError
Suspended
Submit
Suspend
Suspend
Resume
Resume
Abort
AbortFirst Job Starts
Last Job terminates
Internal Error
Internal Error
1.If all state counters are 0 then there is no Instance of the given Workflow2. In Column “Error” the number instances being in states “Error” and “Aborted” are summed3. Instances in state “Suspended” are displayed according their preceding states
22
Columns of individual instances, please note, that outputs can be downloaded separately
Downloading resultsDownloading results
Inner slider to encounter and access each Instances Columns of bulk download: All or
proper parts of all instances of a given WF can be downloaded
Information about the quota of the user allotted storage capacity in the Portal server
23
Application developers’ viewApplication developers’ view
24
End user’saccount
gUSE Applicationrepository service
gUSE Applicationrepository service
Development cycle of a gUSE Development cycle of a gUSE dataflow applicationdataflow application
Graph
Component layoutI/O PortsEdges
Template
ConstraintsComments
Workflow Instance
Running stateOutputs
Define content
Test execution
Prepare forre-usage
Publish
Concrete Workflow
Algorithms/servicesResourcesInputs
Import
Workflow instance
Submit
Repository Item
Workflow instanceWorkflow instance
Which parts and parameters can
be modified, which cannot
Concrete Workflow
25
Graph editorGraph editor(P-GRADE Portal look-and-feel)(P-GRADE Portal look-and-feel)
Components remain empty at this stage!
26
Workflow Configuration:Workflow Configuration:Creating Concrete WorkflowCreating Concrete Workflow
Select Configure
Select a job by mouse click
Fill the job property characteristics.Details have discussed previously.
Select Port Property Configuration
Fill port property characteristics. Details have discussed previously.
Select JDL/RSL Configuration
Select one of the JDL/RSL Configuration Parameters of the list box
Insert a definition
Confirm the settingsClose the configuration of this job
Save & Upload the Workflow configuration. Remind eventual error messages!
Return to main view
Note the inner slider:By moving it you can encounter –and make visible – any Port of the current job
27
Explaining Generator, Normal and Explaining Generator, Normal and Collector job typesCollector job types
*3
0
1
2
0
1
2
0Generator Collector
1. run
2. run
3. run
1 run 1 run
Typically for statistical analysis
Typically for processing data in “parameter sweep”
fashion
Typically for generating input parameters of a
simulation. (Often a database
query)
Number of results generated by a
single query can be unknown!
28
Configuring the Workflow:Configuring the Workflow:Data processing patternData processing pattern
hm n
*K
1
Define input data elements for every open input Port
Determine Job to be Generator by defining Multiple output port.In this case the job may be able to produce more than 1 jobs associated to the multiple output port within one job submission step
Determine Dot or Cross product relation of Input ports
to define the number of job submissions
Determine Job to be Collector by defining a Gathering Input Port.
The Job execution will be postponed until all input files to
that Port have arrived and can be elaborated in a single job
submission step
Legend:
Cross Product
Dot Product
G
C
Number of data elements
29
Animating the number of generated Animating the number of generated output data elementsoutput data elements
hm n
m*n
m*n h*K
S
m*n h*K
m*n*h*K
S S
S
S
S
h*K *K
1
S=max(m*n,h*k)
1
Sm*n*h*K
m*n h
S
S
In case of Generator job the number of job submissions may differ from the number of files on
Output Ports
In case of cross product individual Job submission is generated for each possible
input file combination
In case of dot product the Job is submitted with input
files having a common index number in each input
Ports
G
C
30
Data processing complexity:Data processing complexity: P-GRADE or WS-PGRADE P-GRADE or WS-PGRADE
P-GRADE Portal:• Generators only the first level of
graph• Collectors always the last level of
graph• Always CROSS product between
parameter input channels
WS-PGRADE Portal:• Generators at any level of graph
• Collectors at any level of graph
• Freedom to use CROSS or DOT product
G G
C
NormalWF
G G
G
C
31
Other advanced features for Other advanced features for developersdevelopers
• Workflow embedding– Simplify workflows by hierarchical development– Special case – Iterative embedding: iterate computation
towards the final result
• Conditional execution of workflow branches– Execute a branch only if it receives the expected input
32
Example: Need for complexityExample: Need for complexityCancerGrid workflowCancerGrid workflow
1
1
x1
N
xN
NxM
NxM
NxM
xN
N
xN
N
NxM
Generator job
N=20e-30e, M=100 ~2.7 billion tasks !!!
Generator job
1
33
Current users of gUSECurrent users of gUSE
• CancerGrid project– Predicting various properties of
molecules to find anti-cancer leads
– Creating science gateway for chemists
• EDGeS project (Enabling Desktop Grids for e-Science)
– Integrating EGEE with BOINC and XtremWeb technologies
– User interfaces and tools
• ProSim project– In silico simulation of intermolecular
recognition – JISC ENGAGE program
• University of Westminster Desktop Grid
– Using AutoDock on institutional PCs
34
ConclusionsConclusions
• P-GRADE Portal remains supported– Features can serve most grid scenarios– Open source project on Sourceforge
• WS-PGRADE– Implemented on top of scalable, WS based gUSE
architecture – More expressive dataflow patterns– Transparent access to
• Local resources • Service Grids• Desktop Grids• Databases• Web services
– Application repository• Service for collaboration of developers and end-users
35
Next steps with gUSE: Next steps with gUSE: www.guse.huwww.guse.hu
User manual
Request an account