Post on 16-Jan-2016
transcript
WS-PGRADE
Akos BalaskoLPDS
MTA SZTAKI
Content
General OverviewParts of SystemActivities (of advanced and of common user)Simplified Graph of the hierarchy of ActivitiesPrinciples of Workflow SubmissionPrinciples of Job SubmissionsPrinciples of Job InputPrinciples of Job Output
Portlet DetailsGraphCreate ConcreteConcreteTemplateStorageUploadRepository
General Overview
Inside of the System gUSE Tires
WFGraph editor
WEB-UI(HTML)
Liferay
WS-PGRADE portal
Information System
WF Storage
File Storage
Application Repository
WF Interpreter
Backend
local submitter
GT4 submitter
Glite submitter
LCG submitter
GT2 submitter
GMLCA submitter
WS (Axis) submitter
Glite GridGlite
Grid
GMLCA GridGMLCA
Grid
GT2 GridGT2 Grid
GT4 GridGT4 Grid
LCG GridLCG
Grid
WSWS
DC
I-B
ridge
Concrete Workflow
Algorithms,Resource references,Inputs
Graph
Jobs,Edges,Ports
Template
Constraints,Comments,Form Generators
Workflow Instance
Running state,Outputs
Repository Item
Application ORProject OR,Workflow part(G,T,CW)
Static References
Legend:a b a must reference ba b a may reference b
Concrete Workflow
Algorithms,Resource references,Inputs
Graph
Jobs,Edges,Ports
Template
Constraints,Comments,Form Generators
Workflow Instance
Running state,Outputs
Repository Item
Application ORProject OR,Workflow part(G,T,CW)
Static References
Legend:a b a must reference ba b a may reference b
DAG graph describes the
skeleton of workflow
Jobs are containers of insulated
computations
Edges refers to channels connecting
input/output files
Ports connect channel endpoints or files with Job interiors
Concrete workflow defines the semantic of the workflow execution
Algorithms describe job interiors and may be
defined by binaries, by service calls or by references to other
workflows
Resource references may define the places where
the jobs run and/or the way to find these places
Inputs define the input files elaborated by the Jobs.
Inputs may be extended by job running conditions and by multiplication factor indicating
a set of file to elaborate in subsequent “PS” Job
submissions
Workflow Instance is a submitted object of a Concrete Workflow
Runtime state composes information
generated during job submission in order to control and observe the
run
Outputs compose the result of the
whole computation
Template is a standardization making a Concrete Workflow
reusable
Constraints fix certain properties of Concrete
Workflows subsequently defined by the Template
Comments help the user to set the non fix
parameters of Concrete Workflows subsequently defined by the Template
Form Generators define a question form the end user must fill to use the workflow
in a simplified way
Application is a tested semantically defined Concrete workflow together with the definitions of its eventual
embedded Workflows.Only input files, command line
arguments and the destination of the submission may be left to the end user
to define.
Repository stores developed items in a compressed form
Project is a Concrete Workflow together with the definitions of
each referenced classes.
Workflow part is either a Graph, a Concrete Workflow, a Workflow
Instance or a Template
Concrete Workflow
Algorithms,Resource references,Inputs
Graph
Jobs,Edges,Ports
Template
Constraints,Comments,Form Generators
Workflow Instance
Running state,Outputs
Repository Item
Application ORProject OR,Workflow part(G,T,CW)
NewEdit, CopyDelete
New
New
Configure,Copy, Delete
New
Submit
New
User Activities
Export
Import
Observe,Download,Suspend,Delete
Edit
Legend:a c c is created usingb a and b as argument
Concrete Workflow
Algorithms,Resource references,Inputs
Graph
Jobs,Edges,Ports
Template
Constraints,Comments,Form Generators
Workflow Instance
Running state,Outputs
Repository Item
Application ORProject OR,Workflow part(G,T,CWI)
NewEdit, CopyDelete
New
New
Configure,Copy, Delete
New
Submit
New
User Activities - Developer
Export
Import
Observe,Download,Suspend,Delete
Edit
Legend:a c c is created usingb a and b as argument
1.Create and edit a Graph of a workflow
2. Create the WF, and define the semantics, file association and destination by Configure
3. Submit the Concrete Workflow to observe its state and fetch its result
4. For reusability Template can be made from a Workflow by fixing some of its features
5. Template can be used as an alternative way to define a Concrete Workflow
6 A new CW can be defined by matching a Graph and a CW
7 Tested WF can be exported to end user
Concrete Workflow
Algorithms,Resource references,Inputs
Graph
Jobs,Edges,Ports
Template
Constraints,Comments,Form Generators
Workflow Instance
Running state,Outputs
Repository Item
Application ORProject OR,Workflow part(G,T,CW,WI)
Configure,Delete
Submit
User Activities - End User
Import
Observe,Download,Suspend,Delete
1. Import of an Application including the Template (Eventual name collisions are handled)
2. Simplified setting of missing parameters with forms rendered by the Form Generators of the Template
3. Submit the Concrete Workflow to observe its state and fetch its result
Portlet
Concrete WF List
Workflow
History File
Selected Job
Job Exec Conf Port Configurations JDL/RSL file
Workflow instance List
Workflow Instance
Job List
Details Configure
Details
Std Output
Job
Job Instance List
View all Contents
Job Instance
Output FileSTD Error File STD Out File Log File
Std ErrorOutputLog
Job List
Info,
Submit,
Delete/Abort All
Suspend/Resume/Delete,
Visualize
Job Executable
WF visualization
Job Inputs & Outputs JDL/RSL
Job Config. History
Locate Item in graph
Locate Item
Locate Item
Locate Item
Locate Item
Map of PortletWorkflow/Concrete
Concrete WF List
Workflow
History File
Selected Job
Job Exec Conf Port Configurations JDL/RSL file
Workflow instance List
Workflow Instance
Job List
Details Configure
Details
Std. Output
Job
Job Instance List
View all contents
Job Instance
Output FileSTD Error File STD Out File Log File
Std. ErrorOutputLog
Job List
Info,
Submit,
Delete/Abort All
Suspend/Resume/Delete,
Visualize
Job Executable
WF visualization
Job Inputs & Outputs JDL/RSL
Job Config. History
Locate Item in graph
Locate Item
Locate Item
Locate Item
Locate ItemMap of Portlet Workflow/Concrete
In case of PS Workflows the list Job Instance may contain more than one elements with
different “PID” -s
Configuring the Workflow: Overview
hm n
*K
1
Determine number of accepted files on free input Ports
Determine Job to be Generator by defining Multiple output port.In this case the job may be able to produce more than 1 jobs associated to the multiple output port within one job submission step
Determine Dot or Cross product relation of Input ports
to define the number of job submissions
Determine Job to be Collector by defining a Gathering Input Port.
The Job execution will be postponed until all input files to
that Port have arrived and can be elaborated in a single job
submission step
Legend:
Cross Product
Dot Product
Submitting the Workflow: Overview(Animation the Number of Generated Output Files)
hm n
m*n
m*n h*K
S
m*n h*K
m*n*h*K
S S
S
S
S
h*K
1
S=max(m*n,h*k)
1
Sm*n*h*K
m*n h
S
S
In case of Generator job the number of job submissions may differ from the number of files on
Output Ports
In case of cross product individual Job submission is generated for each possible
input file combination
In case of dot product the Job is submitted with input
files having a common index number in each input
Ports
Workflow Configuration Hierarchy
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job is Workflow Job is Service Job is Binary
Job Execution Model
Job Execution Model
Insulated activity
Must have an algorithm defining the semantics.
Must have resources to implement the algorithm.
May have access right to the resources.
May have input Files.
May have output Files.
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job is Workflow Job is Service Job is Binary
Job Execution Model (Algorithm)
Job algorithm is defined during the Configuration and may be:
Binary codedefined by the user and delivered –similar to input files – to a resource to be executed on.
Service callwhere the Web Service has been previously implemented on a Resource
Call of an embedded Workflow
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job is Workflow Job is Service Job is Binary
intArithmetic_64bit.c
…..FILE x;
x=fopen(“INPUT”,”r”);
COMPILE
intArithmetic_64bit.exe
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job Execution Model (Algorithm: Binary Code – Input /Output)
Make a Source FileCompile itOn the panel Job Executable define
the code to be executed
On the panel Job Inputs & Outputs associate File Open reference with
Internal File Name
Job Execution Model (Algorithm: Binary Code - Configuration)
Step 4Define the Description of the code to sent on the defined Destination via the selected Submitter
Step 1Select Binary as Job interpretation class
Step 2Select one of the Submitters
Step 3Define the Destination hierarchy
Job Execution Model (Algorithm: Service Call)
In this case the user wants reach an existing remote service with the following attributes:
1. Type:Base standard type of the Web Service. The administrator of the portal server sets the list of standards the portal can understand. The default value is Axis.
2. Services:Selection list of services of the given type having been explored by the Portal Server.
3. Methods:Selection list functions the selected service implements.
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job is Workflow Job is Service Job is Binary
Job Execution Model (Algorithm: Service Call Configuration)
Step 1 Select Service as Job interpretation class
Step 2 Select Type of Service to be understood
Step 4 Type a Method as an interface routine of Service
Step 3 Type a Service URL
Job Execution Model (Algorithm: Service Call – Parameter I/O)
Rule: The sequence of port identifiers of a calling Job must correspond to the parameter sequence of a published Service.The parameter list of the service must be known by the user (WSDL tag “parameterOrder” defines order of parameters)Ports ordered by “Port Number” are associated to parameters ordered by “parameterOrder”.
Example:
0
1
2
Job
The Job “job” may be the invocation of the Service S with the following parameter list:
• First parameter: INPUT (corresponding to Port 0)
• Second parameter: OUTPUT (corresponding to Port 1)
• Third parameter: INPUT (corresponding to Port 2)
Job Execution Model (Algorithm: Call of an Embedded Workflow)
Principle:
Original Workflow
Embedded WorkflowTo ensure the compatibility of interfaces the embedded workflow must be defined by a Template
The dummy job whose execution will be substituted by the call of the embedded one
Job Execution Model (Algorithm: Call of an Embedded Workflow – Configuration, Selecting Embedded Workflow)
Step 1 Select (embedded) Workflow as Job interpretation class
Step 2 Embedded (called) Workflow is selected by the check list showing the possible templated Workflows
Focus on the caller Job
Job Execution Model (Algorithm: Call of an Embedded Workflow – Passing Input Parameter)
1
0
The blue line indicates that the focus is on the given port of the caller
The checklist permits the selection one of the permitted
ports of the embedded Workflow
0
1
The radio button should be set to “Yes” if we want to connect the given port of the caller to a port of the called workflow.In the other case the input file will be directed to “/dev/null”
Job Execution Model (Algorithm: Call of an Embedded Workflow – Passing Output Parameter)
0
1
The blue line indicates that the focus is on the given port
of the caller
0 1
2
File Associations
Input File Association to a Job
Step 1 Select the source of each Input Port and calculate the number of available filesStep 2 (in case of PS ports)
Match available files with the “Input Number” of each PortStep 3 (in case of PS ports)
Find Cross Product Groups among PortsStep 4 (in case of PS ports)
Compose the Dot Product Set from Cross Product Groups
General Rule:
If “Input Number” is not filled, the given Port will be regarded as simple Port in the other case PS Port.The reference to PS Port members will be coded by little subsequent integers starting from zero:0,1…n
JobPort 0
Port 1
Port 2
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Input File Association to a Job
Step 1 Select the source of each Input Port and calculate the number of available files.
Step 2 (in case of PS ports)Match available files with the “Input Number” of each Port.
Step 3 (in case of PS ports)Find Cross Product Groups among Ports.
Step 4 (in case of PS ports)Compose the Dot Product Set from Cross Product Groups.
JobPort 0
Port 1
Port 2
File Association to Simple Input Port(Source: Local)
External File name
SQL
upload
Remote
Value
/<path>/<any>
..
<any>
<path>
Step 1 (offline)Create any file on the desktop
Step 2 (online)Port configuration:Select file to be upload
Input File Association to PS Input Port(Source: Local)
..012
0
1
2
..<any>.zip
0
1
2
paramInputs.zip External File name
SQL
upload
Remote
Value
/<path>/paramInputs.zip
<path>
Step 1 (offline)Create Files with special name convention
Step 2 (offline)Make zip files
Step 3 (online)Port configuration:Select file to be upload
0
1
2
Files available for the subsequent Job operations
Warning:To distinguish a PS input file
from any other input file in zip format the name must be fixed as paramInputs.zip
Input File Association to PS Input Port(Source: Local - Configuration)Focus is on the Port
0 of Job “Separator”
Local input file has been selected, whose content must be uploaded from the client.
Browse starts the pop up File selector
“Choose file”
Input File Association to a Simple Input Port(Source: Remote –LFC Catalogue)
..
<any>
<path>
LFC_HOST
<any>
SE
External File name
SQL
upload
Remote
Value
lfn:/<path/<any>
Step 1 (offline)Create Grid file on a remote storage
Step 2 (online)Port configuration:Reference the file by the logical file name
Input File Association to a Simple Input Port(Source: Remote – LFC Catalogue Configuration)
Remote input file has been selected
The flag “Copy to WN” is always set that means the content of file will be copied in the local working directory of the Worker Node (WN) where the job runs. In the alternative case it is the responsibility of job executable to reach the input file by an the independent reference using the GFAL API.
File Association to PS Input Port(Source: Remote – LFC Catalogue)
..<any>_0<any>_1<any>_2
<path>
LFC_HOST
<any>_1
SE
<any>_0
<any>_2
SE
External File name
SQL
upload
Remote
Value
lfn:/<path/<any>
Step 1 (offline)Create Grid files with common prefix and with common catalogue
Step 2 (online)Port configuration:Select the prefix of the logical file name
Files available for the subsequent Job operations
<any>_0<any>_1
<any>_2
File Association to Simple Input Port(Source: Remote – Globus GridMap)
External File name
SQL
upload
Remote
Value
gsiftp:/<path/<any>
Step 1 (offline)Create the file in a of a remote storage
Step 2(online)Port configuration:Reference the file with whole URL
<any>
..
<any>
<path>
RemoteSt.
File Association to PS Input Port(Source: Remote– Globus GridMap)
Step 1 (offline)Create files mapped in a common subdirectory of a remote storage
Step 2 (online)Port configuration:Select the prefix of the remote file URL name
External File name
SQL
upload
Remote
Value
gsiftp:/<path/<any>
<any>_2
<any>_1
<any>_0
..<any>_0<any>_1<any>_2
<path>
RemoteSt.
Files available for the subsequent Job operations
<any>_0<any>_1
<any>_2
File Association to a Simple Input Port(Source: Value)
External File name
SQL
upload
Remote
Value 16
16
Port configuration:Value will be forwarded as a text file content
File Association to PS Input Port(Source: SQL Database)
<any>.mdbDepartments
24ML
26FK
15MA
AgeGenderName
Person
<Db_URL>
External File name
SQL
Upload
Remote
Value
SQL URL (UDBC)
USER
PASSWORD
SELECT Age from Person where Gender =“M”
<DB_URL>
<any_user>
015 1
24
Step 1 (offline)Create Database on a remote site
Step 2 (online)Port configuration:set Query
Step 3 (online)Port configuration:File generation from result set
Input File Association to a Job Step 1 Select the source of each Input Port and calculate the number of available files.Step 2 (in case of PS ports)
Match available files with the “Input Number” of each Port.Step 3 (in case of PS ports)
Find Cross Product Groups among Ports.Step 4 (in case of PS ports)
Compose the Dot Product Set from Cross Product Groups.
JobPort 0
Port 1
Port 2
Configuration of PS-Input Port
Step 1 To access to the settings the Parametric Input must be in “View” state
Step 2 Own or Foreign Port identifier can be selected to build a
CPG or to joint to one.
Step 3 The input field “Input Numbers” appears and can be set only in case of free ports but not in case of channels, to define the number of files actually feed to the current job
Input File Association to a Job
Step 1 Select the source of each Input Port and calculate the number of available files
Step 2 (in case of PS ports) Match available files with the “Input Number” of each Port
Step 3 (in case of PS ports)Find Cross Product Groups among Ports
Step 4 (in case of PS ports)Compose the Dot Product Set from Cross Product Groups
JobPort 0
Port 1
Port 2
Introduction of free Cross Product Group Identifiers. Default values are the Port Numbers i.e. initially they are different
Find Cross Product Groups among Ports
Rule: Any Port referencing self composes a separate base Cross Product Group. Any Port references a foreign Port belongs to the CPG of the referenced.
CPG
1
CPG
2
Port id: 0
Dot & Cross PID 0
Port id: 1
Dot & Cross PID 0
Port id: 2
Dot & Cross PID 2
0X
1Y
2Z
0A
1B B 5
Z 5
B 3
X 3
B 4
Y 4
A 0
X 0
A 1
Y 1 Z
A 2
2
Showing a foreign port includes the given port to be a
member of the CPG
The next member set will be forwarded to each subsequent
Job submission
Input File Association to a Job
Step 1 Select the source of each Input Port and calculate the number of available files
Step 2 (in case of PS ports) Match available files with the “Input Number” of each Port
Step 3 (in case of PS ports)Find Cross Product Groups among Ports
Step 4 (in case of PS ports)Compose the Dot Product Set from Cross Product Groups
Job
Port 0
Port 1
Port 2
Summary Example of PS Input
Port id: 0
Dot & Cross PID 0
Dot & Cross PID 0
Port id: 2
Dot & Cross PID 2
0X
1Y
2Z
0A
1B
0a1
b2
c3
d4
e
A 0
X 0
0a
A 1
Y 1
1b
Z
A 2
2
2c
B 3
X 3
3d
B 4
Y 4
4a
B 5
Z 5
5a
Input Numbers 3
Input Numbers 4
Input Numbers 2
Port 1 is associated to Port 0 to compose a common CPG resulting
2*3= 6 combinations
This CPG containing Port 2 constrained to 4 elements, and the 5 element long list is exhausted. In
this case the user selected the “Use First” rule
Each column indicates the File Input Set of subsequent
job submission
If have not input F
If have not input F
If have not input F
Port id: 1
Conditions of Job Submission Definition of algorithmDefinition and availability of resourcesAvailability of access rightsAvailability of input files on each defined input
Ports:
The defined input files are associated to the Job.Generally one –in the case of PS ports, the next- file must be available.
Existence of all input files in case of Collector PS Input ports
True value of eventual conditions investigating the values of individual input files
Optional Conditions to Submit a Job: Collector Port
Rule:
In case of a channel port (there is an other Job feeding that port) the setting of flag “All” of radio button “Waiting” indicates that all file instances produced by the other job must be present to start the Job.In this case the executor of this job must be able to handle input files prefixed by the Internal File Name of the Port.
In case of calling of an embedded Workflow the associated input ports must be both of the same type collector or not collector i.e. the setting of Waiting must be together in the caller and called ports “All” or “One”.
Detailed Animation of Generator, Normal and Collector Jobs
*K
0
1
2
0
1
2
0Generator Collector
1. run
2. run
3. run
1 run 1 run
Optional Conditions to Submit a Job: Collector Port-example
Step 1 To set this option the input port must be “channel”
Step 2 To access to the setting of this option the features of the Parametric Input must be in “View” state
Step 3 To make the Port to be a Collector one the setting of “Waiting” must be “All”
Conditions of Job Submission Definition of algorithmDefinition and availability of resourcesAvailability of access rightsAvailability of input files on each defined input
Ports:
The defined input files are associated to the Job.Generally one –in the case of PS ports, the next- file must be available.
Existence of all input files in case of Collector PS Input ports True value of eventual conditions investigating the
values of individual input files
Optional Conditions to Submit a Job
Step 1View must be selected to manipulate the Port dependent Conditions
Step 2Operator must be selected from the checkbox list
Step 3Either Value or a port of a foreign Input port must be selected
Step 4Depending on Step 3 an a foreign input Port can be selected by its Internal Name
Output File Association to a Job
Step 1 Select the optional destination of each Output Port Step 2 (in case of PS Generator ports)
Determine the number of output files
General Rule:
As a result of each submission of a job a single output file is expected to be generated on each output Port of the Job.The single exception is the Generator Job which may produce more than 1 files on its PS Generator Output Port(s)The user can instruct the system to forget an output file after it has been used as input in subsequent Job submissions. In this case the Storage Type of the output file must be set as “Volatile” instead of the default setting “Permanent”
JobPort 0
Port 1
Port 2
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Output File Association to a Job(Select the Optional Destination of Each Output Port-Local Output )
Upon the termination of the job the output file –referenced by its Internal File Name- will be copied from the working directory of the Worker Node to the Portal server and will remain there until the user downloads it to his/her client machine or deletes it.
<any>
..
<any>
<work.dir>
WN.
<any>
..
<any>
<job.dir>
PortalServ.
<new.zip>
..
<new.zip>
<usr.dir>
Client
Internal File Name: <any>
Upon termination of the Job on the Worker Node. The name of file produced by the executable of the job must correspond of that declared as Internal File Name. The correspondence is user responsibility
Triggered by User CommandDetails/Short Details/View Content(s)/Output applied on Workflow Instance. The compressed file (with user defined name) will contain files for additional information as well.
Output File Association to a Job(Local Output Configuration)
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
If the output is local (remains on the Portal Server ) only the Internal File Name must be defined
Identifying the Port by its index
Output File Association to a Job(Select the optional destination of each Output Port-Remote File + LFC Catalog)
Upon the termination of the job the output file - referenced by its Internal File Name - will be copied from the working directory of the Worker Node to the Remote Storage. To support PS the defined Remote File Name will be extended by and index.
<any>
..
<any>
<work.dir>
WN.
The name of file produced by the executable of the job must correspond of that declared as Internal File Name. The correspondence is user responsibility
Internal File Name: <any>
Remote File name: lfn:/<path>/<anya>
SE: <SX>
..<anya>_0
<path>
LFC_HOST <anya>_0
SX
The name oft the generated remote File will be extended by the postfix _<i> where i is integer number starting from 0. This rule is applied in any case even if the job is not part of a PS production
Output File Association to a Job(Remote Output LFC Configuration)
Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Grid File Catalogue referenced logical file name is defined
URL of Storage Element (SE) to contain the file to bee created. (Optional)
In case of remote files the Storage Type must be Permanent
Identifying the Port by its index
Output File Association to a Job(Select the Optional Destination of each Output Port-Remote File + LFC Catalog – after a PS Sequence)Shows the case when the job has participated in a PS production:
<any>
..
<any>
<work.dir>
WN.
The name of file produced by the executable of the job must correspond of that declared as Internal File Name. The correspondence is user responsibility
Internal File Name: <any>
Remote File name: lfn:/<path>/<anya>
SE:
..<anya>_0<anya>_1<anya>_2
<path>
LFC_HOST
<anya>_1
SE_X
<anya>_0
<anya>_2
SE_Y
The name oft the generated remote File will be extended by the postfix _<i> where i is integer number starting from 0.
If no SE is defined, remote files can be scattered randomly on available Storage Elements
Output File Association to a Job(Select the optional destination of each Output Port-Remote File + GLOBUS Grid Map)Upon the termination of the job the output file - referenced by its Internal
File Name - will be copied from the working directory of the Worker Node to the Remote Storage.
<any>
..
<any>
<work.dir>
WN.
The name of file produced by the executable of the job must correspond of that declared as Internal File Name. The correspondence is user responsibility
Internal File Name: <any>
Remote File name: gsiftp:/<path>/<anya>
<anya>
..
<anya>
<path>
RemoteSt.
URL to Remote Storage must be defined
The name of destination file remains unchanged if the executed Job was not part of a PS sequence
<anya>_2
<anya>_1
<anya>_0
..<anya>_0<anya>_1<anya>_2
<path>
RemoteSt.
Output File Association to a Job(Select the Optional Destination of Each Output Port - Remote File + GLOBUS Grid Map in Case of PS) Upon the termination of the job the output file –referenced by its
Internal File Name- will be copied from the working directory of the Worker Node to the Remote Storage.
<any>
..
<any>
<work.dir>
WN.
The name of file produced by the executable of the job must correspond of that declared as Internal File Name. The correspondence is user responsibility
Internal File Name: <any>
Remote File name: gsiftp:/<path>/<anya>
URL to Remote Storage must be defined
If the Job execution is part of a PS sequence the name oft the generated remote File will be extended by the postfix _<i> where i is integer number starting from 0.
Define a Generator Job
A job is of a Generator type if it has at least one multiple output port.
Syntactically a output port is multiple output port if the associated Output Number is bigger than one.If the number of files produced by a single run of the Generator job is less than the value of Output Number then the generated files will be encountered cyclically in further jobs.
If the number of output files exceed the Output Number the exceeding files will be not used
Detailed Animation of Generator (Case1: Output Number is equal with the number of Generated Jobs)
*K=3
0
1
2
0
1
2
Generator
1. run2. run
3. run
1 run
Detailed Animation of Generator (Case2: Output Number is Less than the Number of Generated Jobs)
*K=2
0
1
2
0
1
Generator
1. run2. run
1 run
Detailed Animation of Generator (Case3: Output Number is Bigger than the Number of Generated Jobs)
*K=4
0
1
2
0
1
2
Generator
1. run2. run3. run
1 run
3 0 3
4. run
Workflow
Workflow Activation
A workflow can be activated by the following way:
Interactively started by the user hitting the button Submit belonging to the given concrete workflow on the portlet Workflow/Concrete.
Workflow Activation(Interactively by the User)
Step 1The workflow is selected by button “Submit”
Step 2The submission can be confirmed or refused after the optional filling of a free description field identifying Workflow Instance for the user.
Workflow (Instance) States
Origin
Submitted
Running Suspended
AbortedFinishedError
Suspended
Submit
Suspend
Suspend
Resume
Resume
Abort
AbortFirst Job Starts
Last Job terminates
Internal Error
Internal Error
1.If all state counters are 0 then there is no Instance of the given Workflow.2. In Column “Error” the number instances being in states “Error” and “Aborted” are summed.3. Instances in state “Suspended” are displayed according their preceding states.
Workflow Creation and Configuration - Detailed
Based on the skeleton of the Workflow Graph the workflow receives all of its characteristics (parameters) in the configuration process in order to be composed a full, error free program which is able to run in a Grid environment (not taking account the needed proxy certificate, which must be provided by the user separately ).
The configuration may happen in several steps, even the way of the one time Creation of the workflow to be configured later may occur in a form of Pre-configuration, when the basis of the creation is not just a Workflow Graph but a previous Concrete Workflow (whose all parameters can be changed without its Graph) or a Template which imposes even further restrictions on the later configuration.
The configuration of a Concrete Workflow can be continued even if there are Workflow Instances have been Submitted in the Grid. However the Configuration is prohibited during the time when there is an existing Instance of that Workflow in states “SUBMITTED” or “RUNNING”.
Workflow Creation - DetailedThe one time creation of a concrete workflow may be based on:A Workflow Graph
(no predefined characteristics, without the modifiable Internal Port Name-s )
An existing Concrete Workflow(predefined characteristics serve as modifiable defaults)
An existing Template(Some predefined characteristics remain immutable)
An existing Concrete Workflow AND an existing Workflow Graph different from the Graph of the base Concrete Workflow(This is the special case when we want to “modify” the Graph of the base Workflow, but as it is not possible, during an intelligent matching process the two Graphs will be compared and the new Graph outfitted with as many Characteristics of the base Workflow as possible )
J1
J2
J3
J4J1
J2
J3
J4J1
J2
J3
+
Base Workflow (Red color indicates characteristics ) Base Graph
Created new WF, where only J4 should be configured
Workflow Creation - Implementation (Part 1) Create an “empty”
WF using only a Graph selectable from list
Create a WF inheriting parameters fixed in the selected Template
Create a WF copying the parameters of an existing Workflow selectable from list
Name of the new Workflow to be
configured
Free text filed for the creator of the new Concrete Workflow
Confirmation button
Workflow Creation - Implementation (Part 2: Subsequent Replacement by a New Graph)
1
2
New extension of existing graph
Saving the graph with a new name
4
Replacement of graph
3
5
Workflow Configuration
1. The name of a created Concrete Workflow appears in Worflow/Concrete list.
2. The configuration can be activated by the button “Configure” associated to each WF name.
3. In the appearing Graph Image a Job (with its Ports) can be selected.
4. After the configuration of Jobs the operation “Save and Upload” copies the eventual local input (and executable) files to the Portal Server and updates the definition of the Workflow. Concrete WF List
Selected WF
Selected Job
Job Executable Job Inputs & Outputs JDL/RSL Job Config. History
Job is Workflow Job is Service Job is Binary
1
2
3
4
Storage - OverviewA Concrete Workflow and its eventual instances can be downloaded from the
Portal Server to the client Machine.
The storage can servea subsequent Upload operation, recording the work done, oraccess to the results of the calculations.
The actual file transfer is prepared by compressing the needed data If only a Concrete Workflow is needed you should delete the Instances
previously. The Instances and the Outputs of Instances can be downloaded separately. The download of Instances includes
the download of the generator Concrete Workflowthe outputsthe messages resulting from the eventual user jobs and the messages
resulting from the
Note: In the case of Instances all produced output is booked – and can be downloaded. However – at present - only the actual Input is stored, therefore – if the user has changed the input between two successful Workflow Submissions it is not automatically assured, that the user can fetch the input of the former run.
Columns of individual instances, please note, that outputs can be downloaded separately
Storage - Implementation
To access instance
Information about the quota of the user allotted storage capacity in the Portal server
Upload Overview
The user can upload a previously downloaded Concrete Workflow from the Client Machine to the Portal Server.
To avoid name collisions the user has the possibility to rename:the workflowthe graph of the workflow the eventual Template belonging to the workflow
The operation collaborates with the Upload operations of Portlet Workflow/Storage accepting the same way of encoding of Workflows
Upload ImplementationStep 1Select the compressed file in the client machine containing the requested Workflow
Step 2 (option)Check the kind of name(s) you want to redefine.
Step 3 (If Step 2 performed):Enter the new name(s) which will not collide
Step 4Confirm the operation
Template: Make a Workflow Reusable
Template is basically an additional database to a workflow. It serves an experienced user:
To guarantee the immutability of certain tested settings of a workflow.
To inform the new users of the workflow about the proper settings of free characteristics (which have not been set to be immutable by the experienced user)
To define syntax constraints for certain free types used by the Form Generators, where Form Generators are system tools to generate simple question forms to be filled by non experienced users to define a concrete workflow in an easy way.
Lifecycle of Templates
1. A named template T will be created on the base of an existing – probably – tested workflow TW. The selecting of immutable characteristics (parameters) is performed during the creation process and can not be modified afterwards.
2. In an optional Editing process the –experienced user may add comments and type descriptions to the non immutable (free) parameters.
3. A user referencing T may create a new, so called “Templated” workflow inheriting all the properties of TW where the user during the subsequent configurations may not change the immutable parameters.
Creating a Template
Phase 1 Selecting the base Workflow of the Template and determine the main rule of the creation.
Phase 2Decide individually about the free/immutable status of each characteristics which is not already immutable at the moment.
Creating a Template(Phase 1)
There are 3 different cases:1. The base WF is already a “Templated Workflow”
and we want to restrict further the setting of free parameters. This case is called “Derived_from_TemplatedWorkflow”
2. The base WF is already a Templated Workflow but we want to permit the user to set an individual parameter free in the new Template even if it has been “immutable” in the base WF. In that case just a default setting will remind the decision maker that the given characteristic previously was “immutable”
3. The base workflow is a “pure” one, not has been restricted by any Template, i.e. we are free to decide about each characteristics in Phase 2
Creating a Template(Phase 1 - Implementation)
Case 2 and 3 of previous slide:The selected concrete WF can be even “templated”, but the restrictions defined there will not inherited. The checklist opens the selection from all existing concrete Workflows as base
Case 1 of previous slide:The selected concrete WF must be a templated one, and the restrictions defined there will be inherited in the new Template. The checklist opens the selection from all existing templated concrete Workflows as base.
Obligatory name for the new Template
List of existing Templates
The hitting of button “Configure” opens the Phase 2 of the Template creation process
Creating a Template(Phase 2 - Implementation - Overview)
A characteristic whose state can be changed in the current phase:“close” means immutable
The identifier of the current Job: the reserved names of the characteristics have a Job based namespace
Editing a Template
Expansion of a characteristic of “free” type to add informal information to the user, and parameters for the Form Generator
Shorthand:This optional selector indicates - if defined – that the next two parameters of the given characteristic has been defined in the referenced Job.
Step 1 Parameter used by Form Generator:Verbose name appears in the generated form to identify the value which will be requested from the inexperienced user
Step 2 Parameter informing the user of the Template:The creator of the Template informs the user about the proper setting of the given characteristic. This text will appear hitting as a pop up hitting an information box associated to each characteristic in the Configure windows
Repository
Repository OverviewRepository is a Library for the Publication of:
ApplicationsSemantically tested, trusted Workflows containing the definitions of the eventually referenced Embedded Workflows (including their transitive closures)
ProjectsApplications under construction
Concrete WorkflowsTemplatesGraphs
Repository is an interface between: an advanced User (who has the right to insert an element in the Repository by Export) and the common User who can only read a Repository member by Import.
During the inserting of an Application in the Repository the system checks the formal correctness of the Application:
Only the input files,the number of them in case of PS,command line arguments and destinations of job submissions
may not be filled.
During the reading operation the System checks the name collisions and notifies the user about them.
Repository Export
The object to be exported can be selected from the proper Portlet of the Workflow Group (Graph, Template, Concrete).
The button “Export” starts the export process.
Repository Export – Implementation(Concrete)
Step 1The button Export of the requested WF is selected
Step 2Decide about the way of exporting (in case of Application and Project the Embedded Workflows will be exported with)
Step 3Enter free input text which appearing in the Import List will describe and identify the object for the user.
Step 4Execute the Export
Way of working is similar in case of Graph and Template just “Export type” can not be select there
Repository Import - Implementation
Step 1Select a Type to be Imported
Step 3 (option):Check the kind of name(s) you want to redefine
Step 4 (If Step 3 performed)Enter the new name(s) which will not collide
Step 5Execute the Import operation
Step 2Select Object to be Imported
Thank you for your attention!
Questions?
balasko@sztaki.hu
http://www.lpds.sztaki.hu/products/guseSpecial thanks to
Gabor Hermann and Tibor Gottdank (MTA SZTAKI) for the slides