Using Provenance to Support Real-Time Collaborative Design of Workflows
Workflow evolution provenance and OPM
Tommy Ellkvist and Juliana Freire
20082
Workflow Evolution
Version Tree
Workflows Data Products
20083
Action based representation of workflows
Nodes represents workflows Edges represents actions Actions are transformations on
workflows Actions are performed by users
1
2
0
Add Module(0)
Add Module(1)
3
Add Connection(0,1)
20084
OPM XML schema: Example of OPM
(The OPM, 2007)
20085
OPM XML schema: Translated OPM Example
<OPMGraph ...> <Artifact> <ArtifactId>1</ArtifactId> <Account>G</Account> <Account>O</Account> </Artifact> <Artifact> <ArtifactId>1</ArtifactId> <Account>G</Account> <Account>O</Account> </Artifact>… <Process> <ProcessId>1</ProcessId> <Account>G</Account> </Process> <Process> <ProcessId>2</ProcessId> <Account>O</Account> </Process> <Process> <ProcessId>3</ProcessId> <Account>O</Account> </Process>…
<Used ProcessId = "1" Role = "in" ArtifactId = "1"> <Account>G</Account> </Used> <Used ProcessId = "2" Role = "pair" ArtifactId = "1"> <Account>O</Account> </Used> <Used ProcessId = "3" Role = "in" ArtifactId = "3"> <Account>O</Account> </Used> <Used ProcessId = "4" Role = "in" ArtifactId = "4"> <Account>O</Account> </Used> <Used ProcessId = "5" Role = "left" ArtifactId = "5"> <Account>O</Account> </Used> <Used ProcessId = "5" Role = "right" ArtifactId = "6"> <Account>O</Account> </Used> <WasGeneratedBy ArtifactId = "2" Role = "out" ProcessId = "1"> <Account>G</Account> </WasGeneratedBy>… <Alternate Account1 = "O" Account2 = "G"/></OPMGraph>
20086
Vistrails XML Model
<vistrail dbHost="" dbName="" dbPort="" id="" name="" version="0.9.0" xmlns:xsi="http://www.w3.org/..."> <action date="2008-05-27 17:35:39" id="1" prevId="0" prune="" session="" user="g-tomel"> <add id="0" objectId="0" parentObjId="" parentObjType="" what="module"> <module cache="1" id="0" name="String" package="edu.utah.sci.vistrails.basic" tag="" version="" /> </add> <add id="1" objectId="0" parentObjId="0" parentObjType="module" what="location"> <location id="0" x="-89.0" y="62.0" /> </add> </action> <action date="2008-05-27 17:35:43" id="2" prevId="1" prune="" session="" user="g-tomel"> <add id="2" objectId="1" parentObjId="" parentObjType="" what="module"> <module cache="1" id="1" name="ConcatenateString" package="edu.utah.sci.vistrails.basic" tag="" version="" /> </add> <add id="3" objectId="1" parentObjId="1" parentObjType="module" what="location"> <location id="1" x="-20.0" y="-67.0" /> </add> </action> <action date="2008-05-27 17:35:46" id="3" prevId="2" prune="" session="" user="g-tomel"> <add id="4" objectId="0" parentObjId="" parentObjType="" what="connection"> <connection id="0" /> </add> <add id="5" objectId="1" parentObjId="0" parentObjType="connection" what="port"> <port id="1" moduleId="1" moduleName="ConcatenateString" name="str1" spec="(edu.utah.sci.vistrails.basic:String)" type="destination" /> </add> <add id="6" objectId="0" parentObjId="0" parentObjType="connection" what="port"> <port id="0" moduleId="0" moduleName="String" name="value" spec="(edu.utah.sci.vistrails.basic:String)" type="source" /> </add> </action></vistrail>
20087
Vistrails XML Model: Translated to OPM
<Used ProcessId = "1" Role = "in" ArtifactId = "0"stopTimeBegin = "2008-05-27 17:35:39" stopTimeEnd = "2008-05-27 17:35:39"> <Account>G</Account> </Used> <Used ProcessId = "2" Role = "in" ArtifactId = "1" stopTimeBegin = "2008-05-27 17:35:43" stopTimeEnd = "2008-05-27 17:35:43"> <Account>G</Account> </Used> <Used ProcessId = "3" Role = "in" ArtifactId = "2” stopTimeBegin = "2008-05-27 17:35:46" stopTimeEnd = "2008-05-27 17:35:46"> <Account>G</Account> </Used> <WasGeneratedBy ArtifactId = "1" Role = "out" ProcessId = "1” stopTimeBegin = "2008-05-27 17:35:39”
stopTimeEnd = "2008-05-27 17:35:39"> <Account>G</Account> </WasGeneratedBy> <WasGeneratedBy ArtifactId = "2" Role = "out" ProcessId = "2” stopTimeBegin = "2008-05-27 17:35:43" stopTimeEnd = "2008-05-27 17:35:43"> <Account>G</Account> </WasGeneratedBy> <WasGeneratedBy ArtifactId = "3" Role = "out" ProcessId = "3” stopTimeBegin = "2008-05-27 17:35:46" stopTimeEnd = "2008-05-27 17:35:46"> <Account>G</Account> </WasGeneratedBy> <WasControlledBy ProcessId = "1" AgentId = "concat.xml" startTimeBegin = "2008-05-27 17:35:39” startTimeEnd = "2008-05-27 17:35:39” stopTimeBegin = "2008-05-27 17:35:39” stopTimeEnd = "2008-05-27 17:35:39"> <Account>G</Account> </WasControlledBy> <WasControlledBy ProcessId = "1" AgentId = "concat.xml" startTimeBegin = "2008-05-27 17:35:43" startTimeEnd = "2008-05-27 17:35:43" stopTimeBegin = "2008-05-27 17:35:43" stopTimeEnd = "2008-05-27 17:35:43"> <Account>G</Account> </WasControlledBy> <WasControlledBy ProcessId = "1" AgentId = "concat.xml" startTimeBegin = "2008-05-27 17:35:46" startTimeEnd = "2008-05-27 17:35:46" stopTimeBegin = "2008-05-27 17:35:46" stopTimeEnd = "2008-05-27 17:35:46"> <Account>G</Account> </WasControlledBy>
<Agent> <AgentId>concat.xml</AgentId> <Agent>G</Agent> </Agent> <Artifact> <ArtifactId>0</ArtifactId> <Account>G</Account> </Artifact> <Artifact> <ArtifactId>1</ArtifactId> <Account>G</Account> </Artifact> <Artifact> <ArtifactId>2</ArtifactId> <Account>G</Account> </Artifact> <Artifact> <ArtifactId>3</ArtifactId> <Account>G</Account> </Artifact> <Process> <ProcessId>1</ProcessId> <Account>G</Account> </Process> <Process> <ProcessId>2</ProcessId> <Account>G</Account> </Process> <Process> <ProcessId>3</ProcessId> <Account>3</Account> </Process>
20088
Observations
General model – Only contains enough information to traverse the
provenance graph– No additional information stored
Different ways of representing workflow design provenance– Edges as actions– Edges as version differences
20089
Observations
What is the time?– How to interpret a time T of a process?– Does interpretation affect querying– Semantics of intervals
Who is the Agent?– Users– Workflow system– The session– Workflow specification
”OPM Level 2”?– Are ther workflow specifics we want to express
200810
Interoperability