1Ilkay ALTINTAS - July 24th, 2007
Ilkay ALTINTASDirector, Scientific Workflow Automation Technologies Laboratory
San Diego Supercomputer Center, UCSD
Working with Kepler Workflows to Automate Steps
in the Scientific Method
2Ilkay ALTINTAS - July 24th, 2007
The Big Picture: Supporting the Scientist
Conceptual SWF
Executable SWF
From “Napkin Drawings” …
… to Executable Workflows
Source: GuideToENM.pdf at the workshop web link. http://reap.ecoinformatics.org/Wiki.jsp?page=RequirementsWorkshop2007
3Ilkay ALTINTAS - July 24th, 2007
Anatomy of a Kepler Workflow
Search Panel
Atomic Actor
Link
Director
Toolbar
Drag and Drop
Annotation
6Ilkay ALTINTAS - July 24th, 2007
Some False Myths about Scientific Workflows
• Scientific workflows are…– … scripts/visual programs
• Can be similar, but there’s value added
– … just workflow engines• There are other components making the design and
execution of workflows more effective• Search features, provenance tracking, semantic
guidance, etc…
– … the user interface• User interface is to make workflow design and
manipulation easier
8Ilkay ALTINTAS - July 24th, 2007
Efrat JEAGER-FRANK1 Christopher J. CROSBY2 Ashraf MEMON1 Viswanath NANDIGAM2 J. Ramon ARROWSMITH2 Jeffrey CORNER2 Ilkay ALTINTAS1* Chaitan BARU1
1 San Diego Supercomputer Center, University of California, San Diego
2 Department of Geological Sciences, Arizona State University
A Three Tier Architecture for LiDAR Interpolation and Analysis
*Presenting author
9Ilkay ALTINTAS - July 24th, 2007
R. Haugerud, U.S.G.S
D. Harding, NASA
Point Cloudx, y, zn, …
LiDAR IntroductionSurvey
Process & Classify
Analyze / “Do Science”
Interpolate / Grid
10Ilkay ALTINTAS - July 24th, 2007
LiDAR Difficulties
• Massive volumes of data– 1000s of ASCII files– Hard to subset– Hard to distribute and interpolate
• Analysis requires high performance computing
• Traditionally: Popularity > Resources
11Ilkay ALTINTAS - July 24th, 2007
A Three-Tier Architecture
• GOAL: Efficient LiDAR interpolation and analysis using GEON infrastructure and tools– GEON Portal– Kepler Scientific Workflow System– GEON Grid
• Use scientific workflows to glue/combine different tools and the infrastructure
Portal
Grid
12Ilkay ALTINTAS - July 24th, 2007
Lidar Workflow Process• Configuration phase• Subset: DB2 query on DataStar
Portal
Grid
Subset
Analyze
move process
Visualize
move render display
• Interpolate: Grass RST, Grass IDW, GMT…
• Visualize: Global Mapper, FlederMaus, ArcIMS Scheduling/OutputProcessing
Monitoring/Translation
13Ilkay ALTINTAS - July 24th, 2007
Lidar Processing Workflow (using Fledermaus)
Subset
Analyze
move process
Visualize
move render display
Arizona Cluster
NFS Mounted DiskIBM DB2
Datastar
NFS Mounted Disk
d1d1
d2 (grid file)
d2
d2d1
iView3D/Browser
CreateScene file
Fledermaus
sd
14Ilkay ALTINTAS - July 24th, 2007
Lidar Processing Workflow (using Global Mapper)
Subset
Analyze
move process
Visualize
move render display
Arizona Cluster
NFS Mounted DiskIBM DB2
Datastar
NFS Mounted Disk
d1d1
d2 (grid file)
d2
d2d1
BrowserGet image for grid file
Global Mapper
15Ilkay ALTINTAS - July 24th, 2007
Lidar Processing Workflow (using ArcIMS)
Subset
Analyze
move process
Visualize
move render display
Arizona Cluster
NFS Mounted DiskIBM DB2
Datastar
NFS Mounted Disk
d1d1
d2 (grid file)
d2
ArcSDE ArcIMSArcInfo
ArcIMS
d1
16Ilkay ALTINTAS - July 24th, 2007
Lidar Workflow Portlet
• User selections from GUI – Translated into a query and a parameter file – Uploaded to remote machine
• Workflow description created on the fly
• Workflow response redirected back to portlet
17Ilkay ALTINTAS - July 24th, 2007
Render Map
DB2
DB2
Spatial
query
Client/
GEON Portal
NFS Mounted Disk
ArcInfo
Compute Cluster
x,y,z and attribute
raw data
process output
KEPLER WORKFLOW
Map
Parameters
Grass
Functions
submit
Parameterxml
Create
Workflow
Description
ArcSDE ArcIMS
Map onto the grid (Pegasus)
Grass surfacing algorithms:
Spline
IDW
block mean
…
Download data
Binary grid
ASCII grid
Text file
Tiff/Jpeg/GifASCII grid
LIDAR POST-PROCESSING WORKFLOW PORTLET
26Ilkay ALTINTAS - July 24th, 2007
ORB Sources:Real Time SensorData Source.
ANTELOPEOrb DataRecorded.
KEPLER
CLIENT WEB SERVER
• Start Kepler Engine via a Browser• Specify Contour Parameters• Look at Contour Plots with Google Maps Applications
27Ilkay ALTINTAS - July 24th, 2007
To Sum Up
• The next three days:– Discussions on use cases with an eye on the
workflow requirements– Discuss other possible workflows– Developers taking notes on Kepler
implementation details of possible workflow steps
• Latest release: http://kepler-project.org
28Ilkay ALTINTAS - July 24th, 2007
Ilkay [email protected]+1 (858) 822-5453
http://www.sdsc.edu
Thanks!&
Questions…