NCSU-1 V1/26-Mar-02 1
Context-sensitive Service Composition for Support of
Scientific Workflows
Mladen A. Vouk
North Carolina State University, Raleigh, NC
NCSU-2 V1/26-Mar-02 2
TeamSystem Engineering and Software• Prof. Mladen Vouk (scientific workflows, middleware, networks)• Sandeep Chandra, MS candidate (system, middleware)• Zhengang Cheng, PhD candidate (agents, protocols, services,
workflows)• Prof. Munindar Singh (agents, workflows, data-bases)Domain-Specific Workflows• Prof. Donald Bitzer (signal analysis, coding theory)• E. Eni-May, PhD candidate (bioinformatics, coding-theory)• Dr. David Rosnick (bioinformatics, entropy analysis)• Prof. Anne Stomp (genetic engineering)Coordination (Dr. T. Critchlow + others)• Dr. Tom Potok (project coordination, software)• Other ORNL project scientists
NCSU-3 V1/26-Mar-02 3
Philosophy?• Human-centric workflow support (appliance-
like) – case study Bioinformatics• Service-oriented (distributed and diverse data
access, storage, manipulation, analysis, and display, grid-based computing, end-user profile services, quality of service)
• Context-sensitive (end-user presence, location, expertise, access and interaction permissions, domain translation, p2p communications, etc.)
NCSU-4 V1/26-Mar-02 4
(In)efficient
End-User
Apps
Data and Compute
Communications
OS
NCSU-5 V1/26-Mar-02 5
Example Workflow – Top LevelInput (select, slice and dice data):
– Obtain 3' end 16/18S ribosome for selected organism– Obtain sequences of mature mRNA for organism (or DNA if unavailable)
Process (model, compare, display, etc.):– Compute free energy bindings between 3' end of 16/18S rRNA and
mRNA– Train decision mechanism based on subset of mRNA sequences– Perform signal analysis on remaining (or newly requested) binding
sequences to determine efficiency
Output and analyze efficiency/signal model/data– Review results and compare to published efficiency/frameshift data (e.g.,
Nucleic Acids Research, J. Molecular Biology)– Evaluate theory in light of information theory (Shannon, Schnieder
NCSU-6 V1/26-Mar-02 6
How?• Domain-adequate computer-human interfaces
– Personalizing context/knowledge gateways
– Domain-aware workflow construction (service discovery, composition, invocation, agents, protocols)
• Adequate and seamless services, service registration and exchange gateways (move away from bring/cook-your-own service approach).
• Adaptive (policy-based) quality of service control and management all along the “service stack.”
NCSU-7 V1/26-Mar-02 7
Architecture
PARTICIPANT A
PARTICIPANT B
PARTICIPANT C
GENES
PHY’S
BIO
UDDI
HUMANCLIENT
WORKFLOW COMPOSITION &INTERFACES AGENT
SERVICE
AGENTSERVICE
AGENTSERVICE
All participants register their services
Directly connect touddi registry
Service & ContextGateways and Multiplexers
e.g., Iflow, JavaAgent
e.g., UDDI, NCBioGrid, WLS, IPPhones, H323Video
Vipar GenBank,BioNews
NCSU-8 V1/26-Mar-02 8
NCSU-9 V1/26-Mar-02 9
NCSU-10 V1/26-Mar-02 10
Service Agent - Example
DESCISIONMODULE
MESSAGINGSYSTEM
POLICY Remote
APP’S
OBJECT
INCOMING MESSAGES
WSDL and/orASDL method, access, behaviorPublishing(e.g., in UDDI)
SOAPInterface
Site specific
Human and agents should be able to consume published services. Workflow and pipeline are ways to consume services
NCSU-11 V1/26-Mar-02 11
What is a Service?• A service is an entity that can receive service request
and respond/deliver within a time, cost, reliability, security, etc., frame acceptable to the end-user. The service may be presented in the form of an intelligent agent, or simply as a servlet.
• The service provides access to its data, methods, and tools, etc., which usually is the property of a particular organization.
• In the original "Data Integration Architecture", the CM Wrapper and XML Wrapper represent a service, and provide services to other services. Here they are viewed as independent services that possess Intelligence.
NCSU-12 V1/26-Mar-02 12
Composition
AS1:S1
AS1:S2
AS2:S1
AS3:S2
WORKFLOW
AS1:S1 AS2:S1 AS2:S2 AS3:S2
AS1 AS2 AS3
S1 S2 S1 S2 S1 S2
PIPELINE
PIPELINE AND WORKFLOW
Synch, AsynchDiffering timescales
NCSU-13 V1/26-Mar-02 13
UDDI AT SDM4
SDM Interface to construct userWorkflows.(prot—iFlow In progress)
Services registeredWith UDDI at sdm4.99-sdm category991- vipar news992- vipar genes993- Data Serv994- Analysis Serv
Menu to selectdata services
Menu to select analysis services
Menu to selectVipar services
Browser Browser
DB2Databaseon sdm4
Vipar serverFor news
Vipar serverFor bioinfo
WWW
Db2xml wrapper
Getting details
Details returned
Selected options arequeried to the UDDI
Populate menus with the service details
Connect to vipar using RMI
Service 1 Service n Service m
Description of availableservices at sdm4 UDDIusing WSDL, XML or HTML.
Invoke any of the registered services
Download To local system
Invoke services Invoke services
Invoke service
Current Framework
NCSU-14 V1/26-Mar-02 14
Support
• CVS is version control system for our developments. It is used to share data and software.
• Eclipse is a Java IDE from IBM, available from www.eclipse.org. It has seamless integration with CVS repository and provides an integrated debugging environment.
NCSU-15 V1/26-Mar-02 15
Intial Project Architecture and Prototypes
df
PDB
XMLWrapper
XMLWrapper
VIPAR
XMLWrapper
API
Integrationcomponent /KB-Mediator
(KBM)
QueryDispatch
andCollection(QDaC)
CMWrapper
CMWrapper
CMWrapper
Source / AgentMetaDataRegistry
XWRAPWrapperGenerator
XQuery (subsets e.g. Sel/Proj)
:Bio
XMLWrapper
ExternalProgram
XQuery interfaceSelect/project only
if invoked, pre-processes query parameters and post-processes results
NCSU-16 V1/26-Mar-02 16
Things to Do?
• Cast the amazing array of tools and software scientists use as services, catalogue it and define/improve interfaces, and ease – focus on What rather than How (from user perspective).
• Create context gateways that will coordinate domain-specific interactions and services and help in creation of efficient workflows.
• NCSU specific, we plan to have a fully working prototype in place in the next 6 month period.
• Suggestions?
NCSU-17 V1/26-Mar-02 17
NCSU-18 V1/26-Mar-02 18
Prototypes
VIPAR - GenBANK
VIPAR - GenNEWS
Other
GENES
News
BIO
UDDI
David,ChiChi
WORKFLOW COMPOSITION &INTERFACES AGENT
SERVICE
AGENTSERVICE
All participants register their services
Directly connect touddi registry
Service & ContextGateways and Multiplexers
UDDI, WLS, IP-phones, H323 VideoNC BioGrid
Iflow, JavaAgentEm