Post on 17-Jan-2016
transcript
Predictable Workflow Deployment Service
Stephen MCGoughAli Afzal, Anthony Mayer, Steven Newhouse, Laurie Young
London e-Science Centre
Department of Computing, Imperial College London
2
ICENI
The Iceni, under Queen Boudicca, united the tribes of South-East England in a revolt against the occupying Roman forces in AD60.
• IC e-Science Networked Infrastructure• Developed by LeSC Grid Middleware Group• Collect and provide relevant Grid meta-data• Use to define and develop higher-level services• Interaction with other frameworks: OGSA, Jxta etc.
3
The Architecture: Showing the Trinity
SchedulerReservation
Service
PerformanceStore
Launcher
ApplicationService
ReservationEngine
4
Scheduling Service
Scheduling Framework
Application Mapper- Generates the possible mappings of Components to resources
Scheduling Algorithm-Algorithm to select where to deploycomponents
Listen out for services-Launcher Services-Reservation Services-Performance Services
5
Scheduling Ports
• launchJob – Takes an EP (workflow) or JDML (job description)– Works out where to deploy on the grid
• Uses Performance, Reservation and Launching service to help determine this
– Deploys work to appropriate Resources (as JDML)– Returns EP indicating what has been done
• generateQuote – same but doesn’t deploy
6
Performance Store- Persistent Performance storage
Performance Store- Persistent Performance storage
The Performance Repository Framework:
Performance Framework
Performance Processing- Conversion of raw event times into performance data
Data Collector-Collecting data on currently running applications (event times)
Performance Store- Persistent Performance storage
7
Performance Ports
• register – Inform the Performance Service of a new application to monitor– Provides a unique id that is used for further calls
to the PS
• addEP – Provide a workflow for an application the PS is monitoring– Requires the Execution Plan (a workflow)– Requires the unique id provided above
8
Performance Ports (2)
• getActivityTime – Get an estimated execution time for part of a workflow– Compulsory data
• Component type – identifies the component we are interested in
• Resource – the resource it will be run on• Activity – which part of the component
– Optional data• Share count – the number of other components that will be
running on the resource• Problem space characteristics – a set of parameters
specified by the component designer (eg number of unknowns for a set of liner equations)
9
Performance Ports (3)
• getProblemCharacteristics – find out the set of parameters and their types that can be used when querying the performance service for a given component– Requires component type, resource, activity
10
Performance Events
• When ICENI components start or component ports are accessed events are fired– Used to gather performance information about
currently running application
• Events contain data about– Time, Component where event happened,
resource, type of event (start or port), application which event refers to.
• Are serialised objects – can be XML documents
11
Collection of Performance Results
Data CollectorLinear EquationSource
Linear EquationSolver
Display VectorResults
Time Event 12:00 Linear Equation Source Start 12:04 Send out Equations 12:03 Linear Equation Solver Start 12:05 Receive Equations 12:12 ………..
Event:Start LinearEquation Source
Performance Processing
Workflow
Performance Store
12
Launcher-Converts a JDML document into a platform specific job
Launcher-Converts a JDML document into a platform specific job
Launching Service
Launching Framework
Reservation- Provides mechanism for reservations to be made
Advertiser-Generate a document for each resourceavailable from this Launcher
Launcher-Converts a JDML document into a platform specific job
Launcher Factory-Generates a Launcher for each job submitted to the Launching Service
13
Launching Service
• launchJob – Takes an XML description of the job to deploy written in JDML and enacts this job on the appropriate resource– JDML is translated to the local DRM specifics
• getResources – return the set of id’s of the resources available from this launcher. – If a set of user credentials are provided then the
list only contains those resources that the user may use.
14
Launching Service (2)
• getResourceDescription – Get the resource description for a named resource as an XML document. – If credentials are provided only return the document if the
user can use the resource.• getResourceAttribute – Query a specific
attribute from a resource. Given the name of a resource and the name of one of the attributes return the value of this attribute.
• getLocations – Get a list of the names of the resources
15
Launcher With Reservations
• createReservation - Given an agreement document requests a reservation for a resource– Returns an agreement document and an agreement identity
• renegotiateAgreement – Takes an agreement document returned previously and attempts to modify it.– If successful new document returned– If unsuccessful return an alternative proposal
• cancelReservation – takes a reservation identity and cancels the associated Reservation
16
Launcher With Reservations (2)
• createHold – Given an agreement document and timeout value make a hold on a resource– Arguments may be negotiated– Returns an Agreement Document with the Hold Identity– Hold is not permanent (time limited)
• may need to cancel if can’t hold all other components in application
• confirmHold - Takes a hold identity and makes the hold permanent
• cancelHold – Takes a hold identity and cancels that hold on the resource.
17
Reservation Service
• makeReservations – Takes a set of EPs (workflows) and tries to see if any of them can be fully reserved for the given user credentials– Returns an EP that can be fully reserved (if one exists)– Does this by making holds with the Launching Services
and confirming them• cancelReservation – Takes the Resource
Identity and Reservation Identity and cancels that reservation– These are found in the EP returned from creating a
reservaiton
18
Reservation Engine
• Exposes the underlying reservation features of the DRM• makeReservation – Takes reservation including
time interval and user credentials– Either confirms the reservation is accepted or offers an
alternative• cancelReservation – Takes a reservation identity
and cancels it• makeHold – Takes a reservation request and duration
– Returns the time interval that the hold will be held for• cancelHold – Cancel a Hold request given its id• confirmHold – Make a Hold into a reservation –
requires id
19
Example Execution
PerformanceService
SchedulingService
ReservationsService
LauncherService
ReservationsEngine
AdvertiseAdvertise
Advertise
Advertise
Submit workflow
Get performanceinformation
Performancedata
Get resource information
Resource information
Evaluate Performance Models
Schedule workflow
CreateReservations
Create Hold Create Hold
Hold CreatedHold Created
Confirm Hold Confirm Hold
ReservationCreatedReservation
ConfirmedExecutionPlan
Application Started
Actor
Deploy Jobs onto Resources
20
Service: ICENI
• End to end Grid middleware. Providing Launching, Scheduling, Reservation and inter-application communication.– URL: www.lesc.doc.ic.ac.uk/iceni– Licence: ICENI, based on Sun open source licence– Support: Web site / mailing list
• SOA Model:Jini
21
What do you use to build your service?(i.e. How ‘standard’ is your service?)
• Widely Implemented Standard Specification (1pt) – JINI
• Implemented draft specification (2pt)• Implemented draft specification (3pt)• Implemented proposal (4pt)
– ICENI Architecture
• Non-implemented proposal (5pt)• Concept (6pt)• TOTAL: JINI, 1pt, Implemented Pro 4pt = 5pt.
22
Service Dependencies
• What else does your service depend on (i.e. external dependencies)? – Logging : Java Logging
• What does your implementation depend on?– Languages : Java– JINI based.
23
AAA & Security
• What authentication mechanism do you use?– X509 certificates based.
• What authorisation mechanism do you use?– From ICENI infrastructure.
• What accounting mechanism do you use?– None at present.
• Does service interaction need to be encrypted?• If these are not used now, will they be in the future?
24
Exploiting the Service Architecture
• What features from your ‘plumbing’ do you use in your service?– Event notification– Meta-data– Registry discovery/advertisement
25
Service Activity
• Multiple interaction or single user?– Multiple interaction
• Throughput (1/per day or 100/per second?)– ~ 10/per min.
• Typical data volume moved in
• Typical data volume moved out– Depends on job.
26
Service Failure
• Required Reliability– Failure semantics?
• Positive ack
• Required Persistence– No current persistence.
• Required Availability– One of many.
27
Required Service Management
• Remote access to:– Performance– Progress (limited at present).
28
The Future
• How will ICENI develop?
• Want to re-engineer services as web-services
• Already have this for launcher (WS-JDML)
• Bring ICENI back into main stream services– More reliable and useful to others– Fragment ICENI into separate interoperating
services– Explore different service discovery mechanisms
29
Acknowledgements
• Director: Professor John Darlington• Research Staff:
– Anthony Mayer, Nathalie Furmento– Stephen McGough, William Lee, Jeremy Cohen– Marko Krznaric, Murtaza Gulamali– Asif Saleem, Laurie Young, Jeffery Hau
• Others:– Steven Newhouse, Yong Xie, Gary Kong, James Stanton
• Contact:– http://www.lesc.ic.ac.uk/iceni– e-mail: lesc@ic.ac.uk