Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | colten-puckett |
View: | 214 times |
Download: | 0 times |
EGEE-III INFSO-RI-222667
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE and gLite are registered trademarks
Experiences with using the EGEE grid infrastructure and lessons for the future
Bob Jones
EGEE Project Director
Bob Jones (CERN)EGEE project Director
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Contents
• EGEE in one slide• What EGEE does today (one more slide)
• Our understanding of what CLARIN wants– Based on Peter Wittenburg’s presentation at EGEE’09 last week
and the “CLARIN short-guides” (very useful!) Centres, Trust Domain, Metadata, Virtual Collections, etc.
– Mapped on to what exists today I don’t pretend that we have a turn-key solution for CLARIN but rather
these are examples of what is possible
• How CLARIN could interface with EGI• Suggested Next Steps
Lots of material contributed by EGEE & WLCG colleagues
Bob Jones - NEERI 09 2
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Bob Jones - NEERI 09 3
EGEE-III
Main Objectives– Expand/optimise existing EGEE
infrastructure, include more resources and user communities
– Prepare migration from a project-based model to a sustainable federated infrastructure based on National Grid Initiatives
Flagship Grid infrastructure project co-funded by the European Commission
Duration: 2 years Consortium: ~140 organisations across 33 countries
EC co-funding: 32Million €
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Bob Jones - NEERI 09 4
EGEE – What do we deliver?• Infrastructure operation - Sites distributed across many
countries Large quantity of CPUs and storage Continuous monitoring of grid services & automated site
configuration/management Support multiple Virtual Organisations from diverse research
disciplines
• Middleware - Production quality software distributed under business friendly open source licence
Implements a service-oriented architecture that virtualisesresources
Adheres to recommendations on web service inter-operability and evolving towards emerging standards
• User Support - Managed process from first contact through to production usage
Training Expertise in grid-enabling applications Online helpdesk Dedicated support for specific disciplines Networking events (User Forum, Conferences etc.) for cross-
discipline interaction
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
CLARIN Centres
• Centres classification• Recognized R• Matadata C• Service B• Infrastructure A (roughly equivalent of EGEE Regional Operations Centres)• External E
• Need to monitor quality of services provided by centres• Need more details on the service definitions for each type• Probably need a Service Level Agreement for each type
• Example from EGEE: EGEE Service level agreement between Regional Operations Centres and Sites
• EGEE/EGI has an extendable monitoring infrastructure• Based on NAGIOS widely used and extendable open source monitoring toolkit• See Service Availability Monitoring in EGEE and Beyond
video demo @ EGEE’09 on YouTube
Bob Jones - NEERI 09 5
WLCG depends on two major science grid infrastructures ….
EGEE - Enabling Grids for E-ScienceOSG - US Open Science Grid
6
Interoperability & interoperation is vital significant effort in building the procedures to support it
Bob Jones - NEERI 09
Tier 0 – Tier 1 – Tier 2
7
Tier-0 (CERN):•Data recording•Initial data reconstruction
•Data distribution
Tier-1 (11 centres):•Permanent storage•Re-processing•Analysis
Tier-2 (~130 centres):• Simulation• End-user analysis
The WLCG MoU: http://lcg.web.cern.ch/lcg/mou.htm
An example: WLCG
Bob Jones - NEERI 09
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Monitoring Centres
Bob Jones - NEERI 09 8
http://gstat-dev/gstat/summary/grid/WLCG/
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Trust Domain
Bob Jones - NEERI 09 9
https://www.eugridpma.org/members/worldmap/
• The choices made by CLARIN appear to be very sensible• Not exactly the same as EGEE/EGI but interoperation is possible
• Pilot project between BiG Grid, SURFnet and MPI already built an integrated online “SLCS” Certificate Authority service with an example use case of the IMDI browser (a linguistic corpus access browser)
• Talk to AAI community• GEANT and IGTF/EUGridPMA have a lot of useful experience• Europe should avoid separate sets of CAs• [email protected]
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Centre availability/reliability reporting
Bob Jones - NEERI 09 10
See VO Specific Service Monitor using Service Level Status video demo @ EGEE’09 on YouTube
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Component Metadata
• AMGA – the ARDA Metadata Grid Application• Metadata Catalogue of EGEE’s gLite Middleware
– Millions of files, 6000+ users, 200+ computing centres
– Mainly (real-only) file metadata
– Main concerns : scalability, performance, fault-tolerance, support for hierarchical collections, security
replicate metadata between different AMGA instances allowing the federation of metadata
different authentication methods via (Grid-Proxy-) Certificates as well as very flexible accesses control mechanisms for individual data items based on ACLs
– Does not yet support Persistent Identifiers AMGA uses grid file LFNs (Logical File Name) as does rest of gLite Would require some development
http://amga.web.cern.ch/amga/
AMGA 2.0 presentation at EGEE’09
Bob Jones - NEERI 09 11
same campus as KAIST(possible ISOcat mirror for CLARIN)
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Workflows
Bob Jones - NEERI 09 12
• Many workflow managers supported– WMS (part of gLite)– GridWay (part of RESPECT)– Kepler, Taverna etc.
• Example - WISDOM
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Virtual Collections
VOMS: Virtual Organization Membership Service
VOMS is a system for managing authorization data within multi-institutional collaborations. VOMS provides a database of user roles and capabilities and a set of tools for accessing and manipulating the database and using the database contents to generate Grid credentials for users when needed
Bob Jones - NEERI 09 13
http://www.gcube-system.org/
gCube offers a feature full platform for distributed hosting, management and retrieval of data and information
See EGEE09 demo on YouTube: A Virtual Research Environment for Species Distribution Map Generation and Management
Goal: Long-term sustainability of grid infrastructures in Europe
Approach: Establish a federated model bringing together National Grid Infrastructures (NGIs) to build the European Grid Infrastructure (EGI)
EGI Organisation: Coordination and operation of a common multi-national, multi-disciplinary Grid infrastructure
To enable and support international Grid-based collaborationTo provide support and added value to NGIsTo liaise with corresponding infrastructures outside Europe
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
CLARIN and EGI
• The creation of National Grid Infrastructures and their overall coordination can provide an ICT context for the research infrastructures– An operational framework for centres involved in CLARIN
• In the EGI context, Specialised Support Centres (SSCs) are the means of interaction with user communities– The EGI SSCs are established and governed by the
user communities– Humanities SSC foreseen in ROSCOE project proposal
Bob Jones - NEERI 09 15
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
ESFRI @ EGEE’09
Cherenkov Telescope Array
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
How to be future proof
• Consider ALL (production grids, supercomputers, commercial cloud systems, volunteer grids, network etc.) as a combined e-Infrastructure ecosystem– Aim for interoperability and combine the resources into a consistent
whole– Work closely with EGEE/EGI, DEISA/PRACE and GEANT – they are
ready to help! - they have links around the world
• Keep the applications agile– Don’t make the code so specialised that it can only use one specific
installation – things will change!
• Make it easy for the users– Consider a community gateway/portal
Simplify authorisation/authentication Easy access to common codes (handle license issues) Relevant tutorials & documentation
Bob Jones - NEERI 09 17
Grids, clouds, supercomputers, etc.
Bob Jones - NEERI 09 18
Grids• Collaborative environment• Distributed resources (political/sociological)• Commodity hardware (also supercomputers)• (HEP) data management• Complex interfaces (bug not feature)
Supercomputers• Expensive• Low latency interconnects• Applications peer reviewed• Parallel/coupled applications• Traditional interfaces (login)• Also SC grids (DEISA, Teragrid)
Clouds• Proprietary (implementation)• Economies of scale in management• Commodity hardware• Virtualisation for service provision and encapsulating application environment• Details of physical resources hidden• Simple interfaces (too simple?)
Volunteer computing• Simple mechanism to access millions CPUs• Difficult if (much) data involved• Control of environment check • Community building – people involved in Science• Potential for huge amounts of real work
Many different problems:Amenable to different solutions
No right answer
Many different problems:Amenable to different solutions
No right answer
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
European E-Infrastructure Forum
• Forum for the discussion of principles and practices to create synergies for distributed Infrastructures
• Goal: seamless interoperation of leading e-Infrastructures serving the European Research Area
• Focus: needs of the user communities that require services which can only be achieved by collaborating Infrastructures
• Initial membership:– EGEE & EGI– DEISA & PRACE– Terena & GEANT
• Offers a way of interacting as a whole with user communities of a multi-national nature that are interested in making use of the Infrastructures
Bob Jones - NEERI 09 19
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Proposed next steps (1)
• Identify clear contact points between the ESFRI projects and e-Infrastructures– E-Infrastructure projects have been talking to individuals (users
or partners) Can we make contacts more official and identify contact points in
specific areas:• Security
• Data management
• Network
• Etc.
– These will be useful for establishing links between different ESFRI projects, between ESFRI projects and e-Infrastructures etc.
Bob Jones - NEERI 09 20
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Proposed next steps (2)
• Use these contacts to build matrix for technical requirements & organisational aspects
Bob Jones - NEERI 09 21
requirement CLARIN DARIAH/CESSDA
EISCAT3D
EPOS LIFEWATCH
ELIXIR XFEL CTA FAIR SKA
Singlesign-on
Persistent storage
Global
workflows
Virt Org
stds
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Proposed next steps (3)
• Once the matrix has been built it can be used to focus:
– Collaboration between ESFRI projects
– Collaboration between ESFRI projects and e-Infrastructures
– Provide input to roadmaps for e-Infrastructures of the future
– Provide input to national funding agencies and European
Commission on their future funding programmes
Bob Jones - NEERI 09 22
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667
Summary
Bob Jones - NEERI 09 24
The key added value of grid infrastructures is a framework for collaboration
• Global secure access to computing resources, data, software and resultsCPU power for computing-intensive tasksData management capabilities
Metadata and annotationSecurityReplicationHigh-speed data transfersFacilitate creation of distributed data repositories, data mining, indexing and search
Software servicesAvailability of open source softwareIntegration with commercial software packages
• Scalable and dynamic architecture which can be extended with additional services as required
• All organisations can participate AND contribute
The EGI operational model and SSCs are a candidate mechanism for CLARIN to interact with EGI