M.Kunze, ACAT2003, Tsukuba
ACAT2003
Session 1 Summary Talk: Computing Technology
and Environment for Physics Research
Marcel Kunze
Institute for Scientific Computing (IWR)
Forschungszentrum Karlsruhe GmbH
M.Kunze, ACAT2003, Tsukuba
Topics (33 Talks and Round Table Discussion)
Parallel Computing Technologies and Applications (7 Talks)
Data Fabric and Data Management (5 Talks)
Online Monitoring and Control (3 Talks)
Advanced Analysis Environments (8 Talks)
Innovations in Software Engineering (5 Talks)
Graphic User Interfaces, Common Libraries(5 Talks)
It is almost impossible to treat everything in this talk: It will be necessary to focus on specific topics.
M.Kunze, ACAT2003, Tsukuba
Computing Platforms Evolution
At the Oberammergau workshop (1993), I had given a summary talk labeled „Computing in the 90‘s“. We saw the transition from mainframes to workstations and I emphasized the importance of the client-server model.
Today, platforms and applications are far more
Advanced
Powerful
Dynamic
Complex
This talk is about distributed computing and the importance of Grid Services.
M.Kunze, ACAT2003, Tsukuba
A little bit of History
M.Kunze, ACAT2003, Tsukuba
Grid Computing
Grid Computing has emerged as an important new field, distinguished from conventional distributed computing by ist focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation.
Foster, Kesselman, Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, 2001
For us this means:
Provision of common tools, frameworks, environment, data persistency
Exploiting the resources available to experiments in computing centres, physics institutes and universities around the world
Presenting this as a reliable, coherent environment for the experiments
The goal is to enable the physicist to concentrate on science, unaware of the details and complexity of the environment they are exploiting
M.Kunze, ACAT2003, Tsukuba
The Golden Rule
Keep it simpleAs simple as possibleNot any simpler
- Einstein
M.Kunze, ACAT2003, Tsukuba
Middleware
The tools that provide functions that are of general application ..
…. not HEP-special or experiment-special
and that we can reasonably expect to come in the long term from public or commercial sources (cf internet protocols, unix, html)
M.Kunze, ACAT2003, Tsukuba
Grid Services
M.Kunze, ACAT2003, Tsukuba
Open Grid Services Infrastructure (OGSI)
Distributed applications are made of software components
Grid Services are an extension of Web Services Discovery Dynamic service creation Lifetime management Notification
OGSI defines a set of standardized interfaces and protocols
Currently available OGSI implementations Unix: Globus Toolkit 3, OGSI::Lite (Perl), pyGridWare (Python) Windows: OGSI.NET (Virginia Univ.); MS.NETGrid (EPCC)
GT3 has been evaluated by LCG (Talk: M. Lamanna) Generally impressed with GT3 and the overall concept GT3 IndexService: totally new, looks well designed Information system and GRAM (critical parts of the GLOBUS kit) have
problems of scalability and reliability
M.Kunze, ACAT2003, Tsukuba
Data Grids: Architecture
Replica Consistency Service in a Data Grid (Talk: G.Pucciani)
Replication of data increases system performance
Problem: Consistency management is an issue in applications where users can modify replicas
Integrating SRB with the GIGGLE/EDG framework
(Talk: S.Metson) Active collaboration between members of CMS, BaBar and the
SDSC SRB Group
Problem: Could files stored in SRB be accessed by LCG tools?
Data discovery component is well understood
Full interoperation requires further development effort
Implementation of corresponding Grid services is planned
M.Kunze, ACAT2003, Tsukuba
Grid Tools: Monitoring
Configuration Monitoring Tool for Large Scale Distributed Computing (Talk: Y.Wu)
track and query site configuration information for large-scale distributed CMS applications
Plans to rework the tool as a Grid service
M.Kunze, ACAT2003, Tsukuba
Grid Tools: Monitoring
MonALISA: Monitoring Agents using a Large Integrated Services Architecture (Talk: I.Legrand)
Dynamic registration and discovery & subscription mechanism
Adaptability and self-organization
M.Kunze, ACAT2003, Tsukuba
Distributed Systems: Simulation
MONARC simulation framework (Talk: I.Legrand) Modelling of large scale distributed computing systems
Design tool for large distributed systems
Performance evaluation
M.Kunze, ACAT2003, Tsukuba
Distributed Physics Data Analysis
Most HEP experiments are developing frameworks for distributed computing (M.Burgon-Lyon, G.Garzoglio:CDF,D0; P.Elmer,A.Hasan:BaBar; I.Adachi, G.Moloney:Belle; L.Taylor:CMS; A.J.Peters:ALICE)
Various workable solutions exist
Sometimes parallel and non-compatible effort
Importance of standardization
ARDA: Architectural Roadmap towards Distributed AnalysisRTAG11: http://www.uscms.org/s&c/lcg/ARDA/
Common Grid analysis architecture for all LHC experiments
OGSI compliant
Concern: Analysis activities require chaotic access to resources by a large number of potentially inexperienced users (Professors)
Component-by-component deployment and avoiding big-bang releases are critical parts of the implementation strategy
Recommendation: Prototype based on AliEn (Talk: A.J.Peters)
M.Kunze, ACAT2003, Tsukuba
Advanced Analysis Environments
M.Kunze, ACAT2003, Tsukuba
Advanced Analysis Environments
M.Kunze, ACAT2003, Tsukuba
Advanced Analysis Environments
M.Kunze, ACAT2003, Tsukuba
General Re-Use of Components and Services (95%)
M.Kunze, ACAT2003, Tsukuba
Interactive Physics Data Analysis
Issues Typical interactive requests will run on o(TB) distributed
data
Transfer/replication times for the whole data about one hour
Data transfers once and in advance of the interactive session
Allocation, installation and set-up of corresponding database servers before the interactive session
Integration of user-friendly interactive access
M.Kunze, ACAT2003, Tsukuba
Interactive Physics Data Analysis
M.Kunze, ACAT2003, Tsukuba
Parallel ROOT Facility: PROOF
Local
Remote
Selection
Parameters
Procedure
Proc.C
Proc.C
Proc.C
Proc.C
Proc.C
PROOF
CPU
CPU
CPU
CPU
CPU
CPU
TagDB
RDB
DB1
DB4DB5DB6
DB3
DB2
Talk: Fons Rademakers
M.Kunze, ACAT2003, Tsukuba
PROOF: Actual Development
M.Kunze, ACAT2003, Tsukuba
PEAC System Overview
M.Kunze, ACAT2003, Tsukuba
Common Libraries: SEAL SEAL Project Overview (Talk: L.Moneta)
SEAL has delivered basic foundation, utility libraries and object dictionary
The first version of the Component Model and Framework services is available
Scripting based on Python
M.Kunze, ACAT2003, Tsukuba
Common Libraries: PI Physics Interface Project, Status and Plan (Talk: A.Pfeiffer)
Analysis Services components written in Python
Prototypes available to implement AIDA interface for HippoDraw and ROOT
M.Kunze, ACAT2003, Tsukuba
Graphic User Interfaces: QtRoot
Cross-platform approach to create the interactive application based on ROOT and Qt GUI libraries (Talk: V.Fine)
Qt package from TrollTech AS is a multi-platform C++ application framework that developers can use to write single-source applications that run-natively-on Windows, Linux, Unix, Mac OS X and embedded Linux.
A lot of Qt widgets available for re-use
Qt is the basis for the KDE desktop
Consolidation of Root Graphics(TGQt vs. TGWin32,TGX11,TGWin32GDK)
Example: A fragment of STAR “Event Display” QtGLViewer class based viewer
see: http://www.rhic.bnl.gov/~fine/EventDisplay )
M.Kunze, ACAT2003, Tsukuba
Fabric Area
M.Kunze, ACAT2003, Tsukuba
Data Fabric and Data Management
Need for powerful, high throughput systems
Storage Area Networks GridKa scalable IO design based on fibre channel technique
(Talk: J.van Wezel) Infiniband yields 800 MB/s (Talk: U. Schwickerath)
Need for powerful trigger systems to reduce data Realtime analysis for the ALICE HLT (Talk: C.Loizides)
Need for powerful clusters and networks for online event reconstruction and distributed analysis
Realtime event reconstruction farm for Belle (Talk: R.Itoh) A basic R&D for an analysis framework distributed on wide area
network (Talk: H. Sakamoto)
New methods for data integration and management Grid portal based data management for lattice QCD data
(GENIUS Talk: G.Andronico)
M.Kunze, ACAT2003, Tsukuba
Round Table DiscussionD
. L
aforenza
M.Kunze, ACAT2003, Tsukuba
M.Kunze, ACAT2003, Tsukuba
M.Kunze, ACAT2003, Tsukuba
It took 200 Years to develop electrical Grids
M.Kunze, ACAT2003, Tsukuba
Open Questions
Is the far-reaching vision offered by Grid Computing obscured by the lack of interoperability standards among grid computing technologies ?
Should the next few years be considered as a transition period with multiple prototypes in competition to speed up the development ?
M.Kunze, ACAT2003, Tsukuba
How to design Grid-aware Applications?
Make developers and users aware of network based applications
Need to think about new abstract programming models
Development of new programming techniques and tools that specifically address the Grid and encompass
Heterogeneity
Distributed computing aspects of Grid programming
M.Kunze, ACAT2003, Tsukuba
CrossGrid: Tools for easy Use of the Grid
M.Kunze, ACAT2003, Tsukuba
CrossGrid: Migrating Desktop
Idea Save and resume a user grid session
Look and feel of a windows desktop
Implementation Roaming Access Server and Clients
Java Web Services (Portability)
Integration of Tools Job submission wizard
Job monitoring dialog
GridExplorer dialog
GridCommander dialog
M.Kunze, ACAT2003, Tsukuba
Outlook
Scaling of Fabric Infrastructure Cheap commodity components vs. High-tech solutions (e.g. SAN)
Note: Each service needs an operator
Total cost of ownership has to take into account infrastructure & manpower
What will be the business model for the Grid market place of resources ? Unlimited access ? Credit points ? Cash ?
ARDA prototype will push development of Physics applications in a distributed environment
What will the production environment look like?
Components will be based on Grid Services
Open Grid Service Infrastructure is the common denominator
Rapid prototyping and user feedback is essential !
Concern: Users only change their paradigm of working if they see added value (better results, faster turn-around, additional resources etc.) !