Date post: | 13-Dec-2015 |
Category: |
Documents |
Upload: | magnus-mckinney |
View: | 215 times |
Download: | 0 times |
4th December 2003 Wolfgang von Rüden, IT Update for Focus 1
Update on ongoing IT activities
FOCUS December 2003
Thanks to all contributors from IT groups
Wolfgang von Rüden
4th December 2003 Wolfgang von Rüden, IT Update for Focus 2
IT Department Structure 2004
Infrastructure and General Services
Administrative Information Services
Communication Systems
Databases
Internet Services
Product Support
User and Document Services
Physics Services
GRID Deployment
GRID Middleware
Fabric Infrastructure and Operations
Architectures and Data Challenges
Control Systems
Projects
LHC Computing GRID
EGEE
Joint Controls Project (JCOP)
LHC Communications Infrastructure (ComIn)
Department Head’s Office Departmental Planning, Admin Services, Computer Security,
major projects( LCG, EGEE, openlab)
DG’s office
4th December 2003 Wolfgang von Rüden, IT Update for Focus 3
Computer Security Update
• See Linux slide for URGENT kernel patch• Off-site FTP closure: 20 January 2004
– Details at http://cern.ch/security/ftp – Announced in cern.computing, Weekly Bulletin and emails sent
to the registered responsible person of ftp servers visible in the firewall
• Hardware address registration enforcement for portables– Implemented for the whole site since 2 Dec 2003
• Pre-Xmas security campaigns are in progress– systems placing the site at risk will be disconnected from the
network before the Xmas shutdown (as for previous years)• AFS password expiry enforcement
– Being tested within IT Division now– Planned for rest of CERN early next year– 60 days warning (no grace period)
4th December 2003 Wolfgang von Rüden, IT Update for Focus 4
Architecture and Data Challenges Group (ADC)
• AFS:– ~10% of total AFS capacity is now being served from
standard CERN Linux diskservers (mainly for scratch space). No stability or performance problems observed.
• OpenLab:– New: ORACLE joined OpenLab (see Press Release)– IBM StorageTank is undergoing stability & scaling tests,
to serve as CASTOR disk layer in a mock Data Challenge in Spring 2004
– LCG software is being ported to Itanium/Linux– In collaboration with CS:
• Internet2 landspeed record broken (to Starlight/Chicago, 5.4Gb/s
• With ATLAS: First ever Ethernet over SONET (to Ottawa)
4th December 2003 Wolfgang von Rüden, IT Update for Focus 5
Architecture & Data Challenges, cont.
• CASTOR– New developments
• New stager prototype with LSF scheduling demonstrated in October. Progress report available at http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html
• Gridification: CASTOR SRM (Storage Resource Manager) is in production.
– Includes the “copy” method (third party SRM SRM). – Interoperability with FNAL has been demonstrated
– Support• Linux tape driver support is being handed over to linux team
with a lot of help from Jean-Philippe Baud• Significant progress in HEPCCC discussion for support to
external institutes. Formal agreement expected soon
4th December 2003 Wolfgang von Rüden, IT Update for Focus 6
Architecture & Data Challenges
• Linux:– An urgent security upgrade is required for *all* Linux
kernels. Please ensure this information reaches the relevant people in your community. See http://cern.ch/security
– Certification of Red Hat 10 (now “Fedora”) stopped– CERN Linux 7.3 will be supported until end of 2004– HEP-wide (Fermi, SLAC, CERN) negotiations with RedHat
over RHE-3 are difficult– now evaluating collaboration with Fermi on a HEP-
supported recompiled version of RHE-3.
4th December 2003 Wolfgang von Rüden, IT Update for Focus 7
Communication Systems (CS)
• GSM Mobile Telephone upgrade:– Operator change on 5th January 2004– New SIM cards and copiers are being distributed, the
process is going smoothly– There will be NO recorded message to external callers if
they dial the old number. Therefore, users must inform their correspondents from now on that the prefix will change:
• Current prefix: +41 79 201 XX XX• New prefix: +41 76 487 XX XX
• Internet Landspeed Record broken during Telecom 2003 (5.4 Gb/s CERN-Chicago)
4th December 2003 Wolfgang von Rüden, IT Update for Focus 8
Database Group (IT-DB)
LCG Persistency Framework Project• POOL
– Internal LCG and LHCC review of the project– Successful validation of POOL on LCG-1 platform– Latest release: POOL 1.4.0
• significant improvements in handling of Castor requests– Next release POOL (this year) will support LCG-2
• ConditionsDB ( mailto:[email protected] )– Project has started to pickup existing implementations– First round of requirement discussions with the experiments– Kickoff Workshop Dec 8-9 593-R-010
• 2004 Workplans for POOL & ConditionsDB currently being defined– RDBMS vendor independence and ARDA integration
4th December 2003 Wolfgang von Rüden, IT Update for Focus 9
Database Group (cont)
Physics Services• RLS service for LCG
– Ongoing work on enhancing the service availability• Semiautomatic Application Server failover achieved
(via DNS alias switching)– No service interruption required for planned interventions– Studying solutions for fully automated service fail-over
• Hot standby database server in place– Service interruption down to a few minutes for back-end interventions
– Planning RLS service migration to LCG-2 with IT-GD and IT-GM• New EDG-RLS 2.2 version will required data migration and POOL
upgrade• Distribution kits for RLS
– Working with CNAF on test installation of RLS outside CERN• Other Tier1 centres interested (Karlsruhe and RAL)
– Request (CMS) for replication of RLS data between a few Tier1 sites
• Proof-of-concept prototype underway using Oracle replication• Stop-gap until RLS middleware supports replication
4th December 2003 Wolfgang von Rüden, IT Update for Focus 10
Database Group (cont)
Physics Services• Harp Migration
– All events and conditions data migrated– Oracle based Software ready to be integrated into Harp
software release– Condition data stored using current version of Oracle
implementation• COMPASS
– First tests of local server in Trieste using Oracle distribution as prepared for RLS
– Next steps will be agreed between IT and COMPASS after careful consideration of implications, particularly on support and data import/export
• Sun cluster for Physics– all production physics applications migrated from the
CERNDB1 cluster to the dedicated physics cluster
4th December 2003 Wolfgang von Rüden, IT Update for Focus 11
Fabric Infrastructure and Operations (FIO)
• More intelligent batch scheduling– Scheduling of 3rd job per host now conditional on load
• Successful startup of insourced System Administration team– Training during October– Linux batch services managed from November– Solaris services from December 1st; disk & tape servers
from the new year.
4th December 2003 Wolfgang von Rüden, IT Update for Focus 12
Grid Deployment (GD)• LCG-1 is now deployed to 24 sites including many Tier 2s
– Relatively few CPU’s until experiments request the regional centres to add them
• Experiments all testing LCG-1 now– Middleware stability much improved– Working on site configuration issues
• Preparations for LCG-2 in hand – – Middleware release Dec 5, Deployment during December– Addresses Mass Storage access, gcc3.2.x, bug fixes– Preparation for the 2004 data challenges, very tight schedule
• Due to compressed timescale will be addressing operational issues in parallel with running the service– Idea is to run LCG-2 as a stable service, no major (disruptive)
upgrades, but focus on bug fixes, stability, and essential functionality
– Foresee a lifetime of 1 year for LCG-2
4th December 2003 Wolfgang von Rüden, IT Update for Focus 13
Sites in LCG-1 – Nov 25
4th December 2003 Wolfgang von Rüden, IT Update for Focus 14
Internet Services Group (IS) – October/November 2003
• Mail Migration finished• Two additional SMTP gateways added to the
legacy infrastructure to absorb increased traffic (now 2 million msg/week)
• Listbox delivery delay problem solved, one additional machine added, monitoring improved
• Web access to DFS file system now available– Using normal browser or webdav– https://dfs.cern.ch
• Backlog of security patch deployment reabsorbed• DFS home directory server hardware renewed
4th December 2003 Wolfgang von Rüden, IT Update for Focus 15
• CVS Service– 54 CVS repositories successfully running on the public
CVS service.• Problem with external access solved.
– Used port 2000 for kerberos authenticated access for security– Firewall assumed that port 2000 running skinny protocol (IP telephony
related)• We have requested users to change to the standard CVS ports.
– CVS service for the LCG being validated with help of Andreas Pfeiffer and Alberto Aimar.
• It should be ready for production soon.
• Solaris– Sun Fire V210 dual 1GHz UltraSPARC-IIIi machines for
the technology refresh of SUNDEV in production since 26-Nov.
• No major problem found.• Performance at least double of the current machines.
Product Support (PS)
4th December 2003 Wolfgang von Rüden, IT Update for Focus 16
European DataGrid Project
• Provides software components and support for LCG-1 and LCG-2– Final release (EDG 2.1) deployed on application testbed– Components of EDG 2.1 included in LCG-2
• Final deliverables (documents) currently under review– Include assessments by LHC experiments, bio-informatics,
earth observation groups and security experts– Will provide important input for EGEE
• Final EU review scheduled– 19th and 20th February 2004 @ CERN
• Project extended until March 2004– Allows smooth transition to EGEE
4th December 2003 Wolfgang von Rüden, IT Update for Focus 17
EGEE: Enabling Grids for E-Science in Europe
• Create a European-wide Grid production quality infrastructure for multiple sciences
• Leverage national resources in a more effective way for broader European benefit
• 70 leading institutions in 27 countries organised into regional federations
4th December 2003 Wolfgang von Rüden, IT Update for Focus 18
EGEE Implementation Plans
• From day 1 (1st April 2004)– Initial service will be based on the LCG infrastructure
running the LCG-2 grid middleware suite
• In parallel develop a “next generation” grid facility– Produce a new set of grid services according to evolving
standard (OGSI)– Run on a development service providing early access to
the applications for evaluation purposes– Will replace LCG-2 on production facility by early 2005
4th December 2003 Wolfgang von Rüden, IT Update for Focus 19
EGEE ActivitiesEGEE Activities
JRA1: Middleware Engineering and Integration
JRA2: Quality Assurance
JRA3: Security
JRA4: Network Services Development
NA1: Management
NA2: Dissemination and Outreach
NA3: User Training and Education
NA4: Application Identification and Support
NA5: Policy and International Cooperation
24% Joint Research 28% Networking
SA1: Grid Operations, Support and Management
SA2: Network Resource Provision
48% Services Emphasis in EGEE is on operating a productiongrid and supporting the end-users
Starts 1st April 2004 for 2 years (1st phase) with EU funding of ~32M€
4th December 2003 Wolfgang von Rüden, IT Update for Focus 20
EGEE StatusEGEE Status• Negotiations have been successfully completed with the EU
and planning for the transition to EGEE is now underway
• Rapid start-up so EGEE can “hit the ground running” is
essential to ensure it can meet milestones
– Management team is in place
• Currently working to define execution plan for first few months
• Prototyping work will start in December
– Hiring process is underway
• CERN has announced posts and is organising recruitment boards
• Other major partners are also opening posts
• Starts April 2004 for 2 years with EU funding of ~32M€
– Possibility of 2nd phase if successful
4th December 2003 Wolfgang von Rüden, IT Update for Focus 21
Some additional remarks• 2003 was another very busy year• Many services have been improved by migrating
them to more modern hardware• Successful completion of CERN School of
Computing and the first Grid School• IT found a lot of additional funding, from Member
States, non-Member States, European Union and Industry
• Still need to improve communication with users• 2003: Thanks for all the feedback and the
collaborative spirit of most users• 2004: Continue the good relations with users,
improve the bad ones.