+ All Categories
Home > Documents > HEPiX Spring 2008 @ CERN -Summary Report

HEPiX Spring 2008 @ CERN -Summary Report

Date post: 19-Jan-2016
Category:
Upload: orinda
View: 29 times
Download: 0 times
Share this document with a friend
Description:
HEPiX Spring 2008 @ CERN -Summary Report. HEPSysMan @ RAL 19-20 June 2008 Martin Bly. Overview. Venue/Format/Themes CPU Benchmarking Working Group Storage and File Systems Working Group Scientific Linux Selected topics. Spring HEPiX 2008. Venue: CERN - 5 th to 9 th May - PowerPoint PPT Presentation
Popular Tags:
13
HEPiX Spring 2008 @ CERN - Summary Report HEPSysMan @ RAL 19-20 June 2008 Martin Bly
Transcript
Page 1: HEPiX Spring 2008 @ CERN -Summary Report

HEPiX Spring 2008 @ CERN -Summary Report

HEPSysMan @ RAL19-20 June 2008

Martin Bly

Page 2: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Overview

• Venue/Format/Themes• CPU Benchmarking Working Group• Storage and File Systems Working

Group• Scientific Linux• Selected topics

Page 3: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Spring HEPiX 2008

• Venue: CERN - 5th to 9th May– Council Chamber

• Very comfortable, good wireless network access

• Format– Sessions based on themes with a morning

‘plenary’ by an invited speaker– ½ to 1 day per theme

• Agenda: http://indico.cern.ch/conferenceTimeTable.py?confId=27391

Page 4: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Themes• LHC and Data Readiness• LHC overview• Trigger farms of LHC experiments• LCG overview and status• CCRC• Site Reports• Storage technology• CPU technology• Data centre management, availability, and reliability• Problem resolution, problem tracking, alarm systems• System management• Networking infrastructure and computer security• Applications and Operating systems• HEPiX ‘bazaar and think-tank’• General Virtualisation• Grid stuff (Monitoring etc.)• Miscellaneous

Page 5: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Benchmarking Working Group• WLCG MoUs based on SI2K

– SPEC2000 deprecated in favour of SPEC2006• no longer available and maintained

• Remit– Find a benchmark accepted by HEP and others as many sites serve different

communities• Review of existing benchmarking practices (CERN, FZK, INFN, …)• Last 6 months: setup of benchmarking test-bed with dedicated HW at CERN and

others– Covering of wide range of processors with typical HEP configuration (2GB/core)– Run SPEC benchmarks with agreed flags

• SL4/64bit OS with benchmarks at 32-bit/gcc 3.4• Look at SL5, 64-bit, gcc4

– Run of variety of ‘standard candles’ from LHC experiment’s code to compare with SPEC

• Provides scaling and recalibration of computi9ng requirement

• Looking at understanding the statistical treatment of experiment results– Recently uncovered different methodologies for random numbers!

• No major scaling problem with either SI2K or SI2K6– Should allow a smooth transition

Page 6: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

File Systems Working Group• Started with a questionnaire about storage at T1s• Followed up with a technology review and selection

- Posix FS (TFA) : LUSTRE, GPFS, AFS- SRM : CASTOR, dCache, DPM- Xrootd

• Performance comparison between selected technologies• Testbed setup at CERN with 10 servers and 60 8-core clients

with 1 Gb/s connection, 4-5 6TB– 480 simultaneous client tasks– 3 tests : writing, sequential read, pseudo-random read– Most implementations able to sustain wire-speed in writes and

sequential reads– Significant performance advantage for LUSTRE in pseudo-random

reads but must clarify test conditions• Use case may be an advantage for LUSTRE client-side caching

Page 7: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Scientific Linux

• Review of recent releases: SL5.1, SL4.6– Trying to put the 64bit versions out at the same time as the

32bit versions

• Obsolete 3.0.1 to 3.0.8 • Description of issue with ‘new’ tags in version

numbers appearing to make new versions appear older to yum

• Working on automating ‘fastbugs’ repositories• Clarifying policy on security errata• Future:

– SL3.0.9 to continue till October 2010. – Planning on doing SL4.7, SL5.2, SL6.

Page 8: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

SL discussions• Support for SL4?

– RHEL4: full support 3 years, deployment support 3-5 years, maintenance support for 5-7 years. – RHEL released Feb 2005 so in deployment support

• Critical that Grid middleware is available– DESY need SL4 to Autumn 2011– CERN intending to introduce batch and UIs for SL5 in Autumn 2008, so WN gLite payload should be

available• Some concern over experiment readiness

– Compiler is the important factor rather than the actual version of SL– Encourage shorter deadlines with more flexibility on extending deadlines – likely to get better buy-in

from users– So suggest July 2010? Suggestion of October 2010, same as SL3, to stop short-term migration.

• XFS in SL?– In or out? Consensus is to have it in using the usual kernel module system. Jan Iven hears from

unreliable source that back-ports of latest version are coming. SL4 or SL5? SL4 contrib, SL5 standard. Does it work with 32bit? Yes, kernel now less hostile.

• Scientific Linux 6: Should it be based on CentOS?– Still do installer changes– Still add RPMs we usually do– Use precompiled RPMs– Change/recompile RPMs we feel the need to (SL graphics).

• Kernels modules: Adding security repo during the install gets the correct kernel but incorrect modules. Can fix installer, fix up afterwards with a script, or use dkms. Add dkms to release, do it instead of kernel modules?

• Stop Press: RHEL 4 lifetime extended: ‘full support’ for 4 years…

Page 9: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Selected Topics I

• Well attended talk by Sascha Brawer from Google, describing their technology and methods for handling very large datasets over distributed geographical locations

• Based on truckloads of low cost systems– Care about performance per $ not raw performance– In house rack design, chassis-less PC-class

motherboards, low end storage– Many data centres around the world– Need to design software to cope with failures

Page 10: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Selected Topics II

• Several talks on experiences with Lustre– DESY – good description of setting it up– GSI – talk about production use– Lustre appears stable and reliable as a

production distributed file system• Proof against various failure modes

• Sverre Jarp gave a review of the CERN OpenLab and what they are working on– Collaboration with HP, Intel, Oracle…

Page 11: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Experience with Windows Vista at CERN

• Update on Vista activities at CERN– status, plans etc. – Using readiness check to determine suitability,

Vista not the default (XP). – Now 300 machines (~5%) running Vista. – Notes on introduction of SP1– Feb 2008: still preparing for the upgrade rollout.

RFM removed in favour of popup nagging. – Vista SP1 improved performance over XP or

standard vista, but not by much in most cases.

Page 12: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Virtualisation with Windows at CERN

• Review of virtualisation in IT services at CERN

• 17 physical servers with 45 ‘clients’ ranging through Windows server variants and SLC4/5– Using Virtual Server 2005

• New Hyper-V – part of Windows Server 2008, needs 64bit CPU– Supports 32/64bit guests, large RAM

(>32GB) in VMs

Page 13: HEPiX Spring 2008 @ CERN -Summary Report

20 June 2006 HEPiX Spring 2008 Report - HEPSysMan @ RAL

Remote Administration via Service Modules

• Work at GSI on using IPMI modules to administer remotely located server hardware– Disadvantages of remote access using

standard tools, not the least of which is you need a running OS.

• Discussion of advantages of using IPMI modules for remote control– changing BIOS settings, resets, installing…– Detailed description of capabilities.


Recommended