+ All Categories
Home > Documents > TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie...

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie...

Date post: 28-Mar-2015
Category:
Upload: melanie-reyes
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost
Transcript
Page 1: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

TRIUMF SITE REPORT

Corrie Kost

Page 2: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

LINUX at TRIUMF

Use ISO CD’s

Kickstart

Available

Auto

Updates

RH9 ServersDesktop

Yes Yes Yes(only errata / no new hardware)

Fedora

Core 1

Leading Desktop,Special Needs Servers

Yes Yes YesErrata – 18months

Scientific

Linux

Desktop.Future Servers & Desktops – Support !

Yes Yes Yes36 months for hardware, 60 months for errata by RH

TRIUMF urges proper support for Scientific Linux

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 3: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

WANReplacement MRV units (10Gb/sec capable)

Third Passport Router

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 4: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

WestGrid – UBC/TRIUMF Site

• 504 dual 3.06 GHz Xeon IBM blades• Red Hat Linux 9 to allow GPFS (NFS nixed)• OPENPBS Scheduling with (MOAB) Maui• 10 TB disk storage• 70 TB tape storage• Direct Gigabit connection between sites• Possible 10GB in future• February 2004 – opened for general use.

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 5: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

WestGrid – UBC/TRIUMF Site(www.westgrid.ca)

•From a cold start :•GPFS servers load in 5-10min•All nodes up on 60-90min

•Bring up single nodes – 10min•Rebuild (disk) for node – 2 hrs•Single node failure rate ~ 1/day•Node disk failures dominate•Utilization about 87%

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 6: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Network / Servers

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 7: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

ServersUpgradeProgram

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 8: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

LCG Grid Participant

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 9: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Hardware nice but…• 40pin IDE cable is a problem with 2.6 kernel• Mounting bracket screws can short audio & halt boot

High I/OTestbed

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 10: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

STORM1 & STORM2

• Dual 3.2 GHz Xeons

•4GB memory

•4 3WARE 8506-4LP

•16 SATA150 120GB DRIVES

•20GB ST92011A DRIVE

•INTEL 10GBE PXLA8590LR

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 11: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

High Speed I/O –Part 1

Used ext2 for highest speeds (no journaling, but 2TB file size limit)

RH 9 OneFour disk (writes) software RAID 0 3-Ware Controller50.6 , 98, 124, 141 MB/sec respectively. Four disks split over two 3-Ware controllers 162 MB/sec writes Four disks on 1 hardware raid 0 and software raid 0 138MB/sec writesAdding 4 more disks on second 3-Ware – 250 MB/sec (slots 2,5)

--247 MB/sec (slots 2,3)

Adding 4 more disks on third 3-Ware -- 273 MB/sec (slots 2,3,5)

-- 265 MB/sec (slots 2,3,4)

Adding 4 more disks on fourth 3-Ware -- 283 MB/sec (slots 2,3,4,5)

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 12: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

High Speed I/O- Part 2

• Using 4 3-ware in hardware raid 0 mode , software raided by Linux

• dd if=/dev/zero of=/raid/8GB bs=81920 count=104857

• Fedora1 – non-smp – 2.4.22-1.2188np1 HT ext2 -T news write 370 MB/sec

• Fedora1 – non-smp – 2.4.22-1.2188.np1 HT reiserfs write 227 MB/sec

• Loaded e2fs module 1.35-7.1 to fix -largefile and –largefile4 creation with mkfs –T largefile /dev/md0

• Fedora1 –non-smp – 2.4.22-1.2188npt1 HT largefile ext2 write 349 MB/sec

• Fedora1 –non-smp -2.4.22-1.2188npt1 noHT largefile ext2 write 300 MB/sec

• Fedora1 –non-smp – 2.6.6#1 HT largefile ext2 write 375 MB/sec

• Replaced 40 with 80 pin ide cable to main disk allowed SMP to boot

• Fedora1 –SMP – 2.6.6#1 noHT largefile ext2 write 309 MB/sec

• echo 262144 > /proc/sys/net/core/rmem_default

• echo 8388608 > /proc/sys/net/core/rmem_max

• echo 262144 > /proc/sys/net/core/wmem_default

• echo 8388608 > /proc/sys/net/core/wmem_max

• echo 300000 > /proc/sys/net/core/netdev_max_backlog

• echo 8388608 > /proc/sys/net/core/optmem_max

• sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"

• sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"

• sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"

• Iperf maxed out at 2.3Gbits/sec with recompiled 2.6.6 kernel for WEB100

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 13: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

High Speed I/O- Part 3

[root@storm2 root]# time ttcp -t -b 6000000 -l 102400 storm1-10g </raid/8gb-attcp-t: buflen=102400, nbuf=2048, align=16384/0, port=5001, sockbufsize=6000000 tcp -> storm1-10gttcp-t: socketttcp-t: sndbufttcp-t: connectttcp-t: 8589934592 bytes in 42.80 real seconds = 195978.14 KB/sec +++ttcp-t: 83887 I/O calls, msec/call = 0.52, calls/sec = 1959.80ttcp-t: 0.0user 22.2sys 0:42real 52% 0i+0d 0maxrss 0+25pf 17854+622csw

Ttcp disk to disk 191 Mbytes/sec

Three Walls : CPU - 100 % seen 3Ware I/O Controller (140MB/sec instead of 4*50, 375MB/sec instead of 4*140)10Gbit Intel Card using ixgb-1.0.65 driver (2.3 Gb/sec)

Ongoing: Tuning Process Affinity (using /usr/bin/run)Interrupt Affinity (IRQ of 3-ware and 10GbE set to CPU’s eg /proc/irq/24/smp_affinity)

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 14: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Misc. Developments

Build a cheap hot-swapSerial ATA drivesRaid 5 system•1 Promise Fasttrack S150 SX4 controller $233Can•3 Promise Superswap 1100 Drive Enclosures for SATA/150 $112Can•3 Maxtor 120GB S-ATA drives (6Y120M0) $145CanTest on cheap 1.7GHz Celeron, Intel D845GVSLR, 256Mb memoryRedhat 9.0 base (won’t work on updated kernels)• Read large file – 46.8 Mbytes/sec •Write large file – 46.5 Mbytes/sec•Able to pull disk while active – auto rebuilds in 75min when replaced.

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 15: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Misc. Developments

•Remote power on/off using networked power bars

www.servertech.com

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 16: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Mail at TRIUMF

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 17: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

IMP Webmail

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost

Page 18: TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost TRIUMF SITE REPORT Corrie Kost.

Squirrel Webmail

TRIUMF Site Report for HEPiX, Edinburgh, 24-28 May 2004 – Corrie Kost


Recommended