+ All Categories
Home > Documents > Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 -...

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 -...

Date post: 12-Jan-2016
Category:
Upload: mitchell-goodwin
View: 216 times
Download: 0 times
Share this document with a friend
37
Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer
Transcript
Page 1: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Disclaimer

Page 2: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Who the Heck Are You?

● Joe Kaiser● Fermilab Employee● Computing Division/Core Services Support● Scientific Computing Support – CD/CSS-SCS● Lead sys admin for US-CMS● Sisyphus● Socrates – Gadfly

Page 3: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

US-CMS Linux Grid Installations with Rocks

Page 4: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Installation, Configuration, Monitoring, and Maintenance of

Multiple Grid Subclusters

Page 5: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Managing Heterogenous Use Grid Clusters with a Minimum of Fuss,

Muss, and Cuss for US-CMS

Page 6: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Managing Multiple Grid Environments within the US-CMS

Grid

Page 7: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Managing Grid Clusters So People Will Go Away and Let You Get

Some Real Work Done

Page 8: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

TPM: Total Provisioning and Management

Page 9: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

TPM

Page 10: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

TotalPajama

Management

Page 11: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

THIS COULD BE YOU!!!!

Page 12: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

OR YOU!!!!

Page 13: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Deceptively Simple Idea

● Remotely Manage:● Power up/power down● Interact with BIOS● Install – interact with install● Configuration

● Never touch another floppy or cdrom● Minimize human machine interaction.● Anytime – Anywhere: (We already do...sorta.)

Page 14: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Why TPM?

● Commodity hardware creates more machines but no sysadmins to take care of them.

● Ratio of machines to sysadmins is increasing● Sys admin = It's up and functioning

– Low level skill.● Sys admin = Manager of large Grid enabled

resources– Interesting systems problems

Page 15: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

TPM and The US-CMS Computing Facility Story

● Where we were.● Where we are and how we got here.● Where we are going.

Page 16: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Tier2 Center

Online

System Offline Farm

CERN

Computer

CenterFermila

b

France

Regional

Center

Italy Regional

Center

Germany

Regional

Center

InstituteInstituteInstituteInstitute

~0.25TIPS

Workstations

~100 MBytes/sec

~100 MBytes/sec

~2.4

Gbits/sec

100 -

1000

Mbits/sec

Bunch crossing per 25

nsecs.

100 triggers per

second

Event is ~1 MByte in

size

Physicists work on analysis

“channels”.

Each institute has ~10 physicists

working on one or more channels

Data for these channels should be

cached by the institute server

Physics data

cache

~PBytes/sec

~622

Mbits/sec

Tier2 CenterCALTECH

U. Of Florida

~622 Mbits/sec

Tier 0 Tier 0

Tier 1 Tier 1

Tier 3Tier 3

Tier 4Tier 4

UC San

DiegoTier 2Tier 2

Data Grid Hierarchy (CMS)

Page 17: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Where we were

● 1998-99ish? ● One Sun E4500 and a couple of Hitecs● Solaris, FRHL 5.2?

● 1999-2000ish● 1 Sun, 3 Dells 6450's + Powervaults● Solaris, RH with Fermi overlay

● 2000ish ● 1 Sun, 3 Dells, 40 Atipa farm nodes● FRHL 6.1 NFS installs

Page 18: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Cont'.

● 2000 -2001ish● 1 Sun, 3 Dells, 40 farm nodes, 3 IBM eSeries w/disk.● 2.6: RH6.1, AFS, and bigmem don't mix.● Mixed mode machines start.

● 2001-2002ish● See above, 3ware terabyte servers, raidzone, 16 farm

nodes.● RH6.2 because of Objectivity● Rise of the subcluster

Page 19: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Where we are

● Hardware● 65 - dual Athlons +1900● 16 – Dual 1Ghz PIII ● 40 – 750 Mhz PII● 4 - Quad Dells● 7 – 3Ware servers● 3 – Quad IBMS● 2 Suns● Total is 137 machines ● 2 Admins

Page 20: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Cont'

● Software– 6 Subclusters three more requested:

● DGT, IGT, PG, UAF, dCache, LCG-0● Requested: LCG0-CMS, LCG-1, LCFGng● Combination of RH6.2 and RH7.3.

– Install/Configuration mechanisms● Rocks, Fermi Network installs, voldemort, YUM

– Fermi software – dfarm, fbsng, dcache, ngop– Grid software – VDT soon to be 1.1.8

Page 21: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

R&D1 TB

PRODUCTION CLUSTER

(>80 Dual Nodes)

US-CMSTESTBED

ENSTORE(17 DRIVES)

250GB

ES

NE

T (

OC

12)

MR

EN

(O

C3)

DCACHE (>7TB)USER ANALYSIS

CISCO6509

Current configuration of computing resources

Page 22: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

The Impetus

● Frustrations:● FRHL didn't work on CMS resources

– Dells, Farm nodes, mixed mode clusters, Mosix● Couldn't FRHL install to Tier2.● !^$%@# floppies!

● Desires:● One BIOS2LOGIN install, config and management● Dynamic partitioning of worker nodes.● Work with my small cluster and future large one.● No more floppies!

Page 23: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

The Four Pillars of TPM

● Power control● Serial console● Network installation and configuration● Monitoring

Page 24: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Power Control

● APC's● Power on, off, reboot nodes or nodes on a

controller.● Scriptable● On private subnet

Page 25: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Serial Console

● BIOS redirection – 115200● SCS Lantronix – port set to 115200● Boot loader/kernel append set to 115200● Conserver software for central management

Page 26: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Network Install Mechanisms

● FRHL Network installs – used at work

– Farm Team and CMS difficulties.

● OSCAR – used at conference– Too little, too late, gotta know too much

● SystemImager – used for mixed mode● Rocks

– Tier-2's were using and looking to us for distro.

– Needed to do more than what I had.

– Had to apply to future problems – not stop gap.

Page 27: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Rocks Is:

● Reliable, fast enough, cheap, simple, scalable.● Allows for remote installation part of B2L admin.● Portable – can build own distributions.● Opensource, active community.● Complete cluster management or part of suite.● Allows for middleware to be built on top.

Page 28: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Rocks Philosophy

● Make clusters easy – when in doubt reinstall.● Sys admins aren't cheap and don't scale● Enable non-experts to build clusters

● Use installation as common mechanism to manage cluster

● All nodes are 100% automatically installed– Zero “hand” configuration– Everything you need for a cluster

Page 29: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Cluster Software Management

Software Packages● RPMs

– Standard Red Hat (desktop) packaged software

– Or your own addons

● Rocks-dist– Manages the RPM repository

– This is the distribution

● RedHat Workgroup

Software Configuration● Tuning RPMs

– For clusters

– For your site

– Other customization

● XML Kickstart– Programmatic System Building

– Scalable

● Appliance Creation

Page 30: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Basic Installation

● Fully automated cluster deployment

1. Get and burn ISO CD (DVD for IA64) image from http://www.rocksclusters.org

2. Boot frontend with CD/DVD3. Fill out 7 configuration screens (mostly Red Hat)4. Reboot frontend machine5. Integrate compute nodes with insert-ethers6. Ready to go!

● Complete out of the box solution with rational default settings

● Identical environment for x86 or IA64

Page 31: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Fermi Extends It

● Two words: Kerberos and Networking● Create out own distribution without disturbing

Rocks. Pass down to Tier 2.● XFS Fermi Rocks for 3Ware Servers● Add appliances, RPMS, Workgroups

Page 32: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Node Node Node Node Node

Node Node Node Node Node

Front-end Node

Network attached StoragedCache

ENSTORE

Web servers

Db servers

Dynamic partitioning and configuring the farm

CISCO 6509

Front-end Node

User interactive/batch

Production

Page 33: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Dynamic Partitioning

● Move hardware resources to different subcluster.● Crudely possible with Rocks and voldemort.● VOLDEMORT rsync utility by Steve Timm. ● Install as far as possible with Rocks, critical

differences between worker nodes can be pushed/pulled and then rebooted behind another server with different disk.

● YUM for updating.

Page 34: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

The Fourth Pillar – Monitoring

● What is happening on my cluster?● NGOP● Ganglia

● Grid Monitoring● Open protocols, open interfaces

● Information must be presented in ways that are conducive to answering the above quickly.

Page 35: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Where are we going?

● Farm Management Project● Produce multiuse clusters (grid, interactive, and batch) with

common infrastructure.

● Tools needed - ● TPM tools – install, configure, monitor, guarantee● Partitioning meta-tools.● Resource and process management (FBSNG is current)

● Any tool to be adopted must be “better” than what I've got.

Page 36: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Remember – This Could be YOU!

Page 37: Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03 Disclaimer Institute ~0.25TIPS Workstations 100 - 1000 Mbits/sec Physicists work on analysis “channels”.

Joe Kaiser – Fermilab:CD/CSS-SCS:US-CMS 5/16/03

Links

● http://www.rocksclusters.org● http://ganglia.sourceforge.net● http://www-isd.fnal.gov/ngop● http://linux.duke.edu/projects/yum


Recommended