+ All Categories
Home > Documents > Overview of Wisconsin Campus Grid

Overview of Wisconsin Campus Grid

Date post: 23-Feb-2016
Category:
Upload: golda
View: 38 times
Download: 0 times
Share this document with a friend
Description:
Overview of Wisconsin Campus Grid. Dan Bradley [email protected] Center for High-Throughput Computing. Technology. HTCondor. e xecutable = a.out RequestMemory = 1000 o utput = stdout e rror = stderr q ueue 1000. firewall. Open one port and use shared_port on submit machine. - PowerPoint PPT Presentation
15
Overview of Wisconsin Campus Grid Dan Bradley [email protected] Center for High- Throughput Computing
Transcript
Page 1: Overview of Wisconsin Campus Grid

Overview of Wisconsin Campus Grid

Dan Bradley [email protected] for High-Throughput

Computing

Page 2: Overview of Wisconsin Campus Grid

Dan Bradley 2

Technology

Page 3: Overview of Wisconsin Campus Grid

Dan Bradley 3

HTCondor

submit machine

Condor Pool

Condor Pool

firewall

flocking

Open one port and use shared_port on submit machine.

submit machine

If execute nodes are behind NAT but have outgoing net, use CCB.

• pools: 5• submit nodes: 50• user groups: 106• execute nodes: 1,600• cores: 10,000

executable = a.outRequestMemory = 1000

output = stdouterror = stderrqueue 1000

CCB

NAT

Page 4: Overview of Wisconsin Campus Grid

Dan Bradley 4

Accessing Files• No campus-wide shared FS• HTCondor file transfer for most cases:

– Send software + input files to job– Grind, grind, …– Send output files back to submit node

• Some other cases:– AFS: works on most of campus, but not across OSG– httpd + SQUID(s): when xfer from submit node doesn’t scale– CVMFS: read-only http FS (see talk tomorrow)– HDFS: big datasets on lots of disks– Xrootd: good for access from anywhere

• Used on top of HDFS and local FS

Page 5: Overview of Wisconsin Campus Grid

Dan Bradley 5

Managing Workflows

• A simple submit file works for many users– We provide an example job wrapper script to help download

and set up common software packages: MATLAB, python, R• DAGMan is used by many others– Common pattern:

• User drops files into a directory structure• Script generates DAG from that• Rinse, lather, repeat

• Some application portals are also used– e.g. NEOS Online Optimization Service

Page 6: Overview of Wisconsin Campus Grid

Dan Bradley 6

Overflowing to OSG

• glideinWMS– We run a glideinWMS “frontend”– Uses OSG glidein factories– Appears to users as just another pool to flock to• But jobs must opt-in: +WantGlidein = True

million hours used

• We customize glideins to make them look more like other nodes on campus:• publish OS version, glibc

version, CVMFS availability

Page 7: Overview of Wisconsin Campus Grid

Dan Bradley 7

A Clinical Health Application• Tyler Churchill: modeling cochlear

implants to improve signal processing.• Used OSG + campus resources to run

simulations that include important acoustic temporal fine structure, which is typically ignored due to difficulty.

“We can't do much about sound resolution given hardware limitations, but we can improve the integrated software. OSG and distributed high-throughput computing are helping us rapidly produce results that directly benefit CI wearers.”

Page 8: Overview of Wisconsin Campus Grid

Dan Bradley 8

Engaging Users

Page 9: Overview of Wisconsin Campus Grid

Dan Bradley 9

Engaging Users

• Meet with individuals (PI + techs)– Diagram workflow– How much input, output, memory, time?– Suitable for exporting to OSG?– Where will the output go?– What software is needed? Licenses?

• Tech support as needed• Periodic reviews

Page 10: Overview of Wisconsin Campus Grid

Dan Bradley 10

Training Users

• Workshops on campus– New users can learn about HTCondor, OSG, etc.– Existing groups can send new students– Show examples of what others have done

• Classes– Scripting for scientific users: python, perl,

submitting batch jobs, DAGMan

Page 11: Overview of Wisconsin Campus Grid

Dan Bradley 11

User Resources

• Many bring only their (big) brains– Use central or local department submit nodes– Use only modest scratch space

• Some have their own submit node– Can attach their own storage– Control user access– Install system software packages

Page 12: Overview of Wisconsin Campus Grid

Dan Bradley 12

Submitting Big

• Kick started work with big run in EC2, now continuing on campus.

• Building a database to quickly classify stem cells and identify important genes active in cell states useful for clinical applications.

Victor Ruotti, winner of Cycle Computing’s Big Science Challenge

Page 13: Overview of Wisconsin Campus Grid

Dan Bradley 13

Users with Clusters

• Three flavors:– condominium• User provides cash, we do the rest

– neighborhood association• User provides space, power, cooling, machines• Configuration is standardized

– sister cities• Independent pools that people want to share• e.g. student computer labs

Page 14: Overview of Wisconsin Campus Grid

Dan Bradley 14

Laboratory for Molecular and Computational Genomics

• Cluster integrated into campus grid• Combined resources can map data

representing the equivalent of one human genome in 90 minutes.

• Tackling challenging cases such as the important maize genome, which is difficult for traditional sequence assembly approaches.• Using whole genome single molecule optical mapping technique.

Page 15: Overview of Wisconsin Campus Grid

Dan Bradley 15

Reaching Further

Research Groups by Discipline


Recommended