+ All Categories
Home > Documents > Paul Avery University of Florida [email protected]

Paul Avery University of Florida [email protected]

Date post: 18-Mar-2016
Category:
Upload: kamuzu
View: 36 times
Download: 0 times
Share this document with a friend
Description:
Open Science Grid Linking Universities and Laboratories in National CyberInfrastructure. Paul Avery University of Florida [email protected]. SURA Infrastructure Workshop Austin, TX December 7, 2005. Bottom-up Collaboration: “Trillium”. Trillium = PPDG + GriPhyN + iVDGL - PowerPoint PPT Presentation
Popular Tags:
47
SURA Infrast ructure Work shop (Dec. 7 Paul Avery 1 Paul Avery University of Florida [email protected] Open Science Grid Linking Universities and Laboratories in National CyberInfrastructure SURA Infrastructure Workshop Austin, TX December 7, 2005
Transcript
Page 1: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 1

Paul AveryUniversity of [email protected]

Open Science GridLinking Universities and Laboratories in National

CyberInfrastructure

SURA Infrastructure WorkshopAustin, TX

December 7, 2005

Page 2: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 2

Bottom-up Collaboration: “Trillium”

Trillium = PPDG + GriPhyN + iVDGLPPDG: $12M (DOE) (1999 – 2006)GriPhyN: $12M (NSF) (2000 – 2005) iVDGL: $14M (NSF) (2001 – 2006)

~150 people with large overlaps between projectsUniversities, labs, foreign partners

Strong driver for funding agency collaborations Inter-agency: NSF – DOE Intra-agency: Directorate – Directorate, Division – Division

Coordinated internally to meet broad goalsCS research, developing/supporting Virtual Data Toolkit

(VDT)Grid deployment, using VDT-based middleware Unified entity when collaborating internationally

Page 3: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 3

Common Middleware: Virtual Data Toolkit

Sources(CVS)

Patching

GPT srcbundles

NMI

Build & TestCondor pool

22+ Op. Systems

Build

Test

Package

VDT

Build

Many Contributors

Build

Pacman cache

RPMs

Binaries

Binaries

Binaries Test

A unique laboratory for testing, supporting, deploying, packaging, upgrading, & troubleshooting complex sets of software!

Page 4: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 4

VDT Growth Over 3 Years (1.3.8 now)

0

5

10

15

20

25

30

35Ja

n-02

Apr

-02

Jul-0

2

Oct

-02

Jan-

03

Apr

-03

Jul-0

3

Oct

-03

Jan-

04

Apr

-04

Jul-0

4

Oct

-04

Jan-

05

Apr

-05

VDT 1.1.x VDT 1.2.x VDT 1.3.x

# of

com

pone

nts

VDT 1.0Globus 2.0bCondor 6.3.1

VDT 1.1.7Switch to Globus 2.2

VDT 1.1.11Grid3

VDT 1.1.8First real use by LCG

www.griphyn.org/vdt/

Page 5: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 5

Components of VDT 1.3.5 Globus 3.2.1 Condor 6.7.6 RLS 3.0 ClassAds 0.9.7 Replica 2.2.4 DOE/EDG CA certs ftsh 2.0.5 EDG mkgridmap EDG CRL Update GLUE Schema 1.0 VDS 1.3.5b Java Netlogger 3.2.4 Gatekeeper-Authz MyProxy1.11 KX509

System Profiler GSI OpenSSH 3.4 Monalisa 1.2.32 PyGlobus 1.0.6 MySQL UberFTP 1.11 DRM 1.2.6a VOMS 1.4.0 VOMS Admin 0.7.5 Tomcat PRIMA 0.2 Certificate Scripts Apache jClarens 0.5.3 New GridFTP Server GUMS 1.0.1

Page 6: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 6

VDT Collaborative Relationships

ComputerScience

Research

VirtualData

Toolkit

Partner science projectsPartner networking projectsPartner outreach projects

Science, ENG,Education

Communities

Globus, Condor, NMI, TeraGrid, OSGEGEE, WLCG, Asia, South AmericaQuarkNet, CHEPREO, Digital Divide

Deployment,Feedback

TechTransfer

Techniques&

software

RequirementsPrototyping

& experiments

Other linkages Work force CS researchers Industry

U.S.GridsInternational

Outreach

Page 7: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 7

Search for Origin of Mass New fundamental forces Supersymmetry Other new particles 2007 – ?

TOTEM

LHCbALICE

27 km Tunnel in Switzerland & France

CMS

ATLAS

Major Science Driver:Large Hadron Collider (LHC) @ CERN

Page 8: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 8

LHC: Petascale Global Science Complexity: Millions of individual detector channels Scale: PetaOps (CPU), 100s of Petabytes (Data) Distribution: Global distribution of people & resources

CMS Example- 20075000+ Physicists 250+ Institutes 60+ Countries

BaBar/D0 Example - 2004700+ Physicists 100+ Institutes 35+ Countries

Page 9: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 9

CMS Experiment

LHC Global Data Grid (2007+)

Online System

CERN Computer Center

USAKorea RussiaUK

Maryland

150 - 1500 MB/s

>10 Gb/s

10-40 Gb/s

2.5-10 Gb/s

Tier 0

Tier 1

Tier 3

Tier 2

Physics caches PCs

Iowa

UCSDCaltechU Florida

5000 physicists, 60 countries

10s of Petabytes/yr by 2008 1000 Petabytes in < 10 yrs?

FIU

Tier 4

Page 10: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 10

Grid3 andOpen Science Grid

Page 11: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 11

Grid3: A National Grid InfrastructureOctober 2003 – July 200532 sites, 3,500 CPUs: Universities + 4 national

labsSites in US, Korea, Brazil, TaiwanApplications in HEP, LIGO, SDSS, Genomics, fMRI,

CS

Brazil www.ivdgl.org/grid3

Page 12: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 12

Grid3 Lessons Learned How to operate a Grid as a facility

Tools, services, error recovery, procedures, docs, organizationDelegation of responsibilities (Project, VO, service, site, …)Crucial role of Grid Operations Center (GOC)

How to support people people relationsFace-face meetings, phone cons, 1-1 interactions, mail lists,

etc. How to test and validate Grid tools and applications

Vital role of testbeds How to scale algorithms, software, process

Some successes, but “interesting” failure modes still occur How to apply distributed cyberinfrastructure

Successful production runs for several applications

Page 13: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 13

http://www.opensciencegrid.org

Page 14: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 14

Sao Paolo Taiwan, S.Korea

Production Grid: 50+ sites, 15,000 CPUs “present”(available but not at one time)

Sites in US, Korea, Brazil, Taiwan Integration Grid: 10-12 sites

Open Science Grid: July 20, 2005

Page 15: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 15

OSG Operations Snapshot

November 7: 30 days

Page 16: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 16

OSG Participating DisciplinesComputer Science Condor, Globus, SRM,

SRBTest and validate innovations: new services & technologies

Physics LIGO, Nuclear Physics, Tevatron, LHC

Global Grid: computing & data access

Astrophysics Sloan Digital Sky Survey

CoAdd: multiply-scanned objectsSpectral fitting analysis

Bioinformatics Argonne GADU project

Dartmouth Psychological & Brain Sciences

BLAST, BLOCKS, gene sequences, etcFunctional MRI

University campusResources, portals, apps

CCR (U Buffalo)GLOW (U Wisconsin)TACC (Texas Advanced Computing Center)MGRID (U Michigan)UFGRID (U Florida)Crimson Grid (Harvard)FermiGrid (FermiLab Grid)

Page 17: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 17

OSG Grid PartnersTeraGrid • “DAC2005”: run LHC apps on TeraGrid

resources• TG Science Portals for other applications• Discussions on joint activities: Security,

Accounting, Operations, PortalsEGEE • Joint Operations Workshops, defining

mechanisms to exchange support tickets• Joint Security working group• US middleware federation contributions to

core-middleware gLITE

Worldwide LHC Computing Grid

• OSG contributes to LHC global data handling and analysis systems

Other partners • SURA, GRASE, LONI, TACC• Representatives of VOs provide portals and

interfaces to their user groups

Page 18: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 18

Example of Partnership:WLCG and EGEE

Page 19: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 19

OSG Technical Groups & Activities

Technical Groups address and coordinate technical areas

Propose and carry out activities related to their given areasLiaise & collaborate with other peer projects (U.S. &

international)Participate in relevant standards organizations.Chairs participate in Blueprint, Integration and Deployment

activities Activities are well-defined, scoped tasks contributing

to OSGEach Activity has deliverables and a plan… is self-organized and operated… is overseen & sponsored by one or more Technical

GroupsTGs and Activities are where the real work gets done

Page 20: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 20

OSG Technical Groups (deprecated!)

Governance Charter, organization, by-laws, agreements, formal processes

Policy VO & site policy, authorization, priorities, privilege & access rights

Security Common security principles, security infrastructure

Monitoring and Information Services

Resource monitoring, information services, auditing, troubleshooting

Storage Storage services at remote sites, interfaces, interoperability

Support Centers Infrastructure and services for user support, helpdesk, trouble ticket

Education / Outreach

Training, interface with various E/O projects

Networks (new) Including interfacing with various networking projects

Page 21: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 21

OSG ActivitiesBlueprint Defining principles and best practices for

OSGDeployment Deployment of resources & servicesProvisioning Connected to deploymentIncidence response

Plans and procedures for responding to security incidents

Integration Testing & validating & integrating new services and technologies

Data Resource Management (DRM)

Deployment of specific Storage Resource Management technology

Documentation Organizing the documentation infrastructure

Accounting Accounting and auditing use of OSG resources

Interoperability Primarily interoperability between Operations Operating Grid-wide services

Page 22: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 22

OSG Integration Testbed:Testing & Validating Middleware

Brazil

Taiwan

Korea

Page 23: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 23

Networks

Page 24: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 24

Evolving Science Requirements for Networks (DOE High Performance

Network Workshop)Science Areas

Today End2End

Throughput

5 years End2End

Throughput5-10 Years End2End

ThroughputRemarks

High Energy Physics

0.5 Gb/s 100 Gb/s 1000 Gb/s High bulk throughput

Climate (Data &

Computation)0.5 Gb/s 160-200

Gb/sN x 1000

Gb/sHigh bulk

throughput

SNS NanoScience

Not yet started

1 Gb/s 1000 Gb/s + QoS for Control Channel

Remote control and time critical throughput

Fusion Energy

0.066 Gb/s(500 MB/s

burst)0.2 Gb/s(500MB/20 sec. burst)

N x 1000 Gb/s

Time critical throughput

Astrophysics 0.013 Gb/s(1 TB/week)

N*N multicast

1000 Gb/s Computational steering and

collaborationsGenomics

Data & Computation

0.091 Gb/s(1 TB/day)

100s of users

1000 Gb/s + QoS for Control Channel

High throughput

and steering

See http://www.doecollaboratory.org/meetings/hpnpw/

Page 25: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 25

UltraLight

10 Gb/s+ network• Caltech, UF, FIU, UM, MIT• SLAC, FNAL• Int’l partners• Level(3), Cisco, NLR

http://www.ultralight.org

Integrating Advanced Networking in Applications

Page 26: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 26

EducationTraining

Communications

Page 27: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 27

iVDGL, GriPhyN Education/Outreach

Basics $200K/yr Led by UT Brownsville Workshops, portals, tutorials Partnerships with QuarkNet,

CHEPREO, LIGO E/O, …

Page 28: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 28

Grid Training Activities June 2004: First US Grid Tutorial (South Padre Island,

Tx)36 students, diverse origins and types

July 2005: Second Grid Tutorial (South Padre Island, Tx)

42 students, simpler physical setup (laptops) Reaching a wider audience

Lectures, exercises, video, on webStudents, postdocs, scientistsCoordination of training activities “Grid Cookbook” (Trauner & Yafchak)More tutorials, 3-4/yearCHEPREO tutorial in 2006?

Page 29: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 29

QuarkNet/GriPhyN e-Lab Project

http://quarknet.uchicago.edu/elab/cosmic/home.jsp

Page 30: Paul Avery University of Florida avery@phys.ufl

CHEPREO: Center for High Energy Physics Research and Educational OutreachFlorida International University

Physics Learning Center CMS Research iVDGL Grid Activities AMPATH network (S.

America)

Funded September 2003

$4M initially (3 years) MPS, CISE, EHR, INT

Page 31: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 31

Grids and the Digital Divide

Background World Summit on Information

Society HEP Standing Committee on

Inter-regional Connectivity (SCIC)Themes Global collaborations, Grids and

addressing the Digital Divide Focus on poorly connected

regions Brazil (2004), Korea (2005)

Page 32: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 32

Science Grid CommunicationsBroad set of activities(Katie Yurkewicz)News releases, PR, etc.Science Grid This WeekOSG NewsletterNot restricted to OSGwww.interactions.org/sgtw

Page 33: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 33

Grid Timeline

2000 2001 2003 2004 2005 2006 20072002

GriPhyN, $12M

PPDG, $9.5M

UltraLight, $2M

CHEPREO, $4M

DISUN, $10M

Grid Communications

Grid Summer Schools

Grid3 operations

OSG operationsVDT

1.0

First US-LHCGrid

Testbeds

Digital Divide Workshops

LIGO Grid

Start of LHCiVDGL,

$14M

Page 34: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 34

Future of OSG CyberInfrastructure

OSG is a unique national infrastructure for scienceLarge CPU, storage and network capability crucial for

science Supporting advanced middleware

Long-term support of the Virtual Data Toolkit (new disciplines & international collaborations

OSG currently supported by a “patchwork” of projects

Collaborating projects, separately funded Developing workplan for long-term support

Maturing, hardening facilityExtending facility to lower barriers to participationOct. 27 presentation to DOE and NSF

Page 35: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 35

Sao Paolo Taiwan, S.Korea

OSG Consortium Meeting: Jan 23-25

University of Florida (Gainesville)About 100 – 120 people expectedFunding agency invitees

ScheduleMonday Morning: Applications plenary

(rapporteurs)Monday Afternoon: Partner Grid projects plenaryTuesday Morning: ParallelTuesday Afternoon: PlenaryWednesday Morning: ParallelWednesday Afternoon: PlenaryThursday: OSG Council meeting

Page 36: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 36

Disaster PlanningEmergency Response

Page 37: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 37

Grids and Disaster Planning / Emergency Response

Inspired by recent eventsDec. 2004 tsunami in IndonesiaAug. 2005 Katrina hurricane and subsequent flooding(Quite different time scales!)

Connection of DP/ER to GridsResources to simulate detailed physical & human

consequences of disastersPriority pooling of resources for a societal good In principle, a resilient distributed resource

Ensemble approach well suited to Grid/cluster computing

E.g., given a storm’s parameters & errors, bracket likely outcomes

Huge number of jobs requiredEmbarrassingly parallel

Page 38: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 38

DP/ER Scenarios Simulating physical scenarios

Hurricanes, storm surges, floods, forest firesPollutant dispersal: chemical, oil, biological and nuclear spillsDisease epidemicsEarthquakes, tsunamisNuclear attacksLoss of network nexus points (deliberate or side effect)Astronomical impacts

Simulating human responses to these situationsRoadways, evacuations, availability of resourcesDetailed models (geography, transportation, cities,

institutions)Coupling human response models to specific physical

scenarios Other possibilities

“Evacuation” of important data to safe storage

Page 39: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 39

DP/ER and Grids: Some Implications

DP/ER scenarios are not equally amenable to Grid approach

E.g., tsunami vs hurricane-induced floodingSpecialized Grids can be envisioned for very short response timesBut all can be simulated “offline” by researchersOther “longer term” scenarios

ER is an extreme example of priority computingPriority use of IT resources is common (conferences, etc) Is ER priority computing different in principle?

Other implicationsRequires long-term engagement with DP/ER research communities(Atmospheric, ocean, coastal ocean, social/behavioral, economic)Specific communities with specific applications to executeDigital Divide: resources to solve problems of interest to 3rd WorldForcing function for Grid standards?Legal liabilities?

Page 40: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 40

Grid Project ReferencesOpen Science Grid

www.opensciencegrid.orgGrid3

www.ivdgl.org/grid3Virtual Data Toolkit

www.griphyn.org/vdtGriPhyN

www.griphyn.orgiVDGL

www.ivdgl.orgPPDG

www.ppdg.netCHEPREO

www.chepreo.org

UltraLight www.ultralight.org

Globus www.globus.org

Condor www.cs.wisc.edu/condor

WLCG www.cern.ch/lcg

EGEE www.eu-egee.org

Page 41: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 41

Extra Slides

Page 42: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 42

Grid3 Use by VOs Over 13 Months

Page 43: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 43

CMS: “Compact” Muon Solenoid

Inconsequential humans

Page 44: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 44

LHC: Beyond Moore’s LawEstimated CPU Capacity at CERN

Intel CPU (2 GHz) = 0.1K SI95

0

1,000

2,000

3,000

4,000

5,000

6,000

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

K S

I95 Moore’s Moore’s LawLaw

(2000)(2000)

LHC CPU LHC CPU RequiremenRequiremen

tsts

Page 45: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 45

Grids and Globally Distributed Teams

Non-hierarchical: Chaotic analyses + productions Superimpose significant random data flows

Page 46: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 46

Sloan Digital Sky Survey (SDSS)Using Virtual Data in GriPhyN

1

10

100

1000

10000

100000

1 10 100

Num

ber

of C

lust

ers

Number of Galaxies

Galaxy clustersize distribution

Sloan Data

Page 47: Paul Avery University of Florida avery@phys.ufl

SURA Infrastructure Workshop (Dec. 7, 2005)

Paul Avery 47

The LIGO Scientific Collaboration (LSC)

and the LIGO Grid LIGO Grid: 6 US sites + 3 EU sites (Cardiff/UK,

AEI/Germany)

* LHO, LLO: LIGO observatory sites* LSC: LIGO Scientific Collaboration

Cardiff

AEI/Golm •

Birmingham•


Recommended