+ All Categories
Home > Documents > physics on grids in the U.S.

physics on grids in the U.S.

Date post: 04-Jan-2016
Category:
Upload: ciara
View: 28 times
Download: 0 times
Share this document with a friend
Description:
physics on grids in the U.S. grids in the U.S. span projects, domains, countries, disciplines, oceans: they are a cultural phenomenon. Mainly funded by the US Department of Energy and the National Science Foundation, NASA and NIH. Particle Physics Data Grid. - PowerPoint PPT Presentation
16
Ruth Pordes, Fermilab physics on grids in the U.S.
Transcript
Page 1: physics on grids in the U.S.

Ruth Pordes, Fermilab

physics on grids in the U.S.

Page 2: physics on grids in the U.S.

Ruth Pordes

2physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.grids in the U.S. span projects, domains, countries, disciplines, oceans: they are a

cultural phenomenon.

Mainly funded by the US Department of Energy and the National Science Foundation, NASA and NIH.

Mainly funded by the US Department of Energy and the National Science Foundation, NASA and NIH.

teams working together across a set of funded projects, research collaborations, long-term funded institutions and ad-hoc efforts

teams working together across a set of funded projects, research collaborations, long-term funded institutions and ad-hoc efforts

Particle Physics Data Grid.

Particle Physics Data Grid.

Page 3: physics on grids in the U.S.

Ruth Pordes

3physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Physicists are using these grids for data distribution and processing

V. White, Chep2000

DZero reprocessing, summer 2005:Single site submission reprocessing on 9 sites in 7 countries; initial sharing of resources with LCG, OSG.

D0 Distributed Computing (Grid) proposed from get-go: Automated data requesting, multi-hop transport, and caching

US ATLAS DC2 on Grid3: raw # jobs

Across ~20 sites. 30% non-ATLAS resources.

CMS simulations taking cycles “now and then”.

Efficiency increased from 60-75% over the year through instrumentation and analysis.

Page 4: physics on grids in the U.S.

Ruth Pordes

4physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Science results are emerging that rely on using grids:

TeraGrid Simulation of 14 Billion Years of Universal Evolution

http://www.interactions.org/sgtw/

http://www.interactions.org/sgtw/

Using the optimal (super)computer for each stage of the jobs.

Using the optimal (super)computer for each stage of the jobs.

..Simulated how a cube of the universe 250 million light years on each side developed over 14 billion years taking a snapshot of the universe every seven million years…

..Simulated how a cube of the universe 250 million light years on each side developed over 14 billion years taking a snapshot of the universe every seven million years…

Page 5: physics on grids in the U.S.

Ruth Pordes

5physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Individual researchers do take advantage of spare compute cycles on the grid

SDSS Southern Coadd project combined images from a ~300 square degree region along the southern celestial equator, imaged an average of about ten times to allow scientists to detect very faint and distant astronomical objects.

~15 sites used in an “opportunistic” more to coadd all available data.

>44,000 computing jobs requiring >two terabytes of input data transferred to remote grid nodes were processed.

Elongated blue object shows strong-lensing arc system discovered. The arc is a background galaxy whose image has been distorted by the gravitational strong lensing effect due to the foreground cluster of galaxies (orange-

yellow objects).

Page 6: physics on grids in the U.S.

Ruth Pordes

6physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Cyberinfrastructure and (LHC) Physics on Grids

• Physics distributed computing, in particular LHC data analysis, needs more capability, scaling and performance than Grids in the US offer today. x2~3 in end to end data transfer; ~x10 in job complexity x

throughput; improvement from 65->99% efficiency; added data, job, error management services.

• NSF Cyberinfrastructure (S. Kim, Oct 2004): Science Frontiers as the Drivers Balance capability and capacity Massive data generated at the periphery;

• E-Science (D. Atkins, Oct 2004): Accelerate of cycles of discovery. Reduce constraints of time and space. Enhance sharing, re-use, multi-use of resources. Enable rapid response to the unexpected. Capture of process; not just end results New “and-and” organizational forms. Data Curation and Administration

S. Kim, NSF, 2004

201020052000

ITR

NMI

Terascale

PACI

Cyberinfrastructure

Page 7: physics on grids in the U.S.

Ruth Pordes

7physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

E-labs : experiencial, continuous, and open forums for scholarly collaboration.

Page 8: physics on grids in the U.S.

Ruth Pordes

8physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

All sciences and middleware groups focused on these issues.All sciences and middleware groups focused on these issues.

Data Movement and Administration

• Networks provisioning. Management of data transport for performance and efficiency.

• Data Administration, history, location transparency, replication:

(non-conflicting) namespaces and identifiers, directory and registry services - enterprise,

interrealm;

• Curation and Access Multi-petabyte storage needs becoming “commonplace”. Data lifecycles require technology transitions. Intelligent common replication and placement services.

F. BermanInternet

2004

F. BermanInternet

2004

Page 9: physics on grids in the U.S.

Ruth Pordes

9physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Security, Policy and Trust

Beyond Authentication: accurately revoke, control, audit.

Inter-domain negotiation: agreements and acceptance of “identities”, authorizations, policies. Roles, authorities, responsibilities of Virtual

Organizations

Permissions and access controls: role definition, rights & delegations, privacy management, etc. Technology: Extended X509 certificate attributes;SAML policy descriptions; Callouts from services to map attributes to rights. Business model in support of roles e.g. CMSproductionsubmitter; SDSSserendipityworkinggroup etc.

Research groups administer and prioritize access for their organizations.Research groups administer and prioritize access for their organizations.

technologies from many CS & Physics Croups

Globus

Virginia Tech

DESY/FNAL

UCSD/FNAL

INFN/EGEE

BNL

Page 10: physics on grids in the U.S.

Ruth Pordes

10physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Errors, and Knowing what’s going on

Reporting errors such they are intelligible and intelligent can be incrementally addressed.

End to end Auditing to provide parallel, global, diagnosis of what went wrong in the complex sequence of events across many sites;

e.g. The Fault Tolerant Shell ?a small language for that makes failures a first class concept. Aims to combine the ease of scripting with very precise error semantics, Providing for timeouts, retry, and alternation. You might think of this as exception handling for scripts. D. Thain, University of Notre Dame.try for 30 minutes cd /tmp rm -f data forany host in xxx yyy zzz wget http://${host}/fresh.data data endend

EGEE Operations Workshop S. Belforte: “Experiments need to to control the progress of their application to take proper actions..helping the Grid to work by having it expose much of its status to the users

Page 11: physics on grids in the U.S.

Ruth Pordes

11physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Capabilities and User Environment

User/Organizations need - easy, complete interfaces to the grid resources.- access to experiment databases, software releases, dataset merging and management; job splitting etc at the grid sites.

Virtual Machine technologies are an active area of research as wrappers to dynamically glide in and isolate extra-site resident services.

Page 12: physics on grids in the U.S.

Ruth Pordes

12physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Grids in the U.S.Laboratories, Universities and Experiments allow their

Farms and Storage Systems to be accessed across local and wide area networks and shared through the use grid middleware (as well as supporting direct use locally). Many times they make these resources accessible to multiple Grid domains.

Users and Organizations expect transparent execution and storage environments that span across any boundaries between the resources they are using.

There are thus many “Grids” that intersect and overlap. Each infrastructure is defined by the interfaces and administrative services that it provides.

TeraGrid and Open Science Grid are 2 national grid infrastructures in the U.S. that are providing and augmenting their capabilities in support of physics.

Page 13: physics on grids in the U.S.

Ruth Pordes

13physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

TeraGrid

NSF funded large facilities: at 10 sites where the core set are the US national supercomputers. TeraGrid wide area network for high speed data serving between them.

Diverse: hardware operating systems results in most application running at most at 1 or 2 sites.Deep Software stack (CTSS):

includes compiler and libraries.

Allocations for Use: through agreements: “exchangeable computing units” which can be spent at any site.

Science Gateways: portals between a community and TeraGrid. Pretty broad definition is what each research group needs.

US LHC simulations run on TeraGrid - issues of data handling and validation remain.

Page 14: physics on grids in the U.S.

Ruth Pordes

14physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Open Science Grid

Community Built by application, middleware and site, research, university and laboratory contributors. funded by DOE, NSF, (NIH) funded projects and institutions. ~50 sites from Grid3, OSG, OSG Integration grid.

Grid of Grids made coherent by a set of Interfaces, core Services and Policies.

Autonomous Sites: Minimal impact on facilities existing resources and procedures; heterogeneous model but in practice all sites today are Linux.

Integration Grid: Important structure for testing new services, new types of application, integration

Extensible middleware: Core components packaged and integration by Virtual Data Toolkit (condor, globus, voms etc) with flexible process for additions and support.

Interoperating: Core services interoperate with LCG-2, TeraGrid etc.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

+ site in Brazil

Page 15: physics on grids in the U.S.

Ruth Pordes

15physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

working model – governance – of OSG is somewhere between the open source model for software and that of a particle physics experiment collaboration.

working model – governance – of OSG is somewhere between the open source model for software and that of a particle physics experiment collaboration.

Open Science Grid Consortium

“Open Science” like the “Open Source” in “Open Source” Software. Open to the sciences to contribute;

Provide Common Infrastructure for many different sciences.

US LHC S&C, Lab facilities, Trillium are lead contributors in experiment style collaboration: US LHC is building the systems for LHC data

analysis as part of the Open Science Grid. The DOE Labs are making their resources

available.Middleware teams are lead contributors: Condor and

Globus teams are energetically engaged.

Market Place resource allocation allows for opportunistic use of resources, and internal management across sub-groups.

Page 16: physics on grids in the U.S.

Ruth Pordes

16physics on grids in the U.S.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

physics & grids - a 2 way street

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Physicist - Computer Scientist teams optimally result in seamless merging of results into the mainstream of grid middleware to benefit others and adiabatic integration into experiment infrastructures to their required schedules.

Physicist - Computer Scientist teams optimally result in seamless merging of results into the mainstream of grid middleware to benefit others and adiabatic integration into experiment infrastructures to their required schedules.

You can get bogged down without the right tools

It’s the overall effectiveness that counts, not the secs to go from 0 to 100

Manage the Resources.

On an open road seeing a rainbow does not necessarily mean there is a pot of gold at the end.


Recommended