+ All Categories
Home > Documents > GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May...

GRID IIII D UK Particle Physics GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May...

Date post: 01-Jan-2016
Category:
Upload: judith-lucas
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
22
GridPP Collaboration meeting - R.P.Middleton (RAL/PPD) 23-25th May 2001 1 GRID I I D UK Particle Physics Grid Monitoring Services Grid Monitoring Services Robin Middleton Robin Middleton RAL/PPD RAL/PPD 24-May-01 24-May-01
Transcript

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 1

GRIDIID

UKParticlePhysics

Grid Monitoring ServicesGrid Monitoring Services

Robin MiddletonRobin Middleton

RAL/PPDRAL/PPD

24-May-0124-May-01

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 2

GRIDIID

UKParticlePhysics

OverviewOverview

What is Monitoring ?What is Monitoring ? GGF Perf-WGGGF Perf-WG DataGrid WP3DataGrid WP3 Example : NetloggerExample : Netlogger SummarySummary

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 3

GRIDIID

UKParticlePhysics

IntroductionIntroduction

Information Services part dealt with separately today Information Services part dealt with separately today

DataGrid WorkPackage 3 (WP3)DataGrid WorkPackage 3 (WP3) UK leadership / responsibility WP3 = Grid Monitoring AND Information Services

Global Grid Forum - Perf Mon WorkgroupGlobal Grid Forum - Perf Mon Workgroup http://www-didc.lbl.gov/GridPerf/

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 4

GRIDIID

UKParticlePhysics

What is Monitoring ?What is Monitoring ?

Application performanceApplication performance Fabric availabilityFabric availability Network availability / performanceNetwork availability / performance Event / AlertEvent / Alert ArchivesArchives Forecasting (e.g NWS)Forecasting (e.g NWS) IssuesIssues

update/read frequency information streaming hierarchical .vs. relational relaxed coherence; timestamps scalable; non-invasive non-repeatable

Monitoring .vs. Monitoring & Information ?Monitoring .vs. Monitoring & Information ?

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 5

GRIDIID

UKParticlePhysics

BoundariesBoundaries

MassStorage

ComputingFabric

Network

Monitoring

Application

WorkloadMgt DataMan

End-Users

Sys/Grid-Admin

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 6

GRIDIID

UKParticlePhysics

GGF : Perf-WG GGF : Perf-WG

““The Grid Performance working group is focused on The Grid Performance working group is focused on defining standards and best practices for the gathering, defining standards and best practices for the gathering, representation, storage, distribution, and query of representation, storage, distribution, and query of performance information about Grid resources and performance information about Grid resources and applications.”applications.”

Four Projects (!)Four Projects (!)1.Define a schema for data formats for performance

monitoring. This would be a common interchange format that tools could use to interoperate.

2.Taxonomy / classification of performance monitoring and analysis tools.

3.Survey of existing tools classified by the above taxonomy. 4.Recommendations on the aspects of grid applications,

services and resources that should be monitored. 5.The development of performance monitoring tools based

upon the survey of tools.

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 7

GRIDIID

UKParticlePhysics

GGF Perf-WG : Use CasesGGF Perf-WG : Use Cases

1: Instrumented library for performance measurement (e.g. I/O system)1: Instrumented library for performance measurement (e.g. I/O system)

2: Netlogger/DPSS monitoring streams to log file2: Netlogger/DPSS monitoring streams to log file

3: JAMM (Java) sensors stream data to a GUI3: JAMM (Java) sensors stream data to a GUI

4: JAMM/Port Monitor4: JAMM/Port Monitor

5: Fault detection & analysis5: Fault detection & analysis

6: Job progress monitoring6: Job progress monitoring

7: Distributed system performance analysis7: Distributed system performance analysis

8: Network-aware , self-tuning applications8: Network-aware , self-tuning applications

9: Data replication (choice of “best” location)9: Data replication (choice of “best” location)

10: Scheduling & prediction services10: Scheduling & prediction services

11: Auditing systems11: Auditing systems

12: Configuration monitoring12: Configuration monitoring

13: User application monitoring13: User application monitoring

14: Application self-tuning14: Application self-tuning

15: 15: Real-time adaptive simulation & presentationReal-time adaptive simulation & presentation

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 8

GRIDIID

UKParticlePhysics

DataGrid : WorkPackage 3DataGrid : WorkPackage 3

The aim of this workpackage is to specify, develop, integrate and test tools and infrastructure to enable end-user and administrator access to status and error information in a Grid environment and to provide an environment in which application monitoring can be carried out. This will permit both job performance optimisation as well as allowing for problem tracing and is crucial to facilitating high performance Grid computing.

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 9

GRIDIID

UKParticlePhysics

Architecture (GGF : Perf-WG)Architecture (GGF : Perf-WG)

Producer

Sensor SensorSensor

Host - A

Sensor SensorSensor

Host - B

Consumer DirectoryService

Producer

PublishSubscribe

Discovery

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 10

GRIDIID

UKParticlePhysics

WP3 : TasksWP3 : Tasks

UmbrellasUmbrellas

Task 3.1: Requirements & Design (month 1-12)

Task 3.2: Current Technology (month 1-12)

Task 3.3: Infrastructure (month 7-24)

Task 3.4: Analysis & Presentation (month 7-24)

Task 3.5: Test & Refinement (month 19-36)

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 11

GRIDIID

UKParticlePhysics

WP3 : Deliverables (as in the TA)WP3 : Deliverables (as in the TA)

D3.1 (Report) Month 12: Evaluation Report of D3.1 (Report) Month 12: Evaluation Report of current current technologytechnology

D3.2 (Report) Month 9 : Detailed D3.2 (Report) Month 9 : Detailed architectural designarchitectural design report report and evaluation criteria (also input to WP12 architecture and evaluation criteria (also input to WP12 architecture deliverable)deliverable)

D3.3 (Prototype) Month 9: Components and documentation D3.3 (Prototype) Month 9: Components and documentation for the for the First Project ReleaseFirst Project Release (see WP 6) (see WP 6)

D3.4 (Prototype) Month 21: Components and D3.4 (Prototype) Month 21: Components and documentation for the documentation for the Second Project ReleaseSecond Project Release (see WP (see WP 6)6)

D3.5 (Prototype) Month 33: Components and D3.5 (Prototype) Month 33: Components and documentation for the documentation for the Final Project ReleaseFinal Project Release (see WP 6) (see WP 6)

D3.6 (Report) Month 36: D3.6 (Report) Month 36: Final evaluationFinal evaluation report report

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 12

GRIDIID

UKParticlePhysics

WP3 : Milestones (as in the TA)WP3 : Milestones (as in the TA)

M3.1 Month 6: Decide baseline architecture & technologies.M3.1 Month 6: Decide baseline architecture & technologies.

M3.2 Month 9: Provide requirements for collation by Project M3.2 Month 9: Provide requirements for collation by Project ArchitectArchitect

M3.3 Month 9: Prototype components integrated into First M3.3 Month 9: Prototype components integrated into First Project release (see WP 6)Project release (see WP 6)

M3.4 Month 21: Interim components integrated into Second M3.4 Month 21: Interim components integrated into Second Project Release (see WP 6)Project Release (see WP 6)

M3.5 Month 33: Final components integrated into Final M3.5 Month 33: Final components integrated into Final Project Release (see WP 6)Project Release (see WP 6)

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 13

GRIDIID

UKParticlePhysics

WP3 : First Release (PM9)WP3 : First Release (PM9)

• Information services based on a new version of the Information services based on a new version of the Globus MDS (soon to be in alpha release).Globus MDS (soon to be in alpha release).

• Rudimentary implementation of a relational approach Rudimentary implementation of a relational approach to information services.to information services.

• A set of APIs in support of both MDS and GMA A set of APIs in support of both MDS and GMA approaches.approaches.

• Basic presentation of performance monitoring data based around Netlogger

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 14

GRIDIID

UKParticlePhysics

WP3 : EffortWP3 : Effort

FundedFunded UnfundedUnfundedTotalTotal

PPARCPPARC 3.03.0 1.831.83 4.834.83

SZTAKI (HU)SZTAKI (HU) 2.082.08 0.920.92 3.03.0

INFN (IT)INFN (IT) 0.00.0 1.161.16 1.161.16

IBM-UKIBM-UK 1.01.0 0.00.0 1.01.0

TotalTotal 6.086.08 3.913.91 10.010.0

+ Trinity College Dublin+ Trinity College Dublin

(NB : for both Monitoring and Information Services )(NB : for both Monitoring and Information Services )

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 15

GRIDIID

UKParticlePhysics

WP3 : Use CasesWP3 : Use Cases

Fault Detection & Analysis, Heartbeats [5]Fault Detection & Analysis, Heartbeats [5] Job Status & Progress Monitoring [6]Job Status & Progress Monitoring [6] Application Performance Monitoring [1,13]Application Performance Monitoring [1,13] Performance Analysis of Distributed Systems [7]Performance Analysis of Distributed Systems [7] Scheduling Services and Self Tuning Applications Scheduling Services and Self Tuning Applications

[8,10,14,([8,10,14,(15)]] Data Replication Services [9]Data Replication Services [9] Accounting & Auditing [11]Accounting & Auditing [11] Configuration monitoring [12]

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 16

GRIDIID

UKParticlePhysics

WP3 : Decisions (end 2000)WP3 : Decisions (end 2000)

Try to track standards & best practice from Global Grid Try to track standards & best practice from Global Grid ForumForum

evaluate, steer, adopt, … Other WPs should provide the majority of sensorsOther WPs should provide the majority of sensors

network, fabric, mass-storage WP3 will provide the instrumentation APIWP3 will provide the instrumentation API Key deliverables will beKey deliverables will be

Performance Services Error / Alert Services Status / Parameter Services Logging / Archival Services (forecasting) - information to enable other WPs to do this

WP3 subcontracts archival services (in terms of the WP3 subcontracts archival services (in terms of the data management aspects) ?data management aspects) ?

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 17

GRIDIID

UKParticlePhysics

NetloggerNetlogger

Supervisor

ProcessingNode

Readout Buffer

Acknowledgement : Weidong LiAcknowledgement : Weidong Li

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 18

GRIDIID

UKParticlePhysics

NetloggerNetlogger

Supervisor

ProcessingNode

Readout Buffer

Acknowledgement : Weidong LiAcknowledgement : Weidong Li

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 19

GRIDIID

UKParticlePhysics

Sequence DiagramSequence Diagram

SupervisorReadoutBuffer

ProcessingNode

1 2

35 4

6

7

Request

Fetch Data

Return data

Result

TIME

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 20

GRIDIID

UKParticlePhysics

ResultsResults

1 2 3 41 2 3 4

5 6 75 6 7

X : X : secssecs

Y : “count”Y : “count”

Acknowledgment : Weidong LiAcknowledgment : Weidong Li

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 21

GRIDIID

UKParticlePhysics

Netlogger SummaryNetlogger Summary

Example deploymentExample deployment Time resolutionTime resolution

NTP (~5ms) Custom h/w (~50s)

Thread safety ?Thread safety ? Variety of visualisation methodsVariety of visualisation methods ““non-invasive” ?non-invasive” ? Moving towards the GMAMoving towards the GMA

e.g. integration of directory service

GridPP Collaboration meeting - R.P.Middleton (RAL/PPD)23-25th May 2001 22

GRIDIID

UKParticlePhysics

SummarySummary

Information Service is KEY to MonitoringInformation Service is KEY to Monitoring …and nature of service to be determined !

Unified Information Architecture is importantUnified Information Architecture is important …otherwise duplication and inconsistencies

Align with Global Grid Forum for “standards”, etc.Align with Global Grid Forum for “standards”, etc. Starting point is NetloggerStarting point is Netlogger DataGrid deliverable DataGrid deliverable detailsdetails are testbed “driven” are testbed “driven” Cross-DataGrid WP - service to many areasCross-DataGrid WP - service to many areas


Recommended