+ All Categories
Home > Documents > Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7....

Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7....

Date post: 25-May-2020
Category:
Upload: others
View: 10 times
Download: 0 times
Share this document with a friend
29
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Network activity in EGEE-III SA2 Xavier Jeannin (CNRS/UREC) SA2 Activity Manager 7th NRENs and Grids Workshop (Dublin) 1/2 September 2008
Transcript
Page 1: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Network activity in EGEE-IIISA2

Xavier Jeannin (CNRS/UREC)SA2 Activity Manager

7th NRENs and Grids Workshop (Dublin) 1/2 September 2008

Page 2: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 2

Agenda

• EGEE size and statistics• SA2 Network activity

– Technical Network Liaison Committee TNLC– EGEE Network Operations Center EGEE– EGEE-III Projects

LHCOPN support / operational Model Trouble matching and correlation Tools for troubleshootingGrid site networking needs Advanced network services IPv6 Trouble Ticket standardization

• European Grid Initiative, National Grid Initiative– Lesson learnt from EGEE– Network activity in EGI/NGI

• Conclusion

Page 3: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 3

EGEE: the largest multi-disciplinary research Grid infrastructure in the world

050

100150200250300

avr.-

04ju

il.-0

4oc

t.-04

janv

.-05

avr.-

05ju

il.-0

5oc

t.-05

janv

.-06

avr.-

06ju

il.-0

6oc

t.-06

janv

.-07

avr.-

07ju

il.-0

7oc

t.-07

janv

.-08

avr.-

08

No. Sites

020000400006000080000

avr.-

04ju

il.-0

4oc

t.-04

janv

.-05

avr.-

05ju

il.-0

5oc

t.-05

janv

.-06

avr.-

06ju

il.-0

6oc

t.-06

janv

.-07

avr.-

07ju

il.-0

7oc

t.-07

janv

.-08

avr.-

08No. Cores

Page 4: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 With the courtesy of Bob Jones SA2: Network activity in EGEE-III 4

Users and resources distribution

Feb’08

Page 5: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 With the courtesy of Erwin Laure SA2: Network activity in EGEE-III 5

Highlights of EGEE-II - Applications

• >270 VOs from several scientific domains– Astronomy & Astrophysics– Civil Protection– Computational Chemistry– Comp. Fluid Dynamics– Computer Science/Tools– Condensed Matter Physics– Earth Sciences– Fusion– High Energy Physics– Life Sciences

• Further applications under evaluation

Applications are moving from testing to routine and daily usage

Présentateur
Commentaires de présentation
At Feb review: 100 sites, 10K CPUs 1st gLite release foreseen for March’05 6 domains and
Page 6: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

NA12%

NA25%

NA38%

NA419%

NA51%SA1

49%

SA22%

SA39%

JRA15%

SA2: Network activity in EGEE-III 6

SA2 in EGEE-III• Total of 375 FTEs in EGEE-III

– 9010 person months (vs. 11165 PMs in EGEE-II; ~20% less)– Grand total combining funded and unfunded contributions

No difference for execution of program of work!• Network activity SA2 = 14 persons + TNLC, 159 PMs

Page 7: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

SA2 – EGEE-III

SA2 Global view

SA2: Network activity in EGEE-III 7

Support for the ENOC

IPv6(GARR, CNRS)

Operational procedures (CNRS)

LCG Support (CNRS)

Operational tools and maintenance

(RRC-KI, CNRS)

Overall Networking coordination

ENOC running

TT exchange standard (GRNET)

Advanced network services(GRNET)

TNLC

IPv6 (GARR, CNRS)

Monitoring (DFN)s

Site networking needs (RedIRIS)

Troubleshooting (DFN)

Page 8: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 8

Technical Network Liaison Committee

• Technical Network Liaison Committee – TNLC– Facilitate cooperation between EGEE on the one hand and

GÉANT2 and the NRENs on the other hand– CERN; CNRS, France; DANTE, UK - the GÉANT2 operator;

RRC KI, Russia; DFN-Verein, Germany; GARR, Italy; GRNET, Greece; RedIRIS Spain...

• Main themes– Monitoring (E2ECU, monitoring LHCOPN/EGI) – Standardization of network trouble tickets (Assessment of the

impact on the grid of a trouble ticket)– Advanced network services (AMPS/SLA, new network advanced

services)

Page 9: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 June 2008 9

EGEE’08 conference• NRENs are invited to take part in the TNLC

Page 10: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Role of the ENOC

• ENOC ensuring E2E connectivity for Grid sites• Assess the impact on the Grid of network trouble• Troubleshoot problems

– Provide support to users– Identify the faulty domain

• Assess the network connectivity of the Grid sites

SA2: Network activity in EGEE-III 10

GÉANT2NREN ARC 1

Grid site 1 NREN BRC 2

Grid site 2

Operated by DANTEOperated by NOC of NREN A

Operated by NOC of NREN B

Operated by NOC of RC2

Operated by NOC of RC1

ENOC ensuring E2E connectivity for Grid sites on the whole path

Page 11: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

The ENOC– A single point of contact between EGEE and the NRENs where EGEE and the

network can exchange operational information– A Network support unit in GGUS (trouble ticket system of EGEE)

SA2: Network activity in EGEE-III 11

•Sites

GGUS

Users

Support Units

•NRENs

GÉANT2

•EGEE Network

•Sites•SitesSites •NRENs•NRENsNRENsENOC

• Interface with network providers:– Collect tickets from NRENs– Assess impact on the grid infrastructure– Forward to GGUS tickets that seem relevant

• Interface with the EGEE user support:– Receive tickets assigned to ENOC by the

GGUS 1st level support– Troubleshoot them provided that the ENOC

has access to suitable monitoring tools– Contact identified faulty domains or reassign

ticket to the associated site if this is local network issue

Présentateur
Commentaires de présentation
GGUS = global grid user support
Page 12: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 12

Assess the network connectivity of the Grid sites

• Specific tools developed: Downcollector, see https://ccenoc.in2p3.fr/

0

100

200

300

400

500

600

700

800

900

1000

August 07 September October November December January 08 February March

Number of connectivity troubles detected on EGEE Grid certified sites sorted per supposed location

WAN/MAN

LAN / Non network (power…)

Unknown

Number of sites with at least one network trouble

282 Certified Grid Sites

Page 13: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Support of LHCOPN

SA2: Network activity in EGEE-III 13

http://ccenoc.in2p3.fr/ASPDrawer/

The LHC Optical Private Network

15 PB of data per year generated by the LHC

Page 14: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 14

Support of LHCOPN

• SA2 objectives in LHCOPN context are: – Define the operational Model

Define accurately responsibilities of each actorEnsure a problem resolution is not delayed by an unsuitable operational modelEnsure the LHCOPN is well monitored

– Set up communication channels between this network and the EGEE Grid (scheduled downtimes, incidents etc.)

• LHCOPN operational model: – Federative Model, responsibility shared by Tiers 1 and Tier 0 – Approach: Define actors and their relationship, Where to find

the information, The procedure Every actor agrees on the operational model and are aware of their role and the procedure they should apply

– Draft: Operational model WIKI

Page 15: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 15

LHCOPN Operational model

Page 16: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 16

LHCOPN Operational model

Page 17: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 17

Trouble matching and correlation RRC-KI

• Trouble matching and correlation for the ENOC– From a discovered incident find the related network trouble ticket– Better trouble localisation– Different methods will be tested

• First method– Another monitoring tool (smoke ping) has been set up, located in

Russia– The results of this tool and those from ENOC (Downcollector,

Lyon) are matched up– The two tools are located in two different places in order to

improve the knowledge of the network topology

Page 18: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 18

Network Operational Database

Page 19: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 19

Tools for troubleshooting DFN

• Tools for efficient troubleshooting– Launch test on demand from the Grid site under central server

control: ping, traceroute, DNS lookup, nmap and bandwithmeasurements.

ENOC

Local site light PerfSONAR’s sensorCentral ENOC monitoring server

1

Grid site B

3

2

4

5

ENOC supervisorSite administrator

Grid site A

Présentateur
Commentaires de présentation
Constraints from Grid Sites: - A lightweight product - A well known product - No active measurement - Obtain an access to measurement Constraints from the project: - Sustainability of the software - A minimum of software development Proposal: Deployment on the site of a tool written by PerfSonar team (DFN) that able active measurement on demand Add to the already developed core module of PerfSonar a plugin to match the requirement of SA2 (ping, traceroute, DNS lookup, nmap and bandwidth measurements)
Page 20: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 20

Tools for troubleshooting DFN

• Active measure on demand, light weight PerfSONARversion with a specific plug-in

• Look for beta-tester sites

• NRENs can take advantage of the deployment of this software– To troubleshoot their own grid nodes

Page 21: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 21

Grid site networking needs RedIRIS

• Establish by an empirical way the site needs in term of network needs according to type of – Site (Tiers 0, 1, 2, 3)– Experiment computed in the site

• Working plan– Review of the status of Tier2 / Tier3 in Spain– Translate the requirements and needs to network parameters to be measured.– Brief review of different network performance and monitoring tools that tiers

agree to deploy– Pilot / Service definition for deploying perfSONAR– Performance and monitoring tests definition– Tests phase, Results and conclusions.

Page 22: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III

Advanced network services GRNET

• Enable access for applications to the advanced services provided by the NRENs

• SLA automation in multi-domain environment through AMPS (Advance Multi-domain Provisioning )– Overcome the lack of automated mechanisms

• SLA monitoring in EGEE– Automate the monitoring procedure and generate alarms.– perfSONAR

• Investigate the new advanced network services soon available– Dynamic lightpath?

22

Page 23: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 23

IPv6 follow-up GARR/CNRS

• Set up all elements needed to handle IPv6 in EGEE– Middleware, testbed

gLite internal dependencies, IPv6 compliance• DPM-LFC, BDII

External dependencies• Assessment of IPv6 compliance of external modules• Deep test for important external modules: Grid-FTP …

– Validation process of EGEE (SA3) – IPv6 knowledge dissemination

Training course, presentation

• Assess and make available an operational EGEE IPv6 site - according to which IPv6 gLite modules are available

Page 24: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 24

Trouble ticket exchange CNRS/GRNET

• Defined by the TNLC (GARR, GRNET, RCC-KI, SRCE) • Standard trouble tickets allow a better

– Location of the problem– Assessment of the impact of trouble on the grid

• The translation can be done in – The ENOC, central server translating NREN’s ticket into

standard ticket– The NREN domain

• Software will be soon available• The translator can easily be adapted to the requirement

of NRENs willing to deliver directly standard• Standard trouble tickets will benefit both to NRENs and

Grid project

Page 25: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 25

The European Grid Initiative

•Must be no gap in the support of the

production grid

• Need to prepare permanent Grid infrastructure• Coordinate the integration and interaction between National Grid

Infrastructures (NGIs) • Experimental/research task should switch to production phases • Establish at EGI level a sustainability collaboration between Grid

and Network people• A major stake for NRENs

Page 26: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 26

The Lesson learnt from EGEE

• Future European Grid Initiative network activity:• Troubleshooting activity should be lowered at minimum (only big issues)• Interaction (process, trouble sharing) and integration (operation design, monitoring…) with the Grid are essentials at project level• Trouble ticket handling should be turned into a knowledge database and used as a part of the quality network monitoring• Network monitoring is an open subject in EGI-NGI• The NGI/EGI will federate several grid projects and therefore handle more sites and more networks• Future possibilities offered from networks to the Grid should not be missed: Dynamic lightpath provisioning (Internet2, Phosphorus…), Ipv6 compliance• Network quality control should be fostered (statistics, MoU checking, feedbacks to network providers…)

Page 27: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 27

Network activity in EGI/NGI

• Network activity key objectives in EGI/NGI• Interface between the European Grid Infrastructure

and networks providers• Monitor the quality of networks used by Grid project:

• Public: Educational and research network.• Private: Non educational network providers (commercial…)• Dedicated: LHCOPN, LHC Optical Private Network…

• Ensure that application’s network requirements are fulfilled / monitoring

• Put new network technologies forward in the Grid process.

Page 28: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 28

Conclusion

• Trouble ticket standardization• Tools for troubleshooting

– Light weight PerfSONAR deployed on grid site• Network monitoring for EGI • Collaboration with NRENs around

– Specifics topics (Network monitoring of grid sites, trouble ticket, assessment of the impact of trouble on the grid)

– through TNLC• Establish a future collaboration between NRENs and

NGI/EGI

Page 29: Network activity in EGEE-III SA2 - TERENA · SA2 Global view. SA2: Network activity in EGEE-III. 7. Support for the ENOC. IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 29

Thank you.


Recommended