+ All Categories
Transcript
Page 1: GGUS summary (2 weeks)

GGUS summary (2 weeks)

VO User Team Alarm Total

ALICE 1 0 1 2

ATLAS 14 116 6 136

CMS 4 1 1 6

LHCb 1 20 1 22

Totals 20 137 9 166

1

Page 2: GGUS summary (2 weeks)

04/21/23 WLCG MB Report WLCG Service Report 2

Support-related events since last MB

• We need WLCG shifters, alarmers, management to give us meaningful values for the GGUS ‘Problem Type’ field, in order for periodic reporting to show better weak areas in support.

•There were 9 ALARM tickets since the last MB (2 weeks), 5 of which were real, all submitted by ATLAS. Details follow…

Page 3: GGUS summary (2 weeks)

ATLAS ALARM->CERN-CNAF TRANSFERS

•https://gus.fzk.de/ws/ticket_info.php?ticket=62761

04/21/23 WLCG MB Report WLCG Service Report 3

What time UTC What happened

2010/10/05 9:13 GGUS ALARM ticket opened, automatic email notification to [email protected] AND automatic assignment to ROC_Italy.

2010/10/05 10:23 Site acknowledges ticket and finds a StoRM backend problem.

2010/10/05 12:03 Service restored. Site puts the ticket to ‘solved’ and refers to GGUS:62745 for details.

2010/10/11 Submitter ‘verifies’ ticket GGUS:62745. Not sure how ‘symptomatic’ the solution was…

Page 4: GGUS summary (2 weeks)

ATLAS ALARM->TRANSFERS TO .FR CLOUD

•https://gus.fzk.de/ws/ticket_info.php?ticket=62871

04/21/23 WLCG MB Report WLCG Service Report 4

What time UTC What happened

2010/10/08 5:56 GGUS ALARM ticket opened, automatic email notification to [email protected] AND automatic assignment to NGI_France.

2010/10/08 6:31 Site acknowledges ticket and finds a network problem preventing all DB server access.

2010/10/08 7:29 Service restored.

2010/10/08 10:41 Site puts ticket to status ‘solved’.

Page 5: GGUS summary (2 weeks)

ATLAS ALARM-> CERN SLOW LSF

•https://gus.fzk.de/ws/ticket_info.php?ticket=62467

04/21/23 WLCG MB Report WLCG Service Report 5

What time UTC What happened

2010/09/27 15:34

GGUS ALARM ticket opened, automatic email notification to [email protected] AND automatic assignment to ROC_CERN.

2010/09/27 16:01

Operator acknowledges ticket and contacts the expert.

2010/09/27 16:37 Expert’s 1st diagnosis. Too many queries.

2010/09/27 20:10 Service mgr kills a home-made robot by another experiment launching >> bjob queries and puts ticket to status ‘solved’.

Page 6: GGUS summary (2 weeks)

ATLAS ALARM-> CERN SLOW AFS

•https://gus.fzk.de/ws/ticket_info.php?ticket=62662

04/21/23 WLCG MB Report WLCG Service Report 6

What time UTC What happened

2010/10/01 7:13 GGUS ALARM ticket opened, automatic email notification to [email protected] AND automatic assignment to ROC_CERN.

2010/10/01 7:33 Operator acknowledges ticket and contacts the expert.

2010/10/01 9:37 IT Service manager re-classifies in CERN Remedy PRMS.

2010/10/11 15:33

Still ‘in progress’. Reminder sent during this drill.

Page 7: GGUS summary (2 weeks)

ATLAS ALARM-> CERN CASTOR

•https://gus.fzk.de/ws/ticket_info.php?ticket=62688

04/21/23 WLCG MB Report WLCG Service Report 7

What time UTC What happened

2010/10/01 16:24

GGUS ALARM ticket opened, automatic email notification to [email protected] AND automatic assignment to ROC_CERN.

2010/10/01 16:41

Operator acknowledges ticket and contacts the expert.

2010/10/01 16:42

Expert starts investigation.

2010/10/01 17:23

Solved. PutDONE in SRM not propagated to CASTOR. Done by hand.


Top Related