+ All Categories
Home > Documents > MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

Date post: 13-Jan-2016
Category:
Upload: diana-lily-carson
View: 221 times
Download: 0 times
Share this document with a friend
Popular Tags:
16
MINERV A USER GROUP MEETIN G 3 July 2012 3 Jul 2012 MUG - Mid April - June Period 1
Transcript
Page 1: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 1

MINERVA USER

GROUP MEETING

3 July 20123 Jul 2012

Page 2: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 23 Jul 2012

Page 3: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 3

Minerva Operational StatisticsMaintenance Stats – Mid-April through June

Number of Planned PMs 6

Number of Cancelled PMs 4

Number of Unplanned Outages 1

Total Time in Period 1,800 Hours

Total Planned Downtime 40 Hours

Total Unplanned Downtime 12 Hours

Total Time Available 97%

Time in Period 14M CPU Hours

Total Compute Time Used 5.3M CPU Hours

Average Utilization 37%

Accounting StatisticsNumber of Users 91Number of Completed Jobs 313,000

3 Jul 2012

Page 4: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 4

Minerva Operational StatisticsMid-April

Number of Completed Jobs 17,726

Time in Period ~2.5M CPU Hours

Total Compute Time Used 160K CPU Hours

Average Utilization 6.4%

May

Number of Completed Jobs 69,677

Time in Period 5.3M CPU Hours

Total Compute Time Used 1.75M CPU Hours

Average Utilization 33%June

Number of Completed Jobs 230,795

Time in Period 5.3M CPU Hours

Total Compute Time Used 3.3M CPU Hours

Average Utilization 62%3 Jul 2012

Page 5: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 5

Minerva Usage By UserUser CPU Hours N Jobs Avg CPU

H / JobMin Size

Avg Size Max Size

Mihaly Mezei 816,880 132,035 6.18 11.5 2,048

Menachem Fromer 773,833 57,731 13 115.3 25

Ernesto Borrero 768,744 274 2,805 1693.4 4,096

Yacob Gomez 688,090 2,107 326 192.1 1,024

Vladimir Makarov 397,200 24,709 16 17.1 64

Ana Negri 266,359 277 961 32175.6 256

Michael Linderman 196,460 11,670 17 117.2 64

Hardik Shah 190,146 470 404 1647.1 64

Elena Parkhomenko 168,407 946 178 1 3.9 4

Bojan Losic 142,900 3,643 39 126.1 64

3 Jul 2012

Page 6: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 6

Mihaly Mezei; 816,880

Menachem Fromer; 773,834

Ernesto Borrero; 768,744

Yacob Gomez; 688,091

Vladimir Makarov; 397,201

Ana Negri; 266,359

Michael Linderman; 196,461

Hardik Shah; 190,146

Elena Parkhomenko; 168,407

Bojan Losic; 142,901

Zachary Giles; 129,745

Hyung min Cho; 110,500

Ariella Cohain; 110,210

Harm Van bakel; 110,172

Roberto Sanchez; 87,468

Sonali Arora; 86,950Dalila Pinto; 82,309

Other; 213,183

Minerva Hours By User

3 Jul 2012

Page 7: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 7

User Core Hours N Jobs User Core Hours N Jobs

gilesz01 129,744 45,250 changr04 13,385 1,024

choh07 110,499 497 pendlm02 8,081 49

cohaia01 110,210 13,758 bongeg01 6,446 83

vanbah01 110,172 1,264 kouy01 5,094 66

sanchr05 87,468 280 yoos01 3,911 56

aroras03 86,949 2,726 jabado01 2,540 4,545

pintod02 82,309 4,079 ruderd02 1,500 72

zhuj05 35,151 380 bashia02 1,404 30

fludee01 27,196 42 purces04 1,401 70

osmanr01 24,737 29 holeha01 949 5

johnsj12 21,694 22 yangy10 405 313

gargp01 21,618 6,852 brandt02 341 11

schade01 18,654 190 ludtka01 187 6

goldba06 18,464 1,064 provad01 14 12

caig01 2

Remaining Users CPU Hours

3 Jul 2012

Page 8: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 8

Minerva Usage By GroupP.I. CPU Hours N Jobs Avg CPU

H / JobMin Size

Avg Size Max Size

Marta Filizola 1,056,812 586 1,803 1409 4,096

Mihaly Mezei 816,876 132,031 6.18 11.5 2,048

Pamela Sklar 776,361 62,164 12 114 64

Ivan Ubarrechena 688,090 2,107 326 192 1,024

Joseph Buxbaum 584,072 25,845 22 17 64

HPC Staff 267,440 46,127 5.79 17,680

Rui Chang 210,545 17,548 12 16 64

Michael Linderman 197,597 11,751 16.8 1617.2 64

Milind Mahajan 189,941 502 378 1 47.9 4

Bojan Losic 142,900 3,643 39 126.1 64

3 Jul 2012

Page 9: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 9

Marta Filizola; 1,056,812

Mihaly Mezei; 816,876

Pamela Sklar; 776,362Ivan Ubarrechena; 688,091

Joseph Buxbaum; 584,073

HPC Staff; 267,441

Rui Chang; 210,546

Michael Linderman;

197,598

Milind Maha-jan; 189,942

Bojan Losic; 142,901

Harm Van bakel; 110,172Roberto Sanchez; 87,468 Dalila Pinto; 82,309 Jun Zhu; 39,062

Minerva Hours by Group

3 Jul 2012

Page 10: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 10

14

516

3264

128192

256384

512768

10242048

3840

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

15

812

2448

7296

100144

Minerva Job Mix

1581224487296100144

Job Size

CPU

Hou

rs

Wall Time3 Jul 2012

Page 11: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 11

Minerva UtilizationMid-April - June

3 Jul 2012

Page 12: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 12

Minerva UtilizationMay - June

3 Jul 2012

Page 13: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 13

Minerva Scratch Usage

Group Size

Joseph Buxbaum 134T

Dalila Pinto 27T

Genomic Core 22T

Shaun Purcell 12T

Next Gen Seq 3.9T

Genomic Core II 2.0T

Jun Zhu 1.8T

Harm Van bakel 907G

Milind Mahajan 126G

Zhidong Tu 75G

User Size

Ernesto Borrero 12T

Ana Negri 4.5T

Bojan Losic 4.1T

Hyung min Cho 1.3T

Temp Folders 1.3T

Harm Van bakel 1.2T

Yan Kou 979G

Mihaly Mezei 941G

Yacob Gomez 927G

Zhidong Tu 755G

/projects /scratch

3 Jul 2012

Page 14: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 14

Other Plans/Projects

• Archival Storage– Ordered: Tape Library with 4 Tape transports

• 350TB tape capacity

– Anticipated 1 Sep 2012 start of service• GPGPU

– Chassis w/2 Fermi-based Tesla cards ordered– Target availability date is 1 Aug 2012

• Checkpoint/Restart (BLCR)– Partially Installed – needs reboot of systems and testing.

• Monthly Training Meetings– Third Tuesday of Month– Alternate between basic and advanced

3 Jul 2012

Page 15: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 15

HiccupsScheduler Failure:

Problem: June Tripled previous job count. Scheduler database table overflowed.

Resolution: We put limits for the number of jobs per user in Torque and Moab. Long Term: Newer version of Torque and Moab. Move to a SQL Database.

Infiniband / MPI Issues:Problem: Mellanox driver buffer overflowing because of 64 core systems.Resolution: We built a custom version of the Mellanox driver.Long Term: Working with Mellanox to add changes to mainline code.

AMD 64core understanding + performance:Problem: Misunderstanding of number of 32 FPU’s in a system, not 64. Also

the ACML Library is not tuned for the FFTW Library.Resolution: Changed scheduling to allow blocks of 32 and job exclusive

nodes.Long Term: AMD is creating a new ACML library with tuned FFT sizes.

3 Jul 2012

Page 16: MINERVA USER GROUP MEETING 3 July 2012 3 Jul 2012MUG - Mid April - June Period1.

MUG - Mid April - June Period 16

Open ForumRequested/Suggested Topics

• Bioconductor R site-library– Should we put all Bioconductor R packages in one

library? ( module load bioconductor)• Epilogue report– Report job resource resource usage to syserr?

• PM Schedule– Can we reduce PM’s to monthly?

• Fairshare– Comments? Feedback?

3 Jul 2012


Recommended