Super Computing with PowerWorld SimulatorTracy RolstadFebruary 2016PowerWorld Client ConferenceAustin, TX
Education
• Tracy Rolstad– Diploma, Naval War College, College of Naval
Command and Staff– BSEE, University of Idaho– Nuclear Navy
• Nuclear Operational Prototype (S1C)• Nuclear Power School (Reactor Operator)• Electronics Technician School
– Radar, Communications, etc
Resume…
– Avista Corporation• Senior Pwr Sys Consultant, System Planning• WECC TSS Chair, Vice Chair, Secretary (former all)
– Utility System Efficiencies• Senior Power Systems Analyst
– The Bonneville Power Administration• Senior Engineer, System Operations
– The Joint Warfare Analysis Center• EP Senior Analyst, PACOM Chief of Targets• Special Technical Operations Action Officer
– Nuclear Navy (Attack Submarines)• Chief Petty Officer (ETC/SS)• Engineering Watch Supervisor
4
The Cray Y-MP: eight 32-bit processors capable of 333 MegaFlops each. Combined, the Cray Y-MP could sustain a speed of over 2 GFlops. The CPUs ran at a blazing 167Mhz and could process both 24-bit and 32-bit instructions. It cost ~$20 million in 1995.
Fastest processor for a desktop (~1995) was a 486DX/2 66. System bus of 33Mhz and an internal clock rate of 66Mhz. It was a 32-bit processor capable of 2.67 MegaFlops. It cost $4,000.
Lenovo T440 (i7-4600, 2 cores @2.10 GHZ). ~3 GFlops range (laptop with 1 CPU)• The i7 has ~1.3 billion transistors •Running a 64 bit OS (with far too much x86 software)• Costs about $700
Avista is edging towards “teraFlopping”…but practically Flops are a meaningless measure. We use a double Palo Verde generation trip as a measure (i.e.2PV…seconds per 2PV simulation)
5
PowerWorld’s Distributed Computing
Host
RemoteRemote
Remote
Remote
Remote
Remote
RemoteRemote
Remote
Remote
Remote
Remote
Remote
Remote • Uses all CPU cores on your PC for analysis
• Can use all CPU cores on networked PCs as well
• The helpful folks in Avista’s technology group got us 14 dedicated PCs
– 176 cores of analysis power• It rocks!
– We have our own LAN• DCOM on corporate LAN’s is not
desired• Virus 101 says no DCOM• IT is happy with us being separate
5Avatar—Avista Advanced Transmission & Reliability
6
The Avista “backbone”
A “Supercomputer” is Boring…
It is also inexpensive and easy to build
What is this?
Avista’s “Super Computer”
• 5 Lenovo P700 (x2 CPU) w/32 GB RAM– Total of 10 CPU (2 cpu, 12 core each) for 120 cores
• Server class machines (#6 on Intel speed list, ~$10K each)
• 8 HP Z400 (x1 CPU) w/16 GB RAM– Total of 8 CPU (6 cores each) for 48 cores
• 1 HP Z800 (x1 CPU) w/16 GB RAM– Total of 1 CPU (8 cores)
• Total Core Count is ~176– Count varies as we refresh machines or take them
out of the cluster– More is better/fast is better. Go big or go home! – Don’t be cheap. Don’t use junk!
Passmark CPU Benchmark
6
112
311
505
P700
Z400
Z800
T440
Buy fast computers. Patches matter. Fast computers hide the need for patches!• Laptop-T440 does 2 PV 10 second sim in 5 min, 6 sec (have seen 14 min)
• Jamie can talk about the patches and their possible effects• Desktop-P700 does 2 PV 10 second sim in 3 min, 56 sec
Double Palo Verde (x 1000)
Double Palo Verde Trip
• The double Palo Verde Generator Trip– Well known WECC outage and test
• Loss of ~1376 MW x2 (total loss is 2752 MW)• Excellent case checking contingency
– It HAS happened in the past» 14 June 2004 all 3 Palo Verde units were lost
Note the Data Issues (it never ends)
Double Palo Verde
• Is computationally “busy”• Is an excellent benchmark for WECC
– Stability is a one at a time process – Power flow has queuing that you set up (later slides)
• Timing Tests (Almost irrelevant at this point)– Single ten sec 2 PV sims on P700 = 4 minutes– 1000 ten sec 2 PV sims on cluster = 60 minutes
• We observe overhead when “assembling results”– 1000 * 10 = 10,000 sec or 167 minutes of sim time
• “Faster than real time” in aggregate– Saved an FTE…went from days to minutes
Google: 4000 minutes is how many days?
Avista’s present transient stability list is ~1000 contingencies.• TPL-01-04 is moving this number much higher.
We look at as many as 20 cases in a year for compliance purposes.• Our studies took months before distributed computing (4 min per TS sim).• Software begat hardware.• Software & Hardware equals Good!• More is better…to a point which we have not found yet
Engineering judgment for N-1-1…• Simulation = good…judgment = not as good as you think!
W i d b N 1 1 ff t
Things Don’t Always Work (4168 hung)
Reboot and Patch Software
Distributed Computing Setup
Watching it assign 176 problems…
Assigns176 ctgsto 176 cores…Then you wait
Processing the results (from 1000 2 PV)
Results come back to a temp directory
Watching a machine work
Looking at machine adequacy
Look at speed,memory, cores, etc
Final Results (months turn into hours)
• 1000 double Palo Verdes in 60 minutes– 60 minutes of actual computing for 167 minutes of
simulated event time.• “Faster” than real time in aggregate
• 90,000 power flow contingencies in 18 minutes– Yes, when being rigorous and complete you run this
many– Our experience is that engineering judgment is poor
judgment…simulation removes doubt• Cost $100K…Benefit of at least one FTE
ongoing– We see 2 FTE emerging with more PC’s
Why such volume of simulation?
Future Steps
• Dashboard with computer management tools– c:\Windows\System32\taskkill.exe /F /IM
pwrworld.exe• Fault tolerance for computer failure or hanging• Speed increase opportunities
– Optimization, code grooming, elegance, efficiency, etc• Cloud Computing (new PWC service perhaps)• More people doing this!
– Avista is out in front here and would welcome others• Thousands of sims debugs PWS quickly• Bye bye x86…hello 64 bit PWS
Questions
• Questions, thoughts, ideas…• Food for thought
– How fast would ERCOT study’s take?– Eastern Interconnection?– State Estimated case work for RCs– 64 bit OS and the future of simulation
• Calculate the RAM limit– exabytes– What does that even look like