Improving Earthquake Forecasts using USC HPCC
Scott CallaghanSouthern California Earthquake CenterUniversity of Southern CaliforniaSC12
What are earthquake forecasts?
• Want to describe possible earthquakes in a region– Seismic hazard maps– Insurance rates– Building codes
• Determine faults,magnitudes,earthquake rates
• Forecast producedevery 5 yearsfor California
Components of earthquake forecasts
• Integrate data from many sources– Magnitude-frequency distribution– Paleoseismicity– Slip rates– Sanity checks
• Try to satisfyconstraintsas closelyas possible
The Grand Inversion
• Divide faults into small segments• Consider 1 or more segments together• Solve for rates, given constraints
– 234k rates– 30k constraints
• Minimize error• Need to run
thousands of times– Underdetermined– Multiple branches
Simulated Annealing (SA)
• Iterative approach for solving optimization problems• Works by reducing ‘energy’ heuristic
– Calculate energy of current state– Calculate energy of neighboring state– Probability of moving to a neighbor state is proportional to
the temperature and inversely proportional to the energy difference• If energy is less, always move• If energy is greater, occasionally move
– Over time, reduce temperature to converge on a minimum energy (not necessarily global minimum)
• Serial version too slow for Grand Inversion
Parallel Simulated Annealing
• Needed to parallelize algorithm• Have each core perform serial SA for some number
of iterations (nSubIterations)• Share best answer with all cores• Continue until stopping criteria is met• Able to cover more search space quickly
Parallel Simulated Annealing
Parallel Simulated AnnealingnNodes = 5
Parallel Simulated AnnealingnSubIterations = 200
Parallel Simulated Annealing
Parallel Simulated Annealing
Parallel Simulated Annealing
Implementation
• Seismology code– OpenSHA - http://www.opensha.org, open source– Java-based, object-oriented
• Both Java MPI and threaded versions• Why Java?
– Codebase in Java, avoid cost of porting– Scientists already comfortable with Java– Avoid maintaining separate serial and parallel codebases
• OpenSHA is used in many other applications
Performance
LegendSingle Node (thin lines)• 1 thread• 2 threads• 4 threads• 8 threads
Multiple Nodes (4 threads ea.)• 2 nodes (8 threads)• 5 nodes (20
threads)• 10 nodes (40
threads)• 20 nodes (80
threads)• 50 nodes (200
threads)• 100 nodes (400
threads)
Optimal
Actual
Sqrt(threads)
Optimization
• Improve performance of serial section• Energy calculation:
– [1x234000] x [234000x30000] = [1x30000]– Calculate misfits
• Switched to Parallel Colt• Stopped performing entire matrix multiplication each
iteration– Only compute differences, not whole matrix– 100x speedup– Additional 10x with caching
HPCC runs
• 7682 inversions• Best use of SUs is 1 node per inversion
– 5 hours on 8 cores/node per inversion• Continuing to run inversions with new data and
models• Quick way to take new inputs and determine impact
on rates
Questions?