Computation Time Analysis - Climate Reanalysis Data Dipanwita Dasgupta University of Notre Dame...

Post on 06-Jan-2018

216 views 0 download

description

Dataset National Centre for Environemental Prediction / National Centre for Atmospheric Research (NCEP/NCAR) Reanalysis Dataset  Composed of data at 17 pressure levels  Total of approximately grid points  Factors affecting climate Slide 3

transcript

Computation Time Analysis - Climate Reanalysis Data

Dipanwita Dasgupta University of Notre Dame Graduate Operating Systems

Motivation

• Climate Analysis : Why it is important? Increase in occurrence of climate

hazards

• Climate Reanalysis Data Data Centric Approach Climate Network

Slide 2

Dataset• National Centre for

Environemental Prediction / National Centre for Atmospheric Research (NCEP/NCAR) Reanalysis Dataset Composed of data at 17

pressure levels Total of approximately

10000 grid points Factors affecting climate

Slide 3

Background

• Climate Network Model Limited to use 7 factors affecting climate Affects the predictive modeling Computation Time

Slide 4 out of x

Problem

• Computation has 3 steps1. Reading the data from file2. Calculation at each level 3. Combining the results

• Step 2 – highly computation intensive • The present code can only handle 20 units of data at a

time

Slide 5

Slide 6

Actual Work

• Analyzed time taken to run on a single machine

• Distributed Framework Steps 1 and 2 mentioned in previous slide for

each level are independent of each other Ran in a distributed fashion Used the CRC SGE Machine

Slide 7

Assumptions

• Used only one parameter– Geopotential Height

• Only one measure of dispersion– Euclidean Distance

• Processing is similar for other parameters as well as for measures of dispersion

Slide 8

Experimental Set-up

• NCEP Reanalysis Dataset• 20 units of longitude

• Sequential Execution Used the school workstation desktop

• Distributed Framework Used opteron.crc.nd.edu

Slide 9

Distributed Framework: Setup

• opteron.crc.nd.edu• Submitted Bash script• Ran 10 simulations per level• Took the average

Slide 10

Slide 11

Slide 12

Speedup

Slide 13

Results Analysis

• Distributed Framework works better than Sequential Execution

• Expected Speed-Up not achieved Reading data from the file took more time than

expected Reduced time for the other steps

Slide 14

Future Work

• Optimization of reading data from file• Use various file systems – NFS/AFS• Include more measures of dispersion• Increase the number of parameters

Slide 15

Questions??

Slide 16