Florida State University Libraries
Electronic Theses, Treatises and Dissertations The Graduate School
2011
Parallel Grid Generation and Multi-Resolution Methods for Climate ModelingApplicationsDouglas W. (Douglas William) Jacobsen
Follow this and additional works at the FSU Digital Library. For more information, please contact [email protected]
THE FLORIDA STATE UNIVERSITY
COLLEGE OF ARTS AND SCIENCES
PARALLEL GRID GENERATION AND MULTI-RESOLUTION METHODS FOR
CLIMATE MODELING APPLICATIONS
By
DOUGLAS W. JACOBSEN
A Dissertation submitted to theDepartment of Scientific Computing
in partial fulfillment of therequirements for the degree of
Doctor of Philosophy
Degree Awarded:Summer Semester, 2011
The members of the committee approve the dissertation of Douglas W. Jacobsen
defended on June 14th, 2011.
Max GunzburgerProfessor Directing Thesis
Doron NofUniversity Representative
Janet PetersonCommittee Member
Gordon ErlebacherCommittee Member
Michael NavonCommittee Member
John BurkardtCommittee Member
Todd RinglerCommittee Member
Approved:
Max Gunzburger, Chair, Department of Scientific Computing
Joseph Travis, Dean, College of Arts and Sciences
The Graduate School has verified and approved the above-named committee mem-
bers.
ii
I would like to dedicate this dissertation to my loving and supportive wife whohelped me significantly through all of my school work. Also, I would like to thankmy parents and brothers for their continued support.
iii
ACKNOWLEDGMENTS
I would like to thank Dan Voss, Geoff Womeldorff, Mark Peterson, Michael Duda,
and Phil Jones for many useful discussions. The work contained in this dissertation
was supported by the US Department of Energy under grant numbers DE-SC0002624
and DE-FG02-07ER64432.
iv
TABLE OF CONTENTS
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction 1
1.1 Personal Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Parallel SCVT Generator Background 9
2.1 Delaunay Triangulations . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Voronoi Tessellations . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Stereographic Projections . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Parallel Algorithm Details . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Convergence Criteria . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Parallel SCVT Generator Results 23
3.1 Quasi-Uniform Results . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Variable Resolution Results . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Grid Generator Performance . . . . . . . . . . . . . . . . . . . . . . . 34
4 Numerical Model Background 36
4.1 Shallow-Water Equations and Numerical Method . . . . . . . . . . . 37
4.2 Shallow-Water Test Cases . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.1 Non-linear Geostrophic Flow (TC2) . . . . . . . . . . . . . . . 42
4.2.2 Zonal Flow Over an Isolated Mountain (TC5) . . . . . . . . . 43
4.2.3 Barotropic Instability (BTI) . . . . . . . . . . . . . . . . . . . 44
5 Numerical Model Results 45
5.1 Shallow-Water Model Setup . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Shallow Water Test Case Results . . . . . . . . . . . . . . . . . . . . 47
5.2.1 Shallow Water Test Case 5 . . . . . . . . . . . . . . . . . . . . 47
5.2.2 Shallow Water Test Case 2 . . . . . . . . . . . . . . . . . . . . 57
5.2.3 Barotropic Instability Test Case . . . . . . . . . . . . . . . . . 60
v
6 Adaptive Mesh Refinement Background 636.1 AMR Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636.2 SCVT-AMR Framework . . . . . . . . . . . . . . . . . . . . . . . . . 64
7 Adaptive Mesh Refinement Results 737.1 642 Point Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737.2 2562 Point Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8 Discussion 878.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
vi
LIST OF TABLES
3.1 Timing results for MPI-SCVT with bisection and Monte Carlo initialconditions and the speedup of bisection relative to Monte Carlo initialconditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Comparison of STRIPACK with Serial and Parallel versions of MPI-SCVT using final triangulations . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Comparison of STRIPACK with serial and parallel versions of MPI-SCVT using per iteration triangulations . . . . . . . . . . . . . . . . . 26
3.4 Timings based on the domain decomposition used. Uniform uses a coarsequasi-uniform SCVT to define region centers and their associated radii,and sorts using a simple dot product. x16 uses a coarse x16 SCVT todefine region centers and their associated radii, and sorts using a simpledot product. Voronoi uses a coarse x16 SCVT to define region centersand their associated radii, and sorts using a Voronoi cell based sort. . . 31
5.1 Table of grid sizes and spacings for quasi-uniform grids used in shallow-water exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Minimum values and grid spacing factors . . . . . . . . . . . . . . . . . 46
5.3 Approximate mesh resolutions (km) of the fine-mesh (dxf ) and coarse-mesh (dxc) regions of the global domain for the x1 through x16 meshesas a function of the number of grid points. . . . . . . . . . . . . . . . . 46
7.1 Error norms associated with the suite of AMR meshes based on the 642grid point reference mesh. Presented are L2 and L∞ norms of the errorin the thickness field, compared to a T511 reference simulation . . . . . 77
7.2 Error norms for AMR grids based on 2562 grid point reference mesh. L2
and L∞ norms are computed with the thickness field relative to a T511simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
vii
LIST OF FIGURES
2.1 Cross-sectional illustration of a stereographic projection from a sphereinto a tangent plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Domain Decomposition Example. Figure 2.2(a) is an SCVT used for a12 processor domain decomposition, where Figure 2.2(b) is a 10242 gen-erator Delaunay triangulation computed using the 12 generator SCVTfor parallelization. Each colored ring represents a regions radius Rk,where region centers Tk are the Voronoi cell center, at the center ofeach pentagonal structure in Figure 2.2(a). . . . . . . . . . . . . . . . . 18
2.3 Triangulations in a plane after Stereographic projection. 2.3(a) is thetriangulation before (2.8) is applied, and 2.3(b) is after it is applied . . 19
2.4 Triangle division used for integrating Voronoi cells using only the De-launay triangulation without any adjacency information. Kite sectionscontribute to the Voronoi cell centered at the vertex that is part of thekite. A, B, C vertices are generators in the point set, where the point atthe center of the triangle is the circumcenter of this triangle. Triangularregions that are colored similarly contribute to the same vertex. . . . . 20
3.1 Timings for a STRIPACK based SCVT Generator at 162, 642, 10242,40962 and 163842 generators. The red solid line represents the timespent in STRIPACK computing a triangulation, where the green dashedline represents the time spent integrating the Voronoi cells outside ofSTRIPACK in one iteration of Lloyd’s algorithm. Timings in this figurewere computed using an Intel Core 2 Duo T8100 CPU with 3GB of RAM. 24
viii
3.2 Timings for various portions of MPI-SCVT using 2 processors and 2regions. As the problem size increases the slope of both the triangula-tion (Red-Solid) and the integration (Green-Dashed) remain constant.The triangulation doesn’t become more expensive than the integrationuntil after roughly 163842 generators, as compared to Figure 3.1 wheretriangulation was more expensive after only 2562 generators. Also, atriangulation using 2621442 generators costs roughly the same usingMPI-SCVT and 2 processors as a triangulation using 163842 generatorsin STRIPACK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Timing results from MPI-SCVT vs. number of processors. Constantproblem size, shown as parallelization is increased. Red solid lines rep-resent the cost of computing a triangulation, where green dashed linesrepresent the cost of integrating all Voronoi cells, and blue dotted linesrepresent the cost of communicating each region’s updated point set toits neighbors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Density function that creates a grid with resolutions that differ by afactor of 16 between the coarse and the fine region. The maximumvalue of the density function is 1, where the minimum value is ( 1
16)4. . 29
3.5 Figures show a variable resolution grid created using a density functionwith the format defined in (3.1). All three figures are of the same grid,only the viewing perspective is changed. Figure 3.5(a) shows the coarseregion of the grid, 3.5(b) shows the transition region of the grid, and3.5(c) shows the fine region of the grid. . . . . . . . . . . . . . . . . . . 32
3.6 Number of points each processor has to triangulate. 3.6(a) uses a quasi-uniform SCVT for its decomposition, with a simple dot product. 3.6(b)uses a x16 SCVT for its decomposition, with a simple dot product.3.6(c) uses a x16 SCVT for its decomposition, with a more complicatedsort based on the region’s Voronoi diagram. . . . . . . . . . . . . . . . 33
3.7 Scalability results based on number of generators. Green is a linearreference where Red is the Speedup computed using parallel version ofMPI-SCVT against a serial version . . . . . . . . . . . . . . . . . . . . 35
4.1 Four members of a family of meshes constructed from (3.1). Each meshuses 2562 grid points and only differ in the setting of the parameterγ. x1, x2, x4 and x16 shown in the top-left, top-right, bottom-left andbottom-right, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . 40
ix
4.2 C-grid staggering of variables for the finite-volume scheme used in MPAS.Fluid thickness, topography, and kinetic energy are stored at Voronoicell centers. The normal component of the velocity field is defined atthe mid-point of line segments connecting cell centers. Vorticity relatedfields such as relative, absolute, and potential vorticity are stored atVoronoi cell vertices. Derived fields, he, qe, and F⊥
e must be recon-structed at each velocity point. . . . . . . . . . . . . . . . . . . . . . . 41
5.1 The fluid height, hi + bi, at day 15 for TC5. Starting at the upperleft and moving clockwise shows results from the X1, X2, X16 and X4meshes using 40962 cells. The black oval denotes the location of themountain. The figures are generated by filling each Voronoi cell with asingle color, i.e. there is no interpolation due to rendering. This allowsthe coarse-mesh grid cells to be seen in the X4 and X16 simulations. Allresults are plotted with an identical color scheme with a maximum of5975 m and a minimum of 5025 m. . . . . . . . . . . . . . . . . . . . 52
5.2 Log10 of the relative change in available total energy for TC5 as a func-tion of time for the x1, x2, x4, x8 and x16 meshes with 40962 grid points. 53
5.3 Globally averaged potential enstrophy as a function of time for x1, x2,x4, x8, and x16 meshes with 40962 grid points. Simulations are runfor 15 days. Figures show decreasing potential enstrophy for x1 and x2meshes, and increasing potential enstrophy for x4, x8, and x16 meshes. 54
5.4 Log10 of the relative change in available potential enstrophy for TC5 asa function of time for the x1, x2, x4, x8 and x16 meshes with 40962 gridpoints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5 The L2 error of the thickness field at day 15 for TC5 shown for the x1,x2, x4, x8 and x16 meshes. Figure 5.5(a) shows errors as a function ofnumber of generators, and figure 5.5(b) shows errors as a function ofcoarse-mesh grid spacing. Error norms are computed against a T511reference solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.6 The L2 error of the thickness field at day 12 for TC2 for the x1, x2, x4,x8 and x16 meshes. Figure 5.6(a) shows errors as a function of number ofgenerators, and Figure 5.6(b) shows errors as a function of coarse-meshgrid spacing. Error norms are computed against the analytic initialconditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
x
5.7 Each panel depicts the relative vorticity field at day 6 for a barotropically-unstable jet using 655362 cells. The panels differ only in the mesh usedin the simulation. The vertical extent of each panel covers the north-ern hemisphere. The horizontal extent covers all longitudes starting at-90 degrees such that the fine-mesh region is approximately centered oneach panel. The color scales are identical for every panel and saturateat ±1.0× 10−4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1 Density field obtained after one simulation day using the relative vor-ticity field from shallow-water test case 5, on a x1 2562 generator grid,corresponding to the first four steps in Algorithm 3. Figure 6.1(a) hasno smoothings applied, Figure 6.1(b) has 16 smoothings applied, Figure6.1(c) has 64 smoothings applied, and Figure 6.1(d) has 128 smoothingsapplied. The smoothing operator is defined in (6.2). Red representsthe minimum, where blue represents the maximum. To show transitionscolor represents log2(ρ
1/4) . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Three triangles with subdivision based on density values. Figure 6.2(a)shows a triangle whose density value is 14 providing no divisions. Figure6.2(b) shows a triangle whose density value is 24 providing one division.Figure 6.2(c) shows a triangle whose density value is 44 providing twodivisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.1 AMR grids based on a 642 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.2 Reference data fields for 642 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in Figure 7.2(a) is the fluidthickness field, Figure 7.2(b) is the potential vorticity field, and Figure7.2(c) is the relative vorticity field. . . . . . . . . . . . . . . . . . . . . 75
7.3 Thickness fields from the 642 suite of AMR meshes. Figure 7.3(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.3(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.3(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.3(d) showsthe thickness field from a mesh with 128 smoothings. . . . . . . . . . . 76
7.4 Potential vorticity fields from the 642 suite of AMR meshes. Figure7.4(a) shows the potential vorticity field from an unsmoothed AMRmesh. Figure 7.4(b) shows the potential vorticity field from a meshwith 16 smoothings. Figure 7.4(c) shows the potential vorticity fieldfrom a mesh with 64 smoothings. Figure 7.4(d) shows the potentialvorticity field from a mesh with 128 smoothings. . . . . . . . . . . . . . 78
xi
7.5 Relative vorticity fields from the 642 suite of AMR meshes. Figure7.5(a) shows the relative vorticity field from an unsmoothed AMR mesh.Figure 7.5(b) shows the relative vorticity field from a mesh with 16smoothings. Figure 7.5(c) shows the relative vorticity field from a meshwith 64 smoothings. Figure 7.5(d) shows the relative vorticity field froma mesh with 128 smoothings. . . . . . . . . . . . . . . . . . . . . . . . 79
7.6 AMR grids based on a 2562 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.7 Reference data fields for 2562 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in figure 7.7(a) is the fluid thick-ness field, figure 7.7(b) is the potential vorticity field, and figure 7.7(c)is the relative vorticity field. . . . . . . . . . . . . . . . . . . . . . . . . 83
7.8 Thickness fields from the 2562 suite of AMRmeshes. Figure 7.8(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.8(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.8(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.8(d) showsthe thickness field from a mesh with 128 smoothings. . . . . . . . . . . 84
7.9 Potential vorticity fields from the 2562 suite of AMR meshes. Figure7.9(a) shows the potential vorticity field from an unsmoothed AMRmesh. Figure 7.9(b) shows the potential vorticity field from a meshwith 16 smoothings. Figure 7.9(c) shows the potential vorticity fieldfrom a mesh with 64 smoothings. Figure 7.9(d) shows the potentialvorticity field from a mesh with 128 smoothings. . . . . . . . . . . . . . 85
7.10 Relative vorticity fields from the 2562 suite of AMR meshes. Figure7.10(a) shows the relative vorticity field from an unsmoothed AMRmesh. Figure 7.10(b) shows the relative vorticity field from a mesh with16 smoothings. Figure 7.10(c) shows the relative vorticity field from amesh with 64 smoothings. Figure 7.10(d) shows the relative vorticityfield from a mesh with 128 smoothings. . . . . . . . . . . . . . . . . . . 86
xii
ABSTRACT
Spherical centroidal Voronoi tessellations (SCVT) are used in many applications in a
variety of fields, one being climate modeling. They are a natural choice for spatial dis-
cretizations on the surface of the Earth. New modeling techniques have recently been
developed that allow the simulation of ocean and atmosphere dynamics on arbitrarily
unstructured meshes, including SCVTs. Creating ultra-high resolution SCVTs can
be computationally expensive. A newly developed algorithm couples current algo-
rithms for the generation of SCVTs with existing computational geometry techniques
to provide the parallel computation of SCVTs and spherical Delaunay triangulations.
Using this new algorithm, computing spherical Delaunay triangulations shows a speed
up on the order of 4000 over other well known algorithms, when using 42 processors.
As mentioned previously, newly developed numerical models allow the simulation
of ocean and atmosphere systems on arbitrary Voronoi meshes providing a multi-
resolution modeling framework. A multi-resolution grid allows modelers to provide
areas of interest with higher resolution with the hopes of increasing accuracy. How-
ever, one method of providing higher resolution lowers the resolution in other areas
of the mesh which could potentially increase error. To determine the effect of multi-
resolution meshes on numerical simulations in the shallow-water context, a standard
set of shallow-water test cases are explored using the Model for Prediction Across
Scales (MPAS), a new modeling framework jointly developed by the Los Alamos
National Laboratory and the National Center for Atmospheric Research.
An alternative approach to multi-resolution modeling is Adaptive Mesh Refine-
ment (AMR). AMR typically uses information about the simulation to determine
xiii
optimal locations for degrees of freedom, however standard AMR techniques are not
well suited for SCVT meshes. In an effort to solve this issue, a framework is developed
to allow AMR simulations on SCVT meshes within MPAS.
The resulting research contained in this dissertation ties together a newly devel-
oped parallel SCVT generator with a numerical method for use on arbitrary Voronoi
meshes. Simulations are performed within the shallow-water context. New algorithms
and frameworks are described and bench-marked.
xiv
CHAPTER 1
INTRODUCTION
Modeling the Earth’s climate has been considered a grand-challenge problem due to
the broad range of spatial and temporal scales required for robust simulation of its
subcomponents. For example, the climate of the ocean is controlled by both basin
scales of motion, O(104) km, and sub-mesoscale processes with O(10−1) km scales
[2]. These scales are highly interacting, as is typical of nonlinear systems, in that the
O(104) km global scales modify and are modified by the O(10−1) km local scales. For
robust simulation of the climate, an accurate representation of the smallest scales is a
requirement based on this strong inter-scale dependence. This broad scale interaction
is present in both the atmosphere and the ocean creating a difficulty in accurately
simulating the full climate system.
One major deficiency in climate modeling today is resolving small-scale processes.
These processes are resolved typically in one of two ways; either parameterization,
or direct simulation. Direct simulation is computationally expensive as it requires a
high enough spatial resolution to resolve even the smallest-scale processes. Currently
the computational resources available are not sufficient to directly simulate all scales
associated with the fundamental processes in the atmosphere and ocean, such as
clouds and ocean eddies [22]. As an alternative to direct simulation many models use
parameterizations of processes. However, parameterizing a process can be extremely
difficult because it requires an a priori knowledge of the cross-scale interaction of the
1
process. This requires developers to have a greater understanding of the underlying
physics associated with the physical process than those trying to perform direct sim-
ulations of the same process. Although parameterizations are indispensable tools, the
underlying difficulty in developing accurate parameterizations leads climate modelers
to increase model resolution, therefore allowing more direct simulations of small-scale
processes.
A novel technique in climate modeling is explored as part of this dissertation. This
new technique, referred to as a multi-resolution method, is complementary to three
existing branches of research that are active in the climate modeling community to-
day. The first is global ultra high-resolution climate system modeling [18]. Global
ultra high-resolution climate modeling attempts to pair ultra high-resolution climate
systems with state-of-the art high performance computing systems to achieve simula-
tions at unprecedented resolution. However, this approach has a disadvantage in that
reducing horizontal grid spacing by a factor of two typically requires a factor of 23
increase in computing resources, where longitude, latitude, and time each account for
a factor of 2 individually. This example is ignoring any extra expense from increases
in vertical resolution. Based on this significant increase in computational expense,
it is clear that global ultra-high resolution simulations are only able to represent a
small portion of all the simulations performed.
The second approach, intended to circumvent global high-resolution climate mod-
eling, is called limited-area climate modeling. Limited-area climate modeling has
been explored over the last two decades [12, 19, 34]. Typically this approach uses a
high-resolution mesh only over an area of interest, thus only spanning a portion of the
sphere. Utilizing a limited-area mesh reduces the computational requirements signif-
icantly; however one-way, non-interactive lateral boundary conditions are required.
Typically these lateral boundary conditions are obtained from either reanalysis data
or coarse-resolution global climate simulations.
2
The third approach currently being explored is referred to as multi-scale modeling.
Multi-scale modeling couples models at different scales to create a full simulation.
Previously, multi-scale modeling has been investigated with respect to atmospheric
modeling [13]; however a preliminary exploration of this method with regards to ocean
modeling is in progress [4]. Multi-scale methods are built under the assumption that a
scale separation exists that can be exploited in modeling the physical system, meaning
the fine-scale and coarse-scale processes act on temporal and spatial scales sufficiently
far away from each other. However, this assumption remains unvalidated.
As mentioned previously the work contained in this dissertation, and in [25], which
hopes to become a fourth approach, attempts to address some of the existing compu-
tational challenges in modeling the climate system. This new method is informally
referred to as a multi-resolution approach, and essentially merges traditional global
climate modelling approaches with regional limited-area approaches. A global mod-
eling framework is maintained in multi-resolution simulations in the sense that the
entire spatial extent of the atmosphere and/or ocean is simulated within a single
model; however arbitrary regions of local mesh refinement are allowed, similarly to
limited area or multi-scale methods. A global, conforming mesh is employed simi-
lar to stretched-grid or conformal mapping approaches previously explored [9, 10].
Stretched-grid approaches require a deformation of the mesh through a continuous
mapping, e.g. an increase of resolution in one region requires a decrease of resolution
in other regions. Also, stretched-grid approaches are limited in their ability to place
enhanced resolution in multiple regions. The multi-resolution approach developed in
[25] and explored as part of this dissertation alleviates several of the disadvantages
of stretched-grid methods. However, as with stretched-grid approaches, scale aware
parameterizations need to be developed for use with multi-resolution methods.
Multi-resolution approaches allow one or more regions with significantly higher
grid-resolution than the remainder of the mesh, as can be seen in Figures 3.5 and 4.1.
3
These meshes can be used to directly simulate processes in high resolution areas, while
parameterizing those same processes in low resolution regions, similar to multi-scale
methods. Following the motivation and requirements in [25], this multi-resolution
method requires two key components: First, a finite-volume method capable of main-
taining conservation properties when implemented on highly non-uniform grids, and
second, a conforming variable-resolution mesh with exceptional mesh-quality charac-
teristics.
Before describing the spatial meshes that are used, the finite-volume scheme capa-
ble of conservative simulations on highly varying meshes is introduced. As described
in [24, 31], a new finite-volume method has been developed which allows the use of
Voronoi meshes to produce robust simulations of rotationally-dominated geophysical
flows. Robust finite-volume techniques used in global atmosphere and ocean models
often showcase their ability to constrain the spurious growth of nonlinear quanti-
ties, such as potential enstrophy and total energy [1]. This challenge is particularly
difficult when implemented on non-uniform meshes. Combining the recent works of
[24, 31] provides a finite-volume approach that allows for the conservation of nonlinear
quantities, even when the underlying mesh is highly variable.
Although results presented in [24, 31] only showcase quasi-uniform meshes, the
numerical method described allows the use of arbitrary Voronoi meshes. As part of
this dissertation the numerical method’s ability to simulate on highly varying meshes
is explored. Jointly developed by the Los Alamos National Laboratory (LANL) and
the National Center for Atmospheric Research (NCAR), the Model for Prediction
Across Scales (MPAS) provides a framework suitable for the rapid prototyping and
development of dynamical cores. LANL has developed a shallow-water and a full
three-dimensional ocean dynamical core for use in MPAS, while NCAR has developed
an atmospheric model. MPAS implements the numerical method described in [24,
31] allowing the simulation on arbitrary Voronoi meshes, and will be used for the
4
exploration in this dissertation. In order to explore the model’s ability to simulate on
variable resolution meshes, a standard suite of test cases are used in the shallow-water
system as described in [39].
Before describing their use in multi-resolution modeling, a brief history of Voronoi
diagrams is provided. Voronoi diagrams have had many different names in their
past such as Thiessen polygons, Wigner-Seitz unit cells, and Brillouin zones [21].
The use of Voronoi diagrams involves a wide range of applications from condensed
matter physics, to measuring spatially distributed geophysical and meteorological
data. Although their use today is broad, their past use can be traced back to Descartes
in 1644. Originally Dirichlet derived modern Voronoi diagrams, however only in 2
and 3 dimensional spaces. Georgy Fedoseevich Voronoi generalized this work in 1908
to arbitrary dimensions, providing the definition of what we call Voronoi diagrams
today [33].
One version of these Voronoi diagrams, called a Spherical Centroidal Voronoi Tes-
sellation (SCVT), fulfills the requirements of a conforming, variable-resolution mesh.
Recently in climate modeling, Voronoi-like meshing of the sphere has found success
in global atmosphere modeling [14, 32, 35]. Each of these examples motivates the use
of Voronoi-like meshing through the ability to produce high-quality meshes of uni-
form resolution. In addition to the high-quality of Voronoi-like meshes, problematic
grid singularities associated with other meshing approaches are eliminated. Recent
work suggests that even though Voronoi meshes are well suited for uniform spheri-
cal meshes, they are perhaps even more valuable with respect to variable resolution
meshes.
As discussed in Chapter 2, the generation of variable-resolution SCVTs requires
two key components. First, a point-density function must be defined over the sur-
face of the sphere, providing high density in areas of interest. This density function
will help to enforce the variable-resolution nature of the grid. Second, a centroid
5
constraint must be iteratively enforced in every Voronoi cell. Coupling these two
together allows the creation of general variable-resolution meshes. However, current
algorithms for the generation of SCVTs provide less than desired performance as the
point set increases in size. In an effort to aid multi-resolution modeling, a new al-
gorithm is developed as part of this dissertation to allow the parallel computation
of SCVTs. SCVT generation involves two portions, first a triangulation step where
all points are triangulated. Second, an integration step which enforces the centroidal
constraint on the Voronoi diagram. In current algorithms, the performance bottle-
neck is the triangulation step, because of its sequential implementation. Previous
research has been done in an attempt to parallelize planar triangulation computa-
tions [6], however this work does not directly translate onto the surface of the sphere.
Combining existing computational geometry tools, such as stereographic projections
and domain decomposition, this new algorithm provides the parallel computation of
spherical Delaunay triangulations.
One method yet to be explored using SCVTs in geophysical simulations is adaptive
mesh refinement (AMR). AMR has previously been explored in the context of the
shallow-water equations [5, 29]. However, the majority of work uses cubed sphere
meshes that provide static degrees of freedom. Grid cells are used to represent the root
nodes of quad-trees providing easily implemented coarsening and refining. Typically,
some criterion is defined in order to determine if a grid cell should be coarsened or
refined; however grid cells are not allowed to be coarser than their initial size. In
order to satisfy the name of Adaptive Mesh Refinement, refinement of the meshes is
performed adaptively as the simulation progresses. Usually a field of interest is used to
define the refinement criteria, such as relative vorticity. As the simulation progresses,
the field of interest propagates within the domain, providing new regions that need
to be refined while previously refined regions might need to be coarsened. This
process provides a usable framework typical of standard AMR techniques; however
6
this method does not translate easily to SCVT meshes. In order to relate AMR
techniques to SCVT meshes and the MPAS framework, a new technique is explored
providing AMR-like grid generation. Currently the tools to fully implement an AMR
framework do not exist, however part of the work in this dissertation is intended to
aid the creation of an AMR framework within the context of SCVTs.
The parallel generation of SCVTs is described in detail in Chapter 2, and results
from the new algorithm are presented in Chapter 3. Background material on the
MPAS model, as well as the test cases used, are provided in Chapter 4. Results from
the exploration of MPAS on variable-resolution meshes is provided in Chapter 5. A
brief background on AMR and the new AMR framework are provided in Chapter 6.
The results of this new AMR framework are presented in Chapter 7. Finally, this
dissertation concludes with a discussion of the presented material in Chapter 8.
1.1 Personal Contributions
This section explains my personal contributions to the work contained in this
dissertation. In Chapters 2 and 3 my personal contributions are as follows:
• Developed and implemented algorithm for parallel computation of spherical
Delaunay triangulations and spherical centroidal Voronoi tessellations;
• Developed and implemented load balancing algorithm for variable resolution
grids;
• Benchmarked algorithm to produce results.
In Chapters 4 and 5 my personal contributions are as follows:
• Wrote software for conversion from a point set and triangulation to MPAS grid;
• Wrote software for visualization of MPAS input/output/restart files;
• Generated all grids for simulations;
• Wrote software for computation of globally averaged diagnostic quantities;
7
• Wrote initial condition generator for barotropic instability test case;
• Wrote software for computation of global error norms;
• Ran all 25 simulations and computed global error norms.
In Chapters 6 and 7 my personal contributions are as follows:
• Developed AMR framework for SCVT meshes;
• Wrote software for refining SCVT meshes based on a field from the output of
MPAS;
• Wrote software for mapping a density field and smoothing it;
• Wrote software for computation of global error norms;
• Ran all 8 simulations and computed global error norms.
8
CHAPTER 2
PARALLEL SCVT GENERATORBACKGROUND
This chapter provides the necessary background for, and describes the newly devel-
oped algorithm for the parallel generation of spherical centroidal Voronoi tessellations
that was created as part of this dissertation work. Results for this new grid generator
are presented in Section 3.
To begin, constructs required for the definition of SCVTs are described, beginning
with Delaunay triangulations and Voronoi tessellations. Stereographic projections
and their associated properties are then introduced; followed by a detailed description
of the parallel algorithm used for the construction of SCVTs.
2.1 Delaunay Triangulations
A k-simplex is defined as a k-dimensional polytope which is the convex hull of its
k + 1 vertices. For example, a 2-simplex would be a triangle, and a 3-simplex would
be a tetrahedron. A k-simplex is made up of what are referred to as s-faces, where
an s-face is made up of any s + 1 distinct vertices of the k-simplex. For example, a
2-face is a triangular face, a 1-face is an edge, and a 0-face is a vertex.
Given a point set, P , in Rd, the Delaunay triangulation of this point set, D(P ),
is the set of d-simplices such that:
• A point, p, in Rd, is a vertex of a simplex in D(P ), ⇐⇒ p ∈ P ;
9
• The intersection of two simplices in D(P ), is either the empty set, or a common
face;
• The interior of the circumscribing d-sphere through the d + 1 vertices of a
particular simplex contains no other points from the set P .
If the circumscribing d-sphere has more than d+ 1 points lying on its perimeter,
the triangulation is Delaunay, but not unique. The Delaunay triangulation of a point
set defined in Rd is related to the convex hull of the point set when projected onto a
paraboloid in Rd+1 [6].
2.2 Voronoi Tessellations
The dual mesh of a Delaunay triangulation is called the Voronoi tessellation. Given
a set of points, P , called generators, the Voronoi tessellation, V = Vi, is defined as
||x− xi|| < ||x− xj|| ∀x ∈ Vi, (2.1)
where Vi represents a Voronoi cell, and xi ∈ P and xj ∈ P represent generators.
This property, called the Voronoi property, states that every point contained inside
a Voronoi cell is closer to its cell generator than to any other generator in the set P .
To be a centroidal Voronoi tessellation, the cell generators xi are required to be the
centers of mass for the cells, meaning xi = x∗i , with x∗
i defined as
x∗i =
∫Vixρ(x)dx∫
Viρ(x)dx
, (2.2)
where ρ(x) defines a non-negative point-density function which can be used to create
variable resolution meshes.
The center of mass and the generator of a Voronoi cell are generally not coincident.
The requirement that xi and x∗i be the same can be imposed through one of many
algorithms, such as Lloyd’s algorithm [17]. Lloyd’s algorithm imposes this by iterating
10
on the point set, moving each generator to its Voronoi cell’s center of mass until they
are identical. Lloyd’s algorithm is more rigorously discussed in [7].
In general, the density function in (2.2) affects the grid spacing of the final SCVT.
If we arbitrarily select two Voronoi cells from a tessellation, and index them i and j,
their grid spacing and density are related as
hihj
≈
[ρ(xj)
ρ(xi)
] 1
d′+2
, (2.3)
where d′ is the dimension of the simplical elements in the tessellation, ρ(xi) is the
density function as in (2.2) evaluated at a point xi ∈ Vi, and hi is a measure of the
local grid spacing at the point xi. Though (2.3) is an open conjecture, it has been
supported through many numerical studies as can be seen further in [25].
Replacing all of the constructs defined in Sections 2.1 and 2.2 with their analogous
components on the surface of a sphere creates the spherical complements to Delaunay
triangulations and Voronoi tessellations. The spherical versions of Delaunay triangu-
lations and Voronoi tessellations are used for the construction of SCVTs as opposed
to planar CVTs which have been discussed above for simplicity. While planar CVTs
tessellate a 2-dimensional region with polygons, an SCVT tessellates the surface of a
3-dimensional sphere with polygons.
2.3 Stereographic Projections
Stereographic projections are special mappings between the surface of a sphere and
a plane tangent to the sphere. Not only are stereographic projections a conformal
mapping, meaning that angles are preserved, but the projections also preserve circles.
As will be discussed below, preserving circles is a particularly important property of
stereographic projections. Stereographic projections also map the interior of these
circles to the interior of the mapped circles [3, 26]. Preserving circularity implies that
the stereographic projection preserves Delaunay criteria as described in Section 2.1,
11
because Delaunay triangle circumcircles (along with their interiors) are preserved,
and therefore Delaunay triangulations are preserved. This projection can be used to
compute a triangulation of a portion of the sphere, by allowing the triangulation to
be carried out in the more convenient geometry of the plane.
To define the stereographic projection, we need to define the following quantities,
all in Cartesian coordinates in R3. C is the center of the sphere, typically the origin,
T is the point of tangency (where the projection plane is tangent to the sphere), F is
the focus point, which is a reflection about C of T, and P is a point on the surface
of the sphere. The stereographic projection of P into a point Q in the plane defined
by T is defined by
s = 2 ∗(C− F) · (C− F)
(C− F) · (P− F)(2.4)
Q = s ∗P+ (1− s) ∗ F. (2.5)
Figure 2.1 illustrates the stereographic projection, using the variables defined for
(2.4) and (2.5).
For the purposes of this research, it is more useful to define the projection relative
to T, rather than F for reasons that will be explained later. A simple substitution of
T = C− F produces
s = 2 ∗1
(T) · (P+T)(2.6)
Q = s ∗P+ (s− 1) ∗T (2.7)
This projection can be used to project from Rd to R
d−1, and can be repeated until
d− 1 = 2.
12
Figure 2.1: Cross-sectional illustration of a stereographic projection from a sphereinto a tangent plane.
13
2.4 Parallel Algorithm Details
The parallel algorithm closely follows the layout of Lloyd’s algorithm, with a
few modifications. The key modification is computing a Delaunay triangulation in
parallel, since all other portions are considered embarrassingly parallel. The idea
of computing a planar triangulation in parallel has been discussed for several years
[6]. Typically, such algorithms divide the point set up into smaller regions that can
then be triangulated independently from each other. Each triangulation needs to be
stitched together to form a global triangulation. This stitching, or merge step, is
typically computed serially because it could involve modifying significant portions of
each triangulation if the division was not performed correctly. The merge step is the
main difference between most parallel algorithms. The main benefit of the algorithm
here is that the merge step is done in parallel. To create a spherical triangulation in
parallel, a similar technique is employed as in the planar triangulations.
First, the sphere is divided intoN overlapping regions Yk(Tk, Rk) for k = 1, . . . , N ,
which are defined by a geodesic arc length Rk, and a tangent plane defined by the
regions point of tangency Tk. Each of these regions is owned by an independent
processor, and these regions also have some connectivity, or list of neighbors, defined.
On the sphere, these regions would look like overlapping umbrellas, as can be seen
in Figure 2.2(a). Each region (or processor) would take from the global point set,
pi ∈ P , the points that are inside of its region radius, where cos−1(Tk · pi) ≤ Rk.
Keep in mind, this sorting may cause one point to be in several regions, as in Figure
2.2, where Figure 2.2(a) shows an example domain decomposition with 12 regions
that could be used on a set of generators shown triangulated in Figure 2.2(b). Since
the end goal of this algorithm is to compute an SCVT, the regional triangulations do
not need to be merged on every iteration because they overlap.
After a spherical point set, Pk, is determined, the stereographic projection Pk =
S[Pk,Tk] of P into the plane defined by the point of tangency Tk is computed. Be-
14
cause a stereographic projection preserves circles (and their interiors), the projection
also preserves the Delaunay criteria that every triangle’s circumcircle needs to be
empty. The newly projected point set is now triangulated using some planar trian-
gulation algorithm, such as Triangle [28] which is used in this study. If the mapping
from global point index to local point index is appropriately maintained, a simple
map from local index to global index gives the approximate triangulation for the re-
gion on the sphere. One final step is needed to make this the true triangulation for
the region, which is to remove all “non-Delaunay” triangles. The criteria required to
be a Delaunay triangle in the global triangulation is defined as
cos−1 ||Tk − ci||+ ri < Rk, (2.8)
where Tk is a region center, Rk is a region radius, ri is a triangle circumradius, and
ci is a triangle circumcenter.
Since each region is unaware of the triangles and points outside of its radius, only
triangles whose circumcircles are completely contained inside of the region radius Rk
are guaranteed to be Delaunay, as no other points from the point set can be in their
circumcircle. Any triangle whose circumcircle extends outside of its regions radius
may contain points that were not in Pk, and should be discarded from the region’s
triangulation because this triangle is not guaranteed to adhere to the Delaunay criteria
for the entire point set. Figure 2.3 visualizes this point, where Figure 2.3(a) shows a
projected planar triangulation Pk before removing triangles that do not satisfy (2.8),
and Figure 2.3(a) shows the exact same triangulation after removing these potentially
non-Delaunay triangles. After this step is complete, the regional triangulation is now
exactly Delaunay. After the regional triangulation is computed, the integration step
of Lloyd’s algorithm can begin. The overlapping of regions is key to this portion of
the algorithm, because if the overlap is not large enough some true Delaunay triangles
might not be entirely in at least one region.
15
In Lloyd’s algorithm, after the Delaunay triangulation of the point set is computed,
every Voronoi cell center of mass must be computed by integration, so its generator
can be replaced. This step typically requires the computation of the Voronoi diagram
for a region in addition to the Delaunay triangulation previously computed. However,
some careful geometry can reveal that one doesn’t actually need the Voronoi diagram.
A single triangle from a Delaunay triangulation contributes to the integration of three
different Voronoi cells. As seen in Figure 2.4, if the triangle is split into three kites,
each made up of two edge midpoints, the triangle’s circumcenter, and a vertex of the
triangle, each one contributes to the Voronoi cell associated with the triangle vertex
that is part of the kite. Integrating each kite, and updating a portion of the centroid
integral allows one to only use the Delaunay triangulation when computing a CVT or
an SCVT, so that no mesh connectivity needs to be computed on an iteration basis.
To make this algorithm parallel, one simply has to ensure that each generator
is only updated by one region. This can be done using one of a variety of domain
decomposition methods. The method used in this particular algorithm uses the set
of generators from a coarse SCVT to define region centers. Each region then updates
only the generators that are inside of its defined Voronoi cell based on (2.1), using
region centers, or points of tangency, Tk as xi and generators pi ∈ P as x. Since
Voronoi cells are non-overlapping, each generator will only get updated by one region.
As mentioned earlier, the overlapping of regions is necessary to ensure that the trian-
gulation of all points contained inside each region’s Voronoi cells is exact. In practice,
a region radius corresponding to the maximum distance to any adjacent region center
allows enough overlap for the triangulation to be exact, and is defined br
Ri = maxj=1,...,N
cos−1(Ti ·Tj), (2.9)
where N is the number of region neighbors, Ti is the region center of interest, Tj is
a neighboring region center, and Ri is the geodesic arc distance for region i.
16
Whereas this heuristic allows the algorithm to work correctly, it may not be op-
timal for variable resolution grids, as some regions might contain many more points
than they need to when they border both a fine and a coarse region.
Once each of the generators is updated, each region needs to transfer its newly
updated points only to its adjacent neighbors, not to all of the active processors. This
limits each processor’s communications to roughly 6 sends and receives, regardless of
the total number of processors used. After this step is over, the convergence of the
grid is checked, and the iterations continue, or stop depending on the result.
17
(a) 12 Generator SCVT
(b) 10242 Generator Delaunay Triangulation
Figure 2.2: Domain Decomposition Example. Figure 2.2(a) is an SCVT used for a 12processor domain decomposition, where Figure 2.2(b) is a 10242 generatorDelaunay triangulation computed using the 12 generator SCVT for paral-lelization. Each colored ring represents a regions radius Rk, where regioncenters Tk are the Voronoi cell center, at the center of each pentagonalstructure in Figure 2.2(a).
18
(a) Before application of (2.8)
(b) After application of (2.8)
Figure 2.3: Triangulations in a plane after Stereographic projection. 2.3(a) is thetriangulation before (2.8) is applied, and 2.3(b) is after it is applied
19
Figure 2.4: Triangle division used for integrating Voronoi cells using only the De-launay triangulation without any adjacency information. Kite sectionscontribute to the Voronoi cell centered at the vertex that is part of thekite. A, B, C vertices are generators in the point set, where the point atthe center of the triangle is the circumcenter of this triangle. Triangularregions that are colored similarly contribute to the same vertex.
20
2.4.1 Convergence Criteria
When checking for convergence, two metrics are used. Currently, the L2 norm
(2.10) of the generator movement and the L∞ norm (2.11) of generator movement
are compared with some tolerance. If the norm of interest reaches the tolerance, the
iteration process is deemed to have converged. The L∞ is more strict, but both of
these norms follow similar convergence paths when plotted against iteration number.
There are other grid metrics that can be used, such as the clustering energy [8] as in
(2.12), but in practice this tends to be less strict, and more computationally expensive,
when compared with generator movement.
L2 =
√∑Npts
i=1 (xni − xn+1
i )2
Npts
(2.10)
L∞ = maxi=1,...,Npts
(|xni − xn+1
i |) (2.11)
CE =
Npts∑
i=1
∫
Vi
(ρ(x)||x− xi||2dx) (2.12)
2.4.2 Initial Conditions
A variety of initial conditions can be used in an SCVT generator. The most
obvious is Monte Carlo points [20]. These can either be uniformly distributed over the
sphere, to create a quasi-uniform initial condition, or they can be sampled using the
target point-density function, to potentially reduce the number of iterations required
for convergence. In addition to using Monte Carlo initial conditions, one can use a
bisection method to build fine grids from a coarse grid [14]. To create a bisection
grid, a coarse grid will be converged, using as few points as possible. After this coarse
grid is converged, the midpoint of every Voronoi cell edge, or Delaunay triangle edge
is added to the set of points. This causes the overall grid spacing to be reduced by
roughly a factor of two in every cell. It also makes the point set roughly four times
21
as large. In addition to Monte Carlo and bisection initial conditions, there are many
other choices that can be used.
22
CHAPTER 3
PARALLEL SCVT GENERATORRESULTS
Two different types of grids are presented to show the robustness of this algorithm.
To begin, quasi-uniform meshes are created, followed by more complicated variable
resolution meshes, which cover the entire sphere. This method can also be used to
create limited area grids on the sphere, however this is outside the scope of this
dissertation.
All of the results presented below were computed using Florida State University’s
High Performance Computing Facility.
3.1 Quasi-Uniform Results
STRIPACK [23] is an ACM TOMS algorithm that computes Delaunay triangula-
tions on a sphere. STRIPACK is a serial code used as a baseline for comparison in
this study. It is currently one of the few well-known spherical triangulation libraries
available, and is written in Fortran 77. Figure 3.1 shows the performance of STRI-
PACK [23] as the number of generators is increased through bisection as mentioned in
Section 2.4.2. The green dashed line represents the portion of the code that performs
the integration of the Voronoi cells and the red solid line represents the portion of
the code that performs the Delaunay triangulation. It is clear that the majority of
the time per iteration is spent in computing the Delaunay triangulation, and as the
23
number of generators increases the time spent computing a Delaunay triangulation
grows more rapidly than the time to integrate all Voronoi cells.
1
10
100
1000
10000
100000
1e+06
1e+07
100 1000 10000 100000 1e+06
Ave
rag
e t
ime
(m
s)
Number of Generators
TriangulationIntegration
Figure 3.1: Timings for a STRIPACK based SCVT Generator at 162, 642, 10242,40962 and 163842 generators. The red solid line represents the time spentin STRIPACK computing a triangulation, where the green dashed linerepresents the time spent integrating the Voronoi cells outside of STRI-PACK in one iteration of Lloyd’s algorithm. Timings in this figure werecomputed using an Intel Core 2 Duo T8100 CPU with 3GB of RAM.
Since most climate models are shifting towards global high resolution simulations,
the target quasi-uniform grid for this research is a global 15km resolution grid, which
corresponds to 2621442 grid points, or Voronoi cells. Grids created based on a uniform
Monte Carlo and bisection initial conditions are compared. The time for these grids
to converge to an SCVT with a tolerance of 10−6 in the L2 norm, as in (2.10), is
presented. A threshold of 10−6 is the strictest convergence levels that the Monte
Carlo grid can attain. For this reason, we use 10−6 as the convergence threshold for
this study. However, the bisection grid can converge much further beyond this point.
Table 3.1 shows timing results for the parallel algorithm comparing these two different
options of initial conditions. It is clear from this table that bisection initial conditions
provide a significant speedup in the overall cost to generate a grid, seeing as it takes
24
roughly 120th of the time to converge a bisection grid compared to a Monte Carlo grid.
Based on the results presented in Table 3.1, only bisection initial conditions are used
for the following experiments, unless otherwise specified.
Table 3.1: Timing results for MPI-SCVT with bisection and Monte Carlo initial con-ditions and the speedup of bisection relative to Monte Carlo initial condi-tions
Timed Portion Bisection (B) Monte Carlo (MC) Speedup MCB
Total Time (ms) 3,526,041 70,581,300 20.01Triangulation Time (ms) 73,684 21,164,512 287.23Integration Time (ms) 235,016 12,211,376 51.95
Communication Time (ms) 3,152,376 33,713,473 10.69
Tables 3.2 and 3.3 compare the algorithm described in this dissertation (MPI-
SCVT) with STRIPACK [23], for computing spherical Delaunay triangulations. The
results in these tables compare the cost to compute a single triangulation of a 163842
generator (60km global) grid. Table 3.2 compares STRIPACK with the final triangu-
lation routine in MPI-SCVT. This routine produces a full triangulation of the entire
sphere, and is only called once, at the very end of the grid generation process.
Table 3.2: Comparison of STRIPACK with Serial and Parallel versions of MPI-SCVTusing final triangulations
Algorithm Procs Regions Time (ms) SpeedupSTRIPACK 1 1 207528.81 BaselineMPI-SCVT 1 2 9504.02 21MPI-SCVT 42 42 5663.30 37
Table 3.3 compares STRIPACK with the triangulation routine in MPI-SCVT that
is called on every iteration. The results presented relative to MPI-SCVT in Table 3.3
are averages over 2000 iterations. It is clear from this table that we see a significant
speedup over both the serial versions of MPI-SCVT and STRIPACK when using only
42 processors.
25
As was previously mentioned, the drastic different between Tables 3.2 and 3.3 is
due to the different algorithms for computing triangulations. While Table 3.2 presents
timings that are directly comparable to STRIPACK, Table 3.3 presents timings more
useful in computing SCVTs.
As a comparison with Figure 3.1, Figures 3.2 and 3.3 present timing graphs made
from MPI-SCVT. From these three plots, it is clear that the increase in time to com-
pute the Delaunay triangulation does not grow as fast with problem size as it did
in STRIPACK. Two processors are used, because this is the minimum amount of
parallelization that MPI-SCVT supports, and MPI-SCVT requires at least 2 regions
because the stereographic projection has a singularity at the focus point. Eventu-
ally, at around 163842 generators, the triangulation becomes more expensive than
the integration step at least for 2 processors. Figure 3.2 represents the timings of
MPI-SCVT for 2 regions as the problem size increases. Figure 3.3(a) represents the
timings for a 40962 generator grid, which is a global 120km resolution, where Figure
3.3(b) represents a 163842 generator grid, with a 60km resolution, and Figure 3.3(c)
represents a 2621442 generator grid with a 15km resolution.
Table 3.3: Comparison of STRIPACK with serial and parallel versions of MPI-SCVTusing per iteration triangulations
Algorithm Procs Regions Time (ms) SpeedupSTRIPACK 1 1 207528.81 BaselineMPI-SCVT 1 2 3623.09 57MPI-SCVT 42 42 50.6572 4092
26
1
10
100
1000
10000
100000
1e+06
100 1000 10000 100000 1e+06 1e+07
Ave
rag
e T
ime
(m
s)
Number of Generators
TriangulationIntegration
Communication
Figure 3.2: Timings for various portions of MPI-SCVT using 2 processors and 2 re-gions. As the problem size increases the slope of both the triangulation(Red-Solid) and the integration (Green-Dashed) remain constant. Thetriangulation doesn’t become more expensive than the integration untilafter roughly 163842 generators, as compared to Figure 3.1 where trian-gulation was more expensive after only 2562 generators. Also, a triangu-lation using 2621442 generators costs roughly the same using MPI-SCVTand 2 processors as a triangulation using 163842 generators in STRI-PACK.
27
1
10
100
1000
10000
1 10 100A
vera
ge T
ime (
ms)
Number of Processors
TriangulationIntegration
Communication
(a) 40962 Generator Timings
10
100
1000
10000
1 10 100
Avera
ge T
ime (
ms)
Number of Processors
TriangulationIntegration
Communication
(b) 163842 Generator Timings
100
1000
10000
100000
1e+06
1 10 100
Avera
ge T
ime (
ms)
Number of Processors
TriangulationIntegration
Communication
(c) 2621442 Generator Timings
Figure 3.3: Timing results from MPI-SCVT vs. number of processors. Constantproblem size, shown as parallelization is increased. Red solid lines rep-resent the cost of computing a triangulation, where green dashed linesrepresent the cost of integrating all Voronoi cells, and blue dotted linesrepresent the cost of communicating each region’s updated point set toits neighbors.
28
3.2 Variable Resolution Results
Variable resolution grids here are only computed using MPI-SCVT. This is done
because STRIPACK performs comparably in both the uniform case and variable
resolution cases. The main issue with regards to variable resolution grids is the
domain decomposition used for MPI-SCVT. For example, a poor choice of domain
decomposition could force the overlap in regions to be significantly larger than it needs
to be. The larger the overlap of regions, the more points each region has to, needlessly,
triangulate. This is especially apparent when using variable resolution grids as will
be seen later. Because of this, two simple domain decompositions are used on a grid
with a highly varying density function applied, in addition to one, more complicated
domain decomposition method. Timings are presented to determine which performs
better, and gives better load balancing. The density function used to compute the
grids in this section can be seen in Figure 3.4.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.5 1 1.5 2 2.5 3
Density
Distance from Center of Density Function (radians)
Density
Figure 3.4: Density function that creates a grid with resolutions that differ by a factorof 16 between the coarse and the fine region. The maximum value of thedensity function is 1, where the minimum value is ( 1
16)4.
The analytic form of the density function used in figure 3.4 is defined as
ρ(xi) =1
2 (1− γ)
[tanh
(β − |xc − xi|
α
)+ 1
]+ γ, (3.1)
29
where xi is constrained to lie on the surface of the unit sphere. This function results
in relatively large values of ρ within a distance β of the point xc where β is measured
in radians and xc is also constrained to lie on the surface of the sphere. The function
transitions to relatively small values of ρ across a radian distance of α. The distance
between xc and xi is computed as |xc − xi| = cos−1(xc · xi) with a range from 0 to
π. Figure 3.5 shows an example grid created using this density function, with xc set
to be the center of the mountain defined in shallow-water test case number 5 from
[39] with φc =3π2, λc =
π6representing longitude and latitude respectively, γ = 1
16
4,
β = π6, and α = 0.15 with 10242 generators. This set of parameters used in (3.1) is
referred to as x16.
It was previously mentioned in Section 2.4 that the heuristic used to determine the
region radius does not provide good load balancing with respect to variable resolution
grids. To resolve this issue, a new algorithm was developed. The new algorithm
begins by sorting each point into a Voronoi cell. After all regions have their point
sets, the union of this point set with the neighboring Voronoi cell’s point sets gives
the final point set used. This sort method is more expensive to perform, however the
better load balancing reduces idle computing time from processors that have small
loads. Timings using this new method in addition to two dot-product-based methods
for domain decomposition can be seen in Table 3.4. Figure 3.6 shows the number
of points that each processor has to triangulate on a per iteration basis. These
timings and figures were computed using the exact same initial conditions, which
was a converged x16 grid with 163842 generators, and they all used 42 processors,
and 42 regions. Timings presented in Table 3.4 are averages over 3000 iterations.
Based on Table 3.4 and Figure 3.6 there is a significant advantage to the Voronoi
based decomposition in that it not only speeds up the overall cost per iteration, but
it provides a more balanced load across the processors. In Table 3.4 note that the
timings are taken relative to processor number 0, and as can be seen in Figure 3.6(a),
30
processor 0 has a very small load so the majority of its iteration time is spent waiting
for the processors with large loads to finish and catch up which is included in the
Communication column of the table.
Table 3.4: Timings based on the domain decomposition used. Uniform uses a coarsequasi-uniform SCVT to define region centers and their associated radii,and sorts using a simple dot product. x16 uses a coarse x16 SCVT todefine region centers and their associated radii, and sorts using a simpledot product. Voronoi uses a coarse x16 SCVT to define region centers andtheir associated radii, and sorts using a Voronoi cell based sort.
Decomposition Triangulation Integration Communication Iteration SpeedupUniform 14.9779 39.3149 2556.971 2611.35 Base
x16 104.793 276.681 1560.71 1965.56 1.32Voronoi 98.5482 249.77 288.694 640.472 4.07
31
(a) Coarse Region (b) Transition Region
(c) Fine Region
Figure 3.5: Figures show a variable resolution grid created using a density functionwith the format defined in (3.1). All three figures are of the same grid,only the viewing perspective is changed. Figure 3.5(a) shows the coarseregion of the grid, 3.5(b) shows the transition region of the grid, and3.5(c) shows the fine region of the grid.
32
0
20000
40000
60000
80000
100000
120000
140000
160000
5 10 15 20 25 30 35 40N
um
ber
Of P
oin
ts In R
egio
nRegion Number
Uniform Decomposition
(a) Uniform
0
20000
40000
60000
80000
100000
120000
140000
160000
5 10 15 20 25 30 35 40
Num
ber
Of P
oin
ts In R
egio
n
Region Number
x16 Decomposition
(b) x16
0
20000
40000
60000
80000
100000
120000
140000
160000
5 10 15 20 25 30 35 40
Num
ber
Of P
oin
ts In R
egio
n
Region Number
Voronoi based Decomposition
(c) Voronoi
Figure 3.6: Number of points each processor has to triangulate. 3.6(a) uses a quasi-uniform SCVT for its decomposition, with a simple dot product. 3.6(b)uses a x16 SCVT for its decomposition, with a simple dot product. 3.6(c)uses a x16 SCVT for its decomposition, with a more complicated sortbased on the region’s Voronoi diagram.
33
3.3 Grid Generator Performance
To assess the overall performance of MPI-SCVT, some scalability results are pre-
sented in Figure 3.7. Figure 3.7(a) shows that this algorithm can easily under-saturate
processors, and when this happens, communication ends up dominating the overall
runtime for the algorithm, which can be seen in Figure 3.3(a), and scalability ends up
being sub-linear. As the number of generators increases (as seen in Figures 3.7(b) and
3.7(c)) the limit for being under-saturated is higher. Currently in the algorithm, com-
munications are done asynchronously using non-blocking sends and receives. Also,
overall communications are reduced by only communicating with a region’s neighbors.
This is possible because points can only move within a region radius on any two sub-
sequent iterations, and because of this can only move into another region which is
overlapping the current region. More efficiency gains could be realized through im-
provements in the communication, and the integration algorithms, which could result
in linear scaling. In theory, because all of the computation is local this algorithm
should scale linearly very well, up to hundreds if not thousands of processors.
34
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70
Sp
ee
du
p
Number of Processors
SpeedUp (Serial/Parallel)Linear Reference
(a) 40962 Generator Speedup
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70
Sp
ee
du
pNumber of Processors
SpeedUp (Serial/Parallel)Linear Reference
(b) 163842 Generator Speedup
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70
Sp
ee
du
p
Number of Processors
SpeedUp (Serial/Parallel)Linear Reference
(c) 2621442 Generator Speedup
Figure 3.7: Scalability results based on number of generators. Green is a linear refer-ence where Red is the Speedup computed using parallel version of MPI-SCVT against a serial version
35
CHAPTER 4
NUMERICAL MODEL BACKGROUND
Climate models are broken into sub-component models. These dynamical cores model
the physics associated with various portions of the climate and are combined in intel-
ligent ways to create full climate models. Dynamical cores can be used to represent
the ocean, atmosphere, sea-ice, ice sheet, and other components of the climate. The
work contained in this dissertation relates specifically to ocean models.
Ocean models typically utilize equations that describe fluid dynamics with full
3-dimensional motion. These equations can be complicated to solve, and overly ex-
pensive for certain problems. Due to the cost of solving these equations, they can
be less than desirable for model development. For this reason, a simplification of
these equations is used as a starting point for ocean models, called the shallow-water
equations.
The shallow-water equations can be used to explore a less expensive system that
can still capture some of the key physical features in the full ocean system. To explore
the capabilities of their models, developers use a shallow-water model coupled with
test cases that showcase the model’s ability to simulate specific physical processes.
For example, [39] provides test cases suitable for modeling anything from advection,
to non-linear geostrophic flow, to flow over an isolated mountain. Combining these
test cases together allows a developer to determine how well their numerical model
performs in specific situations, and benchmark the overall conservation of the numer-
36
ical method. After the numerical method is explored in the shallow-water context,
it can then be implemented in a full 3-dimensional system to simulate ocean pro-
cesses. Once implemented similar benchmarks can be performed, though they are
significantly more expensive.
This chapter begins by giving a basic background into the shallow-water equations
and the numerical method used for the research contained in this portion of the
dissertation, followed by an introduction of several test cases which are used in the
shallow water system to benchmark the numerical method.
4.1 Shallow-Water Equations and Numerical
Method
The shallow-water equations are described as follows:
∂h
∂t+∇ · (hu) = 0, (4.1)
∂u
∂t+ ηk× u = −g∇ (h+ b)−∇K, (4.2)
where h represents the fluid layer thickness and u represents the fluid velocity along
the surface of the sphere. The absolute vorticity, η, is defined as k · (∇× u) + f and
the kinetic energy, K, is defined as |u|2
2. At all points on the surface of the sphere the
vector k points in the local vertical direction and we require k · u = 0 at all points.
The three parameters in the system are gravity, g, Coriolis parameter, f , and bottom
topography, b.
When using the shallow-water equations four quantities are expected to be con-
served; these quantities are total mass, total energy, potential vorticity, and potential
enstrophy. All of these conservation properties are explored in the results section
using MPAS.
37
A more appropriate form of the continuous equations is expressed as:
∂h
∂t+∇ · F = 0, (4.3)
∂u
∂t+ qF⊥ = −g∇ (h+ b)−∇K, (4.4)
where F = hu, F⊥ = k×hu and η = hq where q is the total potential vorticity. Using
the definition of potential vorticity, potential enstrophy is defined as the thickness-
weighted variance of potential vorticity by 1/2 ∗ q2.
The numerical method used in this research to model the shallow-water system is
discussed at length in [24, 31]. An analysis of the linearized version of (4.1) and
(4.2) is conducted in [31] in order to derive a numerical method that is able to
reproduce stationary geostrophic modes found in the continuous system, even when
the numerical method is implemented on variable resolution meshes such as those
shown in Figure 3.5. [24] extends the analysis to the nonlinear shallow-water equations
shown in (4.3) and (4.4) in order to derive a method that conserves total energy and
potential vorticity while allowing for a physically-appropriate amount of potential
enstrophy dissipation.
The work in this dissertation, and in [25] focuses on variable resolution meshes,
as seen in Figure 4.1, whereas both [24, 31] present results for quasi-uniform meshes,
even though the method is suitable for arbitrary Voronoi tessellations.
The numerical scheme is a standard finite-volume method that makes use of a
C-grid staggering as shown in Figure 4.2.
The discrete approximations of the divergence and gradient operator are shown
in Figure 3 of [24], and are used throughout this derivation.
The thickness field is defined on the Voronoi cells while all vorticity-related fields,
such as relative vorticity, absolute vorticity and potential vorticity, are defined on
the Delaunay triangles. Using a discrete approximation to the divergence operator,
a discrete thickness equation is derived. The equation for the normal-component
38
velocity is derived by taking the inner product of ne (from Figure 4.2) and (4.4). The
resulting discrete system is expressed as:
∂hi∂t
= − [∇ · Fe]i , (4.5)
∂ue∂t
+ F⊥e qe = − [∇ (g(hi + bi) +Ki)]e (4.6)
where Fe = heue represents the mass flux across the edge of a Voronoi cell and F⊥e
represents the mass flux across the edge of each Delaunay cell. Ki, he, qe and F⊥e are
defined following [24]. Also following [24], the anticipated potential vorticity method,
[27], is used to dissipate potential enstrophy.
The derivations in [24, 31] provide a numerical method that conserves total energy
to within time-truncation error, conserves total potential vorticity to within machine
round-off error and dissipates potential enstrophy at a rate that depends on a single
parameter. As mentioned previously, this derivation was carried out for use on a
general Voronoi mesh.
In an effort to produce a framework suitable for the rapid prototyping and devel-
opment of dynamical cores, this numerical method has been implemented in a joint
effort between Los Alamos National Laboratory (LANL) and the National Center
for Atmospheric Research (NCAR). This new framework is called the Model for Pre-
diction Across Scales (MPAS). Currently LANL is using this framework to develop
ocean and shallow-water models, and NCAR is developing an atmospheric model.
For the purposes of this dissertation, the ocean and shallow-water models developed
at LANL are used within MPAS.
39
(a) Quasi-Uniform Grid (x1) (b) Variable Resolution Grid (x2)
(c) Variable Resolution Grid (x4) (d) Variable Resolution Grid(x16)
Figure 4.1: Four members of a family of meshes constructed from (3.1). Each meshuses 2562 grid points and only differ in the setting of the parameter γ. x1,x2, x4 and x16 shown in the top-left, top-right, bottom-left and bottom-right, respectively.
40
Figure 4.2: C-grid staggering of variables for the finite-volume scheme used in MPAS.Fluid thickness, topography, and kinetic energy are stored at Voronoi cellcenters. The normal component of the velocity field is defined at themid-point of line segments connecting cell centers. Vorticity related fieldssuch as relative, absolute, and potential vorticity are stored at Voronoicell vertices. Derived fields, he, qe, and F
⊥e must be reconstructed at each
velocity point.
41
4.2 Shallow-Water Test Cases
The ocean modeling community tests their models using a variety of techniques.
One technique is to apply test problems that showcase various features present in the
ocean, and explore the errors associated with resolving these features. The test prob-
lems can involve anything from advection, to geostrophic flow, to Rossby-Haurwitz
waves. There are several test problems generally accepted by the community for the
testing of an ocean or shallow water dynamical core. One set of these test problems
can be found in [39]. This section describes two test cases that are used to benchmark
the MPAS shallow-water dynamical core defined in [39], along with the test problem
defined in [11]. Some additional tests used to benchmark dynamical cores can be seen
in [37, 38].
4.2.1 Non-linear Geostrophic Flow (TC2)
As defined in [39] this test case represents nonlinear geostrophic flow. Geostrophic
flow is an extremely important physical process that naturally occurs in the ocean
and atmosphere. It occurs when the nonlinear Coriolis force balances the horizontal
pressure gradient. This leads to the momentum equation becoming steady state,
taking the form of
fk× u = −g∇(h+ b) (4.7)
The initial conditions for the geostrophic flow defined for this test case are given
by
u = u0(cos(λ) cos(α) + cos(φ) sin(λ) sin(α)) (4.8)
v = −u0 sin(φ) sin(α) (4.9)
gh = gh0 − (αΩu0 +u202)× (− cos(φ) cos(λ) sin(α) + sin(λ) cos(α))2 (4.10)
42
where λ represents the latitude, φ represents the longitude, Ω represents the rotational
rate of the earth, and α represents the angle between solid body rotation and the polar
axis which is taken to be 0.0 in the simulations presented in Chapter 5.
The velocity field described in (4.8) and (4.9) can also be written in a stream
function form as
ψ = −au0(sin(λ) cos(α)− cos(φ) cos(λ) sin(α)) (4.11)
χ = 0 (4.12)
These stream functions provide only zonal (u direction) flow, with no flow in the
meridional (v) direction. To define the initial flow field, the stream function (4.11) is
sampled at Delaunay cell points xv, and computing ue as k×∇ψ. The thickness field
is defined by sampling (4.10) at Voronoi cell points. Even though errors in ue are
present at t = 0, this approach guarantees that the discrete divergence is identically
zero at t = 0.
4.2.2 Zonal Flow Over an Isolated Mountain (TC5)
Shallow-water test case number 5, as defined in [39], represents zonal flow over an
isolated mountain. This test case begins with geostrophic flow as described in Section
4.2.1; however at the initial time step a mountain is added to the topography. The
center of the mountain is placed at φc =3π2, λc =
π6, with a height described by
hs = hs0(1−r
R) (4.13)
where, hs0 = 2000m, R = π/9, and r2 = min(R2, (φ− φc)2 + (λ− λc)
2).
This causes the zonal flow to interact with the added mountain, causing gravity
and Rossby waves to propagate as the flow adjusts to the presence of the topographical
mountain. This interaction leads to strong nonlinearity, and therefore makes this test
case useful for exploration of a numerical method’s conservative properties.
43
4.2.3 Barotropic Instability (BTI)
As defined in [11], this test case starts with a barotropically unstable zonal flow
that includes a simple perturbation added to induce the instability. The perturbation
first causes global gravity waves to propagate around the sphere within a few hours.
Secondly, it creates complex vortical dynamics which develop over a few days. This
test case requires initial conditions of the form
u(φ) =
0 for φ ≤ φ0
umax
enexp
[1
(φ−φ0)(φ−φ1)]]
for φ0 ≤ φ ≤ φ1
0 for φ ≥ φ1
(4.14)
v(φ, λ) = 0 (4.15)
gh(φ) = gh0 −
∫
φ
au(φ′)
[f +
tan(φ′)
au(φ′)
]dφ′ (4.16)
where we take umax = 80ms, φ0 = π
7, φ1 = π
2− φ0, and en = exp
[−4
(φ1−φ0)2
], and a
is the radius of the Earth. In (4.16) h0 is chosen such that the global average sea
surface height is 10km.
(4.14) is then used to derive a stream function, which is sampled at Delaunay
cell locations as in TC2 in Section 4.2.1 to define the flow field. The height field is
generated based on (4.16). After the initial conditions are generated, a perturbation
is added to the height field that will drive the barotropic instability throughout the
system. This perturbation is defined as
h′(λ, φ) = h cos(φ)e−( λα)2e−[
φ2−φ
β ]2
for − π < λ < π (4.17)
where φ2 =π4, α = 1
3, β = 1
15, and h = 120m.
44
CHAPTER 5
NUMERICAL MODEL RESULTS
5.1 Shallow-Water Model Setup
SCVTs are used, as described in Chapter 2, for the spatial discretizations used in
the finite-volume scheme. These SCVTs are generated using (3.1) as the prescribed
density function. 25 different grids are generated and used in this work, however only
a subset of these are shown. Of the generated grids, 20 are variable resolution, and 5
are quasi-uniform. The quasi-uniform grids have grid spacings and generator counts
that can be found in Table 5.1.
Table 5.1: Table of grid sizes and spacings for quasi-uniform grids used in shallow-water exploration
Generators Approx. Grid Spacing2562 480km10242 240km40962 120km163842 60km655362 30km
The variable resolution grids used have the same number of generators as their
quasi-uniform counterparts, but differ only in γ in the density function (3.1). The
values used for γ can be found in Table 5.2. In these grids, is a distinct fine and coarse
region that are connected by a smooth transition region. Based on the parameters
used for the density function, the fine region has a radius of π/6 radians from the
45
center of the mountain as defined in TC5, the transition region extends past the fine
region another π/9 radians, and the coarse region makes up the remainder of the
sphere. γ is varied to give specific factors for the grid spacing between the coarse and
fine region, as can be seen in (2.3). As an example, the x2 grid (seen in Table 5.2)
has a factor of 2 between grid spacings in the coarse, and in the fine regions.
Table 5.2: Minimum values and grid spacing factors
Grid Name Grid Spacing Factor Minimum Valuex1 1.0 1.0x2 2.0 0.062500000x4 4.0 0.003906250x8 8.0 0.000244141x16 16.0 0.000015259
Setting the grid points constant and varying the density function used to create
the grids has advantages and disadvantages. In terms of disadvantages, making one
region finer requires the rest of the sphere to become coarser, so a x2 2562 grid has
cells that are larger than any cell in a x1 2562 grid. As far as advantages go, the
refinement provided in the variable resolution grids can provide an accuracy increase
in specific regions, similar to limited-area modeling approach’s. Table 5.3 shows the
approximate resolutions for the fine and coarse mesh regions when using the described
density function. An example of the grids can also be seen in Figure 4.1.
Table 5.3: Approximate mesh resolutions (km) of the fine-mesh (dxf ) and coarse-mesh (dxc) regions of the global domain for the x1 through x16 meshes asa function of the number of grid points.
Grid Points x1(dxf , dxc) x2(dxf , dxc) x4(dxf , dxc) x8(dxf , dxc) x16(dxf , dxc)2562 (480, 480) (282, 537) (196, 737) (169, 1293) (163, 2419)10242 (240, 240) (141, 169) (98, 368) (85, 648) (81, 1222)40962 (120, 120) (70, 134) (49, 184) (42, 324) (40, 611)163842 (60, 60) (35, 67) (25, 92) (21, 162) (20, 305)655362 (30, 30) (16, 32) (12, 48) (10, 78) (9, 148)
46
All grid points are generated using the bisection method as described in Section
2.4.2. Unique to x1, the grid points are also associated with the recursive bisection-
projection of an inscribed icosahedron [14]. This method results in a particularly
uniform distribution of grid points resulting in a relatively small solution error. This
special distribution of nodes is lost when producing the variable-resolution meshes.
As a result, a relatively large cost, in terms of global error, is incurred by choosing
to move away from the special quasi-uniform meshes, but very little additional cost
is incurred by increasing the extent of the mesh variation.
5.2 Shallow Water Test Case Results
Using the MPAS shallow-water dynamical core, three test cases are explored in
this Section, as defined in Sections 4.2.1, 4.2.2, and 4.2.3. The results in this Chapter
are also published as [25].
5.2.1 Shallow Water Test Case 5
The analysis of TC5 is presented first because it offers insight into the conserva-
tion properties of MPAS. TC5 contains a single mountain that is responsible for the
evolution of the system. While the mountain is large in scale, it is still localized and,
in that sense, is well suited for local mesh refinement. All of the meshes depicted in
Figure 4.1 and Table 5.3 enhance resolution in the vicinity of the defined mountain.
TC5 prescribes an analytic initial condition of large-scale geostrophic flow that
would be in steady state, if not for the presence of the mountain. This mountain
is centered at xc and extends π/9 radians in latitude and longitude. As described
in Chapter 3 the variable resolution meshes created are also centered at xc and the
fine-mesh region extends a distance of π/6 radians, meaning that the fine-mesh region
includes all of the mountain.
To begin, a qualitative assessment of TC5 is presented. Figure 5.1 shows the fluid
47
height field hi + bi at day 15 for the x1, x2, x4, and x16 meshes, with 40962 cells.
As depicted, all four simulations appear to be identical. This is expected because
the flow is characterized by large-scale Rossby waves that are well resolved on the
coarse-mesh resolutions of all of the 40962 meshes. In the x16 simulation result, the
coarse grid cells can clearly be seen.
Based on the numerical scheme two quantities are conserved to round-off error in
every simulation: the area-weighted global sum of thickness and the volume-weighted
potential vorticity. As found throughout the simulation,
∂
∂tV =
∂
∂t
Ni∑
i=1
hiAi = 0, (5.1)
∂
∂t
Nv∑
v=1
qvhvAv =0, (5.2)
to within round-off error in all simulations, where the quantity V represents the total
fluid volume.
In order to evaluate the energetics of the system, the total energy is computed
following [24, Eq. (70)] as
E =∑
e
Ae
[heu
2e
2
]+∑
i
Ai
[ghi
(1
2hi + bi
)]− Er. (5.3)
where Er represents the unavailable potential energy, and has the form:
Er =∑
i
g HiAi
[Hi
2+ bi
](5.4)
where
Hi =
∑iAi (hi + bi)∑
iAi
− bi (5.5)
has been subtracted. From now on, “total energy” implies “total available energy”. Er
represents the potential energy of the fluid at rest, which is unavailable to the system.
Figure 5.2 demonstrates the conservation of the total energy in the simulations. The
figures show log10|(E(t)−E(0))|
|E(0)|over the 15 day integration for the x1, x2, x4, x8 and
48
x16 meshes with 40962 grid points. At day 15, all solutions conserve total energy
to within 1.0 × 10−8 relative to total energy present at t = 0. This finding is orders
of magnitude better than is required when considering the dissipation mechanisms
present in the real atmosphere and ocean [31].
The total energy is conserved in a physically-appropriate manner, therefore the
nonlinear Coriolis force neither creates nor destroys kinetic energy, and the exchange
of energy between its potential and kinetic forms is equal and opposite. The degree to
which the nonlinear Coriolis force is energetically-neutral is explored, by computing
the time it would take for the nonlinear Coriolis force to double the kinetic energy
in the system. With 40962 grid points, the time required for the nonlinear Coriolis
force to double the kinetic energy is approximately 104 years for all meshes, which is
in agreement with Figure 4 of [24].
The other important component in the total energy budget is the conservative
exchange of energy between its potential and kinetic forms. The potential and kinetic
energy equations each have a source term. These source terms are equal and opposite
(see (15) and (16) of [24]). Following (65) and (67) from [24], the source term for
kinetic and potential energy is explored, respectively. Since these RHS sources are
algebraically equivalent in the discrete system, a very high degree of cancellation
between the sources is expected. All 25 simulations show the time scale for doubling
the kinetic energy of the system due to the imperfect cancellation of KE and PE
sources terms is approximately 1010 years. This is essentially machine precision round-
off error.
With regards to conservation, the final quantity of interest is potential enstrophy.
Figure 5.4 shows log10|(R(t)−R(0))|
R(0)where R is the global-integrated potential enstrophy
defined as
R =1
V
Nv∑
v=1
q2vhvAv −Rr. (5.6)
49
Potential enstrophy also has a unavailable reservoir that is equal to the amount of
potential enstrophy that exists when the fluid is at rest. This unavailable reservoir, Rr
is removed from the computation in order to obtain a more representative evaluation
of potential enstrophy conservation.
Figure 5.3 shows the globally average potential enstrophy as a function of time
over a 15 day simulation for each of the x1, x2, x4, x8, and x16 meshes. Figure
5.4 shows the relative change in globally-averaged potential enstrophy for the x1,
x2, x4, x8 and x16 meshes with 40962 nodes. At day 15, the relative changes in
globally-averaged potential enstrophy vary between 10−4 and 10−2.5 for the X1 and
X16 meshes, respectively. In these simulations, the x1 and x2 simulations show a
monotonic decrease in globally-averaged potential enstrophy, while the x4, x8, and x16
simulations show a monotonic increase in globally-averaged potential enstrophy. A
scale aware Anticipated Potential Vorticity method would clearly aid this discrepancy.
In terms of formal L2 global error norms, previous works using local mesh re-
finement with the shallow-water system all find that the solution error is relatively
unchanged when adding resolution in a specific region (e.g. [5, 29, 36]). This means
the solution error appears to be controlled by the coarse region of the mesh when us-
ing static mesh refinement. The global L2 error norm for each of the 25 simulations,
as a function of coarse-mesh resolution, is shown in Figure 5.5(b). Since TC5 does
not have a known analytic solution, error norms are computed with respect to a T511
global spectral model [30]. For TC5 at T511, the global spectral model requires a
scale-selective ∇4 dissipation of 8.0× 1012m4/s in order to prevent the accumulation
of energy and potential enstrophy at the grid scale.
Figure 5.5 shows the error norms for TC5. Figure 5.5(a) shows the normalized
L2 error as a function of number of generators, where Figure 5.5(b) shows the error
as a function of grid spacing in the coarse-mesh region. Based on figure 5.5(b), the
solution error appears to be controlled by the mesh resolution in the coarse region.
50
All of the simulations show the same convergence rate of approximately 1.5. Note
these errors norms are plotted on a log − log scale to emphasize the primary finding
that the L2 error is controlled by the coarse-mesh resolution. Looking at the results
more closely, it is apparent that the variable resolution meshes provide a small, but
measurable, improvement in solution error.
51
(a) x1 grid (b) x2 grid
(c) x4 grid (d) x16 grid
Figure 5.1: The fluid height, hi + bi, at day 15 for TC5. Starting at the upper leftand moving clockwise shows results from the X1, X2, X16 and X4 meshesusing 40962 cells. The black oval denotes the location of the mountain.The figures are generated by filling each Voronoi cell with a single color,i.e. there is no interpolation due to rendering. This allows the coarse-mesh grid cells to be seen in the X4 and X16 simulations. All results areplotted with an identical color scheme with a maximum of 5975 m and aminimum of 5025 m.
52
1e-13
1e-12
1e-11
1e-10
1e-09
1e-08
0 324000 648000 972000 1.296e+06
Rela
tive C
hange in T
ota
l E
ne
rgy
Time (s)
x1x2x4x8
x16
Figure 5.2: Log10 of the relative change in available total energy for TC5 as a functionof time for the x1, x2, x4, x8 and x16 meshes with 40962 grid points.
53
3.22195e-17
3.222e-17
3.22205e-17
3.2221e-17
3.22215e-17
3.2222e-17
3.22225e-17
3.2223e-17
3.22235e-17
3.2224e-17
0 324000 648000 972000 1.296e+06
Po
ten
tia
l E
nstr
op
hy
Time (s)
x1
(a) x1
4.6501e-17
4.65015e-17
4.6502e-17
4.65025e-17
4.6503e-17
4.65035e-17
4.6504e-17
4.65045e-17
4.6505e-17
0 324000 648000 972000 1.296e+06
Po
ten
tia
l E
nstr
op
hy
Time (s)
x2
(b) x2
5.547e-17
5.5475e-17
5.548e-17
5.5485e-17
5.549e-17
5.5495e-17
0 324000 648000 972000 1.296e+06
Po
ten
tia
l E
nstr
op
hy
Time (s)
x4
(c) x4
6.862e-17
6.864e-17
6.866e-17
6.868e-17
6.87e-17
6.872e-17
6.874e-17
6.876e-17
0 324000 648000 972000 1.296e+06
Po
ten
tia
l E
nstr
op
hy
Time (s)
x8
(d) x8
7.2e-17
7.205e-17
7.21e-17
7.215e-17
7.22e-17
7.225e-17
7.23e-17
7.235e-17
0 324000 648000 972000 1.296e+06
Po
ten
tia
l E
nstr
op
hy
Time (s)
x16
(e) x16
Figure 5.3: Globally averaged potential enstrophy as a function of time for x1, x2,x4, x8, and x16 meshes with 40962 grid points. Simulations are run for 15days. Figures show decreasing potential enstrophy for x1 and x2 meshes,and increasing potential enstrophy for x4, x8, and x16 meshes.
54
1e-08
1e-07
1e-06
1e-05
0.0001
0.001
0.01
0 324000 648000 972000 1.296e+06Rela
tive C
hange in P
ote
ntia
l E
nstr
op
hy
Time (s)
x1x2x4x8
x16
Figure 5.4: Log10 of the relative change in available potential enstrophy for TC5 asa function of time for the x1, x2, x4, x8 and x16 meshes with 40962 gridpoints.
55
1e-05
0.0001
0.001
0.01
1000 10000 100000 1e+06
No
rma
lize
d L
2 E
rro
r
Number of Generators
x1x2x4x8
x16
(a) Normalized error as a function of number of generators
1e-05
0.0001
0.001
0.01
10 100 1000 10000
Norm
aliz
ed L
2 E
rror
Coarse Grid Spacing (km)
x1x2x4x8
x16
(b) Normalized error as a function of coarse-mesh grid spacing
Figure 5.5: The L2 error of the thickness field at day 15 for TC5 shown for the x1, x2,x4, x8 and x16 meshes. Figure 5.5(a) shows errors as a function of numberof generators, and figure 5.5(b) shows errors as a function of coarse-meshgrid spacing. Error norms are computed against a T511 reference solution.
56
5.2.2 Shallow Water Test Case 2
Having confirmed the ability of the numerical model to simulate transient flows
in a robust manner with TC5, TC2 is now used to measure the method’s ability to
maintain large-scale geostrophic balance. Because TC2 is steady-state, any deviation
of the numerical solution from its initial condition is considered to be numerical error.
While TC5 offers a reason for mesh refinement, no comparable reason is present
in TC2. The motivation for evaluating the variable resolution meshes using TC2 is
not to demonstrate the approaches utility, but rather to measure the cost of mesh
refinement. Maintaining large-scale balance is an important property of any numerical
model of the atmosphere or ocean. TC2 provides an environment to precisely measure,
through the L2 error norm, the impact of mesh refinement on maintaining geostrophic
balance.
Figure 5.6 plots the error norms for TC2. Figure 5.6(a) plots the normalized error
norms as a function of number of generators where 5.6(b) shows the normalized L2
error as a function of the coarse-mesh grid spacing. As found with TC5, essentially all
of the variation in the L2 error in the simulations is controlled by the coarse resolution
grid spacing. For a given coarse resolution, solution error increases by approximately
a factor of 2 between the x2 and x16 meshes. However, the solution error for the x1
mesh is approximately a factor of 10 smaller, regardless of the coarse mesh resolution.
Unfortunately the rate of convergence for TC2 does not appear to be uniform.
Meshes with minimum grid resolutions above 100 km show a convergence rate of
approximately 1.9 with respect to the coarse mesh resolution. As the minimum res-
olution of the mesh becomes smaller and smaller, the rate of convergence becomes
smaller. This reduction in convergence rate is likely caused by at least one the fol-
lowing: deficiencies in the structure of the grids, deficiencies in the manner in which
error norms are computed, and/or deficiencies in the numerical model. Currently
none of these possibilities have been excluded, and there is on-going research to de-
57
termine what the underlying cause of this issue is. It is expected that the 2nd-order
convergence rate would continue indefinitely as resolution is increased.
58
1e-06
1e-05
0.0001
0.001
0.01
0.1
1000 10000 100000 1e+06
No
rma
lize
d L
2 E
rro
r
Number of Generators
x1x2x4x8
x16
(a) Errors as a function of number of generators
1e-06
1e-05
0.0001
0.001
0.01
0.1
10 100 1000 10000
Norm
aliz
ed L
2 E
rror
Coarse Grid Spacing (km)
x1x2x4x8
x16
(b) Errors as a function of coarse-mesh grid spacing
Figure 5.6: The L2 error of the thickness field at day 12 for TC2 for the x1, x2, x4,x8 and x16 meshes. Figure 5.6(a) shows errors as a function of numberof generators, and Figure 5.6(b) shows errors as a function of coarse-mesh grid spacing. Error norms are computed against the analytic initialconditions.
59
5.2.3 Barotropic Instability Test Case
The final test case explored in the shallow-water system is the growth of a barotropic
instability on a zonally-symmetric zonal jet [11].
Figure 5.7 shows the relative vorticity field at day 6 for the x1, x2, x4, x8 and
x16 meshes with 655362 cells. The fine-mesh region is coincident with the center of
each panel. In addition, the envelope of the growing barotropic instability is roughly
coincident with the fine mesh region at day 6, with parts of the wave system entering
and exiting the fine-mesh region at this point in time.
Test cases based on instabilities that grow on a zonally-symmetric base state
are particularly challenging for MPAS. The test case is zonally symmetric and the
instability is triggered by a small amplitude perturbation, however SCVT meshes
used are not always zonally-symmetric and, as a result, lead to some truncation error
which projects onto non-zero zonal wave numbers. This truncation error serves as an
additional trigger for the instability and can lead to wave growth that is either too
fast or not in the correct location. As the resolution is increased, the amplitude of the
spurious forcing by truncation error diminishes and the instability is solely controlled
by the perturbation contained in the initial conditions.
In addition, the growth of the unstable waves depends strongly on the type and
strength of the sub-grid scale closures that are either implicit in the underlying nu-
merical formulation or explicitly added to the numerical models. For example, the
x1 panel in Figure 5.7 agrees very closely with panel D in Figure 17 of [15], but is
significantly different than panel D in Figure 9 of [11]. This is because the simula-
tions presented here and in [15] do not use any explicit closure, whereas [11] uses
hyper-diffusion on the RHS of the momentum equation.
The strong correspondence of the x1 simulations with panel D in Figure 17 of
[15] indicates that the x1 simulation is broadly representative of the instability when
simulated in a system with minimal or no damping. The primary purpose here is
60
to understand how the use of variable resolution meshes alters the growth of the
barotropic instability.
First, focusing on the deep, tilted trough just right of center in each panel along
with the ridge-trough-ridge system just upstream to the west one finds that these
dominant features are present in all simulations with the same amplitude and phase.
The x2 simulation is qualitatively equivalent to the x1 simulation in all respects.
In addition, the x8 simulation is qualitatively equivalent to the x4 simulation in all
respects. The x4 simulation differs from the x2 simulation only along the edges of
the panels that corresponds to the center of the coarse-mesh regions. The primary
difference between these two groups of simulations is that the x4/x8 simulations
produce an additional ridge in the upstream wave. The x16 simulation is qualitatively
different from the other simulations in all regions other than the fine-mesh region.
The x16 simulation produces relatively strong ridge-trough systems in the coarse-
mesh region that are not present in the other simulations. It is important to note
that the fine-mesh resolutions of the x8 and x16 simulations are essentially the same at
approximately 10 km, yet the coarse-mesh resolution between these simulations differ
by a factor of two (as in Table 5.3). The x16/655362 simulation is more similar to
the x1/40962 simulation (not shown) than any of the other simulations with 655362
nodes. Since the coarse resolution of the x16/655362 simulation is comparable to
the x1/40962 simulation, this finding is consistent with Figures 5.5(b) and 5.6(b)
which demonstrate that the accuracy of the simulation is controlled primarily by the
resolution in the coarse-mesh region.
61
Figure 5.7: Each panel depicts the relative vorticity field at day 6 for a barotropically-unstable jet using 655362 cells. The panels differ only in the mesh usedin the simulation. The vertical extent of each panel covers the northernhemisphere. The horizontal extent covers all longitudes starting at -90degrees such that the fine-mesh region is approximately centered on eachpanel. The color scales are identical for every panel and saturate at±1.0× 10−4.
62
CHAPTER 6
ADAPTIVE MESH REFINEMENTBACKGROUND
This chapter describes the framework used for the exploration of Adaptive Mesh
Refinement (AMR) in the context of the shallow water equations using SCVT grids.
Results from the described AMR framework are presented in Chapter 7.
6.1 AMR Background
Adaptive Mesh Refinement is typically used as a means to get increased spatial
accuracy without a significant increase in the computational cost. Typically, one
would have to increase the global resolution of a mesh to increase the global accuracy,
however AMR makes use of output data from simulations to generate new meshes
that are better suited for that specific simulation. Because AMR meshes apply local
refinement around features of interest they are a type of multi-resolution mesh. As a
simulation progresses the meshes generated typically track features of interest. As was
seen in Section 5.2.1 error norms associated with multi-resolution meshes appear to
be controlled by the coarse mesh resolution. Because of this, AMR SCVT meshes are
only used with the motivation of reducing horizontal grid spacing in an area defined
by simulation output. At a later point in time, scale aware parameterizations, which
adapt to changes in grid spacing, may reduce the error norms by providing more
accurate simulations on these multi-resolution AMR meshes.
63
The work in this portion of this dissertation follows two similar explorations [5, 29].
Both of these AMR approaches are implemented using cubed sphere grids that provide
two beneficial features for implementing AMR. First, each cell can be thought of as
the root of a quad tree allowing for local refinement simply by subdividing each cell
into four sub-cells. Second, since the cells do not move over time, de-refinement
is as simple as removing the sub-cells and replacing with the previous “root” cell.
One disadvantage to cubed sphere grids is that they are non-conforming meshes and,
because of this, have hanging nodes. The hanging nodes require interpolation schemes
to remove spurious waves generated by reflection from the artificial boundary.
Both [5, 29] use the absolute value of the relative vorticity field as the criteria for
cell refinement. In their explorations they both use several of the standard shallow-
water test cases found in [39] to test their AMR schemes. A common test between
both explorations is shallow-water test case number 5, as defined in Section 4.2.2.
This test case is useful in testing AMR schemes due to the evolution of the vortical
dynamics. As the vortices migrate around the sphere, the AMR scheme should adapt
the grid by refining and derefining cells to compensate for this movement.
6.2 SCVT-AMR Framework
A typical AMR framework for use on cubed sphere meshes with static grid points
has a general format similar to the following pseudo-code:
While Algorithm 1 is a typical AMR framework for use on structured grids with
static grid points, it does not have a direct translation for use on SCVT meshes.
As the focus of this dissertation is on multi-resolution meshes within SCVTs, a new
framework is developed for use within this context, and is described as follows
Currently, the tools required for the use of SCVTs in Algorithm 2 do not ex-
ist. Particularly difficult is re-mapping data between two SCVT meshes. There are
methods of re-mapping data on SCVTs [16], however the implementations are not
64
mature enough to support this type of application. Due to this constraint, the frame-
work presented in this dissertation only incorporates one time step of a typical AMR
framework. This is done under the assumption that eventually the re-mapping tools
will become mature enough to be combined with this AMR scheme, and will allow a
comparison with previously implemented schemes which use Algorithm 1.
The AMR framework used for a single time step is as follows:
Following [5, 29] the relative vorticity field, ξ, is used to define both the point-
density field and the refinement criteria. After one AMR time step, in this case one
day, the output from MPAS’ shallow-water model is used to build a density field.
In order to create a density field, several steps are required. To begin, the absolute
value of the relative vorticity field, |ξ|, is cut off at a threshold comparable to [5, 29] of
|ξ| > 1.0 ∗ 10−5. This step ensures only cells with extreme values of relative vorticity
are refined, which ignores any relative vorticity in the mean flow. After the threshold
is applied, the remaining relative vorticity field is rescaled, where this scaling is defined
as
ρi =|ξi| −minN
j=1 |ξj|
maxNj=1 |ξj|∗ (γ4 − 1.0) + 1.0 (6.1)
where |ξi| represents the absolute value of the relative vorticity at cell i, N represents
the number of cells, ρi represents the density value at a cell, and γ represents an
Algorithm 1 General AMR Framework
t = 0
Initialize simulation
while t < T do
Iterate simulation for time of ∆t
Refine mesh based on chosen criteria, e.g. relative vorticity, ξ
Map data from previous mesh to refined mesh
end while
Compute error norms
65
arbitrary scaling.
Using the scaling defined in (6.1) and the cut off previously discussed, the density
field has a minimum value of 1, and a maximum value defined by γ4. The minimum
grid spacing is then given as a factor of the coarse grid spacing, based on γ4, as in
(2.3). The resulting density field obtained by cutting off and mapping the relative
vorticity field tends to have sharp gradients and can be very concentrated in certain
areas. These sharp gradients cause issues when refinement is applied, as neighboring
cells are allowed to differ by more than one level of refinement.
In an attempt to remove these sharp gradients in the density field a Laplacian
smoothing operator can be applied an arbitrary number of times. Laplacian smooth-
ing will cause the density field to diffuse over the sphere, which will smooth out
gradients in the density field. The motivation of this technique is that grids will be
a higher quality if the density field has smooth gradients. The Laplacian smoothing
operator is defined as
ρ∗i =1
2ρi +
n∑
j=1
(1
2 ∗ n∗ ρj) (6.2)
where n is the number of neighbors cell i has and ρ is the density defined at cell
centers.
This Laplacian smoothing operator replaces a cell value with a weighted average
of its previous value and its neighbor’s values. Because this operator smooths slowly,
it may have to be applied many times before a reasonably smooth density field is
obtained. Figure 6.1 illustrates this point with 4 plots. Presented are density fields
with no smoothings (Figure 6.1(a)), 16 smoothings (Figure 6.1(b)), 64 smoothings
(Figure 6.1(c)), and 128 smoothings (Figure 6.1(d)). 128 is used as the maximum
number of smoothings because most of the features present in the 0 and 16 smoothings
fields are not present anymore. Also, in the 128 smoothings case, refinement is applied
within areas that do not appear to require it.
66
After Laplacian smoothing is applied, the grid refinement levels are validated to
ensure only one level of refinement occurs across edges of Delaunay triangles. Al-
though this could add more points than are required based on the refinement criteria,
this technique is standard in the AMR techniques presented in [5, 29].
Cubed sphere AMR techniques maintain the coarse grid spacing of elements due
to the static grid points. In an attempt to retain this advantage refinement can be
used to intelligently add points to SCVTs. As was discussed previously, bisection
provides a refinement in horizontal grid spacing of roughly a factor of two. In order
to refine the reference mesh, Delaunay triangles are bisected based on the density
value given from re-mapping and smoothing the relative vorticity field. This refine-
ment procedure subdivides Delaunay triangles, adding points on edges, as well as
the interior of triangles. The number of points added depends on the density in the
triangle compared with the minimum density. One edge of a Delaunay triangle is
subdivided n times, where n is defined as
n = log2(ρ1
4n ) (6.3)
where ρn is the density value associated with the Delaunay triangle.
The Delaunay triangle can either be refined when n > 1 or, if |ξ| > α. Both [5, 29]
use |ξ| > α and set α = 2.0 ∗ 10−5. Because of the threshold on ξ the implemented
method combines both these methods. Triangles are refined when n > 1, however
n is only larger than 1 when |ξ| > α. As mentioned previously, refining triangles
based on n > 1 causes the coarse mesh resolution to roughly be preserved, based on
(2.3). Controlling γ from (6.1) allows control over the total number of points added
to a mesh. Although low values of γ produce grids with less total points, they also
produce grids with lower variances in horizontal grid spacing. Figure 6.2 shows three
triangles with varying levels of division based on the density value of the triangle.
After a grid has been refined and smoothed using a combination of (6.1), (6.2), and
67
(6.3), the grid is no longer an SCVT. An SCVT generator, as described in Chapter
2, can be used to converge the point set to an SCVT. Before an SCVT generator
can be used to iterate on a refined point set, a spatial density function needs to be
defined in order to evaluate the point-density of each generator as they move around
the mesh. There are several choices of spatial density functions, each with their own
drawbacks and benefits. The main drawback is differences in computational cost.
Some potential options for spatial density functions are piecewise constant, piecewise
linear, and pointwise constant density functions. While pointwise constant density
functions provide a benefit of having significantly lower computational cost, they have
less dependence on the overlaying density function and can potentially move refined
regions out of the area of interest. Piecewise constant density functions are almost as
expensive as piecewise linear density functions, however they don’t smooth the final
point set over the mesh well, and end up with points clustered in reference Voronoi
cells. Because of the drawbacks of piecewise and pointwise constant density functions,
piecewise linear density functions are used.
Before the refined mesh is output, a reference mesh is output, which contains
points, density values associated with those points, and a triangulation of those points.
The piecewise linear density function is defined as a barycentric interpolation within
reference Delaunay triangles. In order to evaluate the density function at an arbitrary
point, first the Delaunay triangle which contains the point must be determined. This
is done through a combination of vector dot and cross products to check the orien-
tation with each edge of the Delaunay triangle. After the in-out test is completed,
the barycentric weights of the test point need to be computed. This computation is
described as
68
α =Area(B,C, P )
Area(A,B,C)(6.4)
β =Area(C,A, P )
Area(A,B,C)(6.5)
γ =Area(A,B, P )
Area(A,B,C)(6.6)
(6.7)
where A, B, and C are the vertices of the triangle, given in counter-clockwise order,
P is the test point contained inside triangle ABC, and α, β, and γ are the barycentric
weights for P associated with A, B, and C respectively.
After the barycentric weights are computed, the density function is a simple map
defined as follows
ρP = ρA ∗ α + ρB ∗ β + ρC ∗ γ (6.8)
where ρ is the density value, and α, β, and γ are the barycentric weights for point P .
The resulting framework provides a method of producing AMR-like grids that
maintain the specified coarse mesh resolution, while increasing resolution in areas
of interest based on reference simulations. Using this scheme, and some yet-to-be-
developed tools for re-mapping, a full AMR scheme can be implemented. Any of the
fields present in these simulations can be used for computing the density field and
refinement; relative vorticity is only used as a comparison to [5, 29].
69
Algorithm 2 Full AMR Framework for SCVT meshes
t = 0
Initialize simulation
while t < T do
Iterate simulation for time of ∆t
Convert field of interest, ξ, into a point-density field
If desired, smooth point-density field
Refine mesh based on point-density field
Converge refined mesh to an SCVT as described in Chapter 2
Map data from previous SCVT to new SCVT
end while
Compute error norms
Algorithm 3 Single time step SCVT AMR Framework
t = 0
Initialize simulation
Iterate simulation for time of ∆t
Convert field of interest, ξ, into a point-density field
If desired, smooth point-density field
Refine mesh based on point-density field
Converge refined mesh to an SCVT as described in Chapter 2
Initialize simulation with newly converged SCVT mesh at t = 0
Iterate simulation for time of ∆t
Compute error norms
70
(a) No smoothing (b) 16 smoothings
(c) 64 smoothings (d) 128 smoothings
Figure 6.1: Density field obtained after one simulation day using the relative vor-ticity field from shallow-water test case 5, on a x1 2562 generator grid,corresponding to the first four steps in Algorithm 3. Figure 6.1(a) hasno smoothings applied, Figure 6.1(b) has 16 smoothings applied, Figure6.1(c) has 64 smoothings applied, and Figure 6.1(d) has 128 smoothingsapplied. The smoothing operator is defined in (6.2). Red represents theminimum, where blue represents the maximum. To show transitions colorrepresents log2(ρ
1/4)
71
(a) n = 1 (b) n = 2
(c) n = 3
Figure 6.2: Three triangles with subdivision based on density values. Figure 6.2(a)shows a triangle whose density value is 14 providing no divisions. Figure6.2(b) shows a triangle whose density value is 24 providing one division.Figure 6.2(c) shows a triangle whose density value is 44 providing twodivisions
72
CHAPTER 7
ADAPTIVE MESH REFINEMENTRESULTS
To explore the potential of the AMR framework introduced in Chapter 6, results
are presented in this chapter. To begin, a 642 generator quasi-uniform mesh is used
as a reference mesh. After the 642 results are presented, a 2564 generator quasi-
uniform mesh is used to produce another set of AMR grids. These point sets are
chosen due to their relatively small number of points, and large grid spacing. All of
the results presented are computed using shallow-water test case number 5 involving
geostrophic flow over an isolated mountain, implemented inside MPAS. Test case 5
was previously defined in Section 4.2.2. Error norms are computed using a T511 high
resolution spectral element solution, as was used for the shallow-water test case 5
results in Chapter 5.
7.1 642 Point Suite
To begin, a suite of results are presented based on a 642 grid cell quasi-uniform grid
with roughly 960km grid spacing. Using this quasi-uniform reference grid, shallow-
water test case 5, as defined in Section 4.2.2, is simulated for one day. After one
simulation day, the relative vorticity field is mapped into a density field over the
mesh using (6.1). Delaunay triangles are then refined based on their density values.
The maximum density value is constrained using γ = 4. This provides four levels
73
of refinement within a mesh, reducing the 960km grid spacing to 120km in areas
with extreme relative vorticity. Delaunay triangles are refined in the aforementioned
manner, maintaining a fixed coarse grid resolution.
Figure 7.1 shows four grids that make up the 642 grid cell suite. Figures 7.1(a),
7.1(b), 7.1(c), and 7.1(d) show 0, 16, 64, and 128 iterations of Laplacian smoothing
on the density fields respectively. Color in these figures represents cell area, with red
representing the smallest area, and purple representing the largest area.
(a) Unsmoothed (b) 16 Smoothings
(c) 64 Smoothings (d) 128 Smoothings
Figure 7.1: AMR grids based on a 642 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied.
74
Before exploring the results obtained from each simulation, reference data is shown
for a qualitative comparison. Figure 7.2 shows the thickness, potential vorticity, and
relative vorticity fields after one simulation day on the 642 quasi-uniform mesh. Again
the relative vorticity field shown in Figure 7.2(c) was scaled to be the density field
used to generate the meshes shown in Figure 7.1.
(a) Thickness (b) Potential Vorticity
(c) Relative Vorticity
Figure 7.2: Reference data fields for 642 quasi-uniform mesh. Shallow-water test case5 was simulated for 1 day, plotted in Figure 7.2(a) is the fluid thicknessfield, Figure 7.2(b) is the potential vorticity field, and Figure 7.2(c) is therelative vorticity field.
The output fields from the four AMR grids after one simulation day using shallow-
water test case 5 are presented in Figures 7.3, 7.4, and 7.5. Figure 7.3 shows the
thickness fields for all four simulations, which appear qualitatively equivalent to the
reference simulation. Figure 7.4 shows the potential vorticity fields for all four simu-
75
lations, which also appear qualitatively equivalent to the reference simulation. Figure
7.5 shows the relative vorticity fields for all four simulations. The relative vorticity
fields appear to have an increase in noise as the number of smoothings applied to the
mesh are increased.
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.3: Thickness fields from the 642 suite of AMR meshes. Figure 7.3(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.3(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.3(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.3(d) showsthe thickness field from a mesh with 128 smoothings.
As was seen in Section 5.2.1 all conservation properties still hold on these AMR
meshes. In order to explore the effect each AMR mesh has on the simulation error
76
norms are computed as was done in Section 5.2.1. Table 7.1 lists the L2 and L∞
norms of the error in the thickness field of each AMR simulation at the end of one
simulation day.
Table 7.1: Error norms associated with the suite of AMR meshes based on the 642grid point reference mesh. Presented are L2 and L∞ norms of the error inthe thickness field, compared to a T511 reference simulation
Grid L2 L∞ % Irreagular CellsReference 7.45 ∗ 10−4 4.83 ∗ 10−3 1.86%
Unsmoothed 9.00 ∗ 10−4 8.67 ∗ 10−3 7.56%16 Smoothings 8.16 ∗ 10−4 6.04 ∗ 10−3 9.36%64 Smoothings 9.58 ∗ 10−4 5.55 ∗ 10−3 10.83%128 Smoothing 1.45 ∗ 10−3 8.93 ∗ 10−3 7.73%
As can be seen in Table 7.1, all AMR meshes have slightly higher error than the
reference mesh. Also, applying more iterations of Laplacian smoothing does not ap-
pear to aid the error norm much if at all. Most of the error in these meshes comes
from the addition of pentagons and heptagons, or Voronoi cells with five and seven
sides respectively. These irregular cells cause distortion in the area of the mesh sur-
rounding them. Although the reference mesh has the minimum 12 pentagons, each
of the other four meshes have more unneeded pentagons each with an additional sep-
tagon. Although it is an incredibly difficult problem, removing these extra pentagons
and heptagons may potentially improve the error norms, and help this AMR scheme
be more useful.
77
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.4: Potential vorticity fields from the 642 suite of AMR meshes. Figure 7.4(a)shows the potential vorticity field from an unsmoothed AMR mesh. Fig-ure 7.4(b) shows the potential vorticity field from a mesh with 16 smooth-ings. Figure 7.4(c) shows the potential vorticity field from a mesh with64 smoothings. Figure 7.4(d) shows the potential vorticity field from amesh with 128 smoothings.
78
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.5: Relative vorticity fields from the 642 suite of AMR meshes. Figure 7.5(a)shows the relative vorticity field from an unsmoothed AMR mesh. Figure7.5(b) shows the relative vorticity field from a mesh with 16 smooth-ings. Figure 7.5(c) shows the relative vorticity field from a mesh with 64smoothings. Figure 7.5(d) shows the relative vorticity field from a meshwith 128 smoothings.
79
7.2 2562 Point Suite
As mentioned in Table 5.1 a 2562 generator quasi-uniform SCVT has a grid spacing
of roughly 480km. Shallow-water test case 5 is simulated for one day, on the quasi-
uniform mesh, after which the relative vorticity field , ξ, is extracted and converted
into a density field using (6.1). As was the case for the 642 grid point suite of meshes,
γ = 4 is chosen to give four levels of refinement between the grid resolutions of the
finest and coarsest grid regions as defined in (2.3). The factor 4 is chosen because it
allows a large variation in grid spacing between the finest and coarsest grid regions
without adding a significant number of points.
Figure 6.1 shows the four meshes created as part of the 2562 grid point suite
of AMR meshes. The four meshes presented consist of varying levels of Laplacian
smoothing, with 0, 16, 64, and 128 applications of Laplacian smoothing. Each of
these four meshes is colored by cell area, where red represents the smallest value, and
purple represents the largest value.
Figure 7.7 plots the thickness, relative vorticity, and potential vorticity fields on
the quasi-uniform 2562 generator mesh after one simulation day. Figure 7.7 is pro-
vided as a reference for data presented on AMR grids based on the quasi-uniform
2562 generator mesh.
Figures 7.8, 7.9, and 7.10 show the thickness, potential vorticity, and relative
vorticity fields for all four AMR simulations based on the 2562 grid point reference
mesh. All simulations are run for one simulation day using shallow-water test case 5.
As was seen in Section 7.1, an increase in the number of times Laplacian smoothing
is applied to the mesh appears to increase the overall noise in the simulation. While
the thickness and potential vorticity fields appear qualitatively similar to the reference
simulation, the relative vorticity field appears to have significantly more noise. In
order to determine the effect of this noise on the final simulations, the error norms
are presented relative to a T511 reference simulation in Table 7.2.
80
Table 7.2 shows the error norms from all 5 simulations to be essentially equiv-
alent in the L2 norm, and not significantly different in the L∞ norm. As was the
case in Section 7.1, all of the AMR grids have more pentagons and septagons than
the reference mesh, and removing these may improve the error norm for the AMR
simulations. Also, the grids presented as part of these results have distinct bound-
aries between levels of refinement in the final SCVT, which is evident from Figure
7.6. Smoothing out this boundary in the final density function may also improve the
simulation results.
Alternatively to the results presented in Table 7.1, Table 7.2 shows a slight de-
crease in both error norms (excluding the 16 smoothings case) as more smoothings are
applied. As the number of points in the reference mesh increase, the resulting density
function can capture more of the dynamics of the relative vorticity field, allowing for
grids with smoother transition regions. This trend may continue to higher resolution
reference grids.
Table 7.2: Error norms for AMR grids based on 2562 grid point reference mesh. L2
and L∞ norms are computed with the thickness field relative to a T511simulation.
Grid L2 L∞ % Irregular cellsReference 3.32 ∗ 10−4 2.95 ∗ 10−3 0.046%
Unsmoothed 4.97 ∗ 10−4 5.91 ∗ 10−3 4.40%16 Smoothings 4.35 ∗ 10−4 8.33 ∗ 10−3 4.52%64 Smoothings 4.22 ∗ 10−4 5.63 ∗ 10−3 5.34%128 Smoothings 4.20 ∗ 10−4 5.32 ∗ 10−3 5.31%
81
(a) Unsmoothed (b) 16 Smoothings
(c) 64 Smoothings (d) 128 Smoothings
Figure 7.6: AMR grids based on a 2562 grid cell quasi-uniform grid. Color representscell area, where Red is the minimum area and Purple is the maximumarea. Presented are grids with 0, 16, 64, and 128 iterations of Laplaciansmoothing applied.
82
(a) Thickness (b) Potential Vorticity
(c) Relative Vorticity
Figure 7.7: Reference data fields for 2562 quasi-uniform mesh. Shallow-water testcase 5 was simulated for 1 day, plotted in figure 7.7(a) is the fluid thicknessfield, figure 7.7(b) is the potential vorticity field, and figure 7.7(c) is therelative vorticity field.
83
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.8: Thickness fields from the 2562 suite of AMR meshes. Figure 7.8(a) showsthe thickness field from an unsmoothed AMR mesh. Figure 7.8(b) showsthe thickness field from a mesh with 16 smoothings. Figure 7.8(c) showsthe thickness field from a mesh with 64 smoothings. Figure 7.8(d) showsthe thickness field from a mesh with 128 smoothings.
84
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.9: Potential vorticity fields from the 2562 suite of AMR meshes. Figure7.9(a) shows the potential vorticity field from an unsmoothed AMR mesh.Figure 7.9(b) shows the potential vorticity field from a mesh with 16smoothings. Figure 7.9(c) shows the potential vorticity field from a meshwith 64 smoothings. Figure 7.9(d) shows the potential vorticity field froma mesh with 128 smoothings.
85
(a) Unsmoothed (b) 16 Smoothing
(c) 64 Smoothing (d) 128 Smoothing
Figure 7.10: Relative vorticity fields from the 2562 suite of AMR meshes. Figure7.10(a) shows the relative vorticity field from an unsmoothed AMRmesh.Figure 7.10(b) shows the relative vorticity field from a mesh with 16smoothings. Figure 7.10(c) shows the relative vorticity field from a meshwith 64 smoothings. Figure 7.10(d) shows the relative vorticity field froma mesh with 128 smoothings.
86
CHAPTER 8
DISCUSSION
Presented in this dissertation was a broad scope of research, including parallel spher-
ical grid generation, shallow-water simulations, and adaptive mesh refinement. Be-
ginning with Chapters 2 and 3 a new algorithm was presented allowing the parallel
computation of spherical Delaunay triangulations and spherical centroidal Voronoi
tessellations. This new algorithm combines existing techniques in computational ge-
ometry to manipulate the point set, allowing spherical triangulations to be computed
in the more simple geometry of the plane. Existing spherical triangulation algo-
rithms are not easily parallelizeable, and tend to scale poorly with the number of
points. The new algorithm, MPI-SCVT, presents a comparison against a well-known
spherical triangulation algorithm, STRIPACK, and showcases a respectable speed
up. As MPI-SCVT has two algorithms used to compute the triangulation, two values
of speed up are presented. A more direct comparison to STRIPACK uses the full
triangulation of all points in the set, where MPI-SCVT performed with a speed up
of roughly 37 when using 42 processors. For a comparison more applicable to SCVT
generators, the regional triangulation from MPI-SCVT presents a speed up of 4096
over STRIPACK, when using 42 processors.
MPI-SCVT was then used to generate variable resolution meshes that were used
in a new shallow-water model from Los Alamos National Laboratory and the National
Center for Atmospheric Research. This new model is known as MPAS and provides
87
a numerical method that is capable of simulating flow on arbitrary Voronoi meshes.
The shallow-water model was explored using a suite of multi-resolution meshes to
determine the effect of variable resolution meshes in the numerical method. It was
shown that MPAS conserves mass, total energy, and potential vorticity, while dis-
sipating potential enstrophy when using variable resolution meshes. These results
are presented using shallow-water test case 5 from [39]. Also presented are L2 error
norms of the fluid thickness field for shallow-water test cases 2 and 5. Test case 5
shows that the coarse grid resolution controls the global error norm, which can also be
seen in previous results [5, 29, 36]. Test case 2 shows the numerical method currently
has an issue with its rate of convergence at high resolutions which is the focus of
research. The shallow-water model is also explored using a barotropic instability test
case as defined in [11]. The barotropic instability provides a system suitable for the
exploration of the propagation of waves through mesh transition regions. In general,
moderately varying meshes had a minor effect on the overall simulation, only highly
varying meshes showed a significant change in results.
Finally, an adaptive mesh refinement (AMR) framework was presented for sim-
ulations using spherical centroidal Voronoi tessellations. Previous efforts to perform
AMR on spherical grids have required structured grids which provide the benefit of
having fixed grid points. These fixed grid points allow cells to be coarsened and
refined at will, while Voronoi tessellations do not maintain this ability. This new
framework needs to be coupled with tools for remapping data on SCVTs, however
these tools have yet to be developed. Using this framework two reference grid sizes
are used for a suite of 8 AMR simulations. All simulations use shallow water test case
5 from [39] due to its dynamic system. Quasi-uniform grids were used as reference
grids containing either 2562 generators or 642 generators. Reference grids were used
to simulate test case 5 for one day, after which the output was used to refine, smooth,
and generate density fields for new grids. Using this newly refined point set and
88
density field, SCVTs are generated and simulated for one day again. L2 error norms
of the thickness field are presented relative to the T511 reference simulation used for
multi-resolution shallow-water simulations.
These 8 AMR simulations help to solidify the finding that the coarse mesh res-
olution controls the global error. Although some noise appears to be introduced in
the simulation due to irregular Voronoi cells, the global error norm remains close to
the reference global error norm. Future work on this effort reducing the number of
defects in an SCVT mesh may reduce the global error. Although these AMR simula-
tions do not appear to have the same benefit traditional AMR simulations do, they
provide another method of generating multi-resolution meshes suitable for specific
simulations. Future work combining this technique into a full AMR framework, and
creating grids with less local distortion may prove to be a useful endeavor.
8.1 Future Work
As this dissertation cover a broad range of topics, the potential for future work
has many opportunities. With regards to the parallel grid generator, future work
might include:
• Parallel Limited Area Meshing
• Parallel Planar Meshing
Future work with regards to MPAS might include:
• Variable Resolution Meshes Simulated in MPAS Ocean Model
• Exploration of MPAS Shallow-water Model in more Test Cases
Future work involving AMR on SCVT meshes might include:
• Reduction of Pentagon-Septagon pairs in SCVT Meshes
• Coupling of Re-mapping Tools With Described AMR Framework
• Full AMR Simulations
89
BIBLIOGRAPHY
[1] A. Arakawa. Computational design for long-term numerical integration of theequations of fluid motion: Two-dimensional incompressible flow. Journal of
Computational Physics, 1:119–143, 1966.
[2] G. Boccaletti, R. Ferrari, and B. Fox-Kemper. Mixed layer instabilities andrestrati-fication. Journal of Physical Oceanography, 37:2228–2250, 2007.
[3] P.L. Bowers, W.E. Diets, and S.L. Keeling. Fast algorithms for generat-ing delaunay interpolation elements for domain decomposition. Unpublished:http://www.math.fsu.edu/ aluffi/archive/paper77.ps.gz, 1998.
[4] J. Campin, C. Hill, H. Jones, and J. Marshall. Super-parameterization in oceanmodelling: Application to deep convection. Ocean Modelling, 2010. Revised.
[5] C. Chen, F. Xian, and X. Li. An adaptive multimoment global model on a cubedsphere. Monthly Weather Review, 139:523–548, 2011.
[6] P. Cignoni, C. Montani, and R. Scopigno. Dewall: A fast divide and conquerdelaunay triangulation algorithm in e-d. Computer-Aided Design, 30:333–341,1998.
[7] Q. Du, M. Emelianenko, and L. Ju. Convergence of the lloyd algorithm forcomputing centroidal voronoi tessellations. SIAM Journal of Numerical Analysis,44:102–119, 2006.
[8] Q. Du, V. Faber, and M. Gunzburger. Centroidal voronoi tessellations: applica-tions and algorithms. SIAM Rev, 41:637–676, 1999.
[9] M. Fox-Rabinovitz et al. Variable resolution general circulation models:Stretched-grid model intercomparison project (sgmip). Journal of Geophysical
Research, 111, 2006.
[10] M. Fox-Rabinovitz, G. Stenchikov, M. Suarez, and L. Takacs. A finite-differencegcm dynamical core with a variable-resolution stretched grid. Monthly Weather
Review, 125:2943–2968, 1997.
[11] J. Galewsky. An initial-value problem for testing numerical models of the globalshallow-water equations. Tellus, 56A:429–440, 2004.
90
[12] M. Giorgi and L. Mearns. Approaches to the simulation of regional climatechange: a review. Reviews of Geophysics, 29:191–216, 1991.
[13] W. Grabowski. Coupling cloud processes with the large-scale dynamics using thecloud-resolving convection parameterization. Journal of Atmospheric Sciences,58:978–997, 2010.
[14] R. Heikes and D. Randall. Numerical integration of the shallow-water equationson a twisted icosahedral grid. part i: Basic design and results of tests. Monthly
Weather Review, 123:1862–1880, 1995.
[15] S. Ii and F. Xiao. A global shallow water model using high order multi-momentconstrained finite volume method and icosahedral grid. Journal of Computational
Physics, 229:1774–1796, 2010.
[16] P.W. Jones. First- and second-order conservative remapping schemes for gridsin spherical coordinates. Monthly Weather Review, 127:2204–2210, 1998.
[17] S.P. Lloyd. Least squares quantization in pcm. IEEE Transactions on Informa-
tion Theory, 28:129–137, 1982.
[18] J.L McClean et al. A prototype two-decade fully-coupled fine-resolution ccsmsimulation. Ocean Modelling, 2011. In Press.
[19] J.L. McGregor. Regional climate modelling. Meteorology and Atmospheric
Physics, 63:105–117, 1997.
[20] N. Metropolis and S. Ulam. The monte carlo method. Journal of the American
Statistical Association, 44:335–341, 1949.
[21] A. Okabe, B. Boots, K. Sugihara, and S.N. Chiu. Spatial Tessellations - Conceptsand Applications of Voronoi Diagrams. John Wiley, second edition, 2000.
[22] D. Randall and S. Bony. Climate models and their evaluation. IPCC WG1
Fourth Assessment Report, 2007.
[23] R.J. Renka. Algorithm 772: Stripack: Delaunay triangulation and voronoi dia-gram on the surface of a sphere. ACM Transactions on Mathematical Software,23:416–434, 1997.
[24] T. Ringler et al. A unified approach to energy conservation and potential vorticitydynamics for arbitrarily-structured c-grids. Journal of Computational Physics,229:3065–3090, 2010.
[25] T.D. Ringler et al. Exploring a multi-resolution modeling approach within theshallow-water equations. Monthly Weather Review, 2011. Accepted.
91
[26] A. Saalfeld. Delaunay triangulations and stereographic projections. Cartographyand Geographic Information Science, 26:289–296, 1999.
[27] R. Sadourny and C. Basdevant. Parameterization of subgrid scale barotropic andbaroclinic eddies in quasi-geostrophic models: Anticipated potential vorticitymethod. Journal of the Atmospheric Sciences, 42:1353–1363, 1985.
[28] J.R. Shewchuk. Triangle: Engineering a 2d quality mesh generator and delaunaytriangulator. Applied Computational Geometry: Towards Geometric Engineer-
ing, 1148:203–222, 1996.
[29] A. St-Cyr, C. Jablonowski, J.M. Dennis, H. Tofu, and S.J. Thomas. A compar-ison of two shallow-water models with nonconforming adaptive grids. Monthly
Weather Review, 136:1898–1922, 2008.
[30] P.N. Swarztrauber. Spectral transform methods for solving the shallow waterequations on the sphere. Monthly Weather Review, 124:730–744, 1996.
[31] J. Thuburn, T. Ringler, W. Skamarock, and J. Klemp. Numerical representationof geostrophic modes on arbitrarily structured c-grids. Journal of Computational
Physics, 228:8321–8335, 2009.
[32] H. Tomita et al. A global cloud-resolving simulation: Preliminary results froman aqua planet experiment. Geophysical Research Letters, 32, 2005.
[33] G. Voronoi. Nouvelles applications des parametres continus a la theorie desformes quadratiques. J. Reine Angew. Math., 134:198–287, 1908.
[34] Y. Wang, L.R. Leung, J.L. McGregor, D.K. Lee, W.C. Wang, Y. Ding, andF. Kimura. Regional climate modeling: progress, challenges, and prospects.Journal of the Meteorological Society of Japan, 82:1599–1628, 2010.
[35] H. Weller and H. Weller. A high-order arbitrarily unstructured finite-volumemodel of the global atmosphere: Tests solving the shallow-water equations. nter-national Journal for Numerical Methods in Fluids, 56:1589–1596, 2008.
[36] H. Weller, H.G. Weller, and A. Fournier. Voronoi, delaunay, and block-structuredmesh refinement for solution of the shallow-water equations on the sphere.Monthly Weather Review, 137:4208–4224, 2009.
[37] D.L. Williamson. Climate simulations with a spectral, semi-lagrangian modelwith linear grids. Numerical Methods in Atmospheric and Oceanic Modelling.
The Andro J. Robert Memorial Volume, pages 279–292, 1997.
[38] D.L. Williamson. The evolution of dynamical cores for global atmospheric mod-els. Journal of the Meterological Society of Japan, 85B:241–269, 2007.
92
[39] D.L. Williamson et al. A standard test set for numerical approximations tothe shallow water equations in spherical geometery. Journal of Computational
Physics, 102:211–224, 1992.
93
BIOGRAPHICAL SKETCH
Douglas Jacobsen performed both his Ph.D. and M.S. work under the advisement of
Prof. Max Gunzburger at Florida State University in the Department of Scientific
Computing. He entered into the program in Fall of 2007 after finishing his bachelors
degree in Computational Physics at Oregon State University, where he studied the
effect of material defects on hysteresis curves in ferromagnetic materials.
Douglas’ masters research included exploring the effect of vertical grid structures
and mixing strategies on north Atlantic overflow simulations.
Douglas’ current research interests include high performance computing including
GPGPU computing, spherical grid generation specifically related to SCVTs, and
geophysical fluid dynamics.
94