Adaptive Localized Replay: An efficient integration scheme for accurate simulation of coarsening...

Accepted Manuscript

Adaptive Localized Replay: An efficient integration scheme foraccurate simulation of coarsening dynamical systems

Scott A. Norris, Skyler Tweedie

PII: S0021-9991(14)00335-0DOI: 10.1016/j.jcp.2014.05.003Reference: YJCPH 5249

To appear in: Journal of Computational Physics

Received date: 17 October 2013Revised date: 1 May 2014Accepted date: 2 May 2014

Please cite this article in press as: S.A. Norris, S. Tweedie, Adaptive Localized Replay: Anefficient integration scheme for accurate simulation of coarsening dynamical systems, Journalof Computational Physics (2014), http://dx.doi.org/10.1016/j.jcp.2014.05.003

This is a PDF file of an unedited manuscript that has been accepted for publication. As aservice to our customers we are providing this early version of the manuscript. The manuscriptwill undergo copyediting, typesetting, and review of the resulting proof before it is publishedin its final form. Please note that during the production process errors may be discovered whichcould affect the content, and all legal disclaimers that apply to the journal pertain.

http://dx.doi.org/10.1016/j.jcp.2014.05.003

Adaptive Localized Replay: an efficient integration scheme for

accurate simulation of coarsening dynamical systems

Scott A. Norris and Skyler Tweedie

May 8, 2014

Department of Mathematics

Southern Methodist University

Abstract

Coarsening dynamical systems (CDS) of ordinary differential equations (ODEs) have as a defining feature the

presence of discontinuities in the governing equations, due to the removal of vanishing elements from the system.

Simple, commonly-used integration strategies for ODEs neglect these discontinuities, resulting in large errors. In

contrast, available methods to resolve discontinuities exhibit total running times that scale like O(N2

)as the number

of elements N → ∞. The resulting need to choose between either speed or accuracy frustrates attempts to gather

well-converged statistical data from large ensembles of elements, a common goal in studies of coarsening.

In this work, we introduce a numerical scheme, called Adaptive Localized Replay (ALR), which overcomes

this dilemma, provided the underlying system has sparse neighbor dependence. By analyzing the system after each

timestep, and then selectively re-simulating only those (small) parts of the system that were affected by discontinu-

ities, we attain high-order accuracy while maintaining running times of O (N). Using a reference implementation of

our algorithm, applied to a sample problem from the field of faceted surface evolution, we obtain convergence and

runtime results for the ALR method, which compares favorably with existing methods.

1 Introduction

The characterization of micro-structured materials is a fundamental issue in materials science, since the statistics

of microstructural morphology play a central role in determining effective material properties. In many regimes of

growth or relaxation, microstructured materials can exhibit coarsening, during which small microstructural features

continually decrease in size until they vanish, causing the average lengthscale of those that remain to increase with

time. Of particular interest, many such systems exhibit dynamic scaling, whereby the microstructural variations ap-

proach a constant statistical state even as the length scale increases. This behavior is of intense interest, because if the

scale-invariant state can be understood theoretically, and its dependence on environmental parameters predicted, then

one has the possibility for creating predictable and controllable microstructures at a statistical level, despite a lack of

control over the structural evolution at any single location within the material.

A particularly interesting coarsening system for the study of scaling phenomena is that of completely faceted

surfaces. These have an inherent structural simplicity that arises because each facet in such a system moves as a single

unit. Hence, in many cases, the nonlinear PDEs that fully describe the system (e.g., [1, 2, 3, 4]) can be simplified

1

via asymptotic analysis to a facet velocity law providing the normal velocity of each facet (e.g., [5, 6, 7, 8, 9, 10]).

Because the facet velocity law specifies the interfacial velocity of a continuous surface by means of discrete, constant

quantities, the computational complexity of simulating the surface evolution reduces to that of a system of ordinary

differential equations (ODEs). In essence, the surface is governed by a dynamical system of the form

dxdt

= f(x) , (1)

with the vector x containing one entry per facet, but where the dimension of the system decreases every time a coars-

ening event happens. The resulting structure has been termed a Coarsening Dynamical System (CDS), and the relative

simplicity of the governing equations, combined with the easy availability of standard numerical solvers, represent

notable attractions of the faceted surface problem relative to other coarsening problems. Accordingly, this approach

has been employed frequently for one-dimensional surfaces [11, 12, 13, 8, 10, 14, 15, 16, 17, 18], although less so for

two-dimensional simulation [19, 20, 21, 22, 23, 9, 24], where the resolution of topological events becomes the dom-

inant computational cost (note that the use of phase-field [25, 26] or level-set [21, 22, 27] methods avoid the manual

resolution of topological events, at the cost of losing the structure exhibited by Eq.(1)).

For the detailed study of scale-invariance exhibited by coarsening faceted surfaces, it is necessary to gather statis-

tics on the average geometric properties of the system (e.g., the coarsening exponent [28, 29]), to use as the basis for,

or test of, theories attempting to explain or predict the statistical behavior [23, 30]. In order to get well-converged

statistics, one typically wants to simulate systems with as many elements as possible. This is facilitated by the simple

structure of Eq.(1); however, coarsening systems can easily lose most of their elements prior to reaching the scale-

invariant state, and so reasonable convergence within that state can require truly enormous numbers of initial elements.

In order to perform such large simulations as quickly as possible, large timesteps are naturally desired. Finally, to en-

sure that these large timesteps do not introduce too much error into the gathered statistics, one ideally would like to

use high-order methods. Hence, for the acquisition of statistics from the simulation of very large CDSs, the use of

high-order methods for numerical ODEs is highly desired.

It turns out, however, that the use of standard high-order methods on CDS problems with the form (1) is prob-

lematic. Faceted surfaces, in particular, often exhibit facet velocity laws that are configurational, depending on local

facet geometry. During the coarsening process, topological changes to the surface can alter these geometries suddenly,

which means that the resulting dynamical system is generically discontinuous in time. Besides the theoretical prob-

lems posed by these discontinuities, there are computational problems as well. Specifically, the times of the coarsening

events are not known in advance, and generic ODE solvers inevitably overstep these times. Sudden, discontinuous

changes to a few components of the system, if not detected properly, induce errors proportional to the timestep length.

Although algorithms for avoiding these errors exist, the standard method of doing so involves reverting the entire

system to the time a discontinuity occurred, updating the system, and then proceeding. Because the discontinuities are

typically localized during a given timestep, reverting the whole system is extremely wasteful, consuming most of the

computational time in the simulation. Because both the speed and accuracy of simulations are currently limited in this

way, the quantity and quality of statistics that can be gathered and used to test theories is also limited.

In this manuscript we introduce a method for the simulation of Eq.(1) that avoids the poor scaling of traditional

methods for handling discontinuities, while preserving their accuracy, by exploiting the nearest-neighbor structure

of most facet evolution laws. We begin by quantifying the error induced by ignoring the discontinuity altogether

during a timestep of length Δt, and show that for systems with nearest-neighbor dependency structures, the error

scales like ΔtD, where D is the neighbor distance away from the facet exhibiting the discontinuity. Hence, if one

wishes to preserve an O(ΔtP

)method, one only needs to correct the outcome of the directly affected facets, and the P

2

nearest neighbors to each side. This can be done by carefully re-simulating only small subdomains of facets associated

with each coarsening event, and using polynomial interpolations of further neighbors to close the system at a small

size. To ensure the accurate correction of such subdomains, an adaptive approach is required that gracefully fails if

unanticipated events are detected, lending the algorithm the name of Adaptive Localized Replay.

2 An example problem and a reference method.

Specific examples of the kinds of CDSs described in the introduction include asymptotic reductions of the 1D Cahn-

Hilliard equation for spinodal decomposition [31], and the generalized convective Cahn-Hilliard equation [13, 2, 8]:

ut +αuux =

{δδu

[F (u)+

12(ux)

2]}

xx=[F ′ (u)− εuxx

]xx . (2)

Here u(x, t) is an order parameter describing the phase of the system, F (u) is a double-welled energy potential driving

the system toward one well or the other, ε is a small parameter penalizing rapid changes in u(x, t), and α measures

the strength of a non-conserved driving force. The variational structure of the right-hand side drives the system into

a configuration consisting of large alternating domains where u is nearly equal to one of the two minimizers of F (u),

separated by rapid transitions between the two values, called “steps” or “kinks”.

Asymptotic analysis of Eq.(2) leads to relatively simpler expressions for the evolution of the step locations, and

hence, of the domain sizes, which we will denote {xi}. For the original Cahn-Hilliard equation (α = 0), encountered

in studies of phase separation, the CDS governing domain size evolution take the form [32, 33, 34]

dxi

dt=

1xi+1

[exp(−xi+2)− exp(−xi)]+1

xi−1[exp(−xi)− exp(−xi−2)] . (3)

More recently, the convective form of the Cahn-Hilliard equation (α > 0) has received much attention, appearing in

particular during the solidification of an anisotropic material into an undercooled melt [2]. An asymptotic analysis of

the CCH shows that the resulting CDS for the domain sizes is [13, 8]

dxi

dt= (−1)i

[1

exp(xi+1)−1− 1

exp(xi−1)−1

]. (4)

To most clearly communicate the essential features of our algorithm, we will here consider a simpler CDS, en-

countered in the directional solidification of binary alloys [35, 10]. Under a simple negative effective thermal gradient,

the vertical velocity of each facet is simply proportional to the vertical co-ordinate of its midpoint, leading to a CDS

on the facet widths ofdxi

dt=

14[2xi − xi−1 − xi+1] . (5)

We are not concerned with the specifics of this system, but for the illustration of a numerical algorithm, it has ad-

vantages over Eqs.(3)-(4). In contrast to Eq.(3), it has a simpler structure, with the rate of change of each domain

length depending only on the nearest neighbors. And, in contrast to Eq.(4), it is continuous for all values of the {xi}.

Nevertheless, it shares important general properties with Eqs.(3)-(4). In particular, we can immediately see that this

system will lead to coarsening. If any particular facet xi is much shorter than the average facet length, then on average,

the right-hand side of Eq.(5) is negative, and the length will continue decreasing until it reaches zero. Assuming that

facet slopes alternate, a zero-length facet causes its own elimination, and also the merger of its two neighbors into a

3

position x

time

t

(a)

position x

t3

t2

t1

heig

ht h

hillvalley

(b)

Figure 1: (color online) (a) A schematic of a coarsening event. The small facet shrinks in size until it vanishes,causing the annihilation of the bounding convex/concave corners, and the merger of its two neighbors into a single,larger facet. (b) A visualization of part of the simulation of a large surface, showing the location of corners colored asin (a); coarsening events occur whenever two corners meet and annihilate.

single, larger facet with length equal to the sum of the neighbors [35]:

(· · · ,xi−1,xi,xi+1, · · ·) xi→0+−−−−→ (· · · ,xi−1 + xi+1, · · ·). (6)

A single coarsening event of this nature is illustrated in Figure 1a, and a representative simulation of Eq.(5) containing

many such events is illustrated in Figure 1b.

By this point, it is clear that the system of equations (5), and any equation with similar behavior, is inherently

discontinuous - for every coarsening event (6), the right-hand side of the equation x = f (x) changes qualitatively, not

just in its values but also in its dimensionality. And the coarsening events that cause these discontinuities can happen

at any time, statistically always in between timesteps. Hence, it is necessary to consider the numerical simulation of

ODEs with discontinuities. We shall conclude this section by outlining some existing methods for numerical simula-

tion of discontinuous ODEs, and highlighting important limitations that render them unsuitable for the simulation of

large facet ensembles necessary for statistics.

2.1 Fast but Inaccurate: Ignoring the discontinuity.

The easiest way to simulate Eq.(5), used by many authors studying coarsening phenomena (see, e.g. Refs. [13, 8]),

is simply to use a standard solver to iterate Eqs.(5), and perform coarsening events only at the discrete times visited

by the solver. This is typically accomplished by removing, prior to each step, all facets below some small threshhold

4

value (an “early” fix); one could also remove, after each step, all facets whose length has become negative (a “late”

fix). A simple algorithm describing this approach is the following:

Algorithm 1

Step 1. Take a timestep from tn to tn+1 using a standard ODE solver.

Step 2. List all facets whose values crossed threshhold (or became negative) during the timestep.

Step 3. Eliminate those facets from the system, and patch the surface in a physically-consistent way.

where “physically-consistent” means preserving the total length of both positive and negative-sloped facets. This ap-

proach, although rather simplistic, is rather easy to implement, with only the repair in Step 3 requiring some geometric

considerations. It is also quite efficient, allowing rapid exploration of large ensembles of facets.

Unfortunately, although it is fast and easy to program, Algorithm 1 is also fundamentally limited to being first-

order accurate in time, with errors proportional to the timestep taken. For, suppose facet i vanished during timestep j

at a time t∗ ∈ (t j, t j+1

). Then the true governing equations were

dxdt

=

⎧⎨⎩

f1 (x) t ∈ (t j, t∗)

f2 (x) t ∈ (t∗, t j+1

) , (7)

whereas Algorithm 1 simply uses one of these equations for the entire timestep (the second equation if performing

“early” fixes, and the second if performing “late” fixes). At coarsening events, the right hand side of Eqs.(5) can

change by an amount proportional to the average facet length – i.e., by an O (1) amount. By neglecting the change

in the equations that should have occurred at t = t∗, Algorithm 1 therefore used equations wrong by O (1) for a time

duration proportional to Δt, leading to an error of size O (Δt). Hence, Algorithm 1 can never be more than first-order

accurate, regardless of the underlying solver used to integrate the equations (indeed, results on the accuracy of standard

solvers are built upon the assumption of sufficiently smooth solutions). Any significant degree of accuracy would thus

require very small timesteps, rendering this method unsuitable for the rapid simulation of very large systems.

2.2 Accurate but slow: Full Replay of each discontinuity

A much more accurate method, which does not often appear in the coarsening literature but is still relatively simple

to code, is to estimate the time of a discontinuity using an interpolating polynomial associated with the numerical

solution [36]. One can then revert the system to this time, update the right-hand side of Eqs.(5) to the correct state, and

continue the integration onward [37]. Specifically, given the state of the system xn at time tn, we find the state xn+1 at

time tn+1 using the following algorithm:

Algorithm 2

Step 1. Take a full timestep to t = tn+1 with an ordinary ODE solver.

Step 2. Find the time t∗ of the earliest vanishing facet, using a root-finder on interpolating polynomials.

Step 3. Move the entire system back to time t∗ (using either the solver itself, or the interpolating polynomial).

Step 4. Update the state of the data structure and equations to reflect the appropriate coarsening event.

Step 5. Take the rest of the step, from t∗ to tn+1. If additional events are found, return to Step 2.

5

We shall call this method the Full Replay method, in that once a discontinuity is detected, the full system is reverted

to the discontinuity time, and then re-simulated onward to the end of the timestep.

The primary advantage of Algorithm 2 is that the solver is used to advance the system only between discontinuities,

during which intervals the governing equations are continuous. In other words, it solves the true governing equations

given by Eq.(7), while still satisfying the smoothness assumptions underlying the accuracy results of the solver. If the

polynomial interpolation can be calculated to the same local accuracy as the method itself, then the error induced by

the discontinuity is no worse than the error of the underlying solver, and the overall method retains the accuracy of the

underlying solver.

Unfortunately, Algorithm 2 remains unsuitable for the simulation of large systems because it scales poorly as the

number of facets increases. Specifically, for a large number O (N) of facets, we expect the number of coarsening

events in any fixed interval of time also to scale like O (N). Because Algorithm 2 visits the time of each coarsening

event, this algorithm effectively imposes a timestep of order Δt ∝ 1N , and so the overall simulation time required to

advance O (N) facets through a fixed interval scales like O(N2

). This is not a problem for small simulations, but as

the system size increases toward a size large enough to obtain a good statistical characterization of the scale-invariant

state, the cost becomes prohibitive.

3 An improved method: Adaptive Localized Replay

We now introduce a method, briefly envisioned in Refs.[38, 24], that preserves the accuracy of Algorithm 2 while

avoiding the scaling problem it exhibits. Our approach is based on two key observations.

First, because a full simulation - resulting in the removal of most facets - is expected to take many timesteps to

complete with reasonable accuracy, then only a small fraction of the facets are expected to vanish in any given

timestep.

Second, because the governing equations (5) have only a nearest-neighbor dependency, the effect of a single, localized

coarsening event might not have much effect on facets tens or hundreds of neighbors away, at least not during a

single timestep.

Hence, at Step 3 of Algorithm 2, the reversion of the entire system to the time of the first occurring discontinuity is

likely very wasteful – only a fraction of the facets vanish during each step, and only a somewhat larger fraction (the

vanishing facets plus those in a neighborhood) are likely to feel the effect of these events. To avoid this waste, it

seems reasonable to try and revert only those parts of the network that are affected by the events. This requires both

theoretical and computational care, so in the rest of this section we will first justify our basic approach, and then detail

its implementation.

3.1 Analysis

From Figure 1, we see that, at the moment of discontinuity, the small facet vanishes, and its two neighbors join to form

a much larger facet. Therefore, due to the nearest-neighbor structure of the dynamics (5), every facet to the right of

facet k sees the coarsening event as a sudden change in the length of facet k, by an amount on the order of the average

facet length, or O (1). If an ODE solver fails to detect this change and update the governing equations accordingly,

this introduces errors into the values of nearby facets. Let us quantify these errors as a function of neighbor distance.

6

We begin by deriving a Taylor series method for the simulation of equations having a nearest-neighbor structure

of the formdxi

dt= f1 (xi−1,xi,xi+1) ; (8)

clearly Eq.(5) is one example of this kind of system. The Taylor series method gives an estimate of the value of the ith

facet xi at time t +Δt, by means of a Taylor expansion centered at time t:

xi (t +Δt)≈ xi (t)+ x′i (t)Δt +12

x′′i (t)(Δt)2 + . . . (9)

In order to obtain a quantitative value for the left side of Eq.9 at time t +Δt, we need values for each of the derivatives

on the right hand side at time t. The current value of xi (t) is obviously one of the variables we are tracking, and its

first derivative is available directly from Eq.8. To obtain higher derivatives of xi (t), we can differentiate Eq.8 using

the chain rule. For instance, differentiating once, we obtain for the second derivative the expression

d2xi

dt2 =∂ f1

∂xi−1

dxi−1

dt+

∂ f1

∂xi

dxi

dt+

∂ f1

∂xi+1

dxi+1

dt

=∂ f1

∂xi−1f1 (xi−2,xi−1,xi)+

∂ f1

∂xif1 (xi−1,xi,xi+1)+

∂ f1

∂xi+1f1 (xi,xi+1,xi+2)

≡ f2 (xi−2,xi−1,xi,xi+1,xi+2)

(10)

where we have introduced the new function f2 to stand for the more complicated middle expression in Eq.(10). In

general, we can continue computing higher derivatives iteratively in this way, defining new functions in terms of

derivatives of the previous functions, via the following recursion relation:

d(n)xi

dt(n)=

n

∑k=−n

∂ fn−1

∂xi+k

dxi+k

dt

=n

∑k=−n

∂ fn−1

∂xi+kf1 (xi+k−1,xi+k,xi+k+1)

≡ fn ({xi+k : k ∈ [−n,n]})

(11)

The exact form of the functions fn (�) clearly become complicated, but we observe the general feature that, because

of the nearest-neighbor structure of the original system of ODEs, the nth time derivative of a facet length depends only

the lengths of the n nearest neighbors in both directions. Inserting these results back into Eq.9, we obtain

xi (t +Δt)≈ xi +Δt f1 (xi−1,xi,xi+1)+Δt2

2!f2 (xi−2,xi−1,xi,xi+1,xi+2)

+Δt3

3!f3 (xi−3,xi−2,xi−1,xi,xi+1,xi+2,xi+3)+ . . . (12)

Equation (12) tells us that an O (1) error, at time t, in the value of the nth neighbor of xi, will appear first in the function

fn (�), and therefore induces an error of size O (Δtn) in the value of xi (t +Δt). Alternatively, we can say that, if we

seek a method with overall accuracy of O(ΔtP

), then an O (1) error at facet xk destroys the accuracy only of the Pth

nearest neighbors of xk. Hence, for the coarsening event shown in Fig. 1a, where three facets are replaced by a single

facet with qualitatively different length, only these three facets and their P nearest neighbors need any attention; all

facets further away remain accurate to within the order of the method.

7

3.2 Algorithm

The preceding analytical result suggests a strategy for an accurate method that can run in O (N) time. Briefly, we

first take the system through a single solver step, ignoring any discontinuities as in Algorithm 1. We then identify

the facets that were involved in coarsening events during the timestep. Except for these facets and their Pth nearest

neighbors, all of the other facets in the system must be accurate to O(ΔtP

). So, if we could re-simulate the evolution

of these subdomains using the more accurate Algorithm 2, we would have accurate values for all facets in the system.

Furthermore, because the subdomain size 2P+ 3 is independent of the system size N, the subdomains can be re-

simulated in O (1) time, and the overall method should be O (N).

The challenge posed by re-simulating only a subdomain is the nearest-neighbor structure of Eq.(5), in which the

evolution of the (P+1)th neighbors require knowledge of the (P+2)th, which require knowledge of the (P+3)th, and

so on. However, because the (P+2)th neighbors are accurate to the order of the method, we may replace their values

during the re-simulation with polynomial interpolations of their evolution during the original solver step, resulting

in a closed system of 2P+ 3 equations. Care must be taken, because patches can overlap, and the re-simulation

process, because it is more accurate than the first “test” solve, may reveal additional events not originally detected.

Nevertheless, such difficulties can be handled in a robust manner according to the following algorithm.

Algorithm 3.

Step 1. Take a full timestep from tn to tn+1 with an ordinary ODE solver.

Step 2. Identify all facets that vanished during the timestep, which will have attained a negative length by the end of

the timestep.

Step 3. For each vanished facet, construct a subdomain consisting of (a) the facet itself and its nearest neighbors,

which join to form the new facet, (b) a “buffer zone” of the P nearest neighbors of these facets in each direction,

and (c) an O(ΔtP

)polynomial interpolation of the evolution of the P+1st neighbors, or boundary facets. The

nearest-neighbor structure of Eqn.(5) ensures that errors due to vanished facets did not reach these boundaries

during Step 1.

Step 4. Combine any overlapping subdomains into larger subdomains, consisting of polynomial interpolations of the

leftmost and rightmost boundary facets, and all of the facets in between; place the resulting unique subdomains

into a list.

Step 5. Re-simulate each subdomain using Algorithm 2, substituting the included polynomial interpolants for the

values of the boundary facets within the system of ODEs. Because the evolution of the boundary facets is

accurate to the same O(ΔtP

)of the global method, the interpolants may be used to provide closure to the re-

simulation of the subdomains. The use of Algorithm 2 for the re-simulation ensures that the correct governing

equations are used at all times. The more accurate re-simulation in Step 5 may reveal additional coarsening

events not originally detected in Step 2. Whether or not this is a problem depends on the location of the event.

Step 6. If a re-simulation is completed with no newly-discovered events inside the “buffer zones,” then the result is

taken to be an accurate correction to the result of Step 1. However, we do not immediately apply the correction,

but rather move the subdomain to a list of finished “patches” to the outcome of Step 1. Proceed to the next

subdomain and return to Step 5.

8

Step 7. On the other hand, if re-simulation reveals additional coarsening events that occur inside the buffer zone, then

the result remains insufficiently accurate, as the newly-discovered vanishing facets induce errors beyond the

original boundary facets. In this case, more care must be taken.

Step 7a We first restore the re-simulated facets to their original values, enlarge the subdomain so as to include

the new vanishing facet, as well as sufficient buffer zones and boundary polynomials on the left and right

hand sides.

Step 7b. Because this subdomain has grown larger than its initial size, it occasionally now overlaps with a pre-

viously distinct domain. We must compare the enlarged subdomain to both the list of waiting subdomains,

and also the list of finished patches.

Step 7c. If the enlarged subdomain overlaps with any member of these two lists, then we delete that mem-

ber, and uniquely merge its contents into the existing subdomain, including appropriate buffer zones and

boundary polynomials. Return to Step 5 with the enlarged subdomain.

Step 8. Proceed iteratively through Steps 5-7 until the list of subdomains is empty. At this point, the list of patches

contains updated values for all facets for which Step 1 produced large errors. Apply each patch in the list of

patches to the outcome of Step 1.

To distinguish it from the Full Replay method (Algorithm 2), we shall call this method the Adaptive Localized Replay

(ALR) method, because only facets near to the coarsening events must be re-simulated, at the cost of a more complex,

adaptive algorithm.

The most important features of this algorithm are: (Step 1) most of the surface can be accurately simulated by a

standard ODE solver; (Step 2-5) only localized regions of the surface need to be re-simulated with the discontinuity-

aware Algorithm 2, which can be done by closing the subdomains with polynomial interpolations of the boundary

facets; and (Step 7) unanticipated events discovered during re-simulation can be gracefully handled by an adaptive

procedure, regardless of whether or not those facets were originally detected in Step 2. It is this adaptive re-simulation

of local patches of surface - required to make the method O (N) - which is unusual in this approach, and gives the

method its name.

4 Testing and Results

We have implemented the Adaptive Localized Replay algorithm for the coarsening dynamical system described by

Eq.(5). In addition, we have implemented the simpler Algorithm 2 to validate the accuracy of our implementation of

Algorithm 3. Finally, we have implemented the common, but inaccurate, Algorithm 1 for reference. For each of these

algorithms, we will perform test simulations using a variety of underlying Runge-Kutta solvers, of varying orders of

accuracy. We present results on the relative accuracy of each combination, as well as the speed of the ALR method

relative to the Full Replay method.

Convergence: individual facet lengths. We first demonstrate the convergence properties of all three algorithms

described above. For an initial condition consisting of 100 facets with random lengths, we simulate from t = 0 to

t = 1, during which more than half of the facets vanish (one time unit is therefore roughly a “half-life” of a facet under

Eq.(5), and serves as a characteristic timescale of the overall system). As a reference solution, we solve via Algorithm

2, using RK4 with a timestep of Δt = .0005. Then, we compute the solution using all three algorithms, using each of

9

(a)

(b)

(c)

(d)

1 2 3 45

67

8 9 10 11 12 1314

1516

17 18 19 20

Figure 2: Schematic of various parts of the algorithm, for a patch of surface containing multiple small facets about tocoarsen, and specially constructed so as to exhibit all possible difficulties. For the sake of a smaller illustration, we usebuffer zones appropriate to a method of order P= 2. (a) During a test timestep (Step 1), three small facets k = {5,7,16}are found to have vanished (Step 2), and re-simulation zones are constructed (Step 3). (b) Prior to re-simulation, two ofthe zones are found to overlap, and are merged into a single zone (Step 4). (c) During re-simulation of the subdomains(Step 5), the left domain finishes successfully (Step 6), but the right domain encounters the unanticipated coarseningof facet k = 14 (Step 7). That domain is increased in size to account for the newly-discovered event (Step 7a). (d)However, the enlargement causes the new domain to overlap an existing domain (Step 7b), and so the contents of thetwo domains are merged (Step 7c), and the entire merged domain is re-simulated again as a single unit using Algorithm2 (Step 5).

10

log10( t)

log 1

0(error)

convergence of lengths: pre-remove

EulerRK2RK4

(a)

log10( t)

log 1

0(error)

ALR methodFull Replay

Euler

RK2

RK4

convergence of lengths: Full and ALR

(b)

Figure 3: Convergence results on individual facet lengths, comparing each of the various algorithms described hereinas the timestep Δt → 0. (a) If facets are merely removed after reaching a small threshhold (Algorithm 1), the neglectof discontinuities leads to only O (Δt) convergence, regardless of the underlying solver. (b) In contrast, both the FullReplay method (Algorithm 2) and the ALR method (Algorithm 3) converge with the accuracy of the underlying solver,despite the presence of O (1) discontinuities in the solution. (Note: the less accurate methods, including all instancesof Algorithm 1, can fail to observe some events that are captured by the reference solution, resulting in a computedsolution with a different number of elements than the reference solution. In these cases, the application of Eq.(13) isimpossible, and the “error” is instead set to unity.)

Euler, RK2, and RK4 as the underlying solver, for time steps Δt ∈ {.1, .05, .02, .01, .005, .002, .001}, and record the

error in each of these solutions relative to the “exact” solution. As an error measure we use the infinity norm

e(Δt) = maxk

∣∣xk,Δt − xk,.0005∣∣ . (13)

The error for the various methods as a function of Δt is shown in Figure 3, in log-log form. In Figure 3a, we see that

Algorithm 1 never attains better than first-order accuracy, regardless of the underlying solver. In Figure 3b, however,

we see that both the Full Reply and the ALR methods retain the accuracy of the underlying Runge-Kutta solver, despite

the presence of discontinuities due to coarsening. For relatively large values of Δt, the smaller error of the Full Replay

method is not due to any greater accuracy of the method itself, but rather to the effect discussed in Section 2.2, where

due to the full resimulation at each coarsening event, the effective timestep is actually smaller than the value shown on

the axis. As Δt → 0, the errors of both methods approach the same value.

Convergence: statistical measures. A primary aim of large ensemble simulations is the collection of statistics, so

we also wish to demonstrate the convergence of some representative statistical property. The mean facet length is a

quantity of obvious interest, but because the total interface length is conserved, the mean facet length varies discretely

with the number of remaining facets, and is therefore a poor choice for convergence studies. However, subsequent

moments of the facet length distribution are continuous functions of the facet lengths. We therefore present in Figure 4

results on the convergence of the variance of the distribution, illustrating that statistics as well as individual variables

are affected by the numerical error associated with discontinuities. Here, using the now-validated ALR method, we ran

simulations beginning with 1,000,000 facets for five time units, by which time only a few thousand facets remained,

which had achieved scale-invariant behavior. Examining the dependence of the variance of this distribution on the

timestep, we see the same behavior as for the individual facet lengths. Skewness, kurtosis, etc. all display similar

11

Euler

RK2

RK4

log10( t)

log 1

0(error)

convergence of distribution variance

Figure 4: Convergence results for statistical data, using the ALR method with various underlying Runge-Kutta solvers.Here a sample of 1,000,000 facets is simulated for five time units, until only a few thousand remain, which havereached a scale-invariant state. At that time the variance of the length distribution is measured, and plotted as afunction of the timestep used to perform the simulation. The less-accurate methods have significantly more remainingfacets, leading to relatively large errors in the statistical data.

behavior.

Runtime Comparison. We now demonstrate the run time of each method, by calculating the total CPU time needed

to simulate from t = 0 to t = 1 for a variety of system sizes N. Because the main practical advantage of the ALR

method lies not in extreme accuracy, but rather reduced simulation time, we present results for a timestep of Δt = 0.1.

This is near to the characteristic timescale of the system, but according to Figure 3 still provides around five digits of

accuracy if RK4 is used as the underlying solver. This choice therefore represents a reasonable balance of accuracy

and speed.

In Figure 5, we see clearly the effect discussed in Section 2.2, in which the Full Replay method scales like O(N2

)for large systems. On the other hand, by adaptively re-simulating only local patches of the surface near to coarsening

events, the ALR method achieves essentially the same accuracy in only O (N) time. This difference powerfully

illustrates the advantage of the adaptive approach. For instance, the construction of Figure 4 required 21 simulations

of 1,000,000 facets each. Using the ALR method, this took approximately three hours of CPU time. By contrast,

if we had attempted to obtain the corresponding plots using the Full Replay method, extrapolation of the associated

curves in Figure 5 suggests a run time three and a half orders of magnitude longer, or approximately one year of CPU

time.

According to profiling software, the vast majority of the runtime in the ALR method - around 90% - is spent

re-simulating the subdomains, so that the Full Replay method remains competitive for domains of up to 200 facets.

However, this is likely due to our use of Python for this reference implementation. Integrations of the full system,

which dominate the Full Replay method, are performed in vectorized form by compiled NumPy libraries, and are

therefore quite efficient. By contrast, the re-simulations and associated bookkeeping operations in the ALR method

incur a large overhead due to the interpreted nature of Python. Were this method to be implemented in a compiled

language, the cost of re-simulation - and therefore of the ALR method as a whole - would be substantially lower.

12

ALR MethodFull Replay

log10(N)

log 1

0(ru

n tim

e [s

econ

ds])

run time comparison

Figure 5: Comparison of runtime for the Full Replay and ALR methods, for various system sizes N. The O(N2

)running time of the Full Replay method is readily apparent, as is the O (N) running time of the ALR method.

Relative Frequency of Events. Finally, we present a brief summary of the relative frequency of different parts of

Algorithm 3, again using a timestep of Δt = 0.1. At this timestep, an average system containing 10,000 facets will

display the following behavior during each step:

• around 800 facets will vanish, requiring discontinuity detection and subomain construction (Steps 2-3). Because

the fourth-order method requires 11 consecutive facets to be re-simulated per coarsening event, it is clear that a

large fraction of the facets must be re-simulated each timestep for this Δt.

• Due to the high fraction of vanishing facets, the 800 subdomains will display considerable overlap, resulting in

only around 350 distinct groups (Step 4). Hence, the average subdomain contains between 2-3 events, involving

around 20-30 facets. At these sizes the overhead of Algorithm 2 is not significant.

• of the 350 groups, around 300 will re-simulate successfully with no unexpected discontinuities (Steps 5-6).

However, around 50 will reveal additional events that were not detected initially, requiring the enlargement of

the subdomain before attempted re-simulation (Step 7a).

• of the 50 enlargements, around 10 will turn out to overlap with an existing subdomain (Step 7b), requiring an

additional merger before re-simulation (Step 7c).

The latter observations highlight the importance of the adaptive nature of our approach. Without adaptivity, around

0.5% of coarsening events will be missed, which reduces the accuracy of around 5% of facets. This occurs each

timestep, and so most facets would be affected by the end of a simulation.

We note that further increases in the timestep will be accompanied by increases in the number of vanishing facets.

Eventually, this will lead to percolation of the subdomains into extremely large groups of size O (N). The re-simulation

of these domains under Algorithm 2 then takes O(N2

)time, and the speed advantage of the method is lost. However,

this merely reflects the presence of a characteristic time over which the governing equations of all facets change

discontinuously. It seems reasonable that this timescale represents a physical upper limit on the possible timestep size.

13

5 Conclusions and Future Work

We have presented a numerical scheme named Adaptive Localized Replay for the integration of coarsening dynamical

systems, which achieves high-order accuracy without the poor scaling performance typical of traditional algorithms

for integrating differential equations with discontinuities. For a fixed number of CPU cycles, our approach can perform

much more accurate simulations than Algorithm #1, resulting in a significant reduction in numerical error; or much

larger simulations than Algorithm #2, resulting in a significant reduction in statistical error. This combination of

benefits should be of particular interest to the coarsening community, where very large ensembles must be simulated

as rapidly as possible, without too much numerical error. By providing high-order accuracy in O (N ) time, our

method should significantly enhance practitioners’ ability to conduct exactly this kind of simulation.

Although our algorithm was developed in the context of faceted surface evolution, it is not specific to that particular

application. The ordered nature of a collection of facets makes the neighbor structure easy to visualize, and is therefore

an ideal first application of the ideas. However, the algorithm should be applicable to any system of discontinuous

ODEs for which the right-hand side of the equation has a sparse neighbor structure, i.e, where the number of “neigh-

bors” of each element does not scale with the system size. For our demonstration problem, regardless of the number

of facets being considered, the velocity function of each facet depends only on two neighboring facets, clearly meeting

this requirement. However, many problems of physical interest can be described by systems of discontinuous ODEs

with only local, and therefore sparse, connectivity structures, and we anticipate that our approach will generalize to

such problems in a straightforward manner.

The system of Equations (5), although associated with a specific physical problem, is interesting primarily as a

simple model system in which the essential features of the algorithm could be clearly expressed. Future work will

include the extension to systems of broader general interest, each with its own specific complicating details. For

instance, the reduced ODE system associated with the Cahn-Hilliard equations of phase separation, written in the

form (3), have a second-nearest-neighbor structure [32, 33]; an analysis similar to that in Sec. 3.1 would reveal that

twice as many facets must be re-simulated per domain to retain solver-level accuracy. On the other hand, the reduced

ODE system associated with the convective Cahn-Hilliard equation, in Eq. (4), has a right hand side that diverges as

facets shrink to zero size [13, 8]; facets therefore exhibit a singular approach to zero, requiring more careful work

to obtain good estimates of the coarsening time. Finally, this method could also be extended to higher-dimensional

surfaces, with a wider variety of events to detect [24]. There, the complication would be that facets on such surfaces

no longer have a unique ordering, and so the concept of “neighbors” must be generalized. These are all tractable

extensions of the present work, but they shall not be pursued here.

Acknolwledgements. SAN thanks Daniel Reynolds and Larry Shampine for many helpful discussions. ST acknowl-

edges the support of an SMU Undergraduate Research Assistantship.

14

References

[1] J. Villain. Continuum models of crystal growth from atomic beams with and without desorption. Journal de

Physique I (France), 1:19–42, 1991.

[2] A. A. Golovin, S. H. Davis, and A. A. Nepomnyashchy. A convective Cahn-Hilliard model for the formation of

facets and corners in crystal growth. Physica D, 122:202–230, 1998.

[3] P. Politi, G. Grenet, A. Marty, A. Ponchet, and J. Villain. Instabilities in crystal growth by atomic or molecular

beams. Physics Reports, 324:271–404, 2000.

[4] T. V. Savina, A. A. Golovin, S. H. Davis, A. A. Nepomnyashchy, and P. W. Voorhees. Faceting of a growing

crystal surface by surface diffusion. Phys. Rev. E, 67:021606, 2003.

[5] A. van der Drift. Evolutionary selection, a principle governing growth orientation in vapor-deposited layers.

Philips Research Reports, 22:267–288, 1967.

[6] A. N. Kolmogorov. To the "geometric selection" of crystals. Dokl. Acad. Nauk. USSR, 65:681–684, 1940.

[7] M. E. Gurtin and P. W. Voorhees. On the effects of elastic stress on the motion of fully faceted interfaces. Acta

Materialia, 46(6):2103–2112, 1998.

[8] S. J. Watson, F. Otto, B. Y. Rubinstein, and S. H. Davis. Coarsening dynamics of the convective Cahn-Hilliard

equation. Physica D, 178(3-4):127–148, April 2003.

[9] S. A. Norris and S. J. Watson. Geometric simulation and surface statistics of coarsening faceted surfaces. Acta

Materialia, 55:6444–6452, 2007.

[10] S. A. Norris, S. H. Davis, P. W. Voorhees, and S. J. Watson. Faceted interfaces in directional solidification.

Journal of Crystal Growth, 310:414–427, 2008.

[11] L. Pfeiffer, S. Paine, G. H. Gilmer, W. van Saarloos, and K. W. West. Pattern formation resulting from faceted

growth in zone-melted thin films. Phys. Rev. Lett., 54(17):1944–1947, 1985.

[12] D. K. Shangguan and J. D. Hunt. Dynamical study of the pattern formation of faceted cellular array growth.

Journal of Crystal Growth, 96:856–870, 1989.

[13] C. L. Emmot and A. J. Bray. Coarsening dynamics of a one-dimensional driven Cahn-Hilliard system. Phys.

Rev. E, 54(5):4568–4575, 1996.

[14] C. Wild, N. Herres, and P. Koidl. Texture formation in polycrystalline diamond films. J. Appl. Phys., 68(3):973–

978, 1990.

[15] J. M. Thijssen, H. J.F. Knops, and A. J. Dammers. Dynamic scaling in polycrystalline growth. Phys. Rev. B,

45(15):8650–8656, 1992.

[16] Paritosh, D. J. Srolovitz, C. C. Battaile, and J. E. Butler. Simulation of faceted film growth in two dimensions:

Microstructure, morphology, and texture. Acta Materialia, 47(7):2269–2281, 1999.

[17] J. Zhang and J. B. Adams. FACET: a novel model of simulation and visualization of polycrystalline thin film

growth. Modeling Simul. Mater. Sci. Eng, 10:381–401, 2002.

15

[18] J. Zhang and J. B. Adams. Modeling and visualization of polycrystalline thin film growth. Computational

Materials Science, 31(3-4):317–328, November 2004.

[19] J. M. Thijssen. Simulations of polycrystalline growth in 2+1 dimensions. Phys. Rev. B, 51(3):1985–1988, 1995.

[20] S. Barrat, P. Pigeat, and E. Bauer-Grosse. Three-dimensional simulation of CVD diamond film growth. Diamond

and Related Materials, 5:276–280, 1996.

[21] G. Russo and P. Smereka. A level-set method for the evolution of faceted crystals. SIAM Journal of Scientific

Computing, 21(6):2073–2095, 2000.

[22] P. Smereka, X. Li, G. Russo, and D. J. Srolovitz. Simulation of faceted film growth in three dimensions: Mi-

crostructure, morphology, and texture. Acta Materialia, 53:1191–1204, 2005.

[23] S. J. Watson and S. A. Norris. Scaling theory and morphometrics for a coarsening multiscale surface, via a

principle of maximal dissipation. Phys. Rev. Lett., 96:176103, 2006.

[24] S.A. Norris and S. J. Watson. Simulating the kinematics of completely faceted surfaces. Journal of Computa-

tional Physics, 231:4560–4577, 2012.

[25] J. E. Taylor and J. W. Cahn. Diffuse interfaces with sharp comers and facets: Phase field models with strongly

anisotropic surfaces. Physica D, 112:381–411, 1998.

[26] J. J. Eggleston, G. B. McFadden, and P. W. Voorhees. A phase-field model for highly anisotropic interfacial

energy. Physica D, 150(1-2):91–103, March 2001.

[27] Colin Ophus, Erik Luber, and David Mitlin. Simulations of faceted polycrystalline thin films: Asymptotic

analysis. Acta Materialia, 57:1327?1336, 2009.

[28] M. Seigert. Coarsening dynamics of crystalline thin films. Phys. Rev. Lett., 81(25):5481–5484, 1998.

[29] A. A. Golovin, S. H. Davis, and A. A. Nepomnyashchy. Model for faceting in a kinetically controlled crystal

growth. Phys. Rev. E, 59:803–825, 1999.

[30] Sofia Biagi, Chaouqi Misbah, and Paolo Politi. Coarsening scenarios in unstable crystal growth. Physical Review

Letters, 109:096101, 2012.

[31] J. W. Cahn and J. E. Hilliard. Free energy of a nonuniform system. I. Interfacial free energy. J. Chem Phys,

28:258, 1958.

[32] K. Kawasaki and T Ohta. Kink dynamics in one-dimensional nonlinear systems. Physica A, 116:573–593, 1982.

[33] T. Kawakatsu and T. Munakata. Kink dynamics in a one-dimensional conserved tdgl system. Progress of

Theoretical Physics, 74:11–19, 1985.

[34] P. W. Bates and J. Xun. Metastable patterns for the cahn-hilliard equation, Part II. layer dynamics and slow

invariant manifold. Journal of Differential Equations, 117:165–216, 1995.

[35] S. A. Norris and S. J. Watson. A mean field theory for coarsening faceted surfaces. Phys. Rev. E, 85:021608,

2012.

[36] L. F. Shampine. Interpolation for runge-kutta methods. SIAM Journal on Numerical Analysis, 22:1014–1027,

1985.

16

[37] E. Hairer, S.P. Nørsett, and G. Wanner. Solving Ordinary Differential Equations I: Nonstiff Problems. Springer,

2009.

[38] Scott A. Norris. Evolving Faceted Surfaces: from continuum modeling, to geometric simulation, to mean-field

theory. PhD thesis, Northwestern University, Evanston, IL, 2006.

17

Date post:	31-Dec-2016
Category:	Documents
Upload:	skyler
View:	213 times
Download:	0 times

Adaptive Localized Replay: An efficient integration scheme for accurate simulation of coarsening...

Documents