Dimension Reduction in “Heat Bath” Models
Raz KupfermanThe Hebrew University
Part IConvergence Results
Andrew StuartJohn Terry
Paul TupperR.K
Ergodicity results: Paul Tupper’s talk.
Set-up: Kac-Zwanzig Models
A (large) mechanical system consisting of a “distinguished” particle interacting through springs with a collection of “heat bath” particles.
The heat bath particles have random initial data (Gibbsian distribution).
Goal: derive “reduced” dynamics for the distinguished particle.
Motivation
•Represents a class of problems where dimensionreduction is sought. Rigorous analysis.
•Convenient test problem for recent dimension reduction approaches/techniques:
•optimal prediction (Kast)
•stochastic modeling•hidden Markov models (Huisinga-Stuart-Schuette)•coarse grained time stepping (Warren-Stuart, Hald-K)•time-series model identification (Stuart-Wiberg)
The governing equations
(Pn,Qn): coordinates of distinguished particle
(pj,qj): coordinates of j-th heat bath particle
mj: mass of j-th particle
kj: stiffness of j-th spring
V(Q): external potential
The Hamiltonian:
€
H =1
2Pn
2 + V (Qn ) +p j
2
2m j
+ k j (q j − Qn )2
j=1
n
∑j=1
n
∑
The equations of motion:
€
˙ Q n = Pn
˙ P n = −V '(Qn ) + k j (q j − Qn )j
∑˙ q j = p j m j
˙ p j = −k j (q j − Qn )
Initial data
Heat bath particles have random initial data: Gibbs distribution with temperature T:
€
f ( p,q) = Z−1 exp −H( p,q,P,Q)
T
⎡ ⎣ ⎢
⎤ ⎦ ⎥
i.i.d N(0,1)
€
p j (0) = m jT η j
q j (0) = Q(0) + T k j ξ j
Initial data are independent Gaussian:
Generalized Langevin Equation
€
˙ Q n = Pn
˙ P n = −V '(Qn ) + k j (q j − Qn )j
∑˙ q j = p j m j
˙ p j = −k j (q j − Qn )
Solve the (p,q) equations and substitute back into the (P,Q) equation
€
Kn (t) = k j cos(ω j t)j∑ , ω j = k j m j
Memory kernel:
€
Fn (t) = T k j1 2 ξ j cos(ω j t) + η j sin(ω j t)[ ]
j∑
Random forcing:
“Fluctuation-dissipation”:
€
E Fn (t)Fn (s) = T Kn (t − s)
€
˙ ̇ Q n + V '(Qn ) = − Kn (t − s) ˙ Q n (s)ds + Fn (t)0
t
∫
(Ford-Kac 65, Zwanzig 73)
Choice of parameters
Heat baths are characterized by broad and dense spectra. Random set of frequencies:
€
ω j ~ U[0,na ], a∈ (0,1), Δω = na n → 0
Ansatz for spring constants:
€
k j = f 2(ω j ) Δω
Assumption:f2(ω) is bounded and decays faster than 1/ω.
€
˙ ̇ Q n + V '(Qn ) = − Kn (t − s) ˙ Q n (s)ds + Fn (t)0
t
∫
Generalized Langevin Eq.
€
Kn (t) = f 2(ω j )cos(ω j t)j∑ Δω
≈ f 2(ω)cos(ωt) dt0
∞
∫
Memory kernel:
(Monte-Carlo approximation of the Fourier transform of f2)
€
Fn (t) = T f (ω j ) ξ j cos(ω j t) + η j sin(ω j t)[ ]j
∑ Δω
≈ T f (ω) cos(ωt) dB1(ω) + sin(ωt) dB2(ω)[ ]0
∞
∫
Random forcing:
(Monte-Carlo approximation of a stochastic integral)
Lemma
1. For almost every choice of frequencies (ω-a.s.) Kn(t) converges pointwise to K(t), the Fourier cosine transform of f2(ω).
2. KnK in L2(,L2[0,T])
Theorem: (ω-a.s.) the sequence of random functions Fn(t) converges weakly in C[0,T] to a stationary Gaussian process F(t) with mean zero and auto-covariance K(t); (FnF).
Proof: CLT + tightness
can be extended to “long term” behavior: convergence of empirical finite-dimensional distributions (Paul Tupper’s talk)
ExampleIf we choose
€
f 2(ω) =2α
π
1
α 2 + ω2
then
€
Kn (t) → K(t) = exp(−α t )
and, by the above theorem, Fn(t) converges weakly to the Ornstein-Uhlenbeck (OU) process U(t) defined by the Ito SDE:
€
dU = −αU dt + 2αT dB
Convergence of Qn(t)
€
˙ ̇ Q n + V '(Qn ) = − Kn (t − s) ˙ Q n (s)ds + Fn (t)0
t
∫
Theorem: (ω-a.s.) Qn(t) converges weakly in C2[0,T] to the solution Q(t) of the limiting stochastic integro-diff, equation:
€
˙ ̇ Q + V '(Q) = − K(t − s) ˙ Q (s)ds + F(t)0
t
∫
Proof: the mapping (Kn,Fn)Qn is continuous
Back to example
€
f 2(ω) =2α
π
1
α 2 + ω2
if
€
˙ ̇ Q + V '(Q) = − e−α ( t−s) ˙ Q (s)ds + U(t)0
t
∫
then Qn(t) converges to the solution of
which is equivalent to the (memoryless!!) SDE
€
dQ = P dt
dP = −V '(Q) + R[ ] dt
dR = (−αR − P) dt + 2αT dB
Numerical validation
Empirical distribution of Qn(t) for n=5000 and various choices of V(Q) compared with the invariant measure of the limiting SDE
single well
double well
triple well
extremely long correlation time
•“Unresolved” component of the solution are modeled by an auxiliary, memoryless, stochastic variable.
•Bottom line: instead of solving a large, stiff system in 2(n+1) variables, solve a Markovian system of 3 SDEs!
•Similar results can be obtained for nonlinear interactions. (Stuart-K ‘03)
Part IIFractional Diffusion
Found in a variety of systems and models: (e.g., Brownian particles in polymeric fluids, continuous-time random walk)In all known cases, fractional diffusion reflects the divergence of relaxation times; extreme non-Markovian behaviour.
€
E ΔQ(t)2
~ tγ γ ≠1
Fractional (or anomalous) diffusion:
Question: can we construct a heat bath models that generated anomalous diffusion?
Reminder
€
˙ ̇ Q n + V '(Qn ) = − Kn (t − s) ˙ Q n (s)ds + Fn (t)0
t
∫
€
Kn (t) = k j cos(ω j t)j∑ , ω j = k j m j
Memory kernel:
€
Fn (t) = T k j1 2 ξ j cos(ω j t) + η j sin(ω j t)[ ]
j∑
Random forcing:
€
ω j ~ U[0,na ]
€
k j = f 2(ω j ) Δω
Parameters:
€
f 2(ω) =C
ω1−γIf we take
€
Kn (t) → K(t) =C
tγthe
npower law decay of memory kernel
Theorem: (ω-a.s.) Qn(t) converges weakly in C1[0,T] to the solution Q(t) of the limiting stochastic integro-diff, equation:
€
˙ ̇ Q + V '(Q) = − K(t − s) ˙ Q (s)ds + F(t)0
t
∫
The limiting GLE
€
K(t) =C
tγ
F(t) is a Gaussian process with covariance K(t); derivative of fractional Brownian motion (1/f-noise)
(Interpreted in distributional sense)
Solving the limiting GLE
€
˙ ̇ Q + V '(Q) = −˙ Q (s)
(t − s)γds + F(t)
0
t
∫
For a free particle, V’(Q)=0, and a particle in a quadratic potential well, V’(Q)=Q, the SIDE can be solved using the Laplace transform.
Free particle: Gaussian profile, variance given by sub-diffusive (Mittag-Leffler) function of time, var(Q)~t.
Quadratic potential: sub-exponential approach to the Boltzmann distribution.
Numerical results
Variance of an ensemble of 3000 systems, V(Q)=0(compared to exact solution of the GLE)
Quadratic well: evolving distribution of 10,000 systems (dashed line: Boltzmann distribution)
What about dimensional reduction?
Even a system with power-law memory can be well approximated by a Markovian system with a few (less than 10) auxiliary variables.
How? Consider the following Markovian SDE:
€
˙ ̇ Q = −V '(Q) + GT u
˙ u = −G ˙ Q − Au + C ˙ W
u(t) : vector of size mA: mxm constant matrixC: mxm constant matrixG: constant m-vector
€
˙ ̇ Q = −V '(Q) + GT u
˙ u = −G ˙ Q − Au + C ˙ W
Solve for u(t) and substitute into Q(t) equation:
€
K~
(t) = GT exp(−At) G€
˙ ̇ Q + V '(Q) = − K~
(t − s) ˙ Q (s)ds + F~
(t)0
t
∫
where
€
F~
(t) = GT exp(−At) u(0) + GT exp(−A(t − s))C0
t
∫ dWs
Goal: find G,A,C so that fluc.-diss. is satisfied and the kernel approximates power-law decay.
The RHS is a rational function of degree (m-1) over m. Pade approximation of the Laplace transform of the memory kernel (classical methods in linear sys. theory).
Even nicer if kernel has continued-fraction representation
It is easier to approximate in Laplace domain:
€
ˆ K (s) = GT (A + sI)−1G
(and the Laplace transform of a power is a power).
€
A =
1 2 2 2 L 2
2 5 6 6 L 6
2 6 9 10 L 10
M M M O M
2 6 10 14 L 4m − 3
⎛
⎝
⎜ ⎜ ⎜ ⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟ ⎟ ⎟ ⎟
G =
1
1
1
M
1
⎛
⎝
⎜ ⎜ ⎜ ⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟ ⎟ ⎟ ⎟
Laplace transform of memory kernel (solid line) compared with continued-fraction approximation for 2,4,8,16 modes (dashed lines).
Variance of GLE (solid line) compared with Markovian approximations with 2,4,8 modes.
Fractional diffusion scaling observed over long time.
Comment:
Approximation by Markovian system is not only a computational tools. Also an analytical approach to study the statistics of the solution (e.g. calculate stopping times).
Controlled approximation (unlike the use of a “Fractional Fokker-Planck equation”).
Bottom line:
Even with long range memory system can be reduced (with high accuracy) into a Markovian system of less than 10 variables (it is “intermediate asymptotics but that what we care about in real life).