Post on 27-Jul-2018
transcript
Algorithms and Computational Aspects of DFTCalculations
Part I
Juan Meza and Chao YangHigh Performance Computing ResearchLawrence Berkeley National Laboratory
IMA TutorialMathematical and Computational Approaches to Quantum Chemistry
Institute for Mathematics and its Applications, University of Minnesota
September 26-27, 2008
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 1 / 32
Outline
1 Preliminaries
2 Density Functional Theory
3 Pseudopotentials
4 Bloch’s Theorem
5 Diagonalization / Minimization
6 Improving Convergence
7 Summary
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 2 / 32
Goals
1 Brief introduction to Schrodinger’s equation and Density Functional Theory
2 Overview of most commonly used approximations
3 Description of the Self-Consistent Field Iteration
4 Overview of major algorithmic components
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 3 / 32
References
M. C. Payne, M. P. Teter, D. C. Allen, T. A. Arias, J. D. Joannopoulos,Iterative minimization techniques for ab initio total energy calculation:Molecular dynamics and conjugate gradients, Reviews of Modern Physics,Vol. 64, Number 4, pp. 1045–1097 (1992).
Christopher J. Cramer, Essentials of Computational Chemistry, John Wileyand Sons (2003).
Richard M. Martin, Electronic Structure Basic Theory and Practical Methods,Cambridge University Press (2005).
F. Nogueira, A. Castro, A.L. Marques, A Tutorial on Density FunctionalTheory, Chapter 6, pp. 218–256, A Primer in Density Functional Theory,Springer-Verlag (2002).
J.M. Thijssen, Computational Physics, Cambridge University Press (2003).
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 4 / 32
Many-body electronic Schrodinger equation
H Ψk(r1, r2, ..., rN ) = EkΨ(r1, r2, ..., rN ) (1)
H = − ~2
2m
N∑i=1
∇2i +
N∑i=1
Vext(ri) +12
∑i 6=j
e2
|ri − rj |(2)
Ψk contains all the information needed to study a system
|Ψk|2 probability of finding an electron at a certain state
Vext represents an external potential, e.g. Coulomb attraction by nuclei
Ek quantized energy
Ψk is a function of 3N variables; the electron positions, r1, ..., rN .
Computational work grows like O(103N )
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 5 / 32
Approximations commonly used
Born–Oppenheimer
Also called adiabatic approximationDue to large difference in mass between electrons and nucleiTake nuclear positions as fixed
Density Functional Theory for modeling electron-electron interactions
Local Density Approximation (LDA)
Pseudopotentials for handling electron-ion interactions
Supercells to model systems with aperiodic geometries
Methods for minimizing total energy functional
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 6 / 32
Density Functional Theory
The unknown is very simple, i.e. the electron density, ρ(r)Hohenberg-KohnTheory
There is a unique mapping between the ground state energy, E0, and theground state density, ρ0
Exact form of the functional unknown and probably unknowable
Independent particle model
Electrons move independently in an average effective potential fieldMust add correction for exchange and correlation terms
Good compromise between accuracy and feasibility
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 7 / 32
Kohn-Sham Total Energy
Kohn and Sham proposed using ne noninteracting electrons moving in aneffective potential due to the other electrons
Replace many-particle wavefunctions with single-particle wavefunctions
Kohn-Sham Total Energy
Etotal[ψi] =12
ne∑i=1
∫Ω
|∇ψi|2 +∫
Ω
Vextρ+
12
∫Ω
ρ(r)ρ(r′)
|r − r′ |drdr
′+ Exc[ρ(r)],
where ρ(r) =∑ne
i=1 |ψi(r)|2,∫
Ωψiψj = δi,j , ne is the number of
electrons, and Exc[ρ(r)], denotes the exchange–correlation functional
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 8 / 32
Kohn-Sham Equations
Goal is to find the ground state energy by minimizing the Kohn-Sham totalenergy, Etotal
Leads to:
Kohn-Sham equations
Hψi = εiψi, i = 1, 2, ..., ne
H =[−1
2∇2 + V (ρ(r))
],
V (ρ(r)) = Vext(r) +∫
ρ
|r − r′ |+ Vxc(ρ)
Nonlinear eigenvalue problem since the Hamiltonian, H, depends on ψthrough the charge density, ρ
Vxc(ρ) = ∂Exc(ρ(r))/∂ρ
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 9 / 32
Pseudopotentials
Interaction between electrons and the nucleus creates a problem; one needsto deal with a singularity near the atomic core, specifically the 1/r term incomputation of Vext(r)Pseudopotentials are based on idea that most chemistry is dependent onvalence electrons rather than core electrons
Therefore we replace the core electrons (and the ionic potential) with aweaker pseudopotential
Using pseudopotentials reduces the number of electrons that we need toconsider, as well as the number of plane waves needed to accurately representthe wavefucntions, thereby reducing the computational cost
Both empirical and ab initio forms available.
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 10 / 32
Exchange–Correlation Functional
Most of the complexity of DFT is hidden in the exchange–correlationfunctional
Exchange arises from antisymmetry due to the Pauli exclusion principle
Correlation accounts for other many-body effects missing from single-particleapproximation, e.g. K.E. not covered by first term of Hamiltonian
No systematic way to improve the exchange–correlation functional
Local Density Approximation (LDA)
Simplest approximation to exchange–correlation termAssumes energy is equal to energy from a homogeneous electron gasPurely local, yet remarkably successfulKnown limitations
Literally hundreds of functionals proposed. For an interesting historicalperspective see In Pursuit of the ”Divine” Functional, A.E. Mattsson,Science, Vol. 298, No. 5594, pp. 759–760 (2002).
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 11 / 32
Discretization Options
Finite difference ψ′(rj) ≈ [ψ(rj + h)− ψ(rj − h)]/hFinite elements
ψ(r) ≈n∑j
αjφj(r), φj(r) nice functions with local support
Local orbital method (good for molecules)
Choose φj(r) as Gaussian or other “nice” functions
Planewave expansion
Choose φj(r) as eigj ·r
Useful for modeling solids with a periodic structure
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 12 / 32
Blochs’ Theorem and Periodic Supercells
Bloch’s Theorem: In a periodic solid each electronic wave function can beexpressed as the product of a periodic function φ and exp(ik · r), where k isa wavevector, i.e.
ψ(r) = e(ik·r) · φ(r)
Can expand φ(r) in a set of plane waves so that ψ(r) is a sum of plane waves(more in a minute)
Bloch’s Theorem allows us to express the electronic wavefunctions in termsof a discrete set of plane waves
Can model large periodic systems by focusing on a smaller primary cell
Can also be used to model nonperiodic systems, like molecules
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 13 / 32
Plane-wave Basis Set
Write wavefunction as:
ψi(r) = eik·r∑gj
αjeigj ·r (3)
In principle, you need an infinite plane-wave basis set
In practice, you introduce an energy cutoff to truncate the basis set
All terms for which the kinetic energy is bigger than the cutoff are ignored
Pseudopotentials also allow us to use a much smaller number of plane-wavebasis thereby reducing the computational cost
As a bonus, the kinetic energy term of Hamiltonian is diagonal (in Fourierspace) when using a plane-wave basis set
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 14 / 32
Finite Dimensional Problem
Recall we want to
minE[ψi] =1
2
neXi=1
ZΩ
|∇ψi|2 +
ZΩ
Vextρ+1
2
ZΩ
ρ(r)ρ(r′)
|r − r′ |drdr
′+ Exc(ρ)
Substituting (3) and after some algebra we have
minX∗X=Ine
EKS(X) ≡ Ekinetic(X) + Eext(X) + EHartree(X) + Exc(X),
where
Ekinetic =12
trace(X∗LX)
Eionic = trace(X∗VextX)
EHartree =12ρ(X)TL†ρ(X)
Exc = ρ(X)T (µxc[ρ(X)])ρ(X) = diag(XX∗)X N × nematrix
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 15 / 32
Minimizing the Total Energy
KKT conditions
∇XL(X,Λ) = 0,X∗X = Ine .
Discretized Kohn-Sham equations can now be written as:
H(X)X = XΛ,X∗X = Ine
.
Kohn-Sham Hamiltonian given by:
H(X) =12L+ V (X),
V (X) = Vext + Diag (L†ρ(X)) + Diag gxc(ρ(X))
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 16 / 32
Approaches for Solving the Kohn-Sham Equations
Work with the KS equations indirectlySelf-Consistent Field Iteration
View as solving a sequence of linear eigenvalue problemsNeed to preconditionNeed other acceleration techniques to improve convergence
Minimize the total energy directlyDirect Constrained Minimization
Constrained optimization problemAlso requires globalization techniquesIn general more robust
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 17 / 32
The SCF Iteration
V (ρ(r))
ρ(r) =∑ne
i |ψi(r)|2
ψii=1,...,ne
[− 1
2∇2 + V (ρ(r))
]ψi = Eiψi Most of the work is in
solving the linear eigenvalueproblem
Orthogonality constraint forthe wavefunctions must beenforced explicitly
If using reciprocal (Fourier)space, then you also havemany 3D FFTs
For large systems, thecalculation of nonlocalpotentials can also beexpensive
SCF does NOT decreasethe energy monotonically
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 18 / 32
Checking for Self-consistency
Convergence is usually checked by computing the change in total energy ordensity between iterations
Recall that neither quantity is guaranteed to decrease monotonically
Sometimes difficult to decide when self-consistency is reached
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 19 / 32
SCF Convergence Properties
Surprisingly few results
A good starting point:
E. Cances and C. Le Bris, Can we outperform the DIIS approach for electronicstructure calculations? Intl. J. Quantum Chem. 79 (200), 82-90
E(x) may not monotonically decrease between SCF iterations
SCF does not always converge;
limi→∞ ‖E(x(i+1))− E(x(i))‖ 6= 0,or limi→∞ ‖ρ(x(i+1))− ρ(x(i))‖ 6= 0
For some problems, one can show subsequence convergence;
limi→∞
‖ρ(x(i+1))− ρ(x(i−1))‖ = 0
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 20 / 32
Example
E(x) =12xTLx+
α
4ρ(x)TL−1ρ(x)
L =(
2 −1−1 2
), x =
(x1
x2
), ρ(x) =
(x2
1
x22
)minE(x)
s.t. x21 + x2
2 = 1[L+ αDiag(L−1ρ(x))
]x = λ1x
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 21 / 32
SCF Converges When α = 1.0
∆ρ(i) = ‖ρ(i) − ρ(i−1)‖
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 22 / 32
SCF Fails When α = 12.0
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 23 / 32
Subsequence Convergence
odd subsequence even subsequence
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 24 / 32
Why Does SCF Fail?
SCF is attempting to minimize a sequence of surrogate models
Objective:
E(x) = 12xTLx+ α
4ρ(x)TL−1ρ(x)
Esur(x) = 12(xTH(x(i))x),
Gradient:
∇E(x) = H(x)x∇Esur(x) = H(x(i))x
Gradients match at x(i)
∇E(x(i)) = ∇Esur(x(i))
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 25 / 32
SCF Step is Too Long!
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 26 / 32
Improving SCF
Construct better surrogate
Cannot afford to use local quadratic approximation (Hessian too expensive)
Charge mixing to improve convergence (heuristic)
Trust Region to restrict the update of the x in a small neighborhood of thegradient matching point, e.g. TRSCF – Thogersen, Olsen, Yeager &Jorgensen (2004)
Direct Constrained Minimization – Yang, Meza & Wang (2006) 1
See talk by Chao Yang, Friday, Oct. 4, 2008
1C. Yang, J. Meza, L. Wang, A Constrained Optimization Algorithm for Total EnergyMinimization in Electronic Structure Calculation, J. Comp. Phy., 217 709-721 (2006)
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 27 / 32
Mixing
Linear mixing
New ρ is a linear combination of the previous value and the quantity computedfrom the solution of the linear eigenvalue problem, i.e.ρi+1 = βρ+ (1− β)ρi
Anderson extrapolation
Broyden and Modified Broyden Mixing
DIIS (Direct Iterative Inversion Subspace)
All methods are some form of an acceleration technique for a nonlineariteration
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 28 / 32
Trust Region Subproblem
Solve
min Esur(x)s.t. xTx = 1,
‖xxT − x(i)(x(i))T ‖2F ≤ ∆ trust region constraint
Equivalent to solving[H(x(i))− σx(i)(x(i))T
]x = λ1x
xTx = 1
σ is a penalty parameter (Lagrange multiplier for the trust region constraint)
Need heuristic for choosing σ
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 29 / 32
SCF + Charge Mixing Improves Convergence
∆E(x(i)) = ‖E(x(i))− Emin‖
α = 12,
n = 10, ne = 2
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 30 / 32
TRSCF Further Improves Convergence
How should we choose σ?
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 31 / 32
Summary
Reviewed basic approximations used in DFT
Introduced the major algorithmic components
Discussed methods for improving SCF convergence
Introduced trust region ideas
Part II of this talk will discuss many of thecomputational issues
Juan Meza (LBNL) Algorithms and Computational Aspects of DFT Calculations September 26, 2008 32 / 32