Parallel Computations in Quantum Lanczos Representation Methods

transcript

Parallel Computations in Quantum Parallel Computations in Quantum Lanczos Representation MethodsLanczos Representation Methods

Hong Zhang and Sean C. Smith

Quantum & Molecular Dynamics Group Center for Computational Molecular Science

The University of Queensland, Australia

OutlineOutline

1.1. IntroductionIntroduction

2. Methodologies2. Methodologies

3. Results3. Results

4. Conclusions & Future Work4. Conclusions & Future Work

1. Introduction1. Introduction

Lanczos (1950)1. Wyatt: Recursive residue generation method2. Iung and Leforestier; Yu and Nyman; Carrington: Spectrally transformed Lanczos algorithm3. Manthe, Seideman, and Miller; Karlsson: GMRES/QMR DVR-ABC4. Guo; Carrington: Symmetry adapted Lanczos method5. Zhang and Smith: Lanczos representation methods.

1.1 Lanczos representation methods1.1 Lanczos representation methods

1. 1. Lanczos representation filter diagonalisationLanczos representation filter diagonalisation in in unimolecular dissociation and rovibrational unimolecular dissociation and rovibrational spectroscopy: spectroscopy: recent non-zero total angular momentum recent non-zero total angular momentum ((JJ > 0) calculations using parallel computing > 0) calculations using parallel computing implementationimplementation. . 2. 2. TI Lanczos subspace wavepacket method; real TI Lanczos subspace wavepacket method; real Lanczos artificial boundary inhomogeneity (LABI); real Lanczos artificial boundary inhomogeneity (LABI); real single Lanczos propagation ABIsingle Lanczos propagation ABI in in bimolecular bimolecular reactionsreactions. .

33. Systems: . Systems: HOCl and HOHOCl and HO22

Importance in atmospheric chemistry and in combustion chemistry, e.g., balance of stratospheric ozone, etc.

[Zhang & Smith, PhysChemComm, 6: 12, 2003]

2. Methodologies2. Methodologies

1. 1. TTransform the ransform the primary representationprimary representation ( (DVRDVR or or FBR) of the fundamental scattering equations (e.g., FBR) of the fundamental scattering equations (e.g., TI TI Schrödinger equation or TI wavepacket-Schrödinger equation or TI wavepacket-Lippmann-Schwinger equation) into the Lippmann-Schwinger equation) into the tridiagonal Lanczos representationtridiagonal Lanczos representation.. 2. 2. Solve the Solve the eigen-problem oreigen-problem or linear systemlinear system within within the subspacethe subspace t to obtaino obtain either eigen-pairs or either eigen-pairs or subspace wavefunctions. subspace wavefunctions. 3. Extract physical information, e.g., bound states, 3. Extract physical information, e.g., bound states, resonances, and scattering (resonances, and scattering (SS) matrix elements. ) matrix elements.

Representation transformationRepresentation transformation

111 'ˆ kkkkkkk vvvHv

ˆˆ iEiHE

1eEE M T

kkk vHv |'ˆ| kkk vHv |'ˆ|1

[Kouri, Arnold & Hoffman, CPL, 203: 166, 1993;Zhang H, Smith SC, JCP, 116: 2354, 2002]

Solving subspace eigen-problem or linear systemSolving subspace eigen-problem or linear system

AlgorithmAlgorithm

(a) Choose (a) Choose MMthth element of element of ((EEjj) to be arbitrary (but non-zero) and calculate ) to be arbitrary (but non-zero) and calculate

111 kkkkkjkk E

12211 cEc j

(b) For k=M-1, M-2, …, 2, update scalar k-1:

(c) Determine constant (true= c) by normalization

Comparison with other Comparison with other eigen-problem/linear system solverseigen-problem/linear system solvers

QMR/MINRESQMR/MINRES

QL/QRQL/QR

Forward recursionForward recursion

Efficiency very important, as linear system solver must be repeated for many different filter energies throughout spectrum.

[Yu & Smith, BBPC,101: 400, 1997;

Yu & Smith, CPL, 283: 69, 1998;

Zhang & Smith, JCP, 115: 5751, 2001;

Zhang & Smith, PCCP, 3: 2282, 2001]

RepresentationRepresentation

Parallel computingParallel computing

Conceptually difficult

Computationally difficult: cpu time and memory

In propagation, the most time consuming part is the matrix-vector multiplication. We use Message-passing interface (MPI) to perform parallel computation.

Propagation

Final analysis

TtTTtT

TjjTrR

rRVrrr

,1, jjJJR

J > 0 DVR Hamiltonian

Matrix-vector multiplicationMatrix-vector multiplication

121110

11,,11, HHH

Processor assignmentProcessor assignment

a) Master processor (ID = a) Master processor (ID = 0)0)

Perform the main Perform the main propagation; write propagation; write , , elements in bound and elements in bound and resonance calculations; resonance calculations; all other related works all other related works except the matrix-vector except the matrix-vector multiplications. multiplications.

b) Working/slave b) Working/slave processors (ID = 1, 2, processors (ID = 1, 2, …)…)

Perform the matrix-Perform the matrix-vector multiplications vector multiplications for each for each component. component.

CommunicationsCommunications

According to the Coriolis According to the Coriolis coupling rules, only two coupling rules, only two nearest neighbouring nearest neighbouring components need to components need to communicate.communicate.

Load balancingLoad balancing

jjminmin is different for each is different for each component, but component, but jjmaxmax is is the same, i.e, the DVR the same, i.e, the DVR size for size for angle is angle is changeable. changeable.

For the highest or the For the highest or the lowest lowest components, components, only one Coriolis only one Coriolis coupling required. coupling required.

TimingTiming

Due to the communications and loading balance issues, Due to the communications and loading balance issues, the model doesn’t scale ideally with (the model doesn’t scale ideally with (JJ+1) for even +1) for even spectroscopic symmetry or spectroscopic symmetry or JJ for odd spectroscopic for odd spectroscopic symmetry. symmetry.

However, one can achieve wall clock times (e.g., for However, one can achieve wall clock times (e.g., for even symmetry even symmetry JJ = 6 HO = 6 HO22 case) that are within about a case) that are within about a factor of 2 of factor of 2 of JJ = 0 calculations. For non parallel = 0 calculations. For non parallel computing, the wall clock times will approximately be a computing, the wall clock times will approximately be a factor of 7 of factor of 7 of JJ = 0 calculations. = 0 calculations.

3. Recent Results3. Recent Results

HOHO22: : JJ = 0-50 bound state energies and = 0-50 bound state energies and resonance energies and widths using both resonance energies and widths using both Lanczos and Chebyshev method from parallel Lanczos and Chebyshev method from parallel computing.computing.

HOCl/DOCl: HOCl/DOCl: JJ = 0-30 ro-vibrational = 0-30 ro-vibrational spectroscopy. spectroscopy.

Table 1 Selected HO2 bound state energies for J = 30 (even symmetry) for comparison. All energy units are in eV (Zhang & Smith, JPCA, 110: 3246, 2006 ).

N DS/LSFD LHFD Bunker J-shifting Ka Kc (1,2,3)

1 .124642 .124642 .124498 0.125357 0 30 (0,0,0)

2 .125728 .125729 .125617 0.127746 1 30 (0,0,0)

3 .135318 .135319 .135005 0.134914 2 29 (0,0,0)

4 .146785 .146786 .146458 0.146859 3 28 (0,0,0)

5 .163396 .163396 .162959 0.163583 4 27 (0,0,0)

6 .184702 .184702 .184127 0.185085 5 26 (0,0,0)

Fig. 1 Plot of the quantum logarithmic rates versus resonance energies. Thin dotted line - QM results; red line - Troe et al. SACM/CT calculations; green line – quantum average. (Zhang & Smith, JCP, 123: 014308, 2005 )

2.1 2.2 2.3 2.4 2.5 2.66

(c) J = 20log 1

Fig. 2 Same as previous figure, except J = 30 (unpublished latest results).

2.1 2.2 2.3 2.4 2.5 2.6

(d) J = 30

Table 2 Selected low vibrational energies at J = 0 for HOCl for comparison. All energy units are in cm-1 (Zhang, Smith, Nanbu & Nakamura, JPCA, 2006, in press).

n (1, 2, 3) This work1 Bowman This work2 Experimental

1 0, 0, 0 0.00 0.000 0.00 0.00

2 0, 0, 1 650.58 724.336 724.98 724.36

3 0, 1, 0 1261.97 1238.617 1245.11 1238.62

4 0, 0, 2 1309.21 1444.107 1442.31 1438.68

5 0, 1, 1 1926.92 1953.748 1967.41

6 0, 0, 3 1963.22 2154.028 2161.36

7 0, 2, 0 2522.28 2456.363 2458.67 2461.21

Table 3 Calculated HOCl ro-vibrational state energies in cm-1 with spectroscopic assignments for J = 20 (Zhang, Smith, Nanbu & Nakamura, JPCA, 2006, in press).

n quantum AR (J, Ka, Kc) (ν1, ν 2, ν 3) symmetry

1 205.15 209.04 20, 0, 20 0, 0, 0 even

2 223.49 229.01 20, 1, 20 0, 0, 0 even

3 283.76 288.91 20, 2, 19 0, 0, 0 even

4 381.70 388.74 20, 3, 18 0, 0, 0 even

21 226.16 229.01 20, 1, 19 0, 0, 0 odd

22 283.71 288.91 20, 2, 18 0, 0, 0 odd

23 381.70 388.74 20, 3, 17 0, 0, 0 odd

Table 4 Comparison of experiments and calculations for selected HOCl far infrared transitions in cm-1 (Zhang, Smith, Nanbu & Nakamura, JPCA, 2006, in press).

n (J’, Ka, Kc) (J”, Ka, Kc) (1, 2, 3) OBS CAL

1 11, 1, 11 10, 0, 10 0, 0, 0 30.47268 30.00

13 11, 2, 9 10, 1, 10 0, 0, 0 71.12366 70.03

28 20, 5, 15 20, 4, 16 0, 0, 0 177.98236 175.37

33 10, 2, 8 11, 1, 11 0, 0, 1 49.33260 48.93

41 20, 3, 18 20, 2, 19 0, 0, 1 99.35080 98.43

56 11, 5, 7 10, 4, 6 0, 0, 1 188.69489 187.03

Table 5 Comparison of experiments and calculations for selected HO2 vibrational state energies in cm-1 from three latest PESs (Zhang & Smith, unpublished latest resuts).

n (1, 2, 3) DMBE IV Troe et al. Xie at al. Experimental

1 0, 0, 0 0.00 0.00 0.00 0.0

2 0, 0, 1 1065.36 1270.89 1090.51 1097.6

3 0, 1, 0 1296.33 1610.65 1388.93 1391.8

4 0, 0, 2 2090.87 2436.94 2163.14

5 0, 1, 1 2359.35 2896.98 2462.26

6 0, 2, 0 2516.60 3134.91 2752.11

7 0, 0, 3 3080.66 3514.81 3216.67

8 1, 0, 0 3333.63 3588.55 3430.08 3436.2

Table 6 Comparison of experiments and calculations for selected DOCl vibrational state energies in cm-1 from the latest ab initio PES and LHFD calculations (Zhang & Smith, unpublished latest resuts).

n cal exp (1, 2, 3)

1 0.00 0.0 (0, 0, 0)

2 722.588 723.3 (0, 0, 1)

3 916.213 909.6 (0, 1, 0)

8 2358.013 (0, 1, 2)

9 2532.055 (0, 2, 1)

10 2669.201 2665.6 (1, 0, 0)

Table 7 Comparison of experiments and calculations for selected DOCl ro-vibrational energies in cm-1 from the latest ab initio PES (Zhang & Smith, latest results; Hu et al., J Mol Spec, 209, 105).

n LHFD AR exp (J, Ka, Kc) (1, 2, 3)

0 3102.29 3096.8861 (30, 0, 30) (1, 0, 0)

1 3112.57 3103.1051 (30, 1, 30) (1, 0, 0)

2 3143.41 3138.8417 (30, 2, 29) (1, 0, 0)

3 3194.80 3190.3545 (30, 3, 28) (1, 0, 0)

4 3266.75 3261.8675 (30, 4, 27) (1, 0, 0)

5 3359.25 3353.6688 (30, 5, 26) (1, 0, 0)

6 3472.31 3465.6222 (30, 6, 25) (1, 0, 0)

7 3605.93 3597.5640 (30, 7, 24) (1, 0, 0)

4. Conclusions and Future Work

Development of Development of Lanczos representation Lanczos representation methods; methods; Design of a parallel Design of a parallel computing model;computing model;Combination of both Combination of both has made has made rigorous rigorous quantum calculations quantum calculations possible for challenging possible for challenging J J > 0 applications. > 0 applications.

Further comparative studies between QD/ST; Develop quantum statistical theories; Mixed QD/MD simulations in larger systems.

Acknowledgements

Prof J. Troe (The University of Göttingen)

Prof S. Nanbu (Kyushu University)

Prof H. Nakamura (IMS)

Dr Marlies Hankel

CCMS/MD members

Australian Research Council

Supercomputer time from APAC & UQ

Parallel Computations in Quantum Lanczos Representation Methods

Documents