20 Feb, 2019
Tutorial on msmbuilder
Wei WangSlide available: http://chemwwang.people.ust.hk/cgi-bin/songshanhu.php
Outline
2
• Prepare the virtual machine and softwares
• Tutorial of msmbuilder
Outline
3
• Prepare the virtual machine and softwares• Method 1: download/copy and use TA’s well configured virtual
machine• Method 2: prepare the virtual machine, required softwares
and python packages by yourself
• Tutorial of msmbuilder
Preparing the Linux system and the softwares
4
• Method 1 (use the well configured virtual machine by TA)• Download and Install virtualbox on your windows or Mac
(https://www.virtualbox.org)• Download/copy the well configured virtual machine from this website:
• http://chz379.ust.hk/songshanhu/ubuntu14.04.zip• Or ask TA to copy, then extract the zip files
• Click the icon to open the virtual box• Password for the user:111111
• Method 2 (prepare the virual machine and software by yourself)• Install virtualbox, then install ubuntu linux on the virtualbox• Install anaconda3,vmd and required python packages
Method 2-step1: Install Ubuntu Linux on VirtualBox
5
• Follow below link to install the virtual machine on your computer
• https://itsfoss.com/install-linux-in-virtualbox/
• Better to use VirtualBox version > 5.2, Ubuntu version > 14.04
https://mirrors.163.com/ubuntu-releases/14.04/
https://www.virtualbox.org/wiki/Downloads
• Remember to assign >15.0 GB hard disk to your ubuntu system
in the VirtualBox settings
Method 2-step2: Install anaconda3 and required python packages
6
• Open your ubuntu system
• Download anaconda3 in the ubuntu
• https://rrtuaepo.continuum.io/archive/
• Choose “Anaconda3-4.1.0-Linux-x86_64.sh”, click to download
• Install anaconda3 in your terminal
• Go to “Downloads/” folder and type “bash Anaconda3-4.1.0-Linux-x86_64.sh” in the terminal
• Source ~/.bashrc after the installation finishes
1
2
3
4
Method 2-step2: Install anaconda3 and required python packages
7
• msmbuilder (version >3.6), pyemma, matplotlib>2.1.5• Using conda to install those packages in your terminal
• Suggest also Installing vmd in your ubuntu system, for visualizing the structures• Tutorial: https://www.biostars.org/p/196147/
msmbuilder
pyemma
matplotlib
Outline
8
• Prepare the virtual machine and softwares
• Tutorial of msmbuilder• General protocol of building MSM• Example on alanine dipeptide• Project: play with Fs peptide
General protocol of building Markov State Model (MSM)
9
Input: MD data Output: Metastable statesKinetics…
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Download the tutorial file (data & script)
10
• Downloading this file (no need to download if use TA’s virtualbox)• http://chz379.ust.hk/songshanhu/http://chz379.ust.hk/songshanhu/songshanhu_tu
torial_msmbuilder_v2.tar.gz
• Open the terminal (Ctrl+Alt+T), extract the tutorial file in ~/Downloads folder
$ cd Downloads/$ tar –zxvf songshanhu_tutorial_msmbuilder_v2.tar.gz
Download the tutorial file (data & script)
11
• Go the the folder “songshanhu_tutorial_msmbuilder_v2” to check the files
$ cd songshanhu_tutorial_msmbuilder_v2/$ cd data/;ls$ cd alanine_dipeptide/;ls$ cd ../../analysis/$ cd alanine_dipeptide/;ls
Example on alanine dipeptide: about the data
12
• The system is well characterized by two torsion angles
• The Ramachandran plots of the data show three basins
vmd
Example on alanine dipeptide: open the script
13
• Go to the folder “songshanhu_tutorial_v2/analysis/alanine_dipeptide/” and open the ipython script “MSM_for_ala2.ipynb”
$ cd ~/Downloads/songshanhu_tutorial_msmbuilder_v2/$ cd analysis/alanine_dipeptide$ jupyter notebook MSM_for_ala2.ipynb
Example on alanine dipeptide: open the script
14
• Go to the folder “songshanhu_tutorial_v2/analysis/alanine_dipeptide/” and open the ipython script “MSM_for_ala2.ipynb”
• Then you will see the browser opens a notebook
This notebook contains all the codes needed in the MSM analysis
About Ipython script: how to run the cells
15
1. Click and select the cell
2. Click to run the selected cell
3. The results will show here
Example on alanine dipeptide: modules & functions
16
• Run the first three cells one by oneDescription of MSM
Loading modules
Loading the functions
Example on alanine dipeptide: inputs
17
• Specify the inputs
• Loading the input files
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
All the results will be shown in the browser & store in the resultdir folder
Example on alanine dipeptide: tICA
18
• Code
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Compute pairwise distances
Run tICA
Interpret the results
Example on alanine dipeptide: tICA
19
• Code
• Results
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Compute pairwise distances
Run tICA
Interpret the results
1. Data projected on principal components
Example on alanine dipeptide: tICA
20
• Code
• Results
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Compute pairwise distances
Run tICA
Interpret the results
(Ctrl+Alt+T) to open a new terminal$ cd results/$ vmd tica-dimension-tIC0.xtc ../trajs/ala2.pdb
Visualizing trajectory using vmd
21
1. Click RMSD Trajectory Tool
2. Click RMSD, then Click Align
3. Right click hereClick delete
4. Delete the last frame
(200)5. Click Reset View
6. Click here
Example on alanine dipeptide: tICA
22
2. Interpret physical meaning of tIC
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Example on alanine dipeptide: tICA
23
• Code
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Play with these parameters
Visualize other tICs
1. Data projected on principal components 2. Interpret physical meaning of tIC
Example on alanine dipeptide: Splitting
24
• Code
• Output
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Play with the number of microstatesClustering into microstates
Visualizing the clusters
Example on alanine dipeptide: Interpret microstate MSM
25
• Implied timescale
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Play with the range
Example on alanine dipeptide: Interpret microstate MSM
26
• Implied timescale
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Play with the range
1. Gap => The molecule has three metastable states
Example on alanine dipeptide: Interpret microstate MSM
27
• Implied timescale
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Play with the range
1. Gap => The molecule has three metastable states
2. The top two curves reach plateau => Markovian lag time
Example on alanine dipeptide: Interpret microstate MSM
28
• Slowest dynamic modes and corresponding timescale
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Markovian lag time (obtained from implied timescale)
Timescale for slowest transition
Visualizing the slowest transition in this system
c.f.
Example on alanine dipeptide: Interpret microstate MSM
29
• Slowest dynamic modes and corresponding timescale
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Markovian lag time (obtained from implied timescale)
Timescale for slowest transition
Visualizing the slowest transition in this system
c.f.The slowest transition of this
system is the change of Psi angle
Visualizing trajectory using vmd
30
1. Click RMSD Trajectory Tool
2. Click RMSD, then Click Align
3. Right click hereClick delete
4. Delete the last frame
(200)5. Click Reset View
6. Click here
Example on alanine dipeptide: Lumping
31
• Code
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Clustering into macrostates
Visualizing the clusters
Example on alanine dipeptide: Lumping
32
• Code
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Clustering into macrostates
Visualizing the clusters
Obtained from the implied timescale
Example on alanine dipeptide: Lumping
33
• Code
• Output
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Clustering into macrostates
Visualizing the clusters
Obtained from the implied timescale
c.f.
Example on alanine dipeptide: Lumping
34
• Code
• Output
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Clustering into macrostates
Visualizing the clusters
Obtained from the implied timescale
c.f.
Example on alanine dipeptide: Interpreting macrostates
35
• Code
• Output structures: view in vmd
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
S0
S1
S2
Example on alanine dipeptide: Kinetics & Thermodynamics
36
• Obtain Markovian lag time to build the microstate MSM
Output: Metastable statesKinetics…
Markovian lag time
Example on alanine dipeptide: Kinetics & Thermodynamics
37
• Stationary population, Mean first passage time, Kinetic rate, dominant pathways can be obtained
Output: Metastable statesKinetics…
Obtained from the microstate implied timescale
Example on alanine dipeptide: Kinetics & Thermodynamics
38
• Stationary population, Mean first passage time, Kinetic rate, dominant pathways can be obtained
Output: Metastable statesKinetics…
Slowest216 ps
Example on alanine dipeptide: Kinetics & Thermodynamics
39
• Stationary population, Mean first passage time, Kinetic rate, dominant pathways can be obtained
Output: Metastable statesKinetics…
Slowest216 ps
Example on alanine dipeptide: Kinetics & Thermodynamics
40
• Stationary population, Mean first passage time, Kinetic rate, dominant pathways can be obtained
Output: Metastable statesKinetics… Play with the two macrostates you are interested in
S0 S2
S1
Dominant pathway24%
27%49%
Example on alanine dipeptide: Kinetics & Thermodynamics
41
• Stationary population, Mean first passage time, Kinetic rate, dominant pathways can be obtained
Output: Metastable statesKinetics…
S0 S2
S1
Dominant pathway24%
27%49%
Slowest216 ps
Outline
42
• Prepare the virtual machine and softwares
• Tutorial of msmbuilder• General protocol of building MSM• Example on alanine dipeptide• Project: play with Fs peptide
data & script
43
• Fs peptide dataset
• Fs peptide: contains 21 amino acids
• It’s folding dynamics is complex
script
Play with the script
44
• Figure out the physical meaning of the first tIC• Tips: open “tica-dimension-tIC0.xtc” in vmd (remember to align the
structures before visualization
• Figure out the timescale for the slowest transition• Tips: first dynamic mode of the microstate MSM & below two figures
Summary
45
Input: MD data Output: Metastable statesKinetics…
1.Loading data
2. tICA
3.Splitting
4.Lumping
Building MSM
Useful websites
46
• Vmd: https://www.ks.uiuc.edu/Training/Tutorials/vmd/tutorial-html/
• Msmbuilder: http://msmbuilder.org/3.8.0/• PyEMMA: http://emma-project.org/latest/
Visualizing trajectory using vmd
47
1. Click RMSD Trajectory Tool
2. Click RMSD, then Click Align
3. Right click hereClick delete
4. Delete the last frame
(200)5. Click Reset View
6. Click here
Check the ”reference_results” folder
48
• This folder contains the standard output for the two examples