ST-‐analyzer Tutorial
Ver. 0.1.0
Committee Wonpil Im
Chair, University of Kansas
Jong Cheol Jeong Developer, University of Kansas
Sunhwan Jo University of Chicago
Yifei Qi University of Kansas
Jeffery B. Klauda University of Maryland
Lev Gorenstein Purdue University
Min Sun Yeom Korea Institute of Science and Technology Information
All rights reserved @ ImLab
i
This work is supported by NSF ABI
ii
Table of Contents Objective ................................................................................................................................ 1
Installation ............................................................................................................................. 2 Required modules & programs ............................................................................................................................................ 2 Python (*required) ..................................................................................................................................................................... 2 Django (*required) ..................................................................................................................................................................... 4 MDAnalysis (*required) ........................................................................................................................................................... 5 Pyhull (*optional) ....................................................................................................................................................................... 5 ST-‐analyzer (*required) ........................................................................................................................................................... 6
ST-‐analyzer GUI .................................................................................................................... 10 Login ............................................................................................................................................................................................. 10 Username & Password ........................................................................................................................................................... 10
Workspace .................................................................................................................................................................................. 11 Menu bar ...................................................................................................................................................................................... 11 Button for minimizing window .......................................................................................................................................... 11
User accounts ............................................................................................................................................................................ 12 Create, delete, and edit account ........................................................................................................................................ 12 Searching and ordering the lists ....................................................................................................................................... 13 Logout ‘admin’ and login with new account ................................................................................................................ 13
ST-‐analyzer ................................................................................................................................................................................ 14 STEP1 ............................................................................................................................................................................................ 15 Preparing job submission ..................................................................................................................................................... 20 STEP2 ............................................................................................................................................................................................ 29
Analysis modules ..................................................................................................................................................................... 31 System size .................................................................................................................................................................................. 32 Helix tilt ........................................................................................................................................................................................ 33 Sterol tilt ...................................................................................................................................................................................... 34 RMSD ............................................................................................................................................................................................. 36 RMSF .............................................................................................................................................................................................. 37 Membrane Density Profiles .................................................................................................................................................. 38 Membrane Order Parameters ............................................................................................................................................ 45 Membrane Thickness .............................................................................................................................................................. 46 Average surface area per lipid ........................................................................................................................................... 49
Result Viewer ............................................................................................................................................................................ 50 Project Retrieval ....................................................................................................................................................................... 50 Outputs ......................................................................................................................................................................................... 51 Page navigator .......................................................................................................................................................................... 52 Data manager ........................................................................................................................................................................... 53
Selection Query .................................................................................................................... 54 Selection keywords and usages ........................................................................................................................................ 54
1
Objective ST-‐analyzer is a standalone GUI toolset to perform various analyses of molecular dynamics simulation trajectories and provides a variety of analysis routines especially focused on membrane systems (e.g., lipid chain order parameter, lipid area, etc). Since trajectory files are generally too large to be uploaded to a remote server, ST-‐analyzer has been developed in cross-‐platform by installing it in a server where trajectories are located. Once ST-‐analyzer is installed, user’s local machines with any types of existing Operating System (OS) can access ST-‐analyzer through HTTP. ST-‐analyzer is freely available through Github (https://github.com/stanalyzer/ST-‐analyzer).
2
Installation ST-‐analyzer can be installed by running install script (i.e., #HOME/install.sh) as following: #HOME/install.sh If users face any issues, please follow troubleshooting on installation shown below. Trouble shooting on installation Although ST-‐analyzer is developed in Python codes, to cover the wide range of analysis demands and to maintain the cross-‐platform characteristics, some external python modules and programs are required. For the users who only need to run particular modules, this documentation makes distinct ‘required modules’ from ‘optional modules’. Using easy_install and pip We have been reported that some systems require both easy_install and pip to install MDAnalysis. sudo easy_install install Django sudo easy_install Pyhull sudo easy_install -U GridDataFormats sudo pip install MDAnalysis Required modules & programs Python (*required) Python v2.7 or above is required. Python is available through http://www.python.org/download/
If you do not have privilege to install Python: This issue is out of scope in this tutorial, but you may find more information as following: If you have permission as a ‘root’ or ‘administrator’: You need to be added to the ‘wheel group’ to install the required packages using the sudo command. Simply type sudo in front of the command line(s) to install each package, you will be prompted to enter your password (same one you use to log into your Linux/Unix account. There are two ways to add a user to the wheel/sudoers group:
a) Open /etc/group with vi, or other editor, and add the user to the wheel group by typing ‘wheel::10:root,username’
b) Open /etc/sudoers with a text editor, and uncomment the %wheel so it looks like: ## Allows people in group wheel to run all commands %wheel ALL=(ALL) ALL
Windows: • Download and Install an all-‐in-‐one package (Anaconda,
http://continuum.io/downloads.html)
3
OS X, Linux & Unix: • Install with virtualenv
o Download python and install it your local directory § mkdir ~/src § cd src § curl –O http://www.python.org/ftp/python/2.7.5/Python-
2.7.5.tgz § tar –xzvf Python-2.7.5.tgz § cd Python2.7.5 § mkdir ~/localpython § ./configure –prefix=/path/for/your/localpython § make
o make install o Install virtualenv
§ cd ~/src § curl –O
https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.9.1.tar.gz
§ tar –xzvf virtualenv-1.9.1.tar.gz § cd virtualenv-1.9.1 § ~/localpython/bin/python setup.py install
o Create virtual environment § mkdir ~/newenvs § cd ~/newenvs § ~/localpython/bin/virtualenv py2.7 –python /use absolute
path /localpython/bin/python2.7
o Activate virtual environment § cd ~/newenvs/py2.7/bin § source ./activate § Install any packages whatever you want
o Deactivate virtual environment (activate will show ‘py2.7’ sign followed by command prompt)
§ (py2.7)$deactivate § This part of article is referred from http://jessenoller.com/2009/03/16/so-‐you-‐
want-‐to-‐use-‐python-‐on-‐the-‐mac/ Other information: Using easy_install:
• http://stackoverflow.com/questions/7465445/how-‐to-‐install-‐python-‐modules-‐without-‐root-‐access
• http://peak.telecommunity.com/DevCenter/EasyInstall#custom-‐installation-‐locations
• http://stackoverflow.com/questions/5506110/it-‐is-‐possible-‐to-‐install-‐another-‐version-‐of-‐python-‐to-‐virtualenv
4
Django (*required) ST-‐analyzer is optimized Django v1.4.1. Django is available through https://www.djangoproject.com/download/ You can check Django version in python interactive mode as following: >>>import django >>>django.get_version() OR >>>django.VERSION Need to maintain different Django version? http://djangotricks.blogspot.com/2008/09/note-‐on-‐python-‐paths.html If you are working in a Windows environment, download the zipped file from the website above and use winrar or another program to unzip the contents (store ina known location for easy access and installation). ADD PYTHON details (to run properly on WINDOWS)
• Download the compressed (.tar) version from the above website. • Uncompress the file using Winrar, or 7-‐zip (http://www.7-‐zip.org/) • Read Python documentation (online or file found in the doc directory/folder of the
Python directory/folder) to make sure the python path in your system calls the correct version (python 2.7) to execute further installation scripts.
• Open a command window (open the start menu and type cmd on the search bar) • Go to the directory/folder where the uncompressed Django files are located. • Install the package by typing: python setup.py install
o Some systems may work by just typing “py” instead of “python” on the command line – this depends on python environment settings specified at during installation or modified as specified on python documentation (environment variables).
• Check Django version in python interactive mode by typing python in the command window, and then: >>> import django >>> print (django.get_version())
• You can also check Django has been installed by going to the Anaconda folder, or the location of the Python folder in your system, and looking at the contents of the Lib/site-‐packages
• Need more information? Please visit https://docs.djangoproject.com/en/1.5/topics/install/
5
MDAnalysis (*required) ST-‐analyzer is optimized MDAnalysis v0.7.6 and above. MDAnalysis requires other modules; therefore, to make the installation simple, we encourage installing all-‐in-‐one package. Following list of packages has their own copyright, so please visit their websites and check the eligibility prior to the installation. All-in-one package
• Anaconda (http://continuum.io/downloads.html) for Linux, Windows and Mac. • Enthought Canopy (https://www.enthought.com/products/canopy/) for Linux,
Windows and Mac. Install MDAnalysis
• Install one of the all-‐in-‐one packages listed above • Download and install MDAnalysis through https://code.google.com/p/mdanalysis/ • Details of installing MDAnalysis can be found in
https://code.google.com/p/mdanalysis/wiki/Install • We have been reported about problems of the installation. Most problems are
caused by outdated version of Python and GNU C compiler (http://gcc.gnu.org/). If you have problems with install, please check the version of your GNU C compiler and Python and discuss with your system administrator
• For more questions about installation, please use discussion board at https://code.google.com/p/mdanalysis/wiki/Install
Pyhull (*optional) Pyhull is Python wrapper to qhull (http://www.qhull.org) used for calculating ‘area-‐per-‐lipid’ in ST-‐analyzer. If calculating area-‐per-‐lipid is not necessary, users are not required installing this module. Details of instruction for installing Pyhull can be found at http://pythonhosted.org/pyhull/
6
ST-‐analyzer (*required) Download ST-analyzer from Git-hub: choose one of methods shown below
• Manual download: https://github.com/stanalyzer/ST-‐analyzer
Figure 1 Manual download through Github
o Unzip the zip file: unzip ST-‐analyzer-‐master.zip
CLICK Here!
7
• Git clone (using commandline): git clone [email protected]:stanalyzer/ST-‐analyzer.git
Figure 2. Using git clone to install ST-‐analyzer
8
Configuration Let's assume ST-‐analyzer is stored into $ST_HOME=/home/your_account/ST-‐analyzer/stanalyzer (Linux or Unix-‐based system) or C:\home\ST-‐analyzer\stanalyzer. At $ST_HOME, you can see following files and directories.
• manage.py: required to run Django server • stanalyzer.db: database file used for ST-‐analyzer. ID: admin, Password: 12345 • gui: diretory containing 'models' and 'views' • media: default directory storing the results • static: directory storing APIs and background modules • stanalyzer: directory containing system setup files • templates: containing template files for ST-‐analyzer GUI • trajectory: containing sample trajectory files
Checking DB consistency At your system command line prompt, use followings:
• user@stanalyzer> cd $ST_HOME • user@stanalyzer> python manage.py syncdb
Run Django to launch ST-analyzer
• user@stanalyzer> cd $ ST_HOME • user@stanalyzer> python manage.py runserver 8000
the number ‘8000’ are used as a port number communicating with ST-‐analyzer. Thus the port number can be changed
Forwarding port Use ssh configuration to forward port
• Go to ‘.ssh’ directory o user@stanalyzer> cd ~/.ssh
• Edit or create ‘config’ file by using text editor o For Linux-‐based system: user@stanalyzer> vi config o For Windows: open or create ‘config’ using a text editor (e.g. notepad.exe)
Edit 'config' file as following: Host any_name HostName your.server.com LocalForward 8000 127.0.0.1:8000
Use command ssh -‐L 8000:localhost:8000 [email protected]
9
Connecting to ST-analyzer through your web-browser • Open your terminal and connect server where ST-‐analyzer is installed by using
‘Forwarding port’ described above • Connect ST-‐analyzer through http://127.0.0.1:8000 • You will see the ST-‐analyzer login. • Initial account and password are 'admin' and '12345'
10
ST-‐analyzer GUI GUI of ST-‐analyzer is designed upon three principles: simple, neat, and useful. From the intuitive interface, we found that many users did not face any problems while using ST-‐analyzer without having any information prior to their first attempts. Therefore, this section may not be necessary for most of users. However, this section intends to deliver more detailed and important information which can be helpful for both inexperienced and experienced users to efficiently utilize ST-‐analyzer.
Login
Username & Password ST-‐analyzer is designed for multiple user environments. This means that the contents or work environment belonging to an account is independent to each other. By using this concept, users can possess multiple IDs to store or work multiple tasks independently. The initial user name and password is ‘admin’ and ‘12345’ respectively. Notice: ‘admin’ account is just an initial account to create users’ personal account. This means ‘admin’ account does not have any privilege such as accessing and modify other data.
Figure 3. Login menu
11
Workspace
Menu bar Menu bar denoted as number (1) and (2) has two submenus ‘ST-‐analyzer’ and ‘About us’ containing help document and link for Im Lab website. The menu, ST-‐analyzer denoted as (1) in the figure 4 shows current user information and ‘Logout’ menu. To protect users data, user should logout to close current session by clicking this menu.
Figure 4. ST-‐analyzer workspace
Button for minimizing window By clicking the button located at the bottom left of the workspace denoted as (3), you can hide all windows in the workspace.
12
User accounts Once you logged in to ST-‐analyzer with ‘admin’ account, you have to create your own account so that your workspace can be protected from others.
Figure 5. Account manager
Create, delete, and edit account ‘NEW’ button denoted as (1) in Figure 5 account manager creates a new account. ‘Level’ is designed for hierarchical account management. Level 10 is the highest authority. The principles of hierarchical management is simple such that higher level can control lower level by means of deleting and editing accounts. Account cannot control other accounts having same level. Therefore, in order to delete or edit their own account, the user has to login to the system with the account expected to be controlled. For example, if users want to edit or delete ‘admin’ account then users must log in with ‘admin’ account. If users have problems with creating a new user account, please check followings:
• Did you click “CREATE” button after filling out the information in “NEW” account window?
• Did you check DB consistency? o > cd $ST_HOME o > python manage.py syncdb
13
‘DELETE’ and ‘EDIT’ button denoted as (2) deletes and edits selected account. In order to execute these commands, the user has to choose the expected account prior to clicking these buttons by clicking the check or edit options denoted as (3).
Searching and ordering the lists The list box contains simple searching and sorting functions. By typing any words matched with items in ID, e-‐mail, and level will be collected in the list by removing unmatched items in the list. To remove selections or see all lists, user needs to remove any character in the search box denoted as (4). By clicking ID column denoted as (5), the list will be sorted based on ID.
Logout ‘admin’ and login with new account Once new account is created, please make sure to login with new account unless you want to keep using ‘admin’ account as your primary account. Please DO NOT USE admin as your personal account especially under multiple account environments.
14
ST-‐analyzer By double clicking, ‘ST-‐analyzer’ icon in the workspace, users can see following window. This window is one of the most important GUI for preparing job submission of trajectory analysis. To submit the job user has to tell ST-‐analyzer about the system environment. The system environment is stored into database and can be recycled later.
Figure 6. ST-‐analyzer initial window
15
STEP1 Step1 configures a project telling ST-‐analyzer information about system environment.
Figure 7 ST-‐analyzer at step 1: (1) Creating new project. (2) Editing the currently selected project. (3) Trajectory file selection (for detail, please read subsequent section, ‘Trajectory file selection’). (4) List of existing files in current ‘input path’ (5) List of selected trajectory files. (6) Modification menus for selected trajectory files. (7) Button to move STEP 2.
16
New Once clicking ‘New button in Figure 7 (1), users can see project configuration window as following:
Figure 8. Step1: creating a new project
(1) Title: the name of project. (e.g. My system: alpha helix) (2) Input path: the location of trajectory that requires absolute path in the system (e.g. /home/mysystem/protein). If the path does not exist, warning dialog box will appear.
(3) Undo drop box: recovering the delete items by simply choosing items in this drop box after entries are accidently deleted.
17
(4) Output path: the location of output directory where the results are stored. Default path is given in the input box. If out path does not exist warning dialog box will appear. Be careful, if ST-‐analyzer is installed by another account rather than your own account then output path must have write permission to the account that installed ST-‐analyzer.
(5) Python path: we have been reported that due to various reasons, OS needs to maintain multiple versions of Python. In order to identify the right version, ST-‐analyzer requires specifying the Python having required modules. If the file does not exist or have executable permission, warning dialog will appear.
(6) Application path: this is designed for future usage, so at this moment users do not need to specify this option. (7) PBS: if users want to submit the job to cluster machine then please give information about your PBS. Template of PBS is shown in the input box. Some information about PBS is described below
Table 1 PBS options
Definition of Important PBS Directives
PBS Directive Description
#PBS -l walltime=HH:MM:SS The maximum walltime (real time, not CPU time) that a job should take. If this limit is exceeded, PBS will stop the job.
#PBS -l pmem=SIZEgb The maximum amount of physical memory used by any process in the job. For example, if the job would use up to 2 GB (gigabytes) of memory, then #PBS -l pmem=2gb.
#PBS -l nodes=N:ppn=M The number of nodes (nodes=N) and the number of processors per node (ppn=M) that the job should use. For example, the job requires 2 nodes with 12 cores, then #PBS –l nodes=2:ppn=12.
#PBS -q queuename This specifies what PBS queue a job should be submitted to. This is only necessary if a user has access to a special queue.
#PBS -j oe Both normal output and error output into the same output file.
18
(8) Create: to store the information into database, users must click “Create” button. (9) Hide: hide project configuration window. Once a project configuration is done, the step 1 will reload trajectory data based on the information given during creating the project. The snapshot is shown below.
Figure 9 Step1 after creating a project
19
Edit Once clicking ‘EDIT button in figure 7 (2), users can see the current project configuration window as following:
Figure 10 Step1: Edit window
As shown in Figure 10, all information defined in a project is displayed. To update current project, users can delete existing values or add new information. “Undo” drop box can help users to recover accidently deleted items. As a final step of updating, users must click “Update” button to reflect current changes.
20
Preparing job submission Project selection Project contains all information about environment system; therefore, it is user’s responsibility to use proper name to identify user-‐defined projects. To help maintaining multiple projects, ST-‐analyzer provides quick search engine, which shows items in the list containing user’s search word and filtering out all other items.
Figure 11 Project selection
21
Input path selection Input path contains information about the locations of trajectory files. It is important that users MUST put all trajectories, PDB and structure files (e.g. ‘psf’ file: either CHARMM or NAMD format) in a same directory. Making easier to find a targeted input path, quick search function is also provided.
Figure 12 Input path selection
22
Output path selection Output path contains information about the locations of output files resulted from ST-‐analyzer analysis modules. The default value is located in “$ST_HOME/media/user ID”, but the location can be varied upon users’ choice given during creation of project. With default path, ST-‐analyzer automatically creates subdirectory with random numbers at the inside of output path while submitting each job. The name of each subdirectory is defined as following format “Year+Month+date+hour+minutes+seconds+12 random characters”. If user defined directory is used then ST-‐analyzer will directly write the output into the user-‐defined directory. Output path selection also provides quick search function.
Figure 13 Output path selection
23
Python path selection In some cases, OS needs to maintain multiple versions of python. Therefore, indicating proper python to ST-‐analyzer is crucial to run functional modules. An example using multiple versions of python is shown below, and in this case, python2.7 located in ‘/export/apps/bin/python2.7’ is selected. A targeted python path can be retrieved from stored multiple python paths by using quick search tool.
Figure 14 Python path selection
24
Structure file selection Structure file contains all of the molecule-‐specific information needed to apply a particular force field to a molecular system such as a protein structure file (PSF) in CHARMM. Quick search will be useful to find structure file by typing a word contained in the name of structure file as shown below.
Figure 15 Structure file selection
25
PDB file selection PDB file contains information about the structure of molecular system and can be easily found by using quick search function in ST-‐analyzer.
Figure 16 PDB file selection
26
Trajectory selection Trajectory files are actual results obtained from molecular dynamics simulation. There are two ways to select trajectory files. 1) Using quick filter inputs:
• Type a word contained in names of trajectory files • Select targeted files using mouse and keyboard (e.g. click start file with left mouse
button and then click last file with ‘shift ’ key + left mouse button) • Click ‘Add’ button that will move the selected files into right list box which lists
finally selected files used for analysis • To recover the full list of files, please press ‘ESC’ button in your keyboard
Figure 17 Trajectory file selection through a filter
27
2) Using quick selection filter: • Type the name of first file in ‘FROM’ input box and last file name in ‘TO’ input box • Click ‘Select’ button
Figure 18 Trajectory file selection through file range inputs
28
Move to STEP 2 By clicking ‘NEXT’ button users can move forward to STEP 2 setting up the actual analysis modules.
Figure 19 Reading trajectory information at Step 2
29
STEP2 Step 2 contains all available analysis modules and their configuration GUIs. Once moved to ‘STEP2’ window, summary of selected trajectory information will appear to make sure system can recognize users inputs.
Figure 20 GUI at Step 2
30
Select machine type ST-‐analyzer can run on two ways: machine Cluster machines (PBS) or a server (Interactive). It is user’s choice to select either PBS or Interactive mode.
• PBS: processing analysis modules with cluster machines – processing multiple jobs on each node. PBS script is required (see page 17 PBS for detail).
• Interactive: processing analysis modules under a server where ST-‐analyzer is installed – sequentially processing multiple jobs.
Figure 21 Choose target machine
31
Analysis modules ST-‐analyzer comes with built in modules commonly used in MD analysis. The number of modules will keep increasing as planned but not limited in following lists. The order of list shown below is random and does not necessarily correspond to the development priority. To simplify the explanation, we now define ST-‐analyzer home directory as $ST_HOME where ST-‐analyzer is installed
• System size • Density profile • Lipid deuterium order parameters • Root-‐mean-‐squared • Root-‐mean-‐squared fluctuation • Lipid hydrophobic thickness • Lipid surface area • B-‐factors • Residue-‐residue contacts • Secondary structure • Dihedral angles • Distance between two atoms • H-‐bond / salt-‐bridge profile • Helix (ß-‐hairpin) tilt, rotation, crossing angles, and distance • Solid-‐state NMR properties (chemical shift, dipolar coupling constant) • Channel pore size • Membrane potential • Water/ion movement • Lipid lateral diffusion constant • Lipid chain relaxation time • Lipid rotation/wobble motions • Sterol tilt angle • Lateral ion density inside channel • Residue-‐water/lipid contact information • Solvent accessible surface area • Lipid adaptation (through selection of "local" and "bulk" lipids) • Substrate binding • Pore hydration
32
System size Analyzing changes of system size during simulations. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/protein
• GUI location: $ST_HOME/stanalyzer/templates/gui/systemsize.html • Module location: $ST_HOME/static/analyzers/box.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
33
Helix tilt Analyzing the tilt angle of helix. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/protein
• GUI location: $ST_HOME/stanalyzer/templates/gui/helixtilt.html • Module location: $ST_HOME/static/analyzers/helix_tilt.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Segments: choose segment ID that contains helixes. NOTE: To select start and end residue, the segments has to be defined first.
• Start: start residue – drop box contains a pair of residue ID and three-‐letter residue name. By changing the segment ID, residue ID and names will be automatically updated.
• End: end residue – drop box contains a pair of residue ID and three-‐letter residue name. By changing the segment ID, residue ID and names will be automatically updated.
• Add: by pressing ‘Add’ button, users can define multiple angle positions.
34
Sterol tilt Analyzing the tilt angle of sterols with respect to the bilayer normal. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
Ring tilt • GUI location: $ST_HOME/stanalyzer/templates/gui/steroltilt.html • Module location: $ST_HOME/static/analyzers/sterol_tilt_ring.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Segments: choose segment ID that contains sterol – this system contains cholesterol (CHL1) at MEMB segments.
• Selection Query: this enables calculating different angles given by users. In this example C3 and C17 atom are used for defining ring tilt. For more information about using query please refer the ‘Selection Query’ section in this document. NOTE: the given query is shown as an example. This means it is users’ responsibility to modify the query according to their system.
• Verify: this button shows the summary of query results as shown below.
35
Tail tilt • GUI location: $ST_HOME/stanalyzer/templates/gui/steroltilt.html • Module location: $ST_HOME/static/analyzers/sterol_tilt_tail.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Segments: choose segment ID that contains sterol – this system contains cholesterol (CHL1) at MEMB segments.
• Selection Query: this enables calculating different angles given by users. In this example C17 and C25 atom are used for defining tail tilt. For more information about using query please refer the ‘Selection Query’ section in this document. NOTE: the given query is shown as an example. This means it is users’ responsibility to modify the query according to their system.
• Verify: this button shows the summary of query results as shown below.
36
RMSD Analyzing root-‐mean-‐square deviation (RMSD).
𝑅𝑀𝑆𝐷 =1𝑁 𝛿!!
!
!!!
Here, δ is the distance between N pairs of atoms (e.g. Cα) In the example, Cα atoms in PROA segment are used for aligning structures and then RMSD among Cβ atoms is calculated. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/protein
• GUI location: $ST_HOME/stanalyzer/templates/gui/rmsd.html • Module location: $ST_HOME/static/analyzers/rmsd.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Atom selection for alignment: selecting atoms for aligning the structure. • Atom selection for RMSD calculation: selecting expected atoms to see their RMSD
based on the alignment made by alignment atoms selected above. • Verify: this button shows the summary of query results as shown below.
37
RMSF Analyzing root-‐mean-‐square fluctuations (RMSF).
𝑅𝑀𝑆𝐹 =1𝑇 𝑥! 𝑡! − 𝑥!
!!
!!!!
Here, T is the time (i.e. number of frames), xi(tj) is the position of atom i at time tj, and 𝑥! is time averaged position of the atom i. In the example below, Cα atoms in PROA segment are used for aligning structures and then RMSF of all atoms in segment PROA is calculated. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/protein
• GUI location: $ST_HOME/stanalyzer/templates/gui/rmsf.html • Module location: $ST_HOME/static/analyzers/rmsf.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• RMSF based on residue (*default = based on individual atom): output of RMSF can be made of either residue or individual atom.
• Atom selection for alignment: selecting atoms for aligning the structure. • Atom selection for RMSF calculation: selecting expected atoms to see their RMSF
based on the alignment made by alignment atoms selected above. • Verify: this button creates the summary of query results unless the query contains
syntax or logical errors.
38
Membrane Density Profiles Analyzing the density based on the number of atoms with given axis and bin size.
all This module calculates density of all atoms in system by recentering coordinates in trajectories. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_all.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate density. • Min: the lowest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Max: the highest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in density calculation. In this case all atoms are
involved.
39
Lipid head This module calculates density of atoms corresponding to lipid head group in system. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_lpH.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8, …) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate density. • Min: the lowest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Max: the highest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in density calculation. To select atoms in head
group, this example uses “segid MEMB and (name P or name N or (name C1* and not name C1) or name O1*)”. The query selects all carbon atoms of which the name starts with C1 or O1 except C1, and the atom name itself is P or N in the MEMB.
• Verify: this button creates the summary of query results unless the query contains syntax or logical errors.
40
Lipid tail This module calculates density of atoms corresponding to lipid tail group in system. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_lpT.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate density. • Min: the lowest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Max: the highest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in density calculation. To select atoms in tail group,
this example uses “segid MEMB and (name C2* or name C3*) and not (name C21 or name C31)”. The query selects all carbon atoms in MEMB segment of which the name starts with C2 or C3 except C2 and C3 themselves.
• Verify: this button creates the summary of query results unless the query contains syntax or logical errors.
41
Water This module calculates density of water atoms in system. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_water.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate density. • Min: the lowest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Max: the highest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in density calculation. To select water atoms, this
example uses “segid TIP3 and name OH2”. The query selects all OH2 atoms in TIP3 segment.
• Verify: this button creates the summary of query results unless the query contains syntax or logical errors.
42
Custom selection This module calculates density based on user-‐defined atoms. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_custom.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate distribution. • Min: the lowest value of expecting range. NOTE: this value only affects the range of
bin. • Max: the highest value of expecting range. NOTE: this value only affects the range of
bin. • Bin size: individual bin size to analyze density. • Users can select any atoms in system to analyze the density of them. • Verify: this button creates the summary of query results unless the query contains
syntax or logical errors. • Add: users can select multiple atoms to make separate outputs. • Use GUI: by checking this option users can write query with simple GUI interface.
For the details about query, please refer ‘Selection Query’ section.
43
Water dipole This module calculates water dipole in system. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_waterdp.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate distribution. • Min: the lowest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Max: the highest coordinate of expecting range. NOTE: this value only affects the
range of bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in density calculation. To select water atoms, this
example uses “segid TIP3”. The query selects all atoms in TIP3 segment that consists of H1, H2, and OH2.
• Verify: this button creates the summary of query results unless the query contains syntax or logical errors.
44
Vector selection This module calculates the distribution of angles (i.e. cosθ) between the basis axis and selected atoms. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/densityprofile.html • Module location: $ST_HOME/static/analyzers/density_vector.py
Parameters • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to calculate distribution. • Min: the lowest value of expecting range. NOTE: this value only affects the range of
bin. • Max: the highest value of expecting range. NOTE: this value only affects the range of
bin. • Bin size: individual bin size to analyze density. • Users can select atoms involved in vector calculation. This example uses “segid
MEMB and name P”. The query selects all phosphate atoms in MEMB segment. • Verify: this button creates the summary of query results unless the query contains
syntax or logical errors.
45
Membrane Order Parameters Calculating deuterium order parameters of selected atoms.
CHARMM format This module calculates order parameters of selected atoms. Currently CHARMM format is only available. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/orderparameters.html • Module location: $ST_HOME/static/analyzers/ordpara_charmm.py
Parameters • segid: choosing segment ID – MEMB is selected. • resname: choosing residue name – DOPC is selected. • Selection query: specifying atoms – one of DOPC tail is selected by “name C2* and
not (name C2 or name C21)”. • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Axis: the basis axis to get bilayer normal. • Verify: this button creates the summary of query results unless the query contains
syntax or logical errors. • Add: users can select multiple atoms to make separate outputs.
46
Membrane Thickness Estimating the thickness of membrane.
Using Phosphate This module calculates thickness based on phosphate atoms. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/thickness.html • Module location: $ST_HOME/static/analyzers/thickness_phosphate.py
Parameters • Query: selecting phosphate atom. • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
47
Using Carbon This module calculates thickness based on carbon atoms. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/thickness.html • Module location: $ST_HOME/static/analyzers/thickness_carbon.py
Parameters • Query: selecting carbon atoms. Users must redefine atoms corresponding to user’s
system. • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
48
Using custom This module calculates thickness based on user-‐defined atoms. In order to run this example, users can use trajectories located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/thickness.html • Module location: $ST_HOME/static/analyzers/thickness_custom.py
Parameters • segid: choosing segment ID – MEMB is selected. • resname: choosing residue name – DOPG is selected. • Query: selecting atoms. Users can select atoms based on their own purpose. • Frame Interval: users can choose frame interval. For example, if time frame 2 is
given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
• Add: users can select multiple atoms to make separate outputs.
49
Average surface area per lipid Calculating area of individual lipids and overall average surface area of lipid.
Using Voronoi diagram This module calculates area per lipid by utilizing Voronoi diagram calculated by Pyhull. In order to run this example, Python should have Pyhull module (http://pythonhosted.org/pyhull/). Trajectories used in this example is located in $ST_HOME/trajectory/lipids
• GUI location: $ST_HOME/stanalyzer/templates/gui/aveperlipid.html • Module location: $ST_HOME/static/analyzers/lipid_per_area_voro.py
Parameters • System size: the information is obtained from initial PDB file. • Query: selecting atoms to define surface area – The given system contains five
different residues CHL1, POPC, POPI, PIPI13, and DOPC, so the query must specify atoms to define surface area. In order to define the surface area this example uses a query “segid MEMB and ((resname CHL1 and name O3) or (resname DOPC and (name C2 or name C21 or name C31)) or (resname POPC and (name C2 or name C21 or name C31)) or (resname POPI and (name C2 or name C21 or name C31)) or (resname POPI13 and (name C2 or name C21 or name C31)))” such that Cholesterol uses an atom (i.e. O3) and all other residues including DOPC, POPC, POPI, and POPI13 use 3 atoms (i.e. C2, C21, and C31) to define the surface.
• Frame Interval: users can choose frame interval. For example, if time frame 2 is given then ST-‐analyzer will use every two frames (i.e. 2, 4, 6, 8,…) to analyze the system instead of every frame.
• Output File Name: users can define the name of output file that contains text format of current module.
50
Result Viewer One of unique characteristics in ST-‐analyzer is data retrieval function utilized by SQLite3. This section introduces ST-‐analyzer data retrieval GUI, ‘Result Viewer’ that consists of simple search engine, graphic viewer, data download, and data management tools.
Project Retrieval Once Result Viewer is launched, users can see the list of existing projects. For data retrieval, Result Viewer provides several tools to navigate stored data as following:
1) Quick search engine: this tool lists any data that contains user typed words in ID or Project.
2) Sorting: by clicking the triangles at ID column, the viewer sorts the data either ascending or descending order.
3) Collapsing lists: by clicking the triangle, the data lists associated with a project are collapsed.
4) Expending list: by clicking the triangle, the row is expanded to show associated data lists
Figure 22. Result Viewer: project retrieval
51
Outputs Users can see the results of submitted jobs by expanding each project as explained above. To retrieve actual outputs, ST-‐analyzer provides two types of data: graph viewer and file download. Output window also supports quick search tool.
1) Quick search engine: retrieving data containing the words corresponding to each column
2) Image viewer: by clicking thumbnails, users can see corresponding image in a popup window.
3) File download: some modules results in multiple outputs, but Output GUI only show an image at a time; therefore downloading entire datasets are necessary. The files contain the raw data of plotting graph, information about defining variables of applied modules and more.
Figure 23. Result Viewer: Outputs
52
Page navigator The Outputs and Project Retrieval can contain many records that cannot be displayed in a small window. To make organized view, ST-‐analyzer provides page-‐based lists.
1) Refresh button: this button will refresh the lists 2) Page navigator: providing easy navigation between pages. The navigator also
changes the number of records at each page.
Figure 24. Result Viewer: Page navigator
53
Data manager ST-‐analyzer provides simple data managing tool that can delete data and jobs in queue.
1) Target: select target group to delete items by using corresponding IDs a. Project Retrieval: by deleting a project, all associated outputs are also deleted b. Outputs: deleting output records c. Queue: delete jobs in queue with “job id”. NOTE: deleting outputs will also
deletes jobs in queue if the submitted job is in queue. 2) Delete records
a. IDs: either project or output IDs are used to identify targeted records. For the consecutive IDs, users can use short hand description. For example to delete Outputs from ID 111 to ID 115, users can simply type 111-‐115, which is equivalent to writing 111, 112, 113, 114, 115.
b. Directory: by checking this option, users can delete both records in database and physical data in output directory.
Figure 25. Result Viewer: Data manager
54
Selection Query Although ST-‐analyzer is designed for automatically analyzing simulation trajectories, there are some technical issues making ST-‐analyzer fully automatic. One of the reasons could be users’ diverse demands and interests on the scope of analysis and various biological systems. To maximize the validity of predefined modules in ST-‐analyzer, selection queries used in MDAnalysis module (https://code.google.com/p/mdanalysis/) is incorporated. This section introduces syntax of selection quires with some example queries used in ST-‐analyzer.
Selection keywords and usages NOTICE:
1) Keywords are case sensitive 2) Selections are parsed left to right and parentheses can be used for grouping. 3) Pattern only allows wild card ‘*’ 4) ‘<>’ indicates required user inputs 5) ‘[]’ indicates optional user inputs 6) ‘|’ indicates alternative choices
• * : wild card used for pattern matching.
o C1* will retrieve C11, C12, C13, …
• segid <segment name>: select by segid o segid DOPC – select all atoms in segment name DOPC
• resid <residue-‐number [: range]>: select by residue ID
o resid 134 – select residue having ID = 134 o resid 134:150 – select residues from residue ID 134 to 150
• resname <residue name>: select by residue name
o resname LYS – select Lysine
• name <atom name>: select by atom name as given in the topology. Often this is force field dependent
o name CA – select Cα atom
• type <atom type>: select by atom type; this is either a string or a number and depends on the force field; it is read from the topology file (e.g. the CHARMM PSF file contains numeric atom types)
o type OHL – select atoms associated with OHL type
55
• atom <segment name> <residue id> <atom name>: select specified single atom
o atom DOPC 1 C2 – select C2 atom associated with first residue of DOPC
• not <selections>: select all atoms except atoms associated with selections o not name C1 – select all atoms except C1 atom
• and | or: combine two selections according to the rules of Boolean algebra
o segid MEMB and (not name C1 or not name O1*) – select all atoms from MEMB segment except C1 atom and atom names start with O1.
• around <distance> <selections>: selects all atoms a certain cutoff away from another
selection o around 3.5 protein – select all atoms that are within 3.5Å from the protein and
not belonging to protein • point <x> <y> <z> <distance>: select all atoms located within certain distance from a
point defined with x, y, and z coordinates o point 5.0 5.0 5.0 3.5 – select all atoms within 3.5Å of the point located at (5.0, 5.0,
5.0)
• prop [abs] <x|y|z> <<|<=|>|>=|==|!=> <value>: select atoms based on position o prop abs z <= 5.0 – select all atoms coordinates of z axis within -‐5.0 <= z <= 5.0
For details and up-‐to-‐dated selection queries, please visit: http://mdanalysis.googlecode.com/git/package/doc/html/documentation_pages/selections.html