+ All Categories
Home > Documents > )JOEBXJ1VCMJTIJOH$PSQPSBUJPO SDIBFB 7PMVNF …

)JOEBXJ1VCMJTIJOH$PSQPSBUJPO SDIBFB 7PMVNF …

Date post: 26-Mar-2022
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
10
Research Article Sequence, Structure, and Binding Analysis of Cyclodextrinase (TK1770) from T. kodakarensis (KOD1) Using an In Silico Approach Ramzan Ali and Muhammad Imtiaz Shafiq Institute of Biochemistry and Biotechnology, University of the Punjab, Lahore 54590, Pakistan Correspondence should be addressed to Muhammad Imtiaz Shafiq; imtiazshafi[email protected] Received 30 July 2015; Revised 12 October 2015; Accepted 1 November 2015 Academic Editor: Isaac K. O. Cann Copyright © 2015 R. Ali and M. I. Shafiq. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ermostable cyclodextrinase (Tk1770 CDase) from hyperthermophilic archaeon ermococcus kodakarensis (KOD1) hydrolyzes cyclodextrins into linear dextrins. e sequence of Tk1770 CDase retrieved from UniProt was aligned with sequences of sixteen CD hydrolyzing enzymes and a phylogenetic tree was constructed using Bayesian inference. e homology model of Tk1770 CDase was constructed and optimized with Modeller v9.14 program. e model was validated with ProSA server and PROCHECK analysis. Four conserved regions and the catalytic triad consisting of Asp411, Glu437, and Asp502 of GH13 family were identified in catalytic site. Also an additional fiſth conserved region downstream to the fourth region was also identified. e structure of Tk1770 CDase consists of an additional N -domain and a helix-loop-helix motif that is conserved in all archaeal CD hydrolyzing enzymes. e N -domain contains an extended loop region that forms a part of catalytic domain and plays an important role in stability and substrate binding. e docking of substrate into catalytic site revealed the interactions with different conserved residues involved in substrate binding and formation of enzyme-substrate complex. 1. Introduction Enzymatic hydrolysis of polysaccharides is a method of choice in many industrial processes due to its high effi- ciency and better yields of the products as compared to acid hydrolysis. Glycoside hydrolases have been used for processing of starch, cellulose, hemicellulose, and cyclodex- trins [1]. Cyclodextrins (CDs) are cyclic oligosaccharides with six or more glucopyranosyl units linked through - 1,4 glycosidic bonds. Cyclodextrins with six, seven, and eight glucopyranosyl moieties are termed as -, -, and - cyclodextrins, respectively. In water CDs adapt a structure with all hydrophilic groups directed towards exterior sur- face and hydrophobic groups towards an internal cavity. e internal hydrophobic cavity of CDs allows them to be resistant to hydrolysis by common amylases and also form inclusion complexes with different organic molecules [2, 3]. CDs have many applications in food, agriculture, cosmetics, and pharmaceutical industry. e vast applications of CDs and their hydrolytic products create a need for efficient and specific enzymes for CDs hydrolysis [2, 4, 5]. e glycoside hydrolases have been classified into 14 clans and 133 families. Each clan consists of at least two families that share catalytic fold and mechanism according to database of carbohydrate active enzymes [6]. e family GH13 (also called -amylase superfamily), the largest family of glycoside hydrolases, is a member of clan GH-H along with families GH-70 and GH-77. e -amylase family (GH13) is the most important family for industrial applications [7–9]. Although there is low sequence similarity among the enzymes of different families within this clan, they all exhibit certain structural features that have been conserved during evolution [10, 11]. e family GH13 is further classified into 35 subfamilies with at least 26 different specificities [12], including -amylase (EC 3.2.1.1), pullulanase (EC 3.2.1.41), glucanotransferase (EC 2.4.1.25), and cyclodextrinase or cyclomaltodextrinase (EC 3.2.1.54). All of these subfamilies share a common catalytic domain comprising a TIM barrel or (/) 8 barrel with Hindawi Publishing Corporation Archaea Volume 2015, Article ID 179196, 9 pages http://dx.doi.org/10.1155/2015/179196
Transcript

Research ArticleSequence Structure and Binding Analysis ofCyclodextrinase (TK1770) from T kodakarensis (KOD1)Using an In Silico Approach

Ramzan Ali and Muhammad Imtiaz Shafiq

Institute of Biochemistry and Biotechnology University of the Punjab Lahore 54590 Pakistan

Correspondence should be addressed to Muhammad Imtiaz Shafiq imtiazshafiqgmailcom

Received 30 July 2015 Revised 12 October 2015 Accepted 1 November 2015

Academic Editor Isaac K O Cann

Copyright copy 2015 R Ali and M I Shafiq This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Thermostable cyclodextrinase (Tk1770 CDase) from hyperthermophilic archaeon Thermococcus kodakarensis (KOD1) hydrolyzescyclodextrins into linear dextrinsThe sequence of Tk1770 CDase retrieved fromUniProt was aligned with sequences of sixteen CDhydrolyzing enzymes and a phylogenetic tree was constructed using Bayesian inferenceThe homologymodel of Tk1770 CDase wasconstructed and optimized with Modeller v914 program The model was validated with ProSA server and PROCHECK analysisFour conserved regions and the catalytic triad consisting of Asp411 Glu437 and Asp502 of GH13 family were identified in catalyticsite Also an additional fifth conserved region downstream to the fourth region was also identified The structure of Tk1770 CDaseconsists of an additional N1015840-domain and a helix-loop-helix motif that is conserved in all archaeal CD hydrolyzing enzymes TheN1015840-domain contains an extended loop region that forms a part of catalytic domain and plays an important role in stability andsubstrate binding The docking of substrate into catalytic site revealed the interactions with different conserved residues involvedin substrate binding and formation of enzyme-substrate complex

1 Introduction

Enzymatic hydrolysis of polysaccharides is a method ofchoice in many industrial processes due to its high effi-ciency and better yields of the products as compared toacid hydrolysis Glycoside hydrolases have been used forprocessing of starch cellulose hemicellulose and cyclodex-trins [1] Cyclodextrins (CDs) are cyclic oligosaccharideswith six or more glucopyranosyl units linked through 120572-14 glycosidic bonds Cyclodextrins with six seven andeight glucopyranosyl moieties are termed as 120572- 120573- and 120574-cyclodextrins respectively In water CDs adapt a structurewith all hydrophilic groups directed towards exterior sur-face and hydrophobic groups towards an internal cavityThe internal hydrophobic cavity of CDs allows them to beresistant to hydrolysis by common amylases and also forminclusion complexes with different organic molecules [2 3]CDs have many applications in food agriculture cosmeticsand pharmaceutical industry The vast applications of CDs

and their hydrolytic products create a need for efficient andspecific enzymes for CDs hydrolysis [2 4 5] The glycosidehydrolases have been classified into 14 clans and 133 familiesEach clan consists of at least two families that share catalyticfold and mechanism according to database of carbohydrateactive enzymes [6] The family GH13 (also called 120572-amylasesuperfamily) the largest family of glycoside hydrolases is amember of clanGH-Halongwith familiesGH-70 andGH-77The120572-amylase family (GH13) is themost important family forindustrial applications [7ndash9] Although there is low sequencesimilarity among the enzymes of different families within thisclan they all exhibit certain structural features that have beenconserved during evolution [10 11]

The family GH13 is further classified into 35 subfamilieswith at least 26 different specificities [12] including120572-amylase(EC 3211) pullulanase (EC 32141) glucanotransferase (EC24125) and cyclodextrinase or cyclomaltodextrinase (EC32154) All of these subfamilies share a common catalyticdomain comprising a TIM barrel or (120573120572)

8barrel with

Hindawi Publishing CorporationArchaeaVolume 2015 Article ID 179196 9 pageshttpdxdoiorg1011552015179196

2 Archaea

Table 1 List of the sequences used for alignment and phylogenetics

Enzyme Organism Abbreviation AA Seq similarity to Tk1770 UniProt IDCyclomaltodextrinase T kodakarensis Tk1770 CDase 656 100 Q5JJ59

Cyclomaltodextrinase Thermococcus sp (strainCGMCC) THES4 CDase 637 60 G0HJP6

Maltogenic 120572-amylase T gammatolerans THEGJ MAse 638 59 C5A4D9Cyclomaltodextrinase T cleftensis THERCLF CDase 644 59 I3ZTQ5Cyclomaltodextrinase T onnurineus THEON CDase 652 59 B6YV58Cyclomaltodextrinase Pyrococcus yayanosii PYRYC CDase 656 57 F8AHJ5Neopullulanase Pyrococcus furiosus PYRFU NPase 645 56 Q8TZP8Cyclomaltodextrinase T paralvinella THERPA CDase 654 56 W0I4Q4Neopullulanase T litoralis THELN NPase 655 55 H3ZKI8Cyclomaltodextrinase Thermococcus sp B1001 THERSP CDase 660 54 Q9HHC8120572-amylase T pendens THEPD 120572-amylase 644 52 A1S075

120572-amylase Staphylothermusmarinus STAMF 120572-amylase 696 28 A3DM60

Cyclomaltodextrinase Geobacillus sp G1w1 GBACI CDase 587 32 A0A093UHG3Cyclomaltodextrinase Paenibacillus wynnii PBACI CDase 581 31 A0A098M8Z8120572-cyclomaltodextrinase Bacillus mycoides BACMY 120572-CDase 586 30 C3APY4Neopullulanase G stearothermophilus GEOSE NPase 588 31 Q9AIV2Cyclomaltodextrinase Bacillus indicus BACIIN CDase 589 29 A0A084GIJ0AA means amino acids

a catalytic triad [13 14] and a C-terminal domain consistingof 120573-strands only [10] Many enzymes also possess N- andorC-terminal carbohydrate binding modules like CBM34CBM20 CBM41 and CBM48 [12 15] The CD hydrolyzingenzymes include cyclodextrinase maltogenic amylase andneopullulanase that hydrolyze CDs into linear maltodextrinsor maltose [16 17] Recently thermostable pullulan hydrolaseIII from Thermococcus kodakarensis (KOD1) has also beenreported to hydrolyze CDs into maltose or glucose [18] Thethermostable enzymes from hyperthermophiles have manyadvantages including higher rates of reaction increased prod-uct yields and decreased risks of contamination as comparedto their mesophilic homologs [19ndash21] Due to the advancedsequencing technologies and rapidly increasing numbers ofgenomes being sequenced the number of sequences beingclassified as glycoside hydrolases is far exceeding the numberof enzymes being structurally or biochemically characterized[22 23] Currently GH13 family contains 26287 sequencesbut only 99 structures have been resolved [24] Till the dateof writing this work Protein Data Bank contains only sixCD hydrolyzing enzymes (PDB IDs 4EAF 1EA9 2XIE 1J0H1H3G and 1BVZ) [6]There is a need for better understandingof sequence and structural components of these proteins andtheir mechanism of catalysis as CDases A bioinformaticsapproach can be used as a valuable predictive tool to provideinformation about structure and function of these enzymes

In this work we have used an in silico approach to provideinsight into the sequence structural components domainarrangement catalytic machinery and enzyme-substrateinteractions of thermophilic cyclodextrinase (Tk1770) fromThermococcus kodakarensis (KOD1) an enzyme of potentialindustrial applicationsThis study provides the first attempt touse in silico approach to provide insight into the structure and

key components of catalytic machinery of cyclodextrinase(CDase) from T kodakarensis

2 Materials and Methods

21 Sequence Retrieval Alignment and Phylogenetic AnalysisTheamino acid sequence (UniProt IDQ5JJ59) of CDase fromT kodakarensis KOD1 (CDase-Tk Tk1770) was retrievedfrom UniProtKB A blast sequence similarity search wascarried out against UniProtKB to find homologs of Tk1770From the blast results sixteen different sequences of CDhydrolyzing enzymes from bacterial and archeal sourceswere selected for further studies (Table 1) The alignment ofsequences was carried out with Clustal Omega and a rootedtree was generated using Bayesian inference method withdefault parameters [25 26]

22 Homology Modeling The Tk1770 CDase was subjectedto NCBI BLAST against RCSB PDB (Protein Data Bank)to search suitable template(s) for comparative modelingMultiple X-ray crystallographic structures (PDB ID 4AEF1J0H 4AEE 1EA9 1SMA and 1WZL) with sequence identityfrom 56 to 29 respectively were selected as templates(Table 2) The sequences of target (Tk1770) and templateswere aligned with Clustal Omega using UGENE program[25] The alignment and the PDB structures were used asinputs for homology modeling with Modeller v914 [27]The model optimization was carried out by variable targetfunction method (VTFM) with conjugate gradients (CG)and molecular dynamics (MD) with simulated annealing(SA) methods [27 28] The models generated by Modellerwere scored on the basis of their DOPE (Discrete OptimizedProtein Energy) values and the model with lowest DOPE

Archaea 3

Table 2 List of the PDB files used as templates for homology modeling of CDase Tk1770

Serialnumber PDB ID Organism Enzyme identity with TK1770 query cover

1 4AEF P furiosus Amylase 56 982 1EA9 Bacillus sp Cyclomaltodextrinase 33 763 1J0J G stearothermophilus Neopullulanase 33 784 1SMA Thermus sp Maltogenic amylase 32 795 4AEE S marinus Maltogenic amylase 29 956 1WZL Thermoactinomyces vulgaris 120572-amylase II 35 76

score was selected for further studies The homology modelwas further validated by ProSA-web server and PROCHECK[29 30] The model was refined by Modeller loop refinementfunctions and again validated for confidence Thus a reliablemodel was constructed and visualized using PyMOL [31]

23 Molecular Docking Studies In order to investigate theenzyme-substrate interactions the docking of substrates (120572-120573- and 120574-cyclodextrins) into the active pocket of Tk1770 wascarried out using AutoDock and MGL Tools v156 [33] Thesubstrates were prepared by adding polar hydrogen atomsand partial charges The protein model was prepared byadding polar hydrogens and Gasteiger charges The grid mapdimensions were set around the active site with all otherparameters set to default and rigid docking was performedThe candidates poses of the substrates were scored on thebasis of their binding energy in kcalmol and the best poseswith lowest binding energy (kcalmol) were selected

3 Results and Discussion

31 Sequence Alignment and Phylogenetic Tree The sequenceof Tk1770 consisting of 656 amino acids was aligned withsixteen CD hydrolyzing enzymes from the GH13 family(Figure 1) These sequences included eleven archeal enzymesand five bacterial enzymes having sequence identities from28 to 60 with Tk1770 CDase (Table 1) All enzymespossess three major domains (i) an N-domain (ii) a catalyticTIM barrel and (iii) a C-domain [10 34]The sequence anal-ysis showed that archeal enzymes contain two N-terminaldomains (ie N1015840- and N-domain) in addition to the catalyticand C-domains whereas the N1015840-domain is absent in all thebacterial CD hydrolyzing enzymes (Figure 1) A linker regionfrom residues 190 to 203 in Tk1770 connects two N-terminaldomains with two C-terminal domains Four conservedregions of GH13 family in TIM barrel structure were iden-tified from residues 299 to 310 405 to 414 433 to 441 and 496to 502 with catalytic triad being Asp411 Glu437 and Asp502An additional conserved region of amino acids 533ndash539 wasalso identified downstream to the conserved regions IndashIV

A rooted phylogenetic tree was constructed from align-ment usingMrBayes with rate matrix wag (fixed) to find evo-lutionary relationship The tree was divided into three cladeswith all bacterial enzymes forming one clade and archealenzymes divided into two clades (Figure 2) The tree showedthat Tk1770 CDase is more closely related to THEGJ MAseand THES4CDases with a sequence identity of 59 and 60

respectively (Figure 2) The STAMF 120572-amylase shows 28sequence identity with Tk1770 CDase and acts as outgroup inthe phylogenetic tree The 120572-amylases usually do not exhibitCDhydrolyzing activity and they also lackN1015840-domainThe120572-amylase (STAMF 120572-amylase) from S marinus is quite uniquein this regard as it exhibits both CD hydrolyzing activity andadditional N1015840-domain [35] It suggests that during the courseof evolution the presence of N1015840-domain might be linked toCD hydrolyzing activity in archaea

32 Homology Modeling The homology modeling programModeller v914 [27] was used to construct 3D structure ofTk1770 with multiple templates as described inMaterials andMethods Out of five models generated the best model withlowest DOPE value was selected

In homology modeling sometimes the model mightcontain certain high-energy loops or residues with unusualgeometry Thus the model selected was refined using Mod-eller built-in loop-refinement function on loops rangingfrom 3 to 7 amino acids in length and then validated withProSA-web server and PROCHECK analysis [30]The overallquality of the model was estimated by ProSA server interms of 119885-score by comparing it with 119885-score values ofexperimentally resolved protein structures in Protein DataBank [29] Ramachandran plot validated all the nonglycinenonproline residues to be in allowed regions and 879 ofresidues in most favorable regions This verifies that all theresidues exhibited accurate stereochemical positions

Homology model of Tk1770 CDase was aligned with Pfuriosus neopullulanase (PYRFU NPase) (PDB ID 4AEF)for an analysis and comparison of the active site and otherstructural features The overall structure of Tk1770 CDasefolds into four major domains with two 120573-strands only N-terminal domains (ie N1015840- and conventional N-domain)connected to TIM barrel (A-domain) and a C-terminaldomain also consisting of 120573-strands The structure of N1015840-domain of Tk1770 typically represents CBM48 with eight 120573-strands [15 36] The structural alignment of N1015840- or CBM48domain of Tk1770 and PYRFU NPase revealed that bothcontain a loop that extends into the catalytic site Howeverthe extended loop of N1015840-domain of PYRFU NPase forms amore flexible helical turn as compared to the loop of Tk1770(Figure 3) The substitution of P91 and S92 in extended loopregion of Tk1770 in place of K89 and G90 in loop of PYRFUNPase might be responsible for this apparent decreasedflexibility of loop in N1015840-domain of Tk1770 (Figure 3) Fur-thermore K89 and G90 in extended loop of N1015840-domain in

4 Archaea

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMY

GEOSEBACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

G - - - - - - - - - - N D M V F H R P A L L Y L Y S F G D R - T H V L L R S K K G K V D A A Y L V T D D T H - - - - - - - - - - - V K M R K K A D G E V F E Y Y E A V L Q E - T E K L R Y S F E V F L K E G K S L - - - - - - -

G - - - - - - - - - - G D G P F H A P S A T Y L Y T V A G R - T H V L L R A K A G T V A K A A L V R P E S E - - - - - - - - - G M V E M R K K A R D E P F E Y F E A V L P G - D G E L E Y S F E V R T R K G M I K - - - - - - -

G - - - - - - - - - - P E P V Y H S P S L L Y L Y T F G G R - V N F V L R A K K G Y L V S S T L I L K G K D - - - - - - - - - - - I E M R K R A S D E L F D Y F G A E V G N L E G P V E Y S F L G E S S E G - P F - - - - - - -

G

A

A

- - - - - - - - - - E G E F F H R P S A T Y L Y S I A G G - T H V L L R A R R G K T R K V R L I L D E S E - - - - - - - - - - - V P M K R K A F D E L F E Y Y E A I L P G - E G V I R Y S L I V E S E G - K T I - - - - - - -

- - - - - - - - - - G D D F Y H E P A L A Y L Y S F A D R - T H V L L R T V K G K A I S T Y L I T D E R - - - - - - - - - - - - I E M R K K A S D E L F D Y F E A V L P R - T E E L S Y G F E I E T G E G - T I - - - - - - -

- - - - - - - - - - G D E F Y H E P S L L Y I Y S F A D R - T H V L F R A V R G R A L R V I L V T D E S - - - - - - - - - - - - V G M R K K A S D E L F D Y F E A I L P R - V K E L S Y T F E I E T E E G - S V - - - - - - -

K S - - - - - - - - - D D L V F H T P S L L Y L Y E I F G R - V H V L L R T Q K G V I K G A T F L G E K H - - - - - - - - - - - - V P M R K K A S D E L F D Y F E V I V E G G D K R L N Y S F E V L T M E G A K F - - - - - - -

S - - - - - - - - - - G E E F Y H Y P S L I Y A Y S L G D L - A Y I R F R A I K G T V K K V F L I S D Q K - - - - - - - - - - - - Y E M R K K A R D D L F E Y F E A V L P K - K E E L E Y Y F E I H T A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

G F D P A S C N G F C E E A L Y H Y P S L T Y V Y P F G G V - L F V R L R A L R G S L Q K A F L V V D G R R - - - - - - - - - - - L E M R L K A R D E V F D Y Y E A S L E A - G G E V S Y Y F E V L G G G R - L H - - - - - - -

K E P D N P - - - - - L D K I I H I E E S G F I H K F N G E - I I I R L I A P T E I N E P L I D L G N E I R - - - - - - - - - - - E P L T K H V V G D N I V Y Q Y I I - - P S R S I L R Y R F I F N Y N D K K L F Y G D E G V S

- - - - - - - - - M F K E A V Y H R P T D N F A Y A Y D E R T L H L R L R T K K G D V D K V E L L H G D P Y E W R N G A W Q F E T M P M K K T G S D E W F D Y W L A E V Q P P Y R R L R Y G F V L H A G E E T L V Y T E K G V Y

- - - - - - - - - M L L E A V Y H R P R L N W S Y A Y N E N T I H L R L R A K K G D L T E V Y A W T G D K Y A W D T T K - - - E L I P M S L F T S D E M F D Y W E C E T V P P H R R L K Y G F L L Q K G S E R I W M T E S D F Q

- - - - - - - - - M F K E A I Y H R P K D N Y A Y A Y D E K T L H I R L R T K K N D V D I A S L I H G D P Y E W Q D G K W I T A N I P M K K S G S T D L F D Y W F V S I E P N F K R L R Y G F E L K N N T E T I V Y T E R G F F

- - - - - - - - - M R K E A I Y H R P A D N F A Y A Y D S E T L H L R L R T K K D D I D R V E L L H G D P Y D W Q N G A W Q F Q T M P M R K T G S D E L F D Y W F A E V K P P Y R R L R Y G F V L Y S G E E K L V Y T E K G F Y

- - - - - - - - - M L K E A V Y H R P K N Q Y A Y A Y D E K T L H I R L R T K K N D V E T V S L V H G D P Y E W S K D G W T F K Q N E M K K S G S D E L F N Y W F T A V E P E Y R R M R Y G F E L T S G D E K W I Y T E K G F I

- - - - - - - T L G P F E A - - - - A P F R L D A P S W I L D R V F Y Q I M P D R F A K G R D H E P P F L - - - - - S - - - - - W E Y Y G G D L W G I V E K I D H L E E L G V N A L Y L T P I F E S M T Y H G Y D I T D Y L R V

- - - - - - - E L G P F R A - - - - V P Y R P E T P L W V Y G R V F Y Q I M P D R F E R G L P G - T P R G R A F R G - - - - - - E E F H G G N L A G I I K R L E H L E E L G V N A L Y L T P I F E S M T Y H R Y D V T D Y F S I

- - - - - - - E L G P F S A - - - - V P I A L K A P E W P L E R V F Y Q V M P D R F A G N C L R - - - - - - - - - D S - - - - - G N F C G G D L W G L K E R L D H I A G L G F N A L Y L T P I F E S T T Y H G Y D V V D Y F H V

- - - - - - - E L G P F E A - - - - K P Y R Y N A P G W I H G R V F Y Q I M P D R F E R G L P G - T P R G R A F A G - - - - - - E G F H G G D L A G I I R R L D H I E S L G A N A L Y I T P V F E S T T Y H R Y D V T D Y F H I

- - - - - - - E Y G N F T A - - - - E P R E L Q V P R W I F N R V F Y Q I M P D R F E R D M I K - K P R G R I I E T G - - - - - L G H H G G D L A G I V K R L G H L E G L G V N A L Y L T P I F E S M T Y H G Y D I V D Y F K V

- - - - - - - E Y G D F T A - - - - T P K E L S T P K W I F S R V F Y Q I M P D R F E R E S N E - E K V - - - - G G D - - - - - P K I Y G G N L P G I L K R L D Y I E G L R V N A L Y L T P I F E S I T Y H G Y D V I D Y F N V

- - - - - - - E Y G Q F K A - - - - R P F S I E F P T W V I D R V F Y Q I M P D K F A R S R K I - - - - Q G I A Y P K - - - - - D K Y W G G D L I G I K E K I D H L V N L G I N A I Y L T P I F S S L T Y H G Y D I V D Y F H V

- - - - - - - N Y G D F K V D F N E Q K E M F K P P T W I F E R I F Y Q I M P D R F A N G N P E N D P H D C I E L G - - - - - - I S H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I I D Y F H V

- - - - - - - D Y G D F N V D F N E Q K E R F K P P A W I F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - D Y G D F K V D F N E Q K E R F K P P A W V F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - R Y G E F S V D V K S L E S L I R V P E W V Y G S V F Y Q I M P D R F A E - - - - - - - - - - - - - - - - - - - - - - - - - G G L E E I A E R L N H V S G L G A N A L Y L T P I F E S T T Y H G Y D V V D Y Y R V

E - - - - - - N S S Y I V V N S K Y I P G - V D K P R W Y M G T V Y Y Q I F I D S F D N G D P N N D P P N R I K K T V - - P R E Y G Y Y G G D L A G I M K H I D H L E D L G V E T I Y L T P I F S S T S Y H R Y D T I D Y K S I

L T P P A D D T A Y Y F C F P F L H D V D L F H A P E W V K D T V W Y Q I F P D R F A N G N P A I N P E G V R P W G S E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I S G I Y L T P I F R A P S N H K Y D T A D Y L E I

T K R P E - N P E K L F E F P Y I N R S D I F T P P A W V K D A V F Y Q I F P E R F A N G D P S L D P E N V Q P W G G - K P E R D S F F G G D L Q G V I D H L D H L S E L G I N A I Y F T P V F A A T T N H K Y D T E D Y M R I

P E T P N D D V G N F F C F P F I H E Q D V F R T P S W I K D T V W Y Q I F P E R F A N G D P S C N P A D T L P W G S T D P T T T N F F G G D F A G V I Q H L D Y L V K L G I S G I Y F T P I F T A H S N H K Y D T I D Y M E I

F E A P I D D T A Y Y F C F P F L H R V D L F E A P D W V K D T V W Y Q I F P E R F A N G N P S I S P E G S R P W G N E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I T G I Y L T P I F R S P S N H K Y D T A D Y F E V

K D V V S D D T A P Y F A F P F L N K A D V F H A P E W V K D T V W Y Q I F P E R F A N G D S S I N P E G T L E W G S I E P T S G N F F G G D F E G V I Q N I G Y L K E L G I S G I Y F T P V F K A Y S N H K Y D T I D Y M E L

S P K V R E F V A R V M N Y W L E K - G A D G W R L D V A H G V P P G F W R E V R E G - - - L P D D A Y L F G E V M D D P R L Y L F - G V F H G V M N Y P L Y D L L L R F F A F G E I G A T E F I N G I E L - L S A H L G P A E

S P E V R K F I R E V M E Y W L E R - G A D G W R L D V A H G V P P E L W G E M R K A - - - M P E G A Y L M G E V M D D P R L W V F - D A F H G T M N Y P L Y E L I L R F F V K G E I D A G E F L N G L E L - L S A H L G P A E

S E E V F E F V V N V M G Y W L K K - - A D G W R L D V A H G V P P D F W V R V R E R - - - M P S S A Y L I G E V M D D A R L Y L F - R G F H G V M N Y A L Y D A I L K F F A F G E I S A E E F L N E L E L - I S V R Y G P A E

N P E V K R L V K D V M M H W L E K - G A D G W R L D V A H G V P P E L W R E V R K A - - - L P K D A Y L V G E V M D D P R L W L F - D K F H G T M N Y P L Y E L I L R F F V E R E I D A G E F L N G L E L - L S A H L G P A E

N P Q V R E F I V S V M K H W L E E - G A D G W R L D V A H G V P P E L W R E V R E R - - - M P E D A Y L V G E V M D D A R L W L F - D K F H G T M N Y P L Y E A I L R F F V R G E I S A E E F L N W L E L - L S T Y Y G P A E

D P R V R K F I A K V M N Y W L E K - G I D G W R L D V A H G I P P D L W R E I R K E - - - M P E D A Y L V G E V M D D A R M W L F - D K F H G T M N Y P L Y E A I L R F F V T G E I T A E E F L N Y L E L - L S T Y Y G P A E

N P K V R E F I K N V I L F W T N K - G V D G F R M D V A H G V P P E V W K E V R E A - - - L P K E K Y L I G E V M D D A R L W L F - D K F H G V M N Y R L Y D A I L R F F G Y E E I T A E E F L N E L E L - L S S Y Y G P A E

N P E V K E F I R T V M K Y W L E R - G A D G W R L D V A H G V P P D V W R E I R K D - - - I P D D A Y L L G E V M D D A R L W L F - D K F H G T M N Y P L Y E A L L R F F V Y N E I T A E E F L N W L E L - L S V Y Y G P A E

S K G V R E F I G N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

S K G V R E F I R N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

N P E V R S F I T G V G R Y W V S R - G V D G W R L D V A H G V P P E L W R E F R E T - - - L P G D V Y L F G E V M D D A R I W L F - D K F H G A M N Y L L Y D A V L R F F A Y R E I T A E E F L N R L E L - L S V Y Y G P G E

N P R T V D Y F I D I T K F W I D K - G I D G F R I D V A M G I H Y S W M K Q Y Y E Y I K N T Y P D F L V L G E L A E N P R I Y M - - D Y F D S A M N Y Y L R K A I L E L L I Y K R I D L N E F I S R I N N V Y A Y I P H Y K A

H P D V R R Y L L D V A T Y W I R E C D I D G W R L D V A N E I D H E F W R E F R R A V K A Q K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P V A D A A L R F F A K E E I N A R E F A E R L M R V L H S Y P A T V N

N P E V K Q Y L L E V A E Y W I K E V G I D G W R L D V A N E V S H E F W R E F R K V V K R A N P D A Y I L G E I W H E S A P W L E G D K F D A V M N Y P F T S A V I D F F V F G N L D A E G F A N S I G K Q L S R Y P L Q A S

H P D V K E Y L L K V G R Y W V R E F H I D G W R L D V A N E V D H S F W R E F R S E I K A I N P E V Y I L G E I W H D A Q P W L Q G D Q F D A V M S Y P I T N A L H S Y F A N E T I G A S E F M E Q I T A S L H S Y S M N V N

N P E V K R Y L L D V A T Y W I R E F D I D G W R L D V A N E I D H E F W R E F R Q A V K A L K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P F T D G V L R F F A K E E I G A R Q F A N Q M V H V L H S Y P N N V N

H P D V R S Y L L E V G R Y W V R E F D I D G W R L D V A N E V D H A F W R E F R Q A V R A E K E D V Y I L G E I W H D S M P W L Q G D Q F D A V M N Y P F T T G T M N F I A N N K V K A E E F V H I M E S V L H S Y P K N V N

106

109

109

104

104

104

104

104

104

104

104

105

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

188

193

187

185

184

184

187

184

184

184

195

202

104

101

104

104

104

279

287

274

279

279

275

279

283

284

284

275

305

216

211

216

216

216

386

394

381

386

386

382

386

390

391

391

378

417

302

298

302

302

303

187

192

186

184

183

183

186

183

183

183

194

201

103

100

103

103

103

278

286

273

278

278

274

278

282

283

283

274

304

215

210

215

215

215

491

499

485

491

491

487

491

495

496

496

483

525

413

409

413

413

414

385

393

380

385

385

381

385

389

390

390

377

416

301

297

301

301

302

E R L G G E E A F R E L V K A L K S R D I K L V L D G V F H H T S F F H P F F R D V V E R G E E S E Y A D F Y R V K G F P V - - V S E E F I R V L K S D L P P M E K Y Q T L K K M G W N - - - Y E S F F S V W V M P R L N H D

D

A

A

A

A

A

A

A

A

R K L G G G G V F G E F V K E L K K R D I R L I L D G V F H H T S F F H P Y F Q D V V R K G E G S E Y R G F Y R I T G F P V - - V P E Q F L R V L H S E G P W I E R Y H L I K S L D W N - - - Y E S F Y S V W L M P R L N H D

S R R L G G D E A F D E L V K E L R R R G I K L I L D G V F H H T S F F H P Y F Q D V V E K G E R S R Y V G F Y R I L G F P V - - V S K R F L R A L D S G L L P G D T R S A P M G A E W N - - - Y E S F Y S V W L M P R L N S D

D R K L G G D G T F L K L A G E L K K R D I K L V L D G V F H H T S F F H P F F Q D L I A R G N E S D Y K D F Y R V T G F P V - - V S G E F L E V L R S K I S P R E K H R R L K E I G W N - - - Y E S F Y S V W L M P R L N H E

G K F G G N E A F G E L A R E L K R R D I K L I L D G V F H H T S F F H S Y F Q D V V K K G G E S R Y R D F Y R I L K F P V - - V S K D F L R V L D S N E P P E R K Y K G L K E L H Q N - - - Y E N F F S V W L M P R L N H D

K R L G G N A A F E K L V R E L K R R D I K L I L D G V F H H T S F F H P H F Q D V V R K G V E S V Y R D F Y R I T G F P V - - V S Q E F L E I L N S E E P W E E K F K R L K N L D W N - - - Y E S F F S V W L M P R L N H D

R R L G G D R A F V D L L S E L K R F D I K V I L D G V F H H T S F F H P Y F Q D V V R K G E N S S F K N F Y R I I K F P V - - V S K E F L Q I L H S K S S W E E K Y K K I K S L G W N - - - Y E S F F S V W I M P R L N H D

K K F G G D K A L K Q L V N E L K K R D I K L I L D G V F H H T S F F H P Y F Q D I L K K G K E S K Y R N F Y R I F G F P V - - I S K E F S K L L H S N E P W I E K Y Q K L R K L K W N - - - Y E S F F S V W L M P R L N H E

R K F G G D E A F E K L V Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

R K F G G D E A F E K L M Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

G R L G G D E A F G R L L A E L K K R G M R V V L D G V F H H T S F F H P Y F Q D L V E K G E E S R Y K G F Y R V L G F P V - - V P R E F L E A L R S G A P R H E - - - - L K K Y P R R - - - Y E S F F D V W L M P R L N H D

D K Y L G T M E D F E K L V Q V L H S R K I K I V L D I T M H H T N P C N E L F V K A L R E G E N S P Y W E M F S F L S P P P K E I V E L M L K Y I D G E E C R S R E L Y K L D Y F R N N K P F Y E A F F N I W L M A K F N H D

D P H F G D K E T L K T L V Q R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V L K N G E S S P Y K D W F H I R D F P L - - Q S E - P - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D A K T L K K L V D V C H E R G I R V L L D A V F N H A G K T F A P F I D V Q E K G E A S P Y K D W F H I N Q F P L - - A F D Q D - - - - - - - - - - - - - - - - - - - - I P S - - - Y D T F A F E P L M P K L N T E

D P Q F G T K E T F K K L V N A C H K R G I K V M L D A V F N H S G Y F F D K F Q D V L K K G K Q S R Y T N W F H I H E F P I - - V T E - P - - - - - - - - - - - - - - - - - - - - L P N - - - Y D T F A F T P Y M P K L N T A

D P H F G D K E T L K T L I D R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V W K N G E S S K Y K D W F H I H E F P L - - Q T E - S - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D K E T F K R L V R T C H D N G I K V M L D A V F N H S G Y Y F P Q F Q D V L E H G E K S S Y K D W F H I R K F P L - - K N E D D - - - - - - - - - - - - - - - - - - - - T I N - - - Y D A F A F V E S M P K L N T E

lowast lowast

M Y K V F G F E E N F I H G R V A R - - V E F S L P D A G R W D Y A Y L L G N F N A F N E G S F R M K H E D K R W I I E I K L P E G L W R Y A F S A G G E F - - L L D P E N P E K E L Y R R P S Y K F E R E V S L A K I A

W

M R K V Y K I F G F E P D Q K F G R V A V - - V E F S I P A E P G N R Y A Y L L G S F N A F N E G S F R M R R K K G R W R T V V K L P E G V W H Y A F S I D G E F - - T P D P E N P R R E V Y R R L S Y K F E R E T S V A V I D

- - -

- - -

M Y K T F G F V E D P V F G R L A R - - V E F S I P Y R - G E R Y A Y L L G S F N A F N E G S F R M E R R G S R W F I R V L L P E G V W R Y A F S L E G R F - - E R D P E N E N V E T Y R R P S Y K F E K E V S V A G V I

- - - M Y K I F G F E P D W R F G R V A R - - V E F S I P A R -- G K Y A Y L L G N F N A F N E G S F R M E R K G E R W R I T L R L P E G V W Y Y G F S V D G E F - - L M D P E N P D V E T Y R K L S Y K L E K E A S V A R I V

- - - M Y K T F G F E S N E Y F G R I A K - - V E F S V P S R - - G S Y A Y L V G S F N A F N E G S F R M R E E N G R W R A T V E L P E G V W H Y G F S I D G K Y - - A P D P E N P E K R A Y R R F S Y K F E R E T S V A R I S

- - - M Y K I L E F G H N E Y F G R V A K - - V E F S F P K R -- G G Y A Y L V G S F N A F N E G S F R M R E K G D R W H I V I D L P E A I W Y Y G F S L D G K Y - - T P D I E N P E R T L Y R R L S Y K F E R E V S I A R I

- - - M Y K L V S F R D S E I F G R V A E - - V E F S L I R E - - G S Y A Y L L G D F N A F N E G S F R M E Q E G K N W K I K I A L P E G V W H Y A F S I D G K F - - V L D P D N P E R R V Y T R K G Y K F H R E V N V A R I V

- - - M Y K I F G F K N D K Y L G K V A E - - V E F S M L K R - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W S I K I E L P E G V W Y Y A F S I D G D L - - M L D P E N R E K T T Y K R H S Y K F R R T V N V A K I F

- - - M Y K I F G F K D D D Y L G K V G I - - T E F S I P K R - - G S Y A Y L L G N F N A F N E G S F R M K E K G D R W Y I K V E L P E G I W Y Y A F S I D G N L - - T L D F E N N E K A V Y R R L S Y K F E K T V N V A K I F

- - - M Y K I F G F K D N D Y L G K V G I - - T E F S I P K S - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W Y I K V E L P E G I W Y Y T F S V D G N L - - I L D F E N N E K T V Y R R L S Y K F E K T V N V A K I F

- - - M Y R V L G F R D D V Y L G R V V K - - A E F S A P R E - - G E Y A Y L L G N F N A F N E G S F R M R G A G D R W V V E V E L P E G V W Y Y L F S L G G R R - - A V D P E N P E T T V Y S R R A Y K F E E R V S V A K L L

- - - M Y K I I G R E I - Y G K G R K G R Y I V K F T R H W P Q Y A K N I Y L I G E F T S L Y P G F V K L R K I E E Q G I V Y L K L W P G E Y G Y G F Q I D N D F E N V L D P D N E E K K C V H T S F F P E Y K K C L S K L V I

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

105

108

108

104

103

103

103

103

103

103

103

103

Figure 1 Continued

Archaea 5

492

500

486

492

492

488

492

496

497

497

484

526

414

410

414

414

415

597

607

593

599

599

595

599

603

604

604

591

630

521

Y F T Y N F L D N H D T E R F I D L A - G - K E R Y L C A L T F L M T Y K G I P A I F Y G D E I G L R G S - G E G M S A G R T P M S W D E E K W D F Q I L R Q T M K L I E L R R S L K S L Q - V G S F R V I G A - - G E K W F V

Y A M Y N F L D N H D T E R F L D L V - G D K R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G R L D G G L S A G R T S M V W D R G K W D T E I F E T T K R L I R L R R G S R A L Q - L G E F V P V R F - - Q G R T M I

Y Y A Y N F L D N H D T E R F L D L V - H D E R L Y L C A L A F L M T Y K G I P A V F Y G D E I G L R G R K G G G L D A G R T P M K W R E E N W N R E I L E T T R E L I H L R R N S K A L Q - F G T F R P L L F - - R G R T I V

Y A M Y N F I D N H D T E R F I D L V - N D E R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G K L E G G L D A G R T P M E W N P E G W N E R I L E T T R K L I E L R K R S K A L Q - L G D F I P L R F - - E G D E I I

Y S M Y N F L D N H D V E R F L D L V - G D E R R Y L C A L A F L M T Y K G I P A L F Y G D E I G L R G I G A S G M E S S R T P M K W G K E T W N T K I L R V T K A L I R L R R K S K A L Q - L G E F R P L E F - - K G G L L L

Y M M Y N F L D N H D V E R F L D L V - G D R K R Y L C A L A F L M T Y K G I P S I F Y G D E I G L S G M E G K G L E V S R T P M R W E G N Q W D T E I L K V T K A L I R L R R N S R A L Q - L G F F R P L K F - - K G R L L V

Y L M Y N F L D N H D V E R F L D I V - G D K R K Y V C A L V F L M T Y K G I P S L F Y G D E I G L R G I N L Q G M E S S R A P M L W N E E E W D Q R I L E I T K T L V K I R K N N K A L L - F G N F V P V K F - - K R K F M V

Y T M Y N F L D N H D V E R F L G L V - R D K R K Y L C A L T F L M T Y K G I P A I Y Y G D E V G L E N M D V P S M E C S R V P M E W N E K K W D K E I L K I T K E L I D L R R R S K A L Q - R G T F V P I F F - - E D K L L I

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G D E I G M R N I E A P F M E R S R A P M E W N K K R W D F E I L N I V K E L I K L R K G S K A L Q - V G T F E P V E F - - R E G M L L

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G N E I G M K N I E A P F M E R S R A P M E W N K K K W D K E I L K T T K E L I K L R R R S K A L Q - K G I F K P V K F - - K D K L L V

Y A M Y N F L D N H D V D R L L S L V - G D R D K Y L C A L V F L F T Y K G V P S I Y Y G D E V G L E N T D S P F M E R S R A P M R W D E S T W D K A I L E A T R A L A S L R R R S A A L Q - R G A F E P V R F - - E G G L L V

L S L Y N M L G S H D V P R I K S M V - Q N N K L L K L M Y V L I F A L P G S P V I Y Y G D E I G L E G G R D P D - - - N R R P M I W D R G N W D L E L Y E H I K K L I R I Y K S C R S M R - H G Y F L V E N L - - G S N L L F

E A A F N L L G S H D T P R I L T V C G E D V R K A K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P G - - - C R K C M I W D D D K Q H R G L Y E H V K Q L I A L R R Q Y R A L R - R G H I A V L H A D E Q T N Q L V

E V A F N L L D S H D T P R L L T L A K G D K K K Q K L A S L F Q F T F M G T P C I Y Y G D E V G M D G G G D P D - - - C R K C M E W D K D K Q D L D L F E F Y R R L I H I R A S H P A L R - T G T L T F L E A S R Q G T K L A

K A A F H L L D S H D T P R I L T T C K G N K N K V K L L Y V F H L S F I G S P C V Y Y G D E I G M D G G M D P G - - - C R K C M V W D E D K Q D T V L F K H I Q T L I S L R R Q Y K A F G G H G L F Q C I E A N D E Q G Y I S

E A A F N L L G S H D T S R I L T V C G G D I R K V K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P E - - - C R K C M V W D P M Q Q N K E L H Q H V K Q L I A L R K Q Y R S L R - R G E I S F L H A D D E M N Y L I

E V A F N L L G S H D T P R I L T T S G G S K E K L K L L F A Y Q L S F I G T P C I Y Y G D E I G M D G E Q D P G - - - C R K C M I W E E D K Q D R E L F T Y V K K L I S L R K K Y P V F G N G G D I T F I E A N D E T N H V I

598

608

594

600

600

596

600

604

605

605

592

631

522

518

523

656

637

638

644

652

656

645

654

655

660

644

696

587

581

586

588

589

Y E R K A G S E R V L V G I N C S W N D V E T P V P S N G S - - - - - - - - - - - - - - - - - - N E Q I K I P A F S S I I R V K D S M N V H I G S D L Q E

Y E R V L G D E R V R V E I R Y S M E P E D C T F H V T A S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Y E R A I D G E S L V V A I N C S E V H V K V S L P G G - - - - - - - - - - - - - - - - - - - - - K S L N L P P L S F R I V D T G R - - - - - - - - - - -

Y E R A L G K E R V R V E I R Y T K N P E E C R F K L F L S H L K - - - - - - - - - - - - - - R K Y W K N Y S P N T S - - - - - - - - - - - - - - - - - -

Y E R V Y Q N E G V L V G I N Y S D V P T A I Q I P E A Y R P A A - - - - - - - - - - - - - D G V S F L K M K P W S F V A L A S T I - - - - - - - - - - -

Y E R I Y E K E H V L V A I N C S S R V E S V L I P E K Y R P I V - - - - - - - - - - - - - - G K T S I E L A P W S F I V V F S R F N D V Q L L S W P - -

Y K R E H M G E R T I V A I N Y S N S R V K - - - - - - - - - - - - - - - - - - - - - - - - - - E L G I T I P E Y S G V I I N E D K V K L I K Y - - - - -

Y E R V S K G E R I L I G I N Y S E K E A K I K L P E K V K I L L - - - - - - - - - - - - - G Q L H G E R L P P F S F F I S S L - - - - - - - - - - - - -

Y E R I H G E E R L L I G I N Y S E N P V S L R K S P D E I L L - - - - - - - - - - - - - - G D L E N S V L K P F S F F V G R L S - - - - - - - - - - - -

Y K R V L N N E N I L V A I N Y S K K E K H L D L P P S F E I L F - - - - - - - - - Q S G S F D R V N I R L K P F S S I I A K K L - - - - - - - - - - - -

Y R R R L G D E S I L V A I N Y S E S E A V L E E P A Q S V L F R - - - - - - - - - - - - S G S V K E K L L G P F S S V V A G D R - - - - - - - - - - - -

I K R W I N N E E I I F L L N V S S K D I S V D L K K L - - G K Y S F D I Y N E K N I D Q H V E - N N V L L R G Y G F L I L G S K P C N I - - - - - - - -

Y E K T D G D E T V V I I I N R S N Q A A D I P L P F N A K K K R L V N L L T G E R W A A E A D G L S V S L P A Y G F A L Y A V E K - - - - - - - - - - -

Y E R R L G D D I L I V L V N T E E T A Q Y F Q L A V E - - E R Q W E N V L T D A P L R A E R G I L S M K L P A F G Y A V L K A V Y - - - - - - - - - - -

Y T K T Y G E E T I F F V L N P T N Q E I S A P I P F D I T G K K I V N L Y T N E E F S A E A D S L Q V A L P P Y G F S I L K W - - - - - - - - - - - - -

Y K K T D G D E T V L V I I N R S D Q K A D I P I P L D A R G T W L V N L L T G E R F A A E A E T L C T S L P P Y G F V L Y A I E R W - - - - - - - - - -

F T K Q N S S Q K M I A V L N N S D K E L S A T L P F S L E D T K L T D L L T G K E F A A H A E K L T V T V P P Y E M A F Y L V Q E - - - - - - - - - - -

522

524

517

522

521

523

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

lowast

Figure 1 Sequence alignment of Tk1770 CDase with sixteen CD hydrolyzing enzymes The alignment of Tk1770 CDase with archeal andbacterial CD hydrolyzing enzymes was carried out with Clustal Omega through UGENE packageThe novel N1015840-domain (CBM48) in archealsequences is represented in red and the protruding region of CBM48 domain in green dotted line The arrow shows the start of the TIMbarrel domain (residues 204ndash584) and four conserved regions (IndashIV) with another downstream conserved region V are represented in greyline below sequence The catalytic triad is indicated through esterics The HLH region of archeal sequences that is absent in all bacterialhomologs is represented in blue dotted line

PYRYC CDase THEG

J MAse

THERCLF CDase

PYRF

U N

Pase

GBACI CDase

THERSP CDase

GEO

SE NPase

THEO

N CD

ase

BACIIN

CDase

THERPA CDase

THES4 CDase

PBACI CDase

Tk1770 CDase

THELN NPase

STAMF 120572-amylase

BACM

Y120572

-CD

ase

THEPD120572-amylase

Figure 2 Phylogenetic tree rooted radial tree of 17 CD hydrolyzing enzymes was constructed using MrBayes with Wag rate matrix (fixed)and visualized using FigTree The phylogenetic tree obtained displays three distinct clades All the bacterial enzymes form a single clade(shown in blue) while the branch for archeal enzymes split into two clades (shown in green and red) Depending upon sequence identity anddomain arrangement Tk1770 CDase seems to be more closely related to THEGJ MAse THES4 CDase THERCLF CDase PYRFU NPaseTHEON CDase and PYRYC CDase (green)

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

2 Archaea

Table 1 List of the sequences used for alignment and phylogenetics

Enzyme Organism Abbreviation AA Seq similarity to Tk1770 UniProt IDCyclomaltodextrinase T kodakarensis Tk1770 CDase 656 100 Q5JJ59

Cyclomaltodextrinase Thermococcus sp (strainCGMCC) THES4 CDase 637 60 G0HJP6

Maltogenic 120572-amylase T gammatolerans THEGJ MAse 638 59 C5A4D9Cyclomaltodextrinase T cleftensis THERCLF CDase 644 59 I3ZTQ5Cyclomaltodextrinase T onnurineus THEON CDase 652 59 B6YV58Cyclomaltodextrinase Pyrococcus yayanosii PYRYC CDase 656 57 F8AHJ5Neopullulanase Pyrococcus furiosus PYRFU NPase 645 56 Q8TZP8Cyclomaltodextrinase T paralvinella THERPA CDase 654 56 W0I4Q4Neopullulanase T litoralis THELN NPase 655 55 H3ZKI8Cyclomaltodextrinase Thermococcus sp B1001 THERSP CDase 660 54 Q9HHC8120572-amylase T pendens THEPD 120572-amylase 644 52 A1S075

120572-amylase Staphylothermusmarinus STAMF 120572-amylase 696 28 A3DM60

Cyclomaltodextrinase Geobacillus sp G1w1 GBACI CDase 587 32 A0A093UHG3Cyclomaltodextrinase Paenibacillus wynnii PBACI CDase 581 31 A0A098M8Z8120572-cyclomaltodextrinase Bacillus mycoides BACMY 120572-CDase 586 30 C3APY4Neopullulanase G stearothermophilus GEOSE NPase 588 31 Q9AIV2Cyclomaltodextrinase Bacillus indicus BACIIN CDase 589 29 A0A084GIJ0AA means amino acids

a catalytic triad [13 14] and a C-terminal domain consistingof 120573-strands only [10] Many enzymes also possess N- andorC-terminal carbohydrate binding modules like CBM34CBM20 CBM41 and CBM48 [12 15] The CD hydrolyzingenzymes include cyclodextrinase maltogenic amylase andneopullulanase that hydrolyze CDs into linear maltodextrinsor maltose [16 17] Recently thermostable pullulan hydrolaseIII from Thermococcus kodakarensis (KOD1) has also beenreported to hydrolyze CDs into maltose or glucose [18] Thethermostable enzymes from hyperthermophiles have manyadvantages including higher rates of reaction increased prod-uct yields and decreased risks of contamination as comparedto their mesophilic homologs [19ndash21] Due to the advancedsequencing technologies and rapidly increasing numbers ofgenomes being sequenced the number of sequences beingclassified as glycoside hydrolases is far exceeding the numberof enzymes being structurally or biochemically characterized[22 23] Currently GH13 family contains 26287 sequencesbut only 99 structures have been resolved [24] Till the dateof writing this work Protein Data Bank contains only sixCD hydrolyzing enzymes (PDB IDs 4EAF 1EA9 2XIE 1J0H1H3G and 1BVZ) [6]There is a need for better understandingof sequence and structural components of these proteins andtheir mechanism of catalysis as CDases A bioinformaticsapproach can be used as a valuable predictive tool to provideinformation about structure and function of these enzymes

In this work we have used an in silico approach to provideinsight into the sequence structural components domainarrangement catalytic machinery and enzyme-substrateinteractions of thermophilic cyclodextrinase (Tk1770) fromThermococcus kodakarensis (KOD1) an enzyme of potentialindustrial applicationsThis study provides the first attempt touse in silico approach to provide insight into the structure and

key components of catalytic machinery of cyclodextrinase(CDase) from T kodakarensis

2 Materials and Methods

21 Sequence Retrieval Alignment and Phylogenetic AnalysisTheamino acid sequence (UniProt IDQ5JJ59) of CDase fromT kodakarensis KOD1 (CDase-Tk Tk1770) was retrievedfrom UniProtKB A blast sequence similarity search wascarried out against UniProtKB to find homologs of Tk1770From the blast results sixteen different sequences of CDhydrolyzing enzymes from bacterial and archeal sourceswere selected for further studies (Table 1) The alignment ofsequences was carried out with Clustal Omega and a rootedtree was generated using Bayesian inference method withdefault parameters [25 26]

22 Homology Modeling The Tk1770 CDase was subjectedto NCBI BLAST against RCSB PDB (Protein Data Bank)to search suitable template(s) for comparative modelingMultiple X-ray crystallographic structures (PDB ID 4AEF1J0H 4AEE 1EA9 1SMA and 1WZL) with sequence identityfrom 56 to 29 respectively were selected as templates(Table 2) The sequences of target (Tk1770) and templateswere aligned with Clustal Omega using UGENE program[25] The alignment and the PDB structures were used asinputs for homology modeling with Modeller v914 [27]The model optimization was carried out by variable targetfunction method (VTFM) with conjugate gradients (CG)and molecular dynamics (MD) with simulated annealing(SA) methods [27 28] The models generated by Modellerwere scored on the basis of their DOPE (Discrete OptimizedProtein Energy) values and the model with lowest DOPE

Archaea 3

Table 2 List of the PDB files used as templates for homology modeling of CDase Tk1770

Serialnumber PDB ID Organism Enzyme identity with TK1770 query cover

1 4AEF P furiosus Amylase 56 982 1EA9 Bacillus sp Cyclomaltodextrinase 33 763 1J0J G stearothermophilus Neopullulanase 33 784 1SMA Thermus sp Maltogenic amylase 32 795 4AEE S marinus Maltogenic amylase 29 956 1WZL Thermoactinomyces vulgaris 120572-amylase II 35 76

score was selected for further studies The homology modelwas further validated by ProSA-web server and PROCHECK[29 30] The model was refined by Modeller loop refinementfunctions and again validated for confidence Thus a reliablemodel was constructed and visualized using PyMOL [31]

23 Molecular Docking Studies In order to investigate theenzyme-substrate interactions the docking of substrates (120572-120573- and 120574-cyclodextrins) into the active pocket of Tk1770 wascarried out using AutoDock and MGL Tools v156 [33] Thesubstrates were prepared by adding polar hydrogen atomsand partial charges The protein model was prepared byadding polar hydrogens and Gasteiger charges The grid mapdimensions were set around the active site with all otherparameters set to default and rigid docking was performedThe candidates poses of the substrates were scored on thebasis of their binding energy in kcalmol and the best poseswith lowest binding energy (kcalmol) were selected

3 Results and Discussion

31 Sequence Alignment and Phylogenetic Tree The sequenceof Tk1770 consisting of 656 amino acids was aligned withsixteen CD hydrolyzing enzymes from the GH13 family(Figure 1) These sequences included eleven archeal enzymesand five bacterial enzymes having sequence identities from28 to 60 with Tk1770 CDase (Table 1) All enzymespossess three major domains (i) an N-domain (ii) a catalyticTIM barrel and (iii) a C-domain [10 34]The sequence anal-ysis showed that archeal enzymes contain two N-terminaldomains (ie N1015840- and N-domain) in addition to the catalyticand C-domains whereas the N1015840-domain is absent in all thebacterial CD hydrolyzing enzymes (Figure 1) A linker regionfrom residues 190 to 203 in Tk1770 connects two N-terminaldomains with two C-terminal domains Four conservedregions of GH13 family in TIM barrel structure were iden-tified from residues 299 to 310 405 to 414 433 to 441 and 496to 502 with catalytic triad being Asp411 Glu437 and Asp502An additional conserved region of amino acids 533ndash539 wasalso identified downstream to the conserved regions IndashIV

A rooted phylogenetic tree was constructed from align-ment usingMrBayes with rate matrix wag (fixed) to find evo-lutionary relationship The tree was divided into three cladeswith all bacterial enzymes forming one clade and archealenzymes divided into two clades (Figure 2) The tree showedthat Tk1770 CDase is more closely related to THEGJ MAseand THES4CDases with a sequence identity of 59 and 60

respectively (Figure 2) The STAMF 120572-amylase shows 28sequence identity with Tk1770 CDase and acts as outgroup inthe phylogenetic tree The 120572-amylases usually do not exhibitCDhydrolyzing activity and they also lackN1015840-domainThe120572-amylase (STAMF 120572-amylase) from S marinus is quite uniquein this regard as it exhibits both CD hydrolyzing activity andadditional N1015840-domain [35] It suggests that during the courseof evolution the presence of N1015840-domain might be linked toCD hydrolyzing activity in archaea

32 Homology Modeling The homology modeling programModeller v914 [27] was used to construct 3D structure ofTk1770 with multiple templates as described inMaterials andMethods Out of five models generated the best model withlowest DOPE value was selected

In homology modeling sometimes the model mightcontain certain high-energy loops or residues with unusualgeometry Thus the model selected was refined using Mod-eller built-in loop-refinement function on loops rangingfrom 3 to 7 amino acids in length and then validated withProSA-web server and PROCHECK analysis [30]The overallquality of the model was estimated by ProSA server interms of 119885-score by comparing it with 119885-score values ofexperimentally resolved protein structures in Protein DataBank [29] Ramachandran plot validated all the nonglycinenonproline residues to be in allowed regions and 879 ofresidues in most favorable regions This verifies that all theresidues exhibited accurate stereochemical positions

Homology model of Tk1770 CDase was aligned with Pfuriosus neopullulanase (PYRFU NPase) (PDB ID 4AEF)for an analysis and comparison of the active site and otherstructural features The overall structure of Tk1770 CDasefolds into four major domains with two 120573-strands only N-terminal domains (ie N1015840- and conventional N-domain)connected to TIM barrel (A-domain) and a C-terminaldomain also consisting of 120573-strands The structure of N1015840-domain of Tk1770 typically represents CBM48 with eight 120573-strands [15 36] The structural alignment of N1015840- or CBM48domain of Tk1770 and PYRFU NPase revealed that bothcontain a loop that extends into the catalytic site Howeverthe extended loop of N1015840-domain of PYRFU NPase forms amore flexible helical turn as compared to the loop of Tk1770(Figure 3) The substitution of P91 and S92 in extended loopregion of Tk1770 in place of K89 and G90 in loop of PYRFUNPase might be responsible for this apparent decreasedflexibility of loop in N1015840-domain of Tk1770 (Figure 3) Fur-thermore K89 and G90 in extended loop of N1015840-domain in

4 Archaea

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMY

GEOSEBACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

G - - - - - - - - - - N D M V F H R P A L L Y L Y S F G D R - T H V L L R S K K G K V D A A Y L V T D D T H - - - - - - - - - - - V K M R K K A D G E V F E Y Y E A V L Q E - T E K L R Y S F E V F L K E G K S L - - - - - - -

G - - - - - - - - - - G D G P F H A P S A T Y L Y T V A G R - T H V L L R A K A G T V A K A A L V R P E S E - - - - - - - - - G M V E M R K K A R D E P F E Y F E A V L P G - D G E L E Y S F E V R T R K G M I K - - - - - - -

G - - - - - - - - - - P E P V Y H S P S L L Y L Y T F G G R - V N F V L R A K K G Y L V S S T L I L K G K D - - - - - - - - - - - I E M R K R A S D E L F D Y F G A E V G N L E G P V E Y S F L G E S S E G - P F - - - - - - -

G

A

A

- - - - - - - - - - E G E F F H R P S A T Y L Y S I A G G - T H V L L R A R R G K T R K V R L I L D E S E - - - - - - - - - - - V P M K R K A F D E L F E Y Y E A I L P G - E G V I R Y S L I V E S E G - K T I - - - - - - -

- - - - - - - - - - G D D F Y H E P A L A Y L Y S F A D R - T H V L L R T V K G K A I S T Y L I T D E R - - - - - - - - - - - - I E M R K K A S D E L F D Y F E A V L P R - T E E L S Y G F E I E T G E G - T I - - - - - - -

- - - - - - - - - - G D E F Y H E P S L L Y I Y S F A D R - T H V L F R A V R G R A L R V I L V T D E S - - - - - - - - - - - - V G M R K K A S D E L F D Y F E A I L P R - V K E L S Y T F E I E T E E G - S V - - - - - - -

K S - - - - - - - - - D D L V F H T P S L L Y L Y E I F G R - V H V L L R T Q K G V I K G A T F L G E K H - - - - - - - - - - - - V P M R K K A S D E L F D Y F E V I V E G G D K R L N Y S F E V L T M E G A K F - - - - - - -

S - - - - - - - - - - G E E F Y H Y P S L I Y A Y S L G D L - A Y I R F R A I K G T V K K V F L I S D Q K - - - - - - - - - - - - Y E M R K K A R D D L F E Y F E A V L P K - K E E L E Y Y F E I H T A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

G F D P A S C N G F C E E A L Y H Y P S L T Y V Y P F G G V - L F V R L R A L R G S L Q K A F L V V D G R R - - - - - - - - - - - L E M R L K A R D E V F D Y Y E A S L E A - G G E V S Y Y F E V L G G G R - L H - - - - - - -

K E P D N P - - - - - L D K I I H I E E S G F I H K F N G E - I I I R L I A P T E I N E P L I D L G N E I R - - - - - - - - - - - E P L T K H V V G D N I V Y Q Y I I - - P S R S I L R Y R F I F N Y N D K K L F Y G D E G V S

- - - - - - - - - M F K E A V Y H R P T D N F A Y A Y D E R T L H L R L R T K K G D V D K V E L L H G D P Y E W R N G A W Q F E T M P M K K T G S D E W F D Y W L A E V Q P P Y R R L R Y G F V L H A G E E T L V Y T E K G V Y

- - - - - - - - - M L L E A V Y H R P R L N W S Y A Y N E N T I H L R L R A K K G D L T E V Y A W T G D K Y A W D T T K - - - E L I P M S L F T S D E M F D Y W E C E T V P P H R R L K Y G F L L Q K G S E R I W M T E S D F Q

- - - - - - - - - M F K E A I Y H R P K D N Y A Y A Y D E K T L H I R L R T K K N D V D I A S L I H G D P Y E W Q D G K W I T A N I P M K K S G S T D L F D Y W F V S I E P N F K R L R Y G F E L K N N T E T I V Y T E R G F F

- - - - - - - - - M R K E A I Y H R P A D N F A Y A Y D S E T L H L R L R T K K D D I D R V E L L H G D P Y D W Q N G A W Q F Q T M P M R K T G S D E L F D Y W F A E V K P P Y R R L R Y G F V L Y S G E E K L V Y T E K G F Y

- - - - - - - - - M L K E A V Y H R P K N Q Y A Y A Y D E K T L H I R L R T K K N D V E T V S L V H G D P Y E W S K D G W T F K Q N E M K K S G S D E L F N Y W F T A V E P E Y R R M R Y G F E L T S G D E K W I Y T E K G F I

- - - - - - - T L G P F E A - - - - A P F R L D A P S W I L D R V F Y Q I M P D R F A K G R D H E P P F L - - - - - S - - - - - W E Y Y G G D L W G I V E K I D H L E E L G V N A L Y L T P I F E S M T Y H G Y D I T D Y L R V

- - - - - - - E L G P F R A - - - - V P Y R P E T P L W V Y G R V F Y Q I M P D R F E R G L P G - T P R G R A F R G - - - - - - E E F H G G N L A G I I K R L E H L E E L G V N A L Y L T P I F E S M T Y H R Y D V T D Y F S I

- - - - - - - E L G P F S A - - - - V P I A L K A P E W P L E R V F Y Q V M P D R F A G N C L R - - - - - - - - - D S - - - - - G N F C G G D L W G L K E R L D H I A G L G F N A L Y L T P I F E S T T Y H G Y D V V D Y F H V

- - - - - - - E L G P F E A - - - - K P Y R Y N A P G W I H G R V F Y Q I M P D R F E R G L P G - T P R G R A F A G - - - - - - E G F H G G D L A G I I R R L D H I E S L G A N A L Y I T P V F E S T T Y H R Y D V T D Y F H I

- - - - - - - E Y G N F T A - - - - E P R E L Q V P R W I F N R V F Y Q I M P D R F E R D M I K - K P R G R I I E T G - - - - - L G H H G G D L A G I V K R L G H L E G L G V N A L Y L T P I F E S M T Y H G Y D I V D Y F K V

- - - - - - - E Y G D F T A - - - - T P K E L S T P K W I F S R V F Y Q I M P D R F E R E S N E - E K V - - - - G G D - - - - - P K I Y G G N L P G I L K R L D Y I E G L R V N A L Y L T P I F E S I T Y H G Y D V I D Y F N V

- - - - - - - E Y G Q F K A - - - - R P F S I E F P T W V I D R V F Y Q I M P D K F A R S R K I - - - - Q G I A Y P K - - - - - D K Y W G G D L I G I K E K I D H L V N L G I N A I Y L T P I F S S L T Y H G Y D I V D Y F H V

- - - - - - - N Y G D F K V D F N E Q K E M F K P P T W I F E R I F Y Q I M P D R F A N G N P E N D P H D C I E L G - - - - - - I S H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I I D Y F H V

- - - - - - - D Y G D F N V D F N E Q K E R F K P P A W I F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - D Y G D F K V D F N E Q K E R F K P P A W V F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - R Y G E F S V D V K S L E S L I R V P E W V Y G S V F Y Q I M P D R F A E - - - - - - - - - - - - - - - - - - - - - - - - - G G L E E I A E R L N H V S G L G A N A L Y L T P I F E S T T Y H G Y D V V D Y Y R V

E - - - - - - N S S Y I V V N S K Y I P G - V D K P R W Y M G T V Y Y Q I F I D S F D N G D P N N D P P N R I K K T V - - P R E Y G Y Y G G D L A G I M K H I D H L E D L G V E T I Y L T P I F S S T S Y H R Y D T I D Y K S I

L T P P A D D T A Y Y F C F P F L H D V D L F H A P E W V K D T V W Y Q I F P D R F A N G N P A I N P E G V R P W G S E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I S G I Y L T P I F R A P S N H K Y D T A D Y L E I

T K R P E - N P E K L F E F P Y I N R S D I F T P P A W V K D A V F Y Q I F P E R F A N G D P S L D P E N V Q P W G G - K P E R D S F F G G D L Q G V I D H L D H L S E L G I N A I Y F T P V F A A T T N H K Y D T E D Y M R I

P E T P N D D V G N F F C F P F I H E Q D V F R T P S W I K D T V W Y Q I F P E R F A N G D P S C N P A D T L P W G S T D P T T T N F F G G D F A G V I Q H L D Y L V K L G I S G I Y F T P I F T A H S N H K Y D T I D Y M E I

F E A P I D D T A Y Y F C F P F L H R V D L F E A P D W V K D T V W Y Q I F P E R F A N G N P S I S P E G S R P W G N E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I T G I Y L T P I F R S P S N H K Y D T A D Y F E V

K D V V S D D T A P Y F A F P F L N K A D V F H A P E W V K D T V W Y Q I F P E R F A N G D S S I N P E G T L E W G S I E P T S G N F F G G D F E G V I Q N I G Y L K E L G I S G I Y F T P V F K A Y S N H K Y D T I D Y M E L

S P K V R E F V A R V M N Y W L E K - G A D G W R L D V A H G V P P G F W R E V R E G - - - L P D D A Y L F G E V M D D P R L Y L F - G V F H G V M N Y P L Y D L L L R F F A F G E I G A T E F I N G I E L - L S A H L G P A E

S P E V R K F I R E V M E Y W L E R - G A D G W R L D V A H G V P P E L W G E M R K A - - - M P E G A Y L M G E V M D D P R L W V F - D A F H G T M N Y P L Y E L I L R F F V K G E I D A G E F L N G L E L - L S A H L G P A E

S E E V F E F V V N V M G Y W L K K - - A D G W R L D V A H G V P P D F W V R V R E R - - - M P S S A Y L I G E V M D D A R L Y L F - R G F H G V M N Y A L Y D A I L K F F A F G E I S A E E F L N E L E L - I S V R Y G P A E

N P E V K R L V K D V M M H W L E K - G A D G W R L D V A H G V P P E L W R E V R K A - - - L P K D A Y L V G E V M D D P R L W L F - D K F H G T M N Y P L Y E L I L R F F V E R E I D A G E F L N G L E L - L S A H L G P A E

N P Q V R E F I V S V M K H W L E E - G A D G W R L D V A H G V P P E L W R E V R E R - - - M P E D A Y L V G E V M D D A R L W L F - D K F H G T M N Y P L Y E A I L R F F V R G E I S A E E F L N W L E L - L S T Y Y G P A E

D P R V R K F I A K V M N Y W L E K - G I D G W R L D V A H G I P P D L W R E I R K E - - - M P E D A Y L V G E V M D D A R M W L F - D K F H G T M N Y P L Y E A I L R F F V T G E I T A E E F L N Y L E L - L S T Y Y G P A E

N P K V R E F I K N V I L F W T N K - G V D G F R M D V A H G V P P E V W K E V R E A - - - L P K E K Y L I G E V M D D A R L W L F - D K F H G V M N Y R L Y D A I L R F F G Y E E I T A E E F L N E L E L - L S S Y Y G P A E

N P E V K E F I R T V M K Y W L E R - G A D G W R L D V A H G V P P D V W R E I R K D - - - I P D D A Y L L G E V M D D A R L W L F - D K F H G T M N Y P L Y E A L L R F F V Y N E I T A E E F L N W L E L - L S V Y Y G P A E

S K G V R E F I G N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

S K G V R E F I R N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

N P E V R S F I T G V G R Y W V S R - G V D G W R L D V A H G V P P E L W R E F R E T - - - L P G D V Y L F G E V M D D A R I W L F - D K F H G A M N Y L L Y D A V L R F F A Y R E I T A E E F L N R L E L - L S V Y Y G P G E

N P R T V D Y F I D I T K F W I D K - G I D G F R I D V A M G I H Y S W M K Q Y Y E Y I K N T Y P D F L V L G E L A E N P R I Y M - - D Y F D S A M N Y Y L R K A I L E L L I Y K R I D L N E F I S R I N N V Y A Y I P H Y K A

H P D V R R Y L L D V A T Y W I R E C D I D G W R L D V A N E I D H E F W R E F R R A V K A Q K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P V A D A A L R F F A K E E I N A R E F A E R L M R V L H S Y P A T V N

N P E V K Q Y L L E V A E Y W I K E V G I D G W R L D V A N E V S H E F W R E F R K V V K R A N P D A Y I L G E I W H E S A P W L E G D K F D A V M N Y P F T S A V I D F F V F G N L D A E G F A N S I G K Q L S R Y P L Q A S

H P D V K E Y L L K V G R Y W V R E F H I D G W R L D V A N E V D H S F W R E F R S E I K A I N P E V Y I L G E I W H D A Q P W L Q G D Q F D A V M S Y P I T N A L H S Y F A N E T I G A S E F M E Q I T A S L H S Y S M N V N

N P E V K R Y L L D V A T Y W I R E F D I D G W R L D V A N E I D H E F W R E F R Q A V K A L K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P F T D G V L R F F A K E E I G A R Q F A N Q M V H V L H S Y P N N V N

H P D V R S Y L L E V G R Y W V R E F D I D G W R L D V A N E V D H A F W R E F R Q A V R A E K E D V Y I L G E I W H D S M P W L Q G D Q F D A V M N Y P F T T G T M N F I A N N K V K A E E F V H I M E S V L H S Y P K N V N

106

109

109

104

104

104

104

104

104

104

104

105

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

188

193

187

185

184

184

187

184

184

184

195

202

104

101

104

104

104

279

287

274

279

279

275

279

283

284

284

275

305

216

211

216

216

216

386

394

381

386

386

382

386

390

391

391

378

417

302

298

302

302

303

187

192

186

184

183

183

186

183

183

183

194

201

103

100

103

103

103

278

286

273

278

278

274

278

282

283

283

274

304

215

210

215

215

215

491

499

485

491

491

487

491

495

496

496

483

525

413

409

413

413

414

385

393

380

385

385

381

385

389

390

390

377

416

301

297

301

301

302

E R L G G E E A F R E L V K A L K S R D I K L V L D G V F H H T S F F H P F F R D V V E R G E E S E Y A D F Y R V K G F P V - - V S E E F I R V L K S D L P P M E K Y Q T L K K M G W N - - - Y E S F F S V W V M P R L N H D

D

A

A

A

A

A

A

A

A

R K L G G G G V F G E F V K E L K K R D I R L I L D G V F H H T S F F H P Y F Q D V V R K G E G S E Y R G F Y R I T G F P V - - V P E Q F L R V L H S E G P W I E R Y H L I K S L D W N - - - Y E S F Y S V W L M P R L N H D

S R R L G G D E A F D E L V K E L R R R G I K L I L D G V F H H T S F F H P Y F Q D V V E K G E R S R Y V G F Y R I L G F P V - - V S K R F L R A L D S G L L P G D T R S A P M G A E W N - - - Y E S F Y S V W L M P R L N S D

D R K L G G D G T F L K L A G E L K K R D I K L V L D G V F H H T S F F H P F F Q D L I A R G N E S D Y K D F Y R V T G F P V - - V S G E F L E V L R S K I S P R E K H R R L K E I G W N - - - Y E S F Y S V W L M P R L N H E

G K F G G N E A F G E L A R E L K R R D I K L I L D G V F H H T S F F H S Y F Q D V V K K G G E S R Y R D F Y R I L K F P V - - V S K D F L R V L D S N E P P E R K Y K G L K E L H Q N - - - Y E N F F S V W L M P R L N H D

K R L G G N A A F E K L V R E L K R R D I K L I L D G V F H H T S F F H P H F Q D V V R K G V E S V Y R D F Y R I T G F P V - - V S Q E F L E I L N S E E P W E E K F K R L K N L D W N - - - Y E S F F S V W L M P R L N H D

R R L G G D R A F V D L L S E L K R F D I K V I L D G V F H H T S F F H P Y F Q D V V R K G E N S S F K N F Y R I I K F P V - - V S K E F L Q I L H S K S S W E E K Y K K I K S L G W N - - - Y E S F F S V W I M P R L N H D

K K F G G D K A L K Q L V N E L K K R D I K L I L D G V F H H T S F F H P Y F Q D I L K K G K E S K Y R N F Y R I F G F P V - - I S K E F S K L L H S N E P W I E K Y Q K L R K L K W N - - - Y E S F F S V W L M P R L N H E

R K F G G D E A F E K L V Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

R K F G G D E A F E K L M Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

G R L G G D E A F G R L L A E L K K R G M R V V L D G V F H H T S F F H P Y F Q D L V E K G E E S R Y K G F Y R V L G F P V - - V P R E F L E A L R S G A P R H E - - - - L K K Y P R R - - - Y E S F F D V W L M P R L N H D

D K Y L G T M E D F E K L V Q V L H S R K I K I V L D I T M H H T N P C N E L F V K A L R E G E N S P Y W E M F S F L S P P P K E I V E L M L K Y I D G E E C R S R E L Y K L D Y F R N N K P F Y E A F F N I W L M A K F N H D

D P H F G D K E T L K T L V Q R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V L K N G E S S P Y K D W F H I R D F P L - - Q S E - P - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D A K T L K K L V D V C H E R G I R V L L D A V F N H A G K T F A P F I D V Q E K G E A S P Y K D W F H I N Q F P L - - A F D Q D - - - - - - - - - - - - - - - - - - - - I P S - - - Y D T F A F E P L M P K L N T E

D P Q F G T K E T F K K L V N A C H K R G I K V M L D A V F N H S G Y F F D K F Q D V L K K G K Q S R Y T N W F H I H E F P I - - V T E - P - - - - - - - - - - - - - - - - - - - - L P N - - - Y D T F A F T P Y M P K L N T A

D P H F G D K E T L K T L I D R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V W K N G E S S K Y K D W F H I H E F P L - - Q T E - S - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D K E T F K R L V R T C H D N G I K V M L D A V F N H S G Y Y F P Q F Q D V L E H G E K S S Y K D W F H I R K F P L - - K N E D D - - - - - - - - - - - - - - - - - - - - T I N - - - Y D A F A F V E S M P K L N T E

lowast lowast

M Y K V F G F E E N F I H G R V A R - - V E F S L P D A G R W D Y A Y L L G N F N A F N E G S F R M K H E D K R W I I E I K L P E G L W R Y A F S A G G E F - - L L D P E N P E K E L Y R R P S Y K F E R E V S L A K I A

W

M R K V Y K I F G F E P D Q K F G R V A V - - V E F S I P A E P G N R Y A Y L L G S F N A F N E G S F R M R R K K G R W R T V V K L P E G V W H Y A F S I D G E F - - T P D P E N P R R E V Y R R L S Y K F E R E T S V A V I D

- - -

- - -

M Y K T F G F V E D P V F G R L A R - - V E F S I P Y R - G E R Y A Y L L G S F N A F N E G S F R M E R R G S R W F I R V L L P E G V W R Y A F S L E G R F - - E R D P E N E N V E T Y R R P S Y K F E K E V S V A G V I

- - - M Y K I F G F E P D W R F G R V A R - - V E F S I P A R -- G K Y A Y L L G N F N A F N E G S F R M E R K G E R W R I T L R L P E G V W Y Y G F S V D G E F - - L M D P E N P D V E T Y R K L S Y K L E K E A S V A R I V

- - - M Y K T F G F E S N E Y F G R I A K - - V E F S V P S R - - G S Y A Y L V G S F N A F N E G S F R M R E E N G R W R A T V E L P E G V W H Y G F S I D G K Y - - A P D P E N P E K R A Y R R F S Y K F E R E T S V A R I S

- - - M Y K I L E F G H N E Y F G R V A K - - V E F S F P K R -- G G Y A Y L V G S F N A F N E G S F R M R E K G D R W H I V I D L P E A I W Y Y G F S L D G K Y - - T P D I E N P E R T L Y R R L S Y K F E R E V S I A R I

- - - M Y K L V S F R D S E I F G R V A E - - V E F S L I R E - - G S Y A Y L L G D F N A F N E G S F R M E Q E G K N W K I K I A L P E G V W H Y A F S I D G K F - - V L D P D N P E R R V Y T R K G Y K F H R E V N V A R I V

- - - M Y K I F G F K N D K Y L G K V A E - - V E F S M L K R - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W S I K I E L P E G V W Y Y A F S I D G D L - - M L D P E N R E K T T Y K R H S Y K F R R T V N V A K I F

- - - M Y K I F G F K D D D Y L G K V G I - - T E F S I P K R - - G S Y A Y L L G N F N A F N E G S F R M K E K G D R W Y I K V E L P E G I W Y Y A F S I D G N L - - T L D F E N N E K A V Y R R L S Y K F E K T V N V A K I F

- - - M Y K I F G F K D N D Y L G K V G I - - T E F S I P K S - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W Y I K V E L P E G I W Y Y T F S V D G N L - - I L D F E N N E K T V Y R R L S Y K F E K T V N V A K I F

- - - M Y R V L G F R D D V Y L G R V V K - - A E F S A P R E - - G E Y A Y L L G N F N A F N E G S F R M R G A G D R W V V E V E L P E G V W Y Y L F S L G G R R - - A V D P E N P E T T V Y S R R A Y K F E E R V S V A K L L

- - - M Y K I I G R E I - Y G K G R K G R Y I V K F T R H W P Q Y A K N I Y L I G E F T S L Y P G F V K L R K I E E Q G I V Y L K L W P G E Y G Y G F Q I D N D F E N V L D P D N E E K K C V H T S F F P E Y K K C L S K L V I

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

105

108

108

104

103

103

103

103

103

103

103

103

Figure 1 Continued

Archaea 5

492

500

486

492

492

488

492

496

497

497

484

526

414

410

414

414

415

597

607

593

599

599

595

599

603

604

604

591

630

521

Y F T Y N F L D N H D T E R F I D L A - G - K E R Y L C A L T F L M T Y K G I P A I F Y G D E I G L R G S - G E G M S A G R T P M S W D E E K W D F Q I L R Q T M K L I E L R R S L K S L Q - V G S F R V I G A - - G E K W F V

Y A M Y N F L D N H D T E R F L D L V - G D K R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G R L D G G L S A G R T S M V W D R G K W D T E I F E T T K R L I R L R R G S R A L Q - L G E F V P V R F - - Q G R T M I

Y Y A Y N F L D N H D T E R F L D L V - H D E R L Y L C A L A F L M T Y K G I P A V F Y G D E I G L R G R K G G G L D A G R T P M K W R E E N W N R E I L E T T R E L I H L R R N S K A L Q - F G T F R P L L F - - R G R T I V

Y A M Y N F I D N H D T E R F I D L V - N D E R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G K L E G G L D A G R T P M E W N P E G W N E R I L E T T R K L I E L R K R S K A L Q - L G D F I P L R F - - E G D E I I

Y S M Y N F L D N H D V E R F L D L V - G D E R R Y L C A L A F L M T Y K G I P A L F Y G D E I G L R G I G A S G M E S S R T P M K W G K E T W N T K I L R V T K A L I R L R R K S K A L Q - L G E F R P L E F - - K G G L L L

Y M M Y N F L D N H D V E R F L D L V - G D R K R Y L C A L A F L M T Y K G I P S I F Y G D E I G L S G M E G K G L E V S R T P M R W E G N Q W D T E I L K V T K A L I R L R R N S R A L Q - L G F F R P L K F - - K G R L L V

Y L M Y N F L D N H D V E R F L D I V - G D K R K Y V C A L V F L M T Y K G I P S L F Y G D E I G L R G I N L Q G M E S S R A P M L W N E E E W D Q R I L E I T K T L V K I R K N N K A L L - F G N F V P V K F - - K R K F M V

Y T M Y N F L D N H D V E R F L G L V - R D K R K Y L C A L T F L M T Y K G I P A I Y Y G D E V G L E N M D V P S M E C S R V P M E W N E K K W D K E I L K I T K E L I D L R R R S K A L Q - R G T F V P I F F - - E D K L L I

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G D E I G M R N I E A P F M E R S R A P M E W N K K R W D F E I L N I V K E L I K L R K G S K A L Q - V G T F E P V E F - - R E G M L L

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G N E I G M K N I E A P F M E R S R A P M E W N K K K W D K E I L K T T K E L I K L R R R S K A L Q - K G I F K P V K F - - K D K L L V

Y A M Y N F L D N H D V D R L L S L V - G D R D K Y L C A L V F L F T Y K G V P S I Y Y G D E V G L E N T D S P F M E R S R A P M R W D E S T W D K A I L E A T R A L A S L R R R S A A L Q - R G A F E P V R F - - E G G L L V

L S L Y N M L G S H D V P R I K S M V - Q N N K L L K L M Y V L I F A L P G S P V I Y Y G D E I G L E G G R D P D - - - N R R P M I W D R G N W D L E L Y E H I K K L I R I Y K S C R S M R - H G Y F L V E N L - - G S N L L F

E A A F N L L G S H D T P R I L T V C G E D V R K A K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P G - - - C R K C M I W D D D K Q H R G L Y E H V K Q L I A L R R Q Y R A L R - R G H I A V L H A D E Q T N Q L V

E V A F N L L D S H D T P R L L T L A K G D K K K Q K L A S L F Q F T F M G T P C I Y Y G D E V G M D G G G D P D - - - C R K C M E W D K D K Q D L D L F E F Y R R L I H I R A S H P A L R - T G T L T F L E A S R Q G T K L A

K A A F H L L D S H D T P R I L T T C K G N K N K V K L L Y V F H L S F I G S P C V Y Y G D E I G M D G G M D P G - - - C R K C M V W D E D K Q D T V L F K H I Q T L I S L R R Q Y K A F G G H G L F Q C I E A N D E Q G Y I S

E A A F N L L G S H D T S R I L T V C G G D I R K V K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P E - - - C R K C M V W D P M Q Q N K E L H Q H V K Q L I A L R K Q Y R S L R - R G E I S F L H A D D E M N Y L I

E V A F N L L G S H D T P R I L T T S G G S K E K L K L L F A Y Q L S F I G T P C I Y Y G D E I G M D G E Q D P G - - - C R K C M I W E E D K Q D R E L F T Y V K K L I S L R K K Y P V F G N G G D I T F I E A N D E T N H V I

598

608

594

600

600

596

600

604

605

605

592

631

522

518

523

656

637

638

644

652

656

645

654

655

660

644

696

587

581

586

588

589

Y E R K A G S E R V L V G I N C S W N D V E T P V P S N G S - - - - - - - - - - - - - - - - - - N E Q I K I P A F S S I I R V K D S M N V H I G S D L Q E

Y E R V L G D E R V R V E I R Y S M E P E D C T F H V T A S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Y E R A I D G E S L V V A I N C S E V H V K V S L P G G - - - - - - - - - - - - - - - - - - - - - K S L N L P P L S F R I V D T G R - - - - - - - - - - -

Y E R A L G K E R V R V E I R Y T K N P E E C R F K L F L S H L K - - - - - - - - - - - - - - R K Y W K N Y S P N T S - - - - - - - - - - - - - - - - - -

Y E R V Y Q N E G V L V G I N Y S D V P T A I Q I P E A Y R P A A - - - - - - - - - - - - - D G V S F L K M K P W S F V A L A S T I - - - - - - - - - - -

Y E R I Y E K E H V L V A I N C S S R V E S V L I P E K Y R P I V - - - - - - - - - - - - - - G K T S I E L A P W S F I V V F S R F N D V Q L L S W P - -

Y K R E H M G E R T I V A I N Y S N S R V K - - - - - - - - - - - - - - - - - - - - - - - - - - E L G I T I P E Y S G V I I N E D K V K L I K Y - - - - -

Y E R V S K G E R I L I G I N Y S E K E A K I K L P E K V K I L L - - - - - - - - - - - - - G Q L H G E R L P P F S F F I S S L - - - - - - - - - - - - -

Y E R I H G E E R L L I G I N Y S E N P V S L R K S P D E I L L - - - - - - - - - - - - - - G D L E N S V L K P F S F F V G R L S - - - - - - - - - - - -

Y K R V L N N E N I L V A I N Y S K K E K H L D L P P S F E I L F - - - - - - - - - Q S G S F D R V N I R L K P F S S I I A K K L - - - - - - - - - - - -

Y R R R L G D E S I L V A I N Y S E S E A V L E E P A Q S V L F R - - - - - - - - - - - - S G S V K E K L L G P F S S V V A G D R - - - - - - - - - - - -

I K R W I N N E E I I F L L N V S S K D I S V D L K K L - - G K Y S F D I Y N E K N I D Q H V E - N N V L L R G Y G F L I L G S K P C N I - - - - - - - -

Y E K T D G D E T V V I I I N R S N Q A A D I P L P F N A K K K R L V N L L T G E R W A A E A D G L S V S L P A Y G F A L Y A V E K - - - - - - - - - - -

Y E R R L G D D I L I V L V N T E E T A Q Y F Q L A V E - - E R Q W E N V L T D A P L R A E R G I L S M K L P A F G Y A V L K A V Y - - - - - - - - - - -

Y T K T Y G E E T I F F V L N P T N Q E I S A P I P F D I T G K K I V N L Y T N E E F S A E A D S L Q V A L P P Y G F S I L K W - - - - - - - - - - - - -

Y K K T D G D E T V L V I I N R S D Q K A D I P I P L D A R G T W L V N L L T G E R F A A E A E T L C T S L P P Y G F V L Y A I E R W - - - - - - - - - -

F T K Q N S S Q K M I A V L N N S D K E L S A T L P F S L E D T K L T D L L T G K E F A A H A E K L T V T V P P Y E M A F Y L V Q E - - - - - - - - - - -

522

524

517

522

521

523

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

lowast

Figure 1 Sequence alignment of Tk1770 CDase with sixteen CD hydrolyzing enzymes The alignment of Tk1770 CDase with archeal andbacterial CD hydrolyzing enzymes was carried out with Clustal Omega through UGENE packageThe novel N1015840-domain (CBM48) in archealsequences is represented in red and the protruding region of CBM48 domain in green dotted line The arrow shows the start of the TIMbarrel domain (residues 204ndash584) and four conserved regions (IndashIV) with another downstream conserved region V are represented in greyline below sequence The catalytic triad is indicated through esterics The HLH region of archeal sequences that is absent in all bacterialhomologs is represented in blue dotted line

PYRYC CDase THEG

J MAse

THERCLF CDase

PYRF

U N

Pase

GBACI CDase

THERSP CDase

GEO

SE NPase

THEO

N CD

ase

BACIIN

CDase

THERPA CDase

THES4 CDase

PBACI CDase

Tk1770 CDase

THELN NPase

STAMF 120572-amylase

BACM

Y120572

-CD

ase

THEPD120572-amylase

Figure 2 Phylogenetic tree rooted radial tree of 17 CD hydrolyzing enzymes was constructed using MrBayes with Wag rate matrix (fixed)and visualized using FigTree The phylogenetic tree obtained displays three distinct clades All the bacterial enzymes form a single clade(shown in blue) while the branch for archeal enzymes split into two clades (shown in green and red) Depending upon sequence identity anddomain arrangement Tk1770 CDase seems to be more closely related to THEGJ MAse THES4 CDase THERCLF CDase PYRFU NPaseTHEON CDase and PYRYC CDase (green)

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Archaea 3

Table 2 List of the PDB files used as templates for homology modeling of CDase Tk1770

Serialnumber PDB ID Organism Enzyme identity with TK1770 query cover

1 4AEF P furiosus Amylase 56 982 1EA9 Bacillus sp Cyclomaltodextrinase 33 763 1J0J G stearothermophilus Neopullulanase 33 784 1SMA Thermus sp Maltogenic amylase 32 795 4AEE S marinus Maltogenic amylase 29 956 1WZL Thermoactinomyces vulgaris 120572-amylase II 35 76

score was selected for further studies The homology modelwas further validated by ProSA-web server and PROCHECK[29 30] The model was refined by Modeller loop refinementfunctions and again validated for confidence Thus a reliablemodel was constructed and visualized using PyMOL [31]

23 Molecular Docking Studies In order to investigate theenzyme-substrate interactions the docking of substrates (120572-120573- and 120574-cyclodextrins) into the active pocket of Tk1770 wascarried out using AutoDock and MGL Tools v156 [33] Thesubstrates were prepared by adding polar hydrogen atomsand partial charges The protein model was prepared byadding polar hydrogens and Gasteiger charges The grid mapdimensions were set around the active site with all otherparameters set to default and rigid docking was performedThe candidates poses of the substrates were scored on thebasis of their binding energy in kcalmol and the best poseswith lowest binding energy (kcalmol) were selected

3 Results and Discussion

31 Sequence Alignment and Phylogenetic Tree The sequenceof Tk1770 consisting of 656 amino acids was aligned withsixteen CD hydrolyzing enzymes from the GH13 family(Figure 1) These sequences included eleven archeal enzymesand five bacterial enzymes having sequence identities from28 to 60 with Tk1770 CDase (Table 1) All enzymespossess three major domains (i) an N-domain (ii) a catalyticTIM barrel and (iii) a C-domain [10 34]The sequence anal-ysis showed that archeal enzymes contain two N-terminaldomains (ie N1015840- and N-domain) in addition to the catalyticand C-domains whereas the N1015840-domain is absent in all thebacterial CD hydrolyzing enzymes (Figure 1) A linker regionfrom residues 190 to 203 in Tk1770 connects two N-terminaldomains with two C-terminal domains Four conservedregions of GH13 family in TIM barrel structure were iden-tified from residues 299 to 310 405 to 414 433 to 441 and 496to 502 with catalytic triad being Asp411 Glu437 and Asp502An additional conserved region of amino acids 533ndash539 wasalso identified downstream to the conserved regions IndashIV

A rooted phylogenetic tree was constructed from align-ment usingMrBayes with rate matrix wag (fixed) to find evo-lutionary relationship The tree was divided into three cladeswith all bacterial enzymes forming one clade and archealenzymes divided into two clades (Figure 2) The tree showedthat Tk1770 CDase is more closely related to THEGJ MAseand THES4CDases with a sequence identity of 59 and 60

respectively (Figure 2) The STAMF 120572-amylase shows 28sequence identity with Tk1770 CDase and acts as outgroup inthe phylogenetic tree The 120572-amylases usually do not exhibitCDhydrolyzing activity and they also lackN1015840-domainThe120572-amylase (STAMF 120572-amylase) from S marinus is quite uniquein this regard as it exhibits both CD hydrolyzing activity andadditional N1015840-domain [35] It suggests that during the courseof evolution the presence of N1015840-domain might be linked toCD hydrolyzing activity in archaea

32 Homology Modeling The homology modeling programModeller v914 [27] was used to construct 3D structure ofTk1770 with multiple templates as described inMaterials andMethods Out of five models generated the best model withlowest DOPE value was selected

In homology modeling sometimes the model mightcontain certain high-energy loops or residues with unusualgeometry Thus the model selected was refined using Mod-eller built-in loop-refinement function on loops rangingfrom 3 to 7 amino acids in length and then validated withProSA-web server and PROCHECK analysis [30]The overallquality of the model was estimated by ProSA server interms of 119885-score by comparing it with 119885-score values ofexperimentally resolved protein structures in Protein DataBank [29] Ramachandran plot validated all the nonglycinenonproline residues to be in allowed regions and 879 ofresidues in most favorable regions This verifies that all theresidues exhibited accurate stereochemical positions

Homology model of Tk1770 CDase was aligned with Pfuriosus neopullulanase (PYRFU NPase) (PDB ID 4AEF)for an analysis and comparison of the active site and otherstructural features The overall structure of Tk1770 CDasefolds into four major domains with two 120573-strands only N-terminal domains (ie N1015840- and conventional N-domain)connected to TIM barrel (A-domain) and a C-terminaldomain also consisting of 120573-strands The structure of N1015840-domain of Tk1770 typically represents CBM48 with eight 120573-strands [15 36] The structural alignment of N1015840- or CBM48domain of Tk1770 and PYRFU NPase revealed that bothcontain a loop that extends into the catalytic site Howeverthe extended loop of N1015840-domain of PYRFU NPase forms amore flexible helical turn as compared to the loop of Tk1770(Figure 3) The substitution of P91 and S92 in extended loopregion of Tk1770 in place of K89 and G90 in loop of PYRFUNPase might be responsible for this apparent decreasedflexibility of loop in N1015840-domain of Tk1770 (Figure 3) Fur-thermore K89 and G90 in extended loop of N1015840-domain in

4 Archaea

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMY

GEOSEBACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

G - - - - - - - - - - N D M V F H R P A L L Y L Y S F G D R - T H V L L R S K K G K V D A A Y L V T D D T H - - - - - - - - - - - V K M R K K A D G E V F E Y Y E A V L Q E - T E K L R Y S F E V F L K E G K S L - - - - - - -

G - - - - - - - - - - G D G P F H A P S A T Y L Y T V A G R - T H V L L R A K A G T V A K A A L V R P E S E - - - - - - - - - G M V E M R K K A R D E P F E Y F E A V L P G - D G E L E Y S F E V R T R K G M I K - - - - - - -

G - - - - - - - - - - P E P V Y H S P S L L Y L Y T F G G R - V N F V L R A K K G Y L V S S T L I L K G K D - - - - - - - - - - - I E M R K R A S D E L F D Y F G A E V G N L E G P V E Y S F L G E S S E G - P F - - - - - - -

G

A

A

- - - - - - - - - - E G E F F H R P S A T Y L Y S I A G G - T H V L L R A R R G K T R K V R L I L D E S E - - - - - - - - - - - V P M K R K A F D E L F E Y Y E A I L P G - E G V I R Y S L I V E S E G - K T I - - - - - - -

- - - - - - - - - - G D D F Y H E P A L A Y L Y S F A D R - T H V L L R T V K G K A I S T Y L I T D E R - - - - - - - - - - - - I E M R K K A S D E L F D Y F E A V L P R - T E E L S Y G F E I E T G E G - T I - - - - - - -

- - - - - - - - - - G D E F Y H E P S L L Y I Y S F A D R - T H V L F R A V R G R A L R V I L V T D E S - - - - - - - - - - - - V G M R K K A S D E L F D Y F E A I L P R - V K E L S Y T F E I E T E E G - S V - - - - - - -

K S - - - - - - - - - D D L V F H T P S L L Y L Y E I F G R - V H V L L R T Q K G V I K G A T F L G E K H - - - - - - - - - - - - V P M R K K A S D E L F D Y F E V I V E G G D K R L N Y S F E V L T M E G A K F - - - - - - -

S - - - - - - - - - - G E E F Y H Y P S L I Y A Y S L G D L - A Y I R F R A I K G T V K K V F L I S D Q K - - - - - - - - - - - - Y E M R K K A R D D L F E Y F E A V L P K - K E E L E Y Y F E I H T A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

G F D P A S C N G F C E E A L Y H Y P S L T Y V Y P F G G V - L F V R L R A L R G S L Q K A F L V V D G R R - - - - - - - - - - - L E M R L K A R D E V F D Y Y E A S L E A - G G E V S Y Y F E V L G G G R - L H - - - - - - -

K E P D N P - - - - - L D K I I H I E E S G F I H K F N G E - I I I R L I A P T E I N E P L I D L G N E I R - - - - - - - - - - - E P L T K H V V G D N I V Y Q Y I I - - P S R S I L R Y R F I F N Y N D K K L F Y G D E G V S

- - - - - - - - - M F K E A V Y H R P T D N F A Y A Y D E R T L H L R L R T K K G D V D K V E L L H G D P Y E W R N G A W Q F E T M P M K K T G S D E W F D Y W L A E V Q P P Y R R L R Y G F V L H A G E E T L V Y T E K G V Y

- - - - - - - - - M L L E A V Y H R P R L N W S Y A Y N E N T I H L R L R A K K G D L T E V Y A W T G D K Y A W D T T K - - - E L I P M S L F T S D E M F D Y W E C E T V P P H R R L K Y G F L L Q K G S E R I W M T E S D F Q

- - - - - - - - - M F K E A I Y H R P K D N Y A Y A Y D E K T L H I R L R T K K N D V D I A S L I H G D P Y E W Q D G K W I T A N I P M K K S G S T D L F D Y W F V S I E P N F K R L R Y G F E L K N N T E T I V Y T E R G F F

- - - - - - - - - M R K E A I Y H R P A D N F A Y A Y D S E T L H L R L R T K K D D I D R V E L L H G D P Y D W Q N G A W Q F Q T M P M R K T G S D E L F D Y W F A E V K P P Y R R L R Y G F V L Y S G E E K L V Y T E K G F Y

- - - - - - - - - M L K E A V Y H R P K N Q Y A Y A Y D E K T L H I R L R T K K N D V E T V S L V H G D P Y E W S K D G W T F K Q N E M K K S G S D E L F N Y W F T A V E P E Y R R M R Y G F E L T S G D E K W I Y T E K G F I

- - - - - - - T L G P F E A - - - - A P F R L D A P S W I L D R V F Y Q I M P D R F A K G R D H E P P F L - - - - - S - - - - - W E Y Y G G D L W G I V E K I D H L E E L G V N A L Y L T P I F E S M T Y H G Y D I T D Y L R V

- - - - - - - E L G P F R A - - - - V P Y R P E T P L W V Y G R V F Y Q I M P D R F E R G L P G - T P R G R A F R G - - - - - - E E F H G G N L A G I I K R L E H L E E L G V N A L Y L T P I F E S M T Y H R Y D V T D Y F S I

- - - - - - - E L G P F S A - - - - V P I A L K A P E W P L E R V F Y Q V M P D R F A G N C L R - - - - - - - - - D S - - - - - G N F C G G D L W G L K E R L D H I A G L G F N A L Y L T P I F E S T T Y H G Y D V V D Y F H V

- - - - - - - E L G P F E A - - - - K P Y R Y N A P G W I H G R V F Y Q I M P D R F E R G L P G - T P R G R A F A G - - - - - - E G F H G G D L A G I I R R L D H I E S L G A N A L Y I T P V F E S T T Y H R Y D V T D Y F H I

- - - - - - - E Y G N F T A - - - - E P R E L Q V P R W I F N R V F Y Q I M P D R F E R D M I K - K P R G R I I E T G - - - - - L G H H G G D L A G I V K R L G H L E G L G V N A L Y L T P I F E S M T Y H G Y D I V D Y F K V

- - - - - - - E Y G D F T A - - - - T P K E L S T P K W I F S R V F Y Q I M P D R F E R E S N E - E K V - - - - G G D - - - - - P K I Y G G N L P G I L K R L D Y I E G L R V N A L Y L T P I F E S I T Y H G Y D V I D Y F N V

- - - - - - - E Y G Q F K A - - - - R P F S I E F P T W V I D R V F Y Q I M P D K F A R S R K I - - - - Q G I A Y P K - - - - - D K Y W G G D L I G I K E K I D H L V N L G I N A I Y L T P I F S S L T Y H G Y D I V D Y F H V

- - - - - - - N Y G D F K V D F N E Q K E M F K P P T W I F E R I F Y Q I M P D R F A N G N P E N D P H D C I E L G - - - - - - I S H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I I D Y F H V

- - - - - - - D Y G D F N V D F N E Q K E R F K P P A W I F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - D Y G D F K V D F N E Q K E R F K P P A W V F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - R Y G E F S V D V K S L E S L I R V P E W V Y G S V F Y Q I M P D R F A E - - - - - - - - - - - - - - - - - - - - - - - - - G G L E E I A E R L N H V S G L G A N A L Y L T P I F E S T T Y H G Y D V V D Y Y R V

E - - - - - - N S S Y I V V N S K Y I P G - V D K P R W Y M G T V Y Y Q I F I D S F D N G D P N N D P P N R I K K T V - - P R E Y G Y Y G G D L A G I M K H I D H L E D L G V E T I Y L T P I F S S T S Y H R Y D T I D Y K S I

L T P P A D D T A Y Y F C F P F L H D V D L F H A P E W V K D T V W Y Q I F P D R F A N G N P A I N P E G V R P W G S E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I S G I Y L T P I F R A P S N H K Y D T A D Y L E I

T K R P E - N P E K L F E F P Y I N R S D I F T P P A W V K D A V F Y Q I F P E R F A N G D P S L D P E N V Q P W G G - K P E R D S F F G G D L Q G V I D H L D H L S E L G I N A I Y F T P V F A A T T N H K Y D T E D Y M R I

P E T P N D D V G N F F C F P F I H E Q D V F R T P S W I K D T V W Y Q I F P E R F A N G D P S C N P A D T L P W G S T D P T T T N F F G G D F A G V I Q H L D Y L V K L G I S G I Y F T P I F T A H S N H K Y D T I D Y M E I

F E A P I D D T A Y Y F C F P F L H R V D L F E A P D W V K D T V W Y Q I F P E R F A N G N P S I S P E G S R P W G N E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I T G I Y L T P I F R S P S N H K Y D T A D Y F E V

K D V V S D D T A P Y F A F P F L N K A D V F H A P E W V K D T V W Y Q I F P E R F A N G D S S I N P E G T L E W G S I E P T S G N F F G G D F E G V I Q N I G Y L K E L G I S G I Y F T P V F K A Y S N H K Y D T I D Y M E L

S P K V R E F V A R V M N Y W L E K - G A D G W R L D V A H G V P P G F W R E V R E G - - - L P D D A Y L F G E V M D D P R L Y L F - G V F H G V M N Y P L Y D L L L R F F A F G E I G A T E F I N G I E L - L S A H L G P A E

S P E V R K F I R E V M E Y W L E R - G A D G W R L D V A H G V P P E L W G E M R K A - - - M P E G A Y L M G E V M D D P R L W V F - D A F H G T M N Y P L Y E L I L R F F V K G E I D A G E F L N G L E L - L S A H L G P A E

S E E V F E F V V N V M G Y W L K K - - A D G W R L D V A H G V P P D F W V R V R E R - - - M P S S A Y L I G E V M D D A R L Y L F - R G F H G V M N Y A L Y D A I L K F F A F G E I S A E E F L N E L E L - I S V R Y G P A E

N P E V K R L V K D V M M H W L E K - G A D G W R L D V A H G V P P E L W R E V R K A - - - L P K D A Y L V G E V M D D P R L W L F - D K F H G T M N Y P L Y E L I L R F F V E R E I D A G E F L N G L E L - L S A H L G P A E

N P Q V R E F I V S V M K H W L E E - G A D G W R L D V A H G V P P E L W R E V R E R - - - M P E D A Y L V G E V M D D A R L W L F - D K F H G T M N Y P L Y E A I L R F F V R G E I S A E E F L N W L E L - L S T Y Y G P A E

D P R V R K F I A K V M N Y W L E K - G I D G W R L D V A H G I P P D L W R E I R K E - - - M P E D A Y L V G E V M D D A R M W L F - D K F H G T M N Y P L Y E A I L R F F V T G E I T A E E F L N Y L E L - L S T Y Y G P A E

N P K V R E F I K N V I L F W T N K - G V D G F R M D V A H G V P P E V W K E V R E A - - - L P K E K Y L I G E V M D D A R L W L F - D K F H G V M N Y R L Y D A I L R F F G Y E E I T A E E F L N E L E L - L S S Y Y G P A E

N P E V K E F I R T V M K Y W L E R - G A D G W R L D V A H G V P P D V W R E I R K D - - - I P D D A Y L L G E V M D D A R L W L F - D K F H G T M N Y P L Y E A L L R F F V Y N E I T A E E F L N W L E L - L S V Y Y G P A E

S K G V R E F I G N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

S K G V R E F I R N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

N P E V R S F I T G V G R Y W V S R - G V D G W R L D V A H G V P P E L W R E F R E T - - - L P G D V Y L F G E V M D D A R I W L F - D K F H G A M N Y L L Y D A V L R F F A Y R E I T A E E F L N R L E L - L S V Y Y G P G E

N P R T V D Y F I D I T K F W I D K - G I D G F R I D V A M G I H Y S W M K Q Y Y E Y I K N T Y P D F L V L G E L A E N P R I Y M - - D Y F D S A M N Y Y L R K A I L E L L I Y K R I D L N E F I S R I N N V Y A Y I P H Y K A

H P D V R R Y L L D V A T Y W I R E C D I D G W R L D V A N E I D H E F W R E F R R A V K A Q K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P V A D A A L R F F A K E E I N A R E F A E R L M R V L H S Y P A T V N

N P E V K Q Y L L E V A E Y W I K E V G I D G W R L D V A N E V S H E F W R E F R K V V K R A N P D A Y I L G E I W H E S A P W L E G D K F D A V M N Y P F T S A V I D F F V F G N L D A E G F A N S I G K Q L S R Y P L Q A S

H P D V K E Y L L K V G R Y W V R E F H I D G W R L D V A N E V D H S F W R E F R S E I K A I N P E V Y I L G E I W H D A Q P W L Q G D Q F D A V M S Y P I T N A L H S Y F A N E T I G A S E F M E Q I T A S L H S Y S M N V N

N P E V K R Y L L D V A T Y W I R E F D I D G W R L D V A N E I D H E F W R E F R Q A V K A L K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P F T D G V L R F F A K E E I G A R Q F A N Q M V H V L H S Y P N N V N

H P D V R S Y L L E V G R Y W V R E F D I D G W R L D V A N E V D H A F W R E F R Q A V R A E K E D V Y I L G E I W H D S M P W L Q G D Q F D A V M N Y P F T T G T M N F I A N N K V K A E E F V H I M E S V L H S Y P K N V N

106

109

109

104

104

104

104

104

104

104

104

105

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

188

193

187

185

184

184

187

184

184

184

195

202

104

101

104

104

104

279

287

274

279

279

275

279

283

284

284

275

305

216

211

216

216

216

386

394

381

386

386

382

386

390

391

391

378

417

302

298

302

302

303

187

192

186

184

183

183

186

183

183

183

194

201

103

100

103

103

103

278

286

273

278

278

274

278

282

283

283

274

304

215

210

215

215

215

491

499

485

491

491

487

491

495

496

496

483

525

413

409

413

413

414

385

393

380

385

385

381

385

389

390

390

377

416

301

297

301

301

302

E R L G G E E A F R E L V K A L K S R D I K L V L D G V F H H T S F F H P F F R D V V E R G E E S E Y A D F Y R V K G F P V - - V S E E F I R V L K S D L P P M E K Y Q T L K K M G W N - - - Y E S F F S V W V M P R L N H D

D

A

A

A

A

A

A

A

A

R K L G G G G V F G E F V K E L K K R D I R L I L D G V F H H T S F F H P Y F Q D V V R K G E G S E Y R G F Y R I T G F P V - - V P E Q F L R V L H S E G P W I E R Y H L I K S L D W N - - - Y E S F Y S V W L M P R L N H D

S R R L G G D E A F D E L V K E L R R R G I K L I L D G V F H H T S F F H P Y F Q D V V E K G E R S R Y V G F Y R I L G F P V - - V S K R F L R A L D S G L L P G D T R S A P M G A E W N - - - Y E S F Y S V W L M P R L N S D

D R K L G G D G T F L K L A G E L K K R D I K L V L D G V F H H T S F F H P F F Q D L I A R G N E S D Y K D F Y R V T G F P V - - V S G E F L E V L R S K I S P R E K H R R L K E I G W N - - - Y E S F Y S V W L M P R L N H E

G K F G G N E A F G E L A R E L K R R D I K L I L D G V F H H T S F F H S Y F Q D V V K K G G E S R Y R D F Y R I L K F P V - - V S K D F L R V L D S N E P P E R K Y K G L K E L H Q N - - - Y E N F F S V W L M P R L N H D

K R L G G N A A F E K L V R E L K R R D I K L I L D G V F H H T S F F H P H F Q D V V R K G V E S V Y R D F Y R I T G F P V - - V S Q E F L E I L N S E E P W E E K F K R L K N L D W N - - - Y E S F F S V W L M P R L N H D

R R L G G D R A F V D L L S E L K R F D I K V I L D G V F H H T S F F H P Y F Q D V V R K G E N S S F K N F Y R I I K F P V - - V S K E F L Q I L H S K S S W E E K Y K K I K S L G W N - - - Y E S F F S V W I M P R L N H D

K K F G G D K A L K Q L V N E L K K R D I K L I L D G V F H H T S F F H P Y F Q D I L K K G K E S K Y R N F Y R I F G F P V - - I S K E F S K L L H S N E P W I E K Y Q K L R K L K W N - - - Y E S F F S V W L M P R L N H E

R K F G G D E A F E K L V Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

R K F G G D E A F E K L M Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

G R L G G D E A F G R L L A E L K K R G M R V V L D G V F H H T S F F H P Y F Q D L V E K G E E S R Y K G F Y R V L G F P V - - V P R E F L E A L R S G A P R H E - - - - L K K Y P R R - - - Y E S F F D V W L M P R L N H D

D K Y L G T M E D F E K L V Q V L H S R K I K I V L D I T M H H T N P C N E L F V K A L R E G E N S P Y W E M F S F L S P P P K E I V E L M L K Y I D G E E C R S R E L Y K L D Y F R N N K P F Y E A F F N I W L M A K F N H D

D P H F G D K E T L K T L V Q R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V L K N G E S S P Y K D W F H I R D F P L - - Q S E - P - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D A K T L K K L V D V C H E R G I R V L L D A V F N H A G K T F A P F I D V Q E K G E A S P Y K D W F H I N Q F P L - - A F D Q D - - - - - - - - - - - - - - - - - - - - I P S - - - Y D T F A F E P L M P K L N T E

D P Q F G T K E T F K K L V N A C H K R G I K V M L D A V F N H S G Y F F D K F Q D V L K K G K Q S R Y T N W F H I H E F P I - - V T E - P - - - - - - - - - - - - - - - - - - - - L P N - - - Y D T F A F T P Y M P K L N T A

D P H F G D K E T L K T L I D R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V W K N G E S S K Y K D W F H I H E F P L - - Q T E - S - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D K E T F K R L V R T C H D N G I K V M L D A V F N H S G Y Y F P Q F Q D V L E H G E K S S Y K D W F H I R K F P L - - K N E D D - - - - - - - - - - - - - - - - - - - - T I N - - - Y D A F A F V E S M P K L N T E

lowast lowast

M Y K V F G F E E N F I H G R V A R - - V E F S L P D A G R W D Y A Y L L G N F N A F N E G S F R M K H E D K R W I I E I K L P E G L W R Y A F S A G G E F - - L L D P E N P E K E L Y R R P S Y K F E R E V S L A K I A

W

M R K V Y K I F G F E P D Q K F G R V A V - - V E F S I P A E P G N R Y A Y L L G S F N A F N E G S F R M R R K K G R W R T V V K L P E G V W H Y A F S I D G E F - - T P D P E N P R R E V Y R R L S Y K F E R E T S V A V I D

- - -

- - -

M Y K T F G F V E D P V F G R L A R - - V E F S I P Y R - G E R Y A Y L L G S F N A F N E G S F R M E R R G S R W F I R V L L P E G V W R Y A F S L E G R F - - E R D P E N E N V E T Y R R P S Y K F E K E V S V A G V I

- - - M Y K I F G F E P D W R F G R V A R - - V E F S I P A R -- G K Y A Y L L G N F N A F N E G S F R M E R K G E R W R I T L R L P E G V W Y Y G F S V D G E F - - L M D P E N P D V E T Y R K L S Y K L E K E A S V A R I V

- - - M Y K T F G F E S N E Y F G R I A K - - V E F S V P S R - - G S Y A Y L V G S F N A F N E G S F R M R E E N G R W R A T V E L P E G V W H Y G F S I D G K Y - - A P D P E N P E K R A Y R R F S Y K F E R E T S V A R I S

- - - M Y K I L E F G H N E Y F G R V A K - - V E F S F P K R -- G G Y A Y L V G S F N A F N E G S F R M R E K G D R W H I V I D L P E A I W Y Y G F S L D G K Y - - T P D I E N P E R T L Y R R L S Y K F E R E V S I A R I

- - - M Y K L V S F R D S E I F G R V A E - - V E F S L I R E - - G S Y A Y L L G D F N A F N E G S F R M E Q E G K N W K I K I A L P E G V W H Y A F S I D G K F - - V L D P D N P E R R V Y T R K G Y K F H R E V N V A R I V

- - - M Y K I F G F K N D K Y L G K V A E - - V E F S M L K R - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W S I K I E L P E G V W Y Y A F S I D G D L - - M L D P E N R E K T T Y K R H S Y K F R R T V N V A K I F

- - - M Y K I F G F K D D D Y L G K V G I - - T E F S I P K R - - G S Y A Y L L G N F N A F N E G S F R M K E K G D R W Y I K V E L P E G I W Y Y A F S I D G N L - - T L D F E N N E K A V Y R R L S Y K F E K T V N V A K I F

- - - M Y K I F G F K D N D Y L G K V G I - - T E F S I P K S - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W Y I K V E L P E G I W Y Y T F S V D G N L - - I L D F E N N E K T V Y R R L S Y K F E K T V N V A K I F

- - - M Y R V L G F R D D V Y L G R V V K - - A E F S A P R E - - G E Y A Y L L G N F N A F N E G S F R M R G A G D R W V V E V E L P E G V W Y Y L F S L G G R R - - A V D P E N P E T T V Y S R R A Y K F E E R V S V A K L L

- - - M Y K I I G R E I - Y G K G R K G R Y I V K F T R H W P Q Y A K N I Y L I G E F T S L Y P G F V K L R K I E E Q G I V Y L K L W P G E Y G Y G F Q I D N D F E N V L D P D N E E K K C V H T S F F P E Y K K C L S K L V I

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

105

108

108

104

103

103

103

103

103

103

103

103

Figure 1 Continued

Archaea 5

492

500

486

492

492

488

492

496

497

497

484

526

414

410

414

414

415

597

607

593

599

599

595

599

603

604

604

591

630

521

Y F T Y N F L D N H D T E R F I D L A - G - K E R Y L C A L T F L M T Y K G I P A I F Y G D E I G L R G S - G E G M S A G R T P M S W D E E K W D F Q I L R Q T M K L I E L R R S L K S L Q - V G S F R V I G A - - G E K W F V

Y A M Y N F L D N H D T E R F L D L V - G D K R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G R L D G G L S A G R T S M V W D R G K W D T E I F E T T K R L I R L R R G S R A L Q - L G E F V P V R F - - Q G R T M I

Y Y A Y N F L D N H D T E R F L D L V - H D E R L Y L C A L A F L M T Y K G I P A V F Y G D E I G L R G R K G G G L D A G R T P M K W R E E N W N R E I L E T T R E L I H L R R N S K A L Q - F G T F R P L L F - - R G R T I V

Y A M Y N F I D N H D T E R F I D L V - N D E R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G K L E G G L D A G R T P M E W N P E G W N E R I L E T T R K L I E L R K R S K A L Q - L G D F I P L R F - - E G D E I I

Y S M Y N F L D N H D V E R F L D L V - G D E R R Y L C A L A F L M T Y K G I P A L F Y G D E I G L R G I G A S G M E S S R T P M K W G K E T W N T K I L R V T K A L I R L R R K S K A L Q - L G E F R P L E F - - K G G L L L

Y M M Y N F L D N H D V E R F L D L V - G D R K R Y L C A L A F L M T Y K G I P S I F Y G D E I G L S G M E G K G L E V S R T P M R W E G N Q W D T E I L K V T K A L I R L R R N S R A L Q - L G F F R P L K F - - K G R L L V

Y L M Y N F L D N H D V E R F L D I V - G D K R K Y V C A L V F L M T Y K G I P S L F Y G D E I G L R G I N L Q G M E S S R A P M L W N E E E W D Q R I L E I T K T L V K I R K N N K A L L - F G N F V P V K F - - K R K F M V

Y T M Y N F L D N H D V E R F L G L V - R D K R K Y L C A L T F L M T Y K G I P A I Y Y G D E V G L E N M D V P S M E C S R V P M E W N E K K W D K E I L K I T K E L I D L R R R S K A L Q - R G T F V P I F F - - E D K L L I

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G D E I G M R N I E A P F M E R S R A P M E W N K K R W D F E I L N I V K E L I K L R K G S K A L Q - V G T F E P V E F - - R E G M L L

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G N E I G M K N I E A P F M E R S R A P M E W N K K K W D K E I L K T T K E L I K L R R R S K A L Q - K G I F K P V K F - - K D K L L V

Y A M Y N F L D N H D V D R L L S L V - G D R D K Y L C A L V F L F T Y K G V P S I Y Y G D E V G L E N T D S P F M E R S R A P M R W D E S T W D K A I L E A T R A L A S L R R R S A A L Q - R G A F E P V R F - - E G G L L V

L S L Y N M L G S H D V P R I K S M V - Q N N K L L K L M Y V L I F A L P G S P V I Y Y G D E I G L E G G R D P D - - - N R R P M I W D R G N W D L E L Y E H I K K L I R I Y K S C R S M R - H G Y F L V E N L - - G S N L L F

E A A F N L L G S H D T P R I L T V C G E D V R K A K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P G - - - C R K C M I W D D D K Q H R G L Y E H V K Q L I A L R R Q Y R A L R - R G H I A V L H A D E Q T N Q L V

E V A F N L L D S H D T P R L L T L A K G D K K K Q K L A S L F Q F T F M G T P C I Y Y G D E V G M D G G G D P D - - - C R K C M E W D K D K Q D L D L F E F Y R R L I H I R A S H P A L R - T G T L T F L E A S R Q G T K L A

K A A F H L L D S H D T P R I L T T C K G N K N K V K L L Y V F H L S F I G S P C V Y Y G D E I G M D G G M D P G - - - C R K C M V W D E D K Q D T V L F K H I Q T L I S L R R Q Y K A F G G H G L F Q C I E A N D E Q G Y I S

E A A F N L L G S H D T S R I L T V C G G D I R K V K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P E - - - C R K C M V W D P M Q Q N K E L H Q H V K Q L I A L R K Q Y R S L R - R G E I S F L H A D D E M N Y L I

E V A F N L L G S H D T P R I L T T S G G S K E K L K L L F A Y Q L S F I G T P C I Y Y G D E I G M D G E Q D P G - - - C R K C M I W E E D K Q D R E L F T Y V K K L I S L R K K Y P V F G N G G D I T F I E A N D E T N H V I

598

608

594

600

600

596

600

604

605

605

592

631

522

518

523

656

637

638

644

652

656

645

654

655

660

644

696

587

581

586

588

589

Y E R K A G S E R V L V G I N C S W N D V E T P V P S N G S - - - - - - - - - - - - - - - - - - N E Q I K I P A F S S I I R V K D S M N V H I G S D L Q E

Y E R V L G D E R V R V E I R Y S M E P E D C T F H V T A S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Y E R A I D G E S L V V A I N C S E V H V K V S L P G G - - - - - - - - - - - - - - - - - - - - - K S L N L P P L S F R I V D T G R - - - - - - - - - - -

Y E R A L G K E R V R V E I R Y T K N P E E C R F K L F L S H L K - - - - - - - - - - - - - - R K Y W K N Y S P N T S - - - - - - - - - - - - - - - - - -

Y E R V Y Q N E G V L V G I N Y S D V P T A I Q I P E A Y R P A A - - - - - - - - - - - - - D G V S F L K M K P W S F V A L A S T I - - - - - - - - - - -

Y E R I Y E K E H V L V A I N C S S R V E S V L I P E K Y R P I V - - - - - - - - - - - - - - G K T S I E L A P W S F I V V F S R F N D V Q L L S W P - -

Y K R E H M G E R T I V A I N Y S N S R V K - - - - - - - - - - - - - - - - - - - - - - - - - - E L G I T I P E Y S G V I I N E D K V K L I K Y - - - - -

Y E R V S K G E R I L I G I N Y S E K E A K I K L P E K V K I L L - - - - - - - - - - - - - G Q L H G E R L P P F S F F I S S L - - - - - - - - - - - - -

Y E R I H G E E R L L I G I N Y S E N P V S L R K S P D E I L L - - - - - - - - - - - - - - G D L E N S V L K P F S F F V G R L S - - - - - - - - - - - -

Y K R V L N N E N I L V A I N Y S K K E K H L D L P P S F E I L F - - - - - - - - - Q S G S F D R V N I R L K P F S S I I A K K L - - - - - - - - - - - -

Y R R R L G D E S I L V A I N Y S E S E A V L E E P A Q S V L F R - - - - - - - - - - - - S G S V K E K L L G P F S S V V A G D R - - - - - - - - - - - -

I K R W I N N E E I I F L L N V S S K D I S V D L K K L - - G K Y S F D I Y N E K N I D Q H V E - N N V L L R G Y G F L I L G S K P C N I - - - - - - - -

Y E K T D G D E T V V I I I N R S N Q A A D I P L P F N A K K K R L V N L L T G E R W A A E A D G L S V S L P A Y G F A L Y A V E K - - - - - - - - - - -

Y E R R L G D D I L I V L V N T E E T A Q Y F Q L A V E - - E R Q W E N V L T D A P L R A E R G I L S M K L P A F G Y A V L K A V Y - - - - - - - - - - -

Y T K T Y G E E T I F F V L N P T N Q E I S A P I P F D I T G K K I V N L Y T N E E F S A E A D S L Q V A L P P Y G F S I L K W - - - - - - - - - - - - -

Y K K T D G D E T V L V I I N R S D Q K A D I P I P L D A R G T W L V N L L T G E R F A A E A E T L C T S L P P Y G F V L Y A I E R W - - - - - - - - - -

F T K Q N S S Q K M I A V L N N S D K E L S A T L P F S L E D T K L T D L L T G K E F A A H A E K L T V T V P P Y E M A F Y L V Q E - - - - - - - - - - -

522

524

517

522

521

523

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

lowast

Figure 1 Sequence alignment of Tk1770 CDase with sixteen CD hydrolyzing enzymes The alignment of Tk1770 CDase with archeal andbacterial CD hydrolyzing enzymes was carried out with Clustal Omega through UGENE packageThe novel N1015840-domain (CBM48) in archealsequences is represented in red and the protruding region of CBM48 domain in green dotted line The arrow shows the start of the TIMbarrel domain (residues 204ndash584) and four conserved regions (IndashIV) with another downstream conserved region V are represented in greyline below sequence The catalytic triad is indicated through esterics The HLH region of archeal sequences that is absent in all bacterialhomologs is represented in blue dotted line

PYRYC CDase THEG

J MAse

THERCLF CDase

PYRF

U N

Pase

GBACI CDase

THERSP CDase

GEO

SE NPase

THEO

N CD

ase

BACIIN

CDase

THERPA CDase

THES4 CDase

PBACI CDase

Tk1770 CDase

THELN NPase

STAMF 120572-amylase

BACM

Y120572

-CD

ase

THEPD120572-amylase

Figure 2 Phylogenetic tree rooted radial tree of 17 CD hydrolyzing enzymes was constructed using MrBayes with Wag rate matrix (fixed)and visualized using FigTree The phylogenetic tree obtained displays three distinct clades All the bacterial enzymes form a single clade(shown in blue) while the branch for archeal enzymes split into two clades (shown in green and red) Depending upon sequence identity anddomain arrangement Tk1770 CDase seems to be more closely related to THEGJ MAse THES4 CDase THERCLF CDase PYRFU NPaseTHEON CDase and PYRYC CDase (green)

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

4 Archaea

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMY

GEOSEBACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

G - - - - - - - - - - N D M V F H R P A L L Y L Y S F G D R - T H V L L R S K K G K V D A A Y L V T D D T H - - - - - - - - - - - V K M R K K A D G E V F E Y Y E A V L Q E - T E K L R Y S F E V F L K E G K S L - - - - - - -

G - - - - - - - - - - G D G P F H A P S A T Y L Y T V A G R - T H V L L R A K A G T V A K A A L V R P E S E - - - - - - - - - G M V E M R K K A R D E P F E Y F E A V L P G - D G E L E Y S F E V R T R K G M I K - - - - - - -

G - - - - - - - - - - P E P V Y H S P S L L Y L Y T F G G R - V N F V L R A K K G Y L V S S T L I L K G K D - - - - - - - - - - - I E M R K R A S D E L F D Y F G A E V G N L E G P V E Y S F L G E S S E G - P F - - - - - - -

G

A

A

- - - - - - - - - - E G E F F H R P S A T Y L Y S I A G G - T H V L L R A R R G K T R K V R L I L D E S E - - - - - - - - - - - V P M K R K A F D E L F E Y Y E A I L P G - E G V I R Y S L I V E S E G - K T I - - - - - - -

- - - - - - - - - - G D D F Y H E P A L A Y L Y S F A D R - T H V L L R T V K G K A I S T Y L I T D E R - - - - - - - - - - - - I E M R K K A S D E L F D Y F E A V L P R - T E E L S Y G F E I E T G E G - T I - - - - - - -

- - - - - - - - - - G D E F Y H E P S L L Y I Y S F A D R - T H V L F R A V R G R A L R V I L V T D E S - - - - - - - - - - - - V G M R K K A S D E L F D Y F E A I L P R - V K E L S Y T F E I E T E E G - S V - - - - - - -

K S - - - - - - - - - D D L V F H T P S L L Y L Y E I F G R - V H V L L R T Q K G V I K G A T F L G E K H - - - - - - - - - - - - V P M R K K A S D E L F D Y F E V I V E G G D K R L N Y S F E V L T M E G A K F - - - - - - -

S - - - - - - - - - - G E E F Y H Y P S L I Y A Y S L G D L - A Y I R F R A I K G T V K K V F L I S D Q K - - - - - - - - - - - - Y E M R K K A R D D L F E Y F E A V L P K - K E E L E Y Y F E I H T A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

S - - - - - - - - - - G E K F Y H Y P S L V Y A Y S L G D S - T Y I R F R A M K G V A K R V F L I S D Q K - - - - - - - - - - - - Y E M R K K A Q D E L F E Y F E A V L P R - K E G L E Y Y F E I H E A D E - I I - - - - - - -

G F D P A S C N G F C E E A L Y H Y P S L T Y V Y P F G G V - L F V R L R A L R G S L Q K A F L V V D G R R - - - - - - - - - - - L E M R L K A R D E V F D Y Y E A S L E A - G G E V S Y Y F E V L G G G R - L H - - - - - - -

K E P D N P - - - - - L D K I I H I E E S G F I H K F N G E - I I I R L I A P T E I N E P L I D L G N E I R - - - - - - - - - - - E P L T K H V V G D N I V Y Q Y I I - - P S R S I L R Y R F I F N Y N D K K L F Y G D E G V S

- - - - - - - - - M F K E A V Y H R P T D N F A Y A Y D E R T L H L R L R T K K G D V D K V E L L H G D P Y E W R N G A W Q F E T M P M K K T G S D E W F D Y W L A E V Q P P Y R R L R Y G F V L H A G E E T L V Y T E K G V Y

- - - - - - - - - M L L E A V Y H R P R L N W S Y A Y N E N T I H L R L R A K K G D L T E V Y A W T G D K Y A W D T T K - - - E L I P M S L F T S D E M F D Y W E C E T V P P H R R L K Y G F L L Q K G S E R I W M T E S D F Q

- - - - - - - - - M F K E A I Y H R P K D N Y A Y A Y D E K T L H I R L R T K K N D V D I A S L I H G D P Y E W Q D G K W I T A N I P M K K S G S T D L F D Y W F V S I E P N F K R L R Y G F E L K N N T E T I V Y T E R G F F

- - - - - - - - - M R K E A I Y H R P A D N F A Y A Y D S E T L H L R L R T K K D D I D R V E L L H G D P Y D W Q N G A W Q F Q T M P M R K T G S D E L F D Y W F A E V K P P Y R R L R Y G F V L Y S G E E K L V Y T E K G F Y

- - - - - - - - - M L K E A V Y H R P K N Q Y A Y A Y D E K T L H I R L R T K K N D V E T V S L V H G D P Y E W S K D G W T F K Q N E M K K S G S D E L F N Y W F T A V E P E Y R R M R Y G F E L T S G D E K W I Y T E K G F I

- - - - - - - T L G P F E A - - - - A P F R L D A P S W I L D R V F Y Q I M P D R F A K G R D H E P P F L - - - - - S - - - - - W E Y Y G G D L W G I V E K I D H L E E L G V N A L Y L T P I F E S M T Y H G Y D I T D Y L R V

- - - - - - - E L G P F R A - - - - V P Y R P E T P L W V Y G R V F Y Q I M P D R F E R G L P G - T P R G R A F R G - - - - - - E E F H G G N L A G I I K R L E H L E E L G V N A L Y L T P I F E S M T Y H R Y D V T D Y F S I

- - - - - - - E L G P F S A - - - - V P I A L K A P E W P L E R V F Y Q V M P D R F A G N C L R - - - - - - - - - D S - - - - - G N F C G G D L W G L K E R L D H I A G L G F N A L Y L T P I F E S T T Y H G Y D V V D Y F H V

- - - - - - - E L G P F E A - - - - K P Y R Y N A P G W I H G R V F Y Q I M P D R F E R G L P G - T P R G R A F A G - - - - - - E G F H G G D L A G I I R R L D H I E S L G A N A L Y I T P V F E S T T Y H R Y D V T D Y F H I

- - - - - - - E Y G N F T A - - - - E P R E L Q V P R W I F N R V F Y Q I M P D R F E R D M I K - K P R G R I I E T G - - - - - L G H H G G D L A G I V K R L G H L E G L G V N A L Y L T P I F E S M T Y H G Y D I V D Y F K V

- - - - - - - E Y G D F T A - - - - T P K E L S T P K W I F S R V F Y Q I M P D R F E R E S N E - E K V - - - - G G D - - - - - P K I Y G G N L P G I L K R L D Y I E G L R V N A L Y L T P I F E S I T Y H G Y D V I D Y F N V

- - - - - - - E Y G Q F K A - - - - R P F S I E F P T W V I D R V F Y Q I M P D K F A R S R K I - - - - Q G I A Y P K - - - - - D K Y W G G D L I G I K E K I D H L V N L G I N A I Y L T P I F S S L T Y H G Y D I V D Y F H V

- - - - - - - N Y G D F K V D F N E Q K E M F K P P T W I F E R I F Y Q I M P D R F A N G N P E N D P H D C I E L G - - - - - - I S H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I I D Y F H V

- - - - - - - D Y G D F N V D F N E Q K E R F K P P A W I F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - D Y G D F K V D F N E Q K E R F K P P A W V F E R V F Y Q I M P D R F A N G N P E N D P H N C I E F K T - - - - - I T H H G G D L E G I I E K L D Y I E E L G V N A L Y L T P I F E S M T Y H G Y D I V D Y Y H V

- - - - - - - R Y G E F S V D V K S L E S L I R V P E W V Y G S V F Y Q I M P D R F A E - - - - - - - - - - - - - - - - - - - - - - - - - G G L E E I A E R L N H V S G L G A N A L Y L T P I F E S T T Y H G Y D V V D Y Y R V

E - - - - - - N S S Y I V V N S K Y I P G - V D K P R W Y M G T V Y Y Q I F I D S F D N G D P N N D P P N R I K K T V - - P R E Y G Y Y G G D L A G I M K H I D H L E D L G V E T I Y L T P I F S S T S Y H R Y D T I D Y K S I

L T P P A D D T A Y Y F C F P F L H D V D L F H A P E W V K D T V W Y Q I F P D R F A N G N P A I N P E G V R P W G S E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I S G I Y L T P I F R A P S N H K Y D T A D Y L E I

T K R P E - N P E K L F E F P Y I N R S D I F T P P A W V K D A V F Y Q I F P E R F A N G D P S L D P E N V Q P W G G - K P E R D S F F G G D L Q G V I D H L D H L S E L G I N A I Y F T P V F A A T T N H K Y D T E D Y M R I

P E T P N D D V G N F F C F P F I H E Q D V F R T P S W I K D T V W Y Q I F P E R F A N G D P S C N P A D T L P W G S T D P T T T N F F G G D F A G V I Q H L D Y L V K L G I S G I Y F T P I F T A H S N H K Y D T I D Y M E I

F E A P I D D T A Y Y F C F P F L H R V D L F E A P D W V K D T V W Y Q I F P E R F A N G N P S I S P E G S R P W G N E D P T P T S F F G G D L Q G I I D H L D Y L V D L G I T G I Y L T P I F R S P S N H K Y D T A D Y F E V

K D V V S D D T A P Y F A F P F L N K A D V F H A P E W V K D T V W Y Q I F P E R F A N G D S S I N P E G T L E W G S I E P T S G N F F G G D F E G V I Q N I G Y L K E L G I S G I Y F T P V F K A Y S N H K Y D T I D Y M E L

S P K V R E F V A R V M N Y W L E K - G A D G W R L D V A H G V P P G F W R E V R E G - - - L P D D A Y L F G E V M D D P R L Y L F - G V F H G V M N Y P L Y D L L L R F F A F G E I G A T E F I N G I E L - L S A H L G P A E

S P E V R K F I R E V M E Y W L E R - G A D G W R L D V A H G V P P E L W G E M R K A - - - M P E G A Y L M G E V M D D P R L W V F - D A F H G T M N Y P L Y E L I L R F F V K G E I D A G E F L N G L E L - L S A H L G P A E

S E E V F E F V V N V M G Y W L K K - - A D G W R L D V A H G V P P D F W V R V R E R - - - M P S S A Y L I G E V M D D A R L Y L F - R G F H G V M N Y A L Y D A I L K F F A F G E I S A E E F L N E L E L - I S V R Y G P A E

N P E V K R L V K D V M M H W L E K - G A D G W R L D V A H G V P P E L W R E V R K A - - - L P K D A Y L V G E V M D D P R L W L F - D K F H G T M N Y P L Y E L I L R F F V E R E I D A G E F L N G L E L - L S A H L G P A E

N P Q V R E F I V S V M K H W L E E - G A D G W R L D V A H G V P P E L W R E V R E R - - - M P E D A Y L V G E V M D D A R L W L F - D K F H G T M N Y P L Y E A I L R F F V R G E I S A E E F L N W L E L - L S T Y Y G P A E

D P R V R K F I A K V M N Y W L E K - G I D G W R L D V A H G I P P D L W R E I R K E - - - M P E D A Y L V G E V M D D A R M W L F - D K F H G T M N Y P L Y E A I L R F F V T G E I T A E E F L N Y L E L - L S T Y Y G P A E

N P K V R E F I K N V I L F W T N K - G V D G F R M D V A H G V P P E V W K E V R E A - - - L P K E K Y L I G E V M D D A R L W L F - D K F H G V M N Y R L Y D A I L R F F G Y E E I T A E E F L N E L E L - L S S Y Y G P A E

N P E V K E F I R T V M K Y W L E R - G A D G W R L D V A H G V P P D V W R E I R K D - - - I P D D A Y L L G E V M D D A R L W L F - D K F H G T M N Y P L Y E A L L R F F V Y N E I T A E E F L N W L E L - L S V Y Y G P A E

S K G V R E F I G N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

S K G V R E F I R N I M E Y W I K K - G A D G W R L D V A H G V P P E V W E E I R E K - - - L P S N V Y L V G E V M D D A R L W I F - N K F H G T M N Y P L Y E A I L R F F V T R E I N A E Q F L N W L E L - L S F Y Y G P A E

N P E V R S F I T G V G R Y W V S R - G V D G W R L D V A H G V P P E L W R E F R E T - - - L P G D V Y L F G E V M D D A R I W L F - D K F H G A M N Y L L Y D A V L R F F A Y R E I T A E E F L N R L E L - L S V Y Y G P G E

N P R T V D Y F I D I T K F W I D K - G I D G F R I D V A M G I H Y S W M K Q Y Y E Y I K N T Y P D F L V L G E L A E N P R I Y M - - D Y F D S A M N Y Y L R K A I L E L L I Y K R I D L N E F I S R I N N V Y A Y I P H Y K A

H P D V R R Y L L D V A T Y W I R E C D I D G W R L D V A N E I D H E F W R E F R R A V K A Q K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P V A D A A L R F F A K E E I N A R E F A E R L M R V L H S Y P A T V N

N P E V K Q Y L L E V A E Y W I K E V G I D G W R L D V A N E V S H E F W R E F R K V V K R A N P D A Y I L G E I W H E S A P W L E G D K F D A V M N Y P F T S A V I D F F V F G N L D A E G F A N S I G K Q L S R Y P L Q A S

H P D V K E Y L L K V G R Y W V R E F H I D G W R L D V A N E V D H S F W R E F R S E I K A I N P E V Y I L G E I W H D A Q P W L Q G D Q F D A V M S Y P I T N A L H S Y F A N E T I G A S E F M E Q I T A S L H S Y S M N V N

N P E V K R Y L L D V A T Y W I R E F D I D G W R L D V A N E I D H E F W R E F R Q A V K A L K P D V Y I L G E I W H D A M P W L R G D Q F D A V M N Y P F T D G V L R F F A K E E I G A R Q F A N Q M V H V L H S Y P N N V N

H P D V R S Y L L E V G R Y W V R E F D I D G W R L D V A N E V D H A F W R E F R Q A V R A E K E D V Y I L G E I W H D S M P W L Q G D Q F D A V M N Y P F T T G T M N F I A N N K V K A E E F V H I M E S V L H S Y P K N V N

106

109

109

104

104

104

104

104

104

104

104

105

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

188

193

187

185

184

184

187

184

184

184

195

202

104

101

104

104

104

279

287

274

279

279

275

279

283

284

284

275

305

216

211

216

216

216

386

394

381

386

386

382

386

390

391

391

378

417

302

298

302

302

303

187

192

186

184

183

183

186

183

183

183

194

201

103

100

103

103

103

278

286

273

278

278

274

278

282

283

283

274

304

215

210

215

215

215

491

499

485

491

491

487

491

495

496

496

483

525

413

409

413

413

414

385

393

380

385

385

381

385

389

390

390

377

416

301

297

301

301

302

E R L G G E E A F R E L V K A L K S R D I K L V L D G V F H H T S F F H P F F R D V V E R G E E S E Y A D F Y R V K G F P V - - V S E E F I R V L K S D L P P M E K Y Q T L K K M G W N - - - Y E S F F S V W V M P R L N H D

D

A

A

A

A

A

A

A

A

R K L G G G G V F G E F V K E L K K R D I R L I L D G V F H H T S F F H P Y F Q D V V R K G E G S E Y R G F Y R I T G F P V - - V P E Q F L R V L H S E G P W I E R Y H L I K S L D W N - - - Y E S F Y S V W L M P R L N H D

S R R L G G D E A F D E L V K E L R R R G I K L I L D G V F H H T S F F H P Y F Q D V V E K G E R S R Y V G F Y R I L G F P V - - V S K R F L R A L D S G L L P G D T R S A P M G A E W N - - - Y E S F Y S V W L M P R L N S D

D R K L G G D G T F L K L A G E L K K R D I K L V L D G V F H H T S F F H P F F Q D L I A R G N E S D Y K D F Y R V T G F P V - - V S G E F L E V L R S K I S P R E K H R R L K E I G W N - - - Y E S F Y S V W L M P R L N H E

G K F G G N E A F G E L A R E L K R R D I K L I L D G V F H H T S F F H S Y F Q D V V K K G G E S R Y R D F Y R I L K F P V - - V S K D F L R V L D S N E P P E R K Y K G L K E L H Q N - - - Y E N F F S V W L M P R L N H D

K R L G G N A A F E K L V R E L K R R D I K L I L D G V F H H T S F F H P H F Q D V V R K G V E S V Y R D F Y R I T G F P V - - V S Q E F L E I L N S E E P W E E K F K R L K N L D W N - - - Y E S F F S V W L M P R L N H D

R R L G G D R A F V D L L S E L K R F D I K V I L D G V F H H T S F F H P Y F Q D V V R K G E N S S F K N F Y R I I K F P V - - V S K E F L Q I L H S K S S W E E K Y K K I K S L G W N - - - Y E S F F S V W I M P R L N H D

K K F G G D K A L K Q L V N E L K K R D I K L I L D G V F H H T S F F H P Y F Q D I L K K G K E S K Y R N F Y R I F G F P V - - I S K E F S K L L H S N E P W I E K Y Q K L R K L K W N - - - Y E S F F S V W L M P R L N H E

R K F G G D E A F E K L V Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

R K F G G D E A F E K L M Q K L K K R D I K L I L D G V F H H T S F F H P Y F Q D V V K N G K N S K Y K D F Y R I I S F P V - - V P E E F F E I L N S K L P W D E K Y R R L K S L K W N - - - Y E S F Y S V W L M P R L N H D

G R L G G D E A F G R L L A E L K K R G M R V V L D G V F H H T S F F H P Y F Q D L V E K G E E S R Y K G F Y R V L G F P V - - V P R E F L E A L R S G A P R H E - - - - L K K Y P R R - - - Y E S F F D V W L M P R L N H D

D K Y L G T M E D F E K L V Q V L H S R K I K I V L D I T M H H T N P C N E L F V K A L R E G E N S P Y W E M F S F L S P P P K E I V E L M L K Y I D G E E C R S R E L Y K L D Y F R N N K P F Y E A F F N I W L M A K F N H D

D P H F G D K E T L K T L V Q R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V L K N G E S S P Y K D W F H I R D F P L - - Q S E - P - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D A K T L K K L V D V C H E R G I R V L L D A V F N H A G K T F A P F I D V Q E K G E A S P Y K D W F H I N Q F P L - - A F D Q D - - - - - - - - - - - - - - - - - - - - I P S - - - Y D T F A F E P L M P K L N T E

D P Q F G T K E T F K K L V N A C H K R G I K V M L D A V F N H S G Y F F D K F Q D V L K K G K Q S R Y T N W F H I H E F P I - - V T E - P - - - - - - - - - - - - - - - - - - - - L P N - - - Y D T F A F T P Y M P K L N T A

D P H F G D K E T L K T L I D R C H E K G I R V M L D A V F N H C G Y E F A P F Q D V W K N G E S S K Y K D W F H I H E F P L - - Q T E - S - - - - - - - - - - - - - - - - - - - - R P N - - - Y D T F A F V P Q M P K L N T A

D P Q F G D K E T F K R L V R T C H D N G I K V M L D A V F N H S G Y Y F P Q F Q D V L E H G E K S S Y K D W F H I R K F P L - - K N E D D - - - - - - - - - - - - - - - - - - - - T I N - - - Y D A F A F V E S M P K L N T E

lowast lowast

M Y K V F G F E E N F I H G R V A R - - V E F S L P D A G R W D Y A Y L L G N F N A F N E G S F R M K H E D K R W I I E I K L P E G L W R Y A F S A G G E F - - L L D P E N P E K E L Y R R P S Y K F E R E V S L A K I A

W

M R K V Y K I F G F E P D Q K F G R V A V - - V E F S I P A E P G N R Y A Y L L G S F N A F N E G S F R M R R K K G R W R T V V K L P E G V W H Y A F S I D G E F - - T P D P E N P R R E V Y R R L S Y K F E R E T S V A V I D

- - -

- - -

M Y K T F G F V E D P V F G R L A R - - V E F S I P Y R - G E R Y A Y L L G S F N A F N E G S F R M E R R G S R W F I R V L L P E G V W R Y A F S L E G R F - - E R D P E N E N V E T Y R R P S Y K F E K E V S V A G V I

- - - M Y K I F G F E P D W R F G R V A R - - V E F S I P A R -- G K Y A Y L L G N F N A F N E G S F R M E R K G E R W R I T L R L P E G V W Y Y G F S V D G E F - - L M D P E N P D V E T Y R K L S Y K L E K E A S V A R I V

- - - M Y K T F G F E S N E Y F G R I A K - - V E F S V P S R - - G S Y A Y L V G S F N A F N E G S F R M R E E N G R W R A T V E L P E G V W H Y G F S I D G K Y - - A P D P E N P E K R A Y R R F S Y K F E R E T S V A R I S

- - - M Y K I L E F G H N E Y F G R V A K - - V E F S F P K R -- G G Y A Y L V G S F N A F N E G S F R M R E K G D R W H I V I D L P E A I W Y Y G F S L D G K Y - - T P D I E N P E R T L Y R R L S Y K F E R E V S I A R I

- - - M Y K L V S F R D S E I F G R V A E - - V E F S L I R E - - G S Y A Y L L G D F N A F N E G S F R M E Q E G K N W K I K I A L P E G V W H Y A F S I D G K F - - V L D P D N P E R R V Y T R K G Y K F H R E V N V A R I V

- - - M Y K I F G F K N D K Y L G K V A E - - V E F S M L K R - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W S I K I E L P E G V W Y Y A F S I D G D L - - M L D P E N R E K T T Y K R H S Y K F R R T V N V A K I F

- - - M Y K I F G F K D D D Y L G K V G I - - T E F S I P K R - - G S Y A Y L L G N F N A F N E G S F R M K E K G D R W Y I K V E L P E G I W Y Y A F S I D G N L - - T L D F E N N E K A V Y R R L S Y K F E K T V N V A K I F

- - - M Y K I F G F K D N D Y L G K V G I - - T E F S I P K S - - G S Y A Y L L G N F N A F N E G S F R M R E K G D R W Y I K V E L P E G I W Y Y T F S V D G N L - - I L D F E N N E K T V Y R R L S Y K F E K T V N V A K I F

- - - M Y R V L G F R D D V Y L G R V V K - - A E F S A P R E - - G E Y A Y L L G N F N A F N E G S F R M R G A G D R W V V E V E L P E G V W Y Y L F S L G G R R - - A V D P E N P E T T V Y S R R A Y K F E E R V S V A K L L

- - - M Y K I I G R E I - Y G K G R K G R Y I V K F T R H W P Q Y A K N I Y L I G E F T S L Y P G F V K L R K I E E Q G I V Y L K L W P G E Y G Y G F Q I D N D F E N V L D P D N E E K K C V H T S F F P E Y K K C L S K L V I

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

105

108

108

104

103

103

103

103

103

103

103

103

Figure 1 Continued

Archaea 5

492

500

486

492

492

488

492

496

497

497

484

526

414

410

414

414

415

597

607

593

599

599

595

599

603

604

604

591

630

521

Y F T Y N F L D N H D T E R F I D L A - G - K E R Y L C A L T F L M T Y K G I P A I F Y G D E I G L R G S - G E G M S A G R T P M S W D E E K W D F Q I L R Q T M K L I E L R R S L K S L Q - V G S F R V I G A - - G E K W F V

Y A M Y N F L D N H D T E R F L D L V - G D K R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G R L D G G L S A G R T S M V W D R G K W D T E I F E T T K R L I R L R R G S R A L Q - L G E F V P V R F - - Q G R T M I

Y Y A Y N F L D N H D T E R F L D L V - H D E R L Y L C A L A F L M T Y K G I P A V F Y G D E I G L R G R K G G G L D A G R T P M K W R E E N W N R E I L E T T R E L I H L R R N S K A L Q - F G T F R P L L F - - R G R T I V

Y A M Y N F I D N H D T E R F I D L V - N D E R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G K L E G G L D A G R T P M E W N P E G W N E R I L E T T R K L I E L R K R S K A L Q - L G D F I P L R F - - E G D E I I

Y S M Y N F L D N H D V E R F L D L V - G D E R R Y L C A L A F L M T Y K G I P A L F Y G D E I G L R G I G A S G M E S S R T P M K W G K E T W N T K I L R V T K A L I R L R R K S K A L Q - L G E F R P L E F - - K G G L L L

Y M M Y N F L D N H D V E R F L D L V - G D R K R Y L C A L A F L M T Y K G I P S I F Y G D E I G L S G M E G K G L E V S R T P M R W E G N Q W D T E I L K V T K A L I R L R R N S R A L Q - L G F F R P L K F - - K G R L L V

Y L M Y N F L D N H D V E R F L D I V - G D K R K Y V C A L V F L M T Y K G I P S L F Y G D E I G L R G I N L Q G M E S S R A P M L W N E E E W D Q R I L E I T K T L V K I R K N N K A L L - F G N F V P V K F - - K R K F M V

Y T M Y N F L D N H D V E R F L G L V - R D K R K Y L C A L T F L M T Y K G I P A I Y Y G D E V G L E N M D V P S M E C S R V P M E W N E K K W D K E I L K I T K E L I D L R R R S K A L Q - R G T F V P I F F - - E D K L L I

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G D E I G M R N I E A P F M E R S R A P M E W N K K R W D F E I L N I V K E L I K L R K G S K A L Q - V G T F E P V E F - - R E G M L L

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G N E I G M K N I E A P F M E R S R A P M E W N K K K W D K E I L K T T K E L I K L R R R S K A L Q - K G I F K P V K F - - K D K L L V

Y A M Y N F L D N H D V D R L L S L V - G D R D K Y L C A L V F L F T Y K G V P S I Y Y G D E V G L E N T D S P F M E R S R A P M R W D E S T W D K A I L E A T R A L A S L R R R S A A L Q - R G A F E P V R F - - E G G L L V

L S L Y N M L G S H D V P R I K S M V - Q N N K L L K L M Y V L I F A L P G S P V I Y Y G D E I G L E G G R D P D - - - N R R P M I W D R G N W D L E L Y E H I K K L I R I Y K S C R S M R - H G Y F L V E N L - - G S N L L F

E A A F N L L G S H D T P R I L T V C G E D V R K A K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P G - - - C R K C M I W D D D K Q H R G L Y E H V K Q L I A L R R Q Y R A L R - R G H I A V L H A D E Q T N Q L V

E V A F N L L D S H D T P R L L T L A K G D K K K Q K L A S L F Q F T F M G T P C I Y Y G D E V G M D G G G D P D - - - C R K C M E W D K D K Q D L D L F E F Y R R L I H I R A S H P A L R - T G T L T F L E A S R Q G T K L A

K A A F H L L D S H D T P R I L T T C K G N K N K V K L L Y V F H L S F I G S P C V Y Y G D E I G M D G G M D P G - - - C R K C M V W D E D K Q D T V L F K H I Q T L I S L R R Q Y K A F G G H G L F Q C I E A N D E Q G Y I S

E A A F N L L G S H D T S R I L T V C G G D I R K V K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P E - - - C R K C M V W D P M Q Q N K E L H Q H V K Q L I A L R K Q Y R S L R - R G E I S F L H A D D E M N Y L I

E V A F N L L G S H D T P R I L T T S G G S K E K L K L L F A Y Q L S F I G T P C I Y Y G D E I G M D G E Q D P G - - - C R K C M I W E E D K Q D R E L F T Y V K K L I S L R K K Y P V F G N G G D I T F I E A N D E T N H V I

598

608

594

600

600

596

600

604

605

605

592

631

522

518

523

656

637

638

644

652

656

645

654

655

660

644

696

587

581

586

588

589

Y E R K A G S E R V L V G I N C S W N D V E T P V P S N G S - - - - - - - - - - - - - - - - - - N E Q I K I P A F S S I I R V K D S M N V H I G S D L Q E

Y E R V L G D E R V R V E I R Y S M E P E D C T F H V T A S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Y E R A I D G E S L V V A I N C S E V H V K V S L P G G - - - - - - - - - - - - - - - - - - - - - K S L N L P P L S F R I V D T G R - - - - - - - - - - -

Y E R A L G K E R V R V E I R Y T K N P E E C R F K L F L S H L K - - - - - - - - - - - - - - R K Y W K N Y S P N T S - - - - - - - - - - - - - - - - - -

Y E R V Y Q N E G V L V G I N Y S D V P T A I Q I P E A Y R P A A - - - - - - - - - - - - - D G V S F L K M K P W S F V A L A S T I - - - - - - - - - - -

Y E R I Y E K E H V L V A I N C S S R V E S V L I P E K Y R P I V - - - - - - - - - - - - - - G K T S I E L A P W S F I V V F S R F N D V Q L L S W P - -

Y K R E H M G E R T I V A I N Y S N S R V K - - - - - - - - - - - - - - - - - - - - - - - - - - E L G I T I P E Y S G V I I N E D K V K L I K Y - - - - -

Y E R V S K G E R I L I G I N Y S E K E A K I K L P E K V K I L L - - - - - - - - - - - - - G Q L H G E R L P P F S F F I S S L - - - - - - - - - - - - -

Y E R I H G E E R L L I G I N Y S E N P V S L R K S P D E I L L - - - - - - - - - - - - - - G D L E N S V L K P F S F F V G R L S - - - - - - - - - - - -

Y K R V L N N E N I L V A I N Y S K K E K H L D L P P S F E I L F - - - - - - - - - Q S G S F D R V N I R L K P F S S I I A K K L - - - - - - - - - - - -

Y R R R L G D E S I L V A I N Y S E S E A V L E E P A Q S V L F R - - - - - - - - - - - - S G S V K E K L L G P F S S V V A G D R - - - - - - - - - - - -

I K R W I N N E E I I F L L N V S S K D I S V D L K K L - - G K Y S F D I Y N E K N I D Q H V E - N N V L L R G Y G F L I L G S K P C N I - - - - - - - -

Y E K T D G D E T V V I I I N R S N Q A A D I P L P F N A K K K R L V N L L T G E R W A A E A D G L S V S L P A Y G F A L Y A V E K - - - - - - - - - - -

Y E R R L G D D I L I V L V N T E E T A Q Y F Q L A V E - - E R Q W E N V L T D A P L R A E R G I L S M K L P A F G Y A V L K A V Y - - - - - - - - - - -

Y T K T Y G E E T I F F V L N P T N Q E I S A P I P F D I T G K K I V N L Y T N E E F S A E A D S L Q V A L P P Y G F S I L K W - - - - - - - - - - - - -

Y K K T D G D E T V L V I I N R S D Q K A D I P I P L D A R G T W L V N L L T G E R F A A E A E T L C T S L P P Y G F V L Y A I E R W - - - - - - - - - -

F T K Q N S S Q K M I A V L N N S D K E L S A T L P F S L E D T K L T D L L T G K E F A A H A E K L T V T V P P Y E M A F Y L V Q E - - - - - - - - - - -

522

524

517

522

521

523

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

lowast

Figure 1 Sequence alignment of Tk1770 CDase with sixteen CD hydrolyzing enzymes The alignment of Tk1770 CDase with archeal andbacterial CD hydrolyzing enzymes was carried out with Clustal Omega through UGENE packageThe novel N1015840-domain (CBM48) in archealsequences is represented in red and the protruding region of CBM48 domain in green dotted line The arrow shows the start of the TIMbarrel domain (residues 204ndash584) and four conserved regions (IndashIV) with another downstream conserved region V are represented in greyline below sequence The catalytic triad is indicated through esterics The HLH region of archeal sequences that is absent in all bacterialhomologs is represented in blue dotted line

PYRYC CDase THEG

J MAse

THERCLF CDase

PYRF

U N

Pase

GBACI CDase

THERSP CDase

GEO

SE NPase

THEO

N CD

ase

BACIIN

CDase

THERPA CDase

THES4 CDase

PBACI CDase

Tk1770 CDase

THELN NPase

STAMF 120572-amylase

BACM

Y120572

-CD

ase

THEPD120572-amylase

Figure 2 Phylogenetic tree rooted radial tree of 17 CD hydrolyzing enzymes was constructed using MrBayes with Wag rate matrix (fixed)and visualized using FigTree The phylogenetic tree obtained displays three distinct clades All the bacterial enzymes form a single clade(shown in blue) while the branch for archeal enzymes split into two clades (shown in green and red) Depending upon sequence identity anddomain arrangement Tk1770 CDase seems to be more closely related to THEGJ MAse THES4 CDase THERCLF CDase PYRFU NPaseTHEON CDase and PYRYC CDase (green)

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Archaea 5

492

500

486

492

492

488

492

496

497

497

484

526

414

410

414

414

415

597

607

593

599

599

595

599

603

604

604

591

630

521

Y F T Y N F L D N H D T E R F I D L A - G - K E R Y L C A L T F L M T Y K G I P A I F Y G D E I G L R G S - G E G M S A G R T P M S W D E E K W D F Q I L R Q T M K L I E L R R S L K S L Q - V G S F R V I G A - - G E K W F V

Y A M Y N F L D N H D T E R F L D L V - G D K R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G R L D G G L S A G R T S M V W D R G K W D T E I F E T T K R L I R L R R G S R A L Q - L G E F V P V R F - - Q G R T M I

Y Y A Y N F L D N H D T E R F L D L V - H D E R L Y L C A L A F L M T Y K G I P A V F Y G D E I G L R G R K G G G L D A G R T P M K W R E E N W N R E I L E T T R E L I H L R R N S K A L Q - F G T F R P L L F - - R G R T I V

Y A M Y N F I D N H D T E R F I D L V - N D E R R Y L C A L A F L M T Y K G I P S I F Y G D E I G L R G K L E G G L D A G R T P M E W N P E G W N E R I L E T T R K L I E L R K R S K A L Q - L G D F I P L R F - - E G D E I I

Y S M Y N F L D N H D V E R F L D L V - G D E R R Y L C A L A F L M T Y K G I P A L F Y G D E I G L R G I G A S G M E S S R T P M K W G K E T W N T K I L R V T K A L I R L R R K S K A L Q - L G E F R P L E F - - K G G L L L

Y M M Y N F L D N H D V E R F L D L V - G D R K R Y L C A L A F L M T Y K G I P S I F Y G D E I G L S G M E G K G L E V S R T P M R W E G N Q W D T E I L K V T K A L I R L R R N S R A L Q - L G F F R P L K F - - K G R L L V

Y L M Y N F L D N H D V E R F L D I V - G D K R K Y V C A L V F L M T Y K G I P S L F Y G D E I G L R G I N L Q G M E S S R A P M L W N E E E W D Q R I L E I T K T L V K I R K N N K A L L - F G N F V P V K F - - K R K F M V

Y T M Y N F L D N H D V E R F L G L V - R D K R K Y L C A L T F L M T Y K G I P A I Y Y G D E V G L E N M D V P S M E C S R V P M E W N E K K W D K E I L K I T K E L I D L R R R S K A L Q - R G T F V P I F F - - E D K L L I

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G D E I G M R N I E A P F M E R S R A P M E W N K K R W D F E I L N I V K E L I K L R K G S K A L Q - V G T F E P V E F - - R E G M L L

Y V M Y N F L D N H D V D R M L S L L - G D K R K Y L C A L V F L F T Y K G V P S I Y Y G N E I G M K N I E A P F M E R S R A P M E W N K K K W D K E I L K T T K E L I K L R R R S K A L Q - K G I F K P V K F - - K D K L L V

Y A M Y N F L D N H D V D R L L S L V - G D R D K Y L C A L V F L F T Y K G V P S I Y Y G D E V G L E N T D S P F M E R S R A P M R W D E S T W D K A I L E A T R A L A S L R R R S A A L Q - R G A F E P V R F - - E G G L L V

L S L Y N M L G S H D V P R I K S M V - Q N N K L L K L M Y V L I F A L P G S P V I Y Y G D E I G L E G G R D P D - - - N R R P M I W D R G N W D L E L Y E H I K K L I R I Y K S C R S M R - H G Y F L V E N L - - G S N L L F

E A A F N L L G S H D T P R I L T V C G E D V R K A K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P G - - - C R K C M I W D D D K Q H R G L Y E H V K Q L I A L R R Q Y R A L R - R G H I A V L H A D E Q T N Q L V

E V A F N L L D S H D T P R L L T L A K G D K K K Q K L A S L F Q F T F M G T P C I Y Y G D E V G M D G G G D P D - - - C R K C M E W D K D K Q D L D L F E F Y R R L I H I R A S H P A L R - T G T L T F L E A S R Q G T K L A

K A A F H L L D S H D T P R I L T T C K G N K N K V K L L Y V F H L S F I G S P C V Y Y G D E I G M D G G M D P G - - - C R K C M V W D E D K Q D T V L F K H I Q T L I S L R R Q Y K A F G G H G L F Q C I E A N D E Q G Y I S

E A A F N L L G S H D T S R I L T V C G G D I R K V K L L F L F Q L T F T G S P C I Y Y G D E I G M T G G N D P E - - - C R K C M V W D P M Q Q N K E L H Q H V K Q L I A L R K Q Y R S L R - R G E I S F L H A D D E M N Y L I

E V A F N L L G S H D T P R I L T T S G G S K E K L K L L F A Y Q L S F I G T P C I Y Y G D E I G M D G E Q D P G - - - C R K C M I W E E D K Q D R E L F T Y V K K L I S L R K K Y P V F G N G G D I T F I E A N D E T N H V I

598

608

594

600

600

596

600

604

605

605

592

631

522

518

523

656

637

638

644

652

656

645

654

655

660

644

696

587

581

586

588

589

Y E R K A G S E R V L V G I N C S W N D V E T P V P S N G S - - - - - - - - - - - - - - - - - - N E Q I K I P A F S S I I R V K D S M N V H I G S D L Q E

Y E R V L G D E R V R V E I R Y S M E P E D C T F H V T A S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Y E R A I D G E S L V V A I N C S E V H V K V S L P G G - - - - - - - - - - - - - - - - - - - - - K S L N L P P L S F R I V D T G R - - - - - - - - - - -

Y E R A L G K E R V R V E I R Y T K N P E E C R F K L F L S H L K - - - - - - - - - - - - - - R K Y W K N Y S P N T S - - - - - - - - - - - - - - - - - -

Y E R V Y Q N E G V L V G I N Y S D V P T A I Q I P E A Y R P A A - - - - - - - - - - - - - D G V S F L K M K P W S F V A L A S T I - - - - - - - - - - -

Y E R I Y E K E H V L V A I N C S S R V E S V L I P E K Y R P I V - - - - - - - - - - - - - - G K T S I E L A P W S F I V V F S R F N D V Q L L S W P - -

Y K R E H M G E R T I V A I N Y S N S R V K - - - - - - - - - - - - - - - - - - - - - - - - - - E L G I T I P E Y S G V I I N E D K V K L I K Y - - - - -

Y E R V S K G E R I L I G I N Y S E K E A K I K L P E K V K I L L - - - - - - - - - - - - - G Q L H G E R L P P F S F F I S S L - - - - - - - - - - - - -

Y E R I H G E E R L L I G I N Y S E N P V S L R K S P D E I L L - - - - - - - - - - - - - - G D L E N S V L K P F S F F V G R L S - - - - - - - - - - - -

Y K R V L N N E N I L V A I N Y S K K E K H L D L P P S F E I L F - - - - - - - - - Q S G S F D R V N I R L K P F S S I I A K K L - - - - - - - - - - - -

Y R R R L G D E S I L V A I N Y S E S E A V L E E P A Q S V L F R - - - - - - - - - - - - S G S V K E K L L G P F S S V V A G D R - - - - - - - - - - - -

I K R W I N N E E I I F L L N V S S K D I S V D L K K L - - G K Y S F D I Y N E K N I D Q H V E - N N V L L R G Y G F L I L G S K P C N I - - - - - - - -

Y E K T D G D E T V V I I I N R S N Q A A D I P L P F N A K K K R L V N L L T G E R W A A E A D G L S V S L P A Y G F A L Y A V E K - - - - - - - - - - -

Y E R R L G D D I L I V L V N T E E T A Q Y F Q L A V E - - E R Q W E N V L T D A P L R A E R G I L S M K L P A F G Y A V L K A V Y - - - - - - - - - - -

Y T K T Y G E E T I F F V L N P T N Q E I S A P I P F D I T G K K I V N L Y T N E E F S A E A D S L Q V A L P P Y G F S I L K W - - - - - - - - - - - - -

Y K K T D G D E T V L V I I N R S D Q K A D I P I P L D A R G T W L V N L L T G E R F A A E A E T L C T S L P P Y G F V L Y A I E R W - - - - - - - - - -

F T K Q N S S Q K M I A V L N N S D K E L S A T L P F S L E D T K L T D L L T G K E F A A H A E K L T V T V P P Y E M A F Y L V Q E - - - - - - - - - - -

522

524

517

522

521

523

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

Tk1770_CDaseTHES4THEGJ

THERCLFTHEONPYRYCPYRFU

THERPATHELN

THERSPTHEPDSTAMFGBACIPBACI

BACMYGEOSE

BACIIN

lowast

Figure 1 Sequence alignment of Tk1770 CDase with sixteen CD hydrolyzing enzymes The alignment of Tk1770 CDase with archeal andbacterial CD hydrolyzing enzymes was carried out with Clustal Omega through UGENE packageThe novel N1015840-domain (CBM48) in archealsequences is represented in red and the protruding region of CBM48 domain in green dotted line The arrow shows the start of the TIMbarrel domain (residues 204ndash584) and four conserved regions (IndashIV) with another downstream conserved region V are represented in greyline below sequence The catalytic triad is indicated through esterics The HLH region of archeal sequences that is absent in all bacterialhomologs is represented in blue dotted line

PYRYC CDase THEG

J MAse

THERCLF CDase

PYRF

U N

Pase

GBACI CDase

THERSP CDase

GEO

SE NPase

THEO

N CD

ase

BACIIN

CDase

THERPA CDase

THES4 CDase

PBACI CDase

Tk1770 CDase

THELN NPase

STAMF 120572-amylase

BACM

Y120572

-CD

ase

THEPD120572-amylase

Figure 2 Phylogenetic tree rooted radial tree of 17 CD hydrolyzing enzymes was constructed using MrBayes with Wag rate matrix (fixed)and visualized using FigTree The phylogenetic tree obtained displays three distinct clades All the bacterial enzymes form a single clade(shown in blue) while the branch for archeal enzymes split into two clades (shown in green and red) Depending upon sequence identity anddomain arrangement Tk1770 CDase seems to be more closely related to THEGJ MAse THES4 CDase THERCLF CDase PYRFU NPaseTHEON CDase and PYRYC CDase (green)

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

6 Archaea

TIM barrel C-domain

1 106 585 656

N-domainN998400-domain(CBM48) (CBM34)

Linker 190ndash203

(a)

HLH

CBM34

CBM48

(b)

PYRFU NPaseYYRTRRPKSGYYKKFF

Tk1770 CDase

(c)

Figure 3 Structural features of Tk1770 CDase (a) The schematic diagram representing the domain arrangement within Tk1770 made withsoftware DOG (Domain Graph) v20 [32] (b) The homology model of Tk1770 CDase consisting of N1015840- (blue) N- (yellow) catalytic (red)and C-domain (green) The catalytic domain also contains helix-loop-helix (HLH) structure (cyan) (c) Structural alignment of N1015840-domain(CBM48) of Tk1770 CDase (blue) model and template (4AEF) (grey) with an extension of loop into the catalytic siteThe sequence alignmentbetween the loops of model and template (4AEF) suggests that the substitution of P91 and S92 in Tk1770 makes its loop rigid

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Archaea 7

VAL 376LYS 364

SER 375

LYS 94

ASP 502

MET ARG 550546

(a)

LYS 364VAL 376

SER 375 MET 546

GLU 504

LYS 94

GLU 96

(b)

MET LYS 364

ASP 502

ARG 550

ARG 97 LYS 94

546

(c)

Figure 4 Docking of cyclodextrins into the active site of the Tk1770 CDase (a) Complex of Tk1770 CDase with 120572-CD and residues involvedin interactions (b) The docked conformation and 120573-CD with TK1770 CDase (c) The active site residues of Tk1770 interacting with 120574-CDThe hydrogen bonds between substrate and amino acids are represented as red dashes

PYRFUNPase form strong hydrogen bondingwithD460 andE470 of catalytic site In Tk1770 S92makes only one hydrogenbond with D460 thus reducing the interactions between theN1015840-domain and catalytic domain Moreover S92 in Tk1770rotates backward to formhydrogen bondwithY93 thatmakesthe loop more rigid All of these factors may contribute tothe decreased stability of the enzyme domains in Tk1770Recently it was reported that the optimum temperature forTk1770 CDase is 65∘C which is much lower as compared tooptimal growth temperature for T kodakarensis (85∘C) andthe optimum temperature for other archeal CD hydrolyzingenzymes [37]

The (120573120572)8barrel (A-domain) also contains a much

larger B-domain between 120573-strand 3 and alpha-helix 3 fromresidues 306 to 403The B-domain of all the archeal enzymespossesses a helix-loop-helix (HLH) motif that extends atthe entrance of active site (Figure 3) but this HLH motifis absent in all five bacterial enzymes as shown in Figure 1It has been reported that in order to maintain activity athigh temperatures archaea might have adapted additionalstructural features These features include N1015840-domain witha loop extension into the catalytic site and HLH motif thatprovides all necessary components for substrate binding andcatalysis in a monomer [35 38]

33 Docking of Substrates into the Catalytic Site The dockingof substrates into the catalytic site provided information

about the interactions in enzyme-substrate complex For thispurpose AutoDock was used to dock cyclodextrins thatis 120572- 120573- and 120574-cyclodextrins into the active site of theTk1770 CDase model All of the conformations of ligandsgenerated by the AutoDock were scored on the basis of theirbinding affinities in kcalmol The best poses of 120572- 120573- and120574-cyclodextrins were selected with binding energies of minus88minus61 and minus78 kcalmol respectively

The docking results showed that apart from interactingresidues a number of residues come in close proximity withsubstrates especially hydrophobic residues like Y93 F95F373 F374 and V376 In case of docked 120572-cyclodextrinresidues D502 R550 and S375 formed hydrogen bonds withhydroxyl groups of substrate (Figure 4) In our homologymodel K94 in the loop extension of N1015840-domain forms a saltbridge with E504 from the active site andmight contribute tothe stability of two domains in the same manner as observedby Park et al in amylaseneopullulanase (4AEF) from Pfuriosus [38] However docking of 120573-CD showed stronginteractions of K94 with hydroxyl groups of substrate andwith E504 (Figure 4) Similarly docking of 120574-CD revealedinteractions of K94 R97 and K364 with substrate (Figure 4)The amino acid K364 in helical region of HLHmotif extendsinto the entrance of active pocket right above the F373 andF374 and might have a role in guiding the substrate into theactive site The aromatic amino acids Y88 Y93 and F95 thatseem to be forming boundary wall of the active site and K94

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

8 Archaea

protruding into the entrance of catalytic site are conserved inarcheal homologs except STAMF 120572-amylase

4 Conclusion

Cyclodextrinase from hyperthermophilic archaea T kodaka-rensis hydrolyzes cyclodextrins into linearmaltodextrinsThesequence alignment of CD hydrolyzing enzymes confirmedthat archaea have developed an additional N1015840-domain and ahelix-loop-helix (HLH) motif in the B-domain that is absentin all bacterial homologs The homology model constructedrevealed that loop connecting 120573-strand 7 and 120573-strand 8of N1015840-domain extends into the catalytic site (A-domain)and plays an important role in substrate binding ResiduesY88 Y93 K94 F95 and R97 in extended loop of N1015840-domain of Tk1770 CDase are conserved in CD hydrolyzingenzymes of archaea Structural alignment betweenmodel andtemplate (4AEF) indicated that P91 and S92 in loop extensionof N1015840-domain of Tk1770 might decrease its flexibility andinteractions with A-domain This might contribute to thedecreased stability of two domains in Tk1770

The docking studies indicated that residues K94 R97K364 S375 D502 E504 andR550 formhydrogen bondswithsubstrates Residue K364 in the helix of HLHmotif extendingat the entrance of the catalytic site interacts with substrate andmight be involved in guiding the substrate into the catalyticsite From these results it can be inferred that archeal CDhydrolyzing enzymes have developed catalytic machinery inwhich an extension of N1015840-domain not only constitutes a partof active pocket but also plays an important role in substratebinding

Conflict of Interests

The authors declare no conflict of interests exists

Authorsrsquo Contribution

Ramzan Ali and Muhammad Imtiaz Shafiq contributedequally to the paper

References

[1] R M Kelly L Dijkhuizen and H Leemhuis ldquoStarch and 120572-glucan acting enzymes modulating their properties by directedevolutionrdquo Journal of Biotechnology vol 140 no 3-4 pp 184ndash193 2009

[2] E M M Del Valle ldquoCyclodextrins and their uses a reviewrdquoProcess Biochemistry vol 39 no 9 pp 1033ndash1046 2004

[3] W J Shieh and A R Hedges ldquoProperties and applications ofcyclodextrinsrdquo Journal of Macromolecular Science Part A Pureand Applied Chemistry vol 33 no 5 pp 673ndash683 1996

[4] Y Nakagawa W Saburi M Takada Y Hatada and KHorikoshi ldquoGene cloning and enzymatic characteristics of anovel 120574-cyclodextrin-specific cyclodextrinase from alkalophilicBacillus clarkii 7364rdquo Biochimica et Biophysica ActamdashProteinsand Proteomics vol 1784 no 12 pp 2004ndash2011 2008

[5] K Uekama F Hirayama andH Arima ldquoPharmaceutical appli-cations of cyclodextrins and their derivativesrdquo in Cyclodextrins

and Their Complexes Chemistry Analytical Methods Applica-tions chapter 14 pp 381ndash422 Wiley-VCH 2006

[6] 2015 httpwwwcazyorg[7] P V Aiyer ldquoAmylases and their applicationsrdquo African Journal of

Biotechnology vol 4 no 13 pp 1525ndash1529 2005[8] W D Crabb and C Mitchinson ldquoEnzymes involved in the

processing of starch to sugarsrdquo Trends in Biotechnology vol 15no 9 pp 349ndash352 1997

[9] T Han F Zeng Z Li et al ldquoBiochemical characterizationof a recombinant pullulanase from Thermococcus kodakarensisKOD1rdquo Letters in Applied Microbiology vol 57 no 4 pp 336ndash343 2013

[10] E A MacGregor ldquoAn overview of clan GH-H and distantlyrelated familiesrdquoBiologia vol 60 supplement 16 pp 5ndash12 2005

[11] P M de Souza and P D O e Magalhaes ldquoApplication ofmicrobial 120572-amylase in industrymdasha reviewrdquo Brazilian Journalof Microbiology vol 41 no 4 pp 850ndash861 2010

[12] M R Stam E G J Danchin C Rancurel P M Coutinhoand B Henrissat ldquoDividing the large glycoside hydrolase family13 into subfamilies towards improved functional annotationsof 120572-amylase-related proteinsrdquo Protein Engineering Design andSelection vol 19 no 12 pp 555ndash562 2006

[13] M Machovic and S Janecek ldquoThe invariant residues in the 120572-amylase family just the catalytic triadrdquo Biologia vol 58 no 6pp 1127ndash1132 2003

[14] S Janecek ldquoHow many conserved sequence regions are therein the 120572-amylase familyrdquo Biologia vol 57 supplement 11 pp29ndash41 2002

[15] D Guillen S Sanchez and R Rodrıguez-SanojaldquoCarbohydrate-binding domains multiplicity of biologicalrolesrdquo Applied Microbiology and Biotechnology vol 85 no 5pp 1241ndash1249 2010

[16] J Matzke A Herrmann E Schneider and E P BakkerldquoGene cloning nucleotide sequence and biochemical propertiesof a cytoplasmic cyclomaltodextrinase (neopullulanase) fromAlicyclobacillus acidocaldarius reclassification of a group ofenzymesrdquo FEMS Microbiology Letters vol 183 no 1 pp 55ndash612000

[17] K-H Park T-J Kim T-K Cheong J-W Kim B-H Oh andB Svensson ldquoStructure specificity and function of cyclomal-todextrinase a multispecific enzyme of the 120572-amylase familyrdquoBiochimica et Biophysica ActamdashProtein Structure andMolecularEnzymology vol 1478 no 2 pp 165ndash185 2000

[18] N Ahmad N Rashid M S Haider M Akram and M AkhtarldquoNovelmaltotriose-hydrolyzing thermoacidophilic type III pul-lulan hydrolase from Thermococcus kodakarensisxsrdquo Appliedand Environmental Microbiology vol 80 no 3 pp 1108ndash11152014

[19] G D Haki and S K Rakshit ldquoDevelopments in industriallyimportant thermostable enzymes a reviewrdquo Bioresource Tech-nology vol 89 no 1 pp 17ndash34 2003

[20] C Bertoldo and G Antranikian ldquoStarch-hydrolyzing enzymesfrom thermophilic archaea and bacteriardquo Current Opinion inChemical Biology vol 6 no 2 pp 151ndash160 2002

[21] MW Bauer L E Driskill and RM Kelly ldquoGlycosyl hydrolasesfrom hyperthermophilic microorganismsrdquo Current Opinion inBiotechnology vol 9 no 2 pp 141ndash145 1998

[22] C Vieille and G J Zeikus ldquoHyperthermophilic enzymessources uses and molecular mechanisms for thermostabilityrdquoMicrobiology and Molecular Biology Reviews vol 65 no 1 pp1ndash43 2001

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Archaea 9

[23] S K Khare A Pandey and C Larroche ldquoCurrent perspectivesin enzymatic saccharification of lignocellulosic biomassrdquo Bio-chemical Engineering Journal vol 102 pp 38ndash44 2015

[24] 2015 httpwwwcazyorgGH13html[25] K Okonechnikov O Golosova M Fursov et al ldquoUnipro

UGENE a unified bioinformatics toolkitrdquo Bioinformatics vol28 no 8 pp 1166ndash1167 2012

[26] F Ronquist and J P Huelsenbeck ldquoMrBayes 3 bayesian phylo-genetic inference under mixed modelsrdquo Bioinformatics vol 19no 12 pp 1572ndash1574 2003

[27] B Webb and A Sali Protein Structure Modeling with MOD-ELLER Protein Structure Prediction Springer 2014

[28] N Eswar B Webb M A Marti-Renom et al ldquoComparativeprotein structure modeling using MODELLERrdquo in CurrentProtocols in Protein Science John Wiley amp Sons 2007

[29] M Wiederstein and M J Sippl ldquoProSA-web interactive webservice for the recognition of errors in three-dimensionalstructures of proteinsrdquoNucleic Acids Research vol 35 no 2 ppW407ndashW410 2007

[30] R A Laskowski M W MacArthur D S Moss and J MThornton ldquoPROCHECK a program to check the stereochemi-cal quality of protein structuresrdquo Journal of Applied Crystallog-raphy vol 26 no 2 pp 283ndash291 1993

[31] W L DeLanoThe PyMOLMolecular Graphics System DeLanoScientific Palo Alto Calif USA 2002

[32] J Ren L Wen X Gao C Jin Y Xue and X Yao ldquoDOG 10illustrator of protein domain structuresrdquo Cell Research vol 19no 2 pp 271ndash273 2009

[33] G M Morris H Ruth W Lindstrom et al ldquoAutoDock4 andAutoDockTools4 automated docking with selective receptorflexibilityrdquo Journal of Computational Chemistry vol 30 no 16pp 2785ndash2791 2009

[34] M Machovic B Svensson E Ann MacGregor and S JanecekldquoA new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21rdquo FEBSJournal vol 272 no 21 pp 5497ndash5513 2005

[35] T-Y Jung D Li J-T Park et al ldquoAssociation of novel domainin active site of archaic hyperthermophilic maltogenic amylasefrom StaphylothermusmarinusrdquoThe Journal of Biological Chem-istry vol 287 no 11 pp 7979ndash7989 2012

[36] M Machovic and S Janecek ldquoDomain evolution in the GH13pullulanase subfamily with focus on the carbohydrate-bindingmodule family 48rdquo Biologia vol 63 no 6 pp 1057ndash1068 2008

[37] Y Sun X Lv Z Li J Wang B Jia and J Liu ldquoRecombinant cy-clodextrinase from Thermococcus kodakarensis KOD1 expres-sion purification and enzymatic characterizationrdquo Archaeavol 2015 Article ID 397924 8 pages 2015

[38] J-T Park H-N Song T-Y Jung et al ldquoA novel domainarrangement in amonomeric cyclodextrin-hydrolyzing enzymefrom the hyperthermophilerdquo Biochimica et Biophysica ActamdashProteins and Proteomics vol 1834 no 1 pp 380ndash386 2013

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology


Recommended