+ All Categories
Home > Documents > Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs ›...

Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs ›...

Date post: 07-Jul-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
46
Nitrosopumilus maritimus genome reveals unique mechanisms for nitrication and autotrophy in globally distributed marine crenarchaea C. B. Walker a,b , J. R. de la Torre a , M. G. Klotz c , H. Urakawa a , N. Pinel a , D. J. Arp d , C. Brochier-Armanet e , P. S. G. Chain f,g,h , P. P. Chan i , A. Gollabgir j , J. Hemp k , M. Hügler l,m , E. A. Karr n , M. Könneke o , M. Shin f,g , T. J. Lawton p , T. Lowe i , W. Martens- Habbena a , L. A. Sayavedra-Soto d , D. Lang f,g , S. M. Sievert q , A. C. Rosenzweig p , G. Manning j , and D. A. Stahl a,1 a Department of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195; b Geosyntec Consultants, Seattle, WA 98101; c Department of Biology, University of Louisville, Louisville, KY 40292; d Department of Botany and Plant Pathology, Oregon State University, Corvalis, OR 97331; e Université de Provence Aix-Marseille I, Laboratoire de Chimie Bactérienne, Centre National de la Recherche Scientique Unité Propre de Recherche, Marseille, 13402 France; f Biosciences Division, Lawrence Livermore National Laboratory, Livermore, CA 94550; g Microbial Program, Joint Genome Institute, Walnut Creek, CA 94598; h Center for Microbial Ecology, Michigan State University, East Lansing, MI 48824; i Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064; j Razavi Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, CA 92037; k School of Chemical Sciences, University of Illinois, Urbana, IL 61801; l Leibniz-Institut für Meereswissenschaften, Kiel, 24105 Germany; m Water Technology Center, Karlsruhe, 76139 Germany; n Department of Botany and Microbiology, University of Oklahoma, Norman, OK 73019; o Institut für Chemie und Biologie des Meeres, Universität Oldenburg, Oldenburg, 26129 Germany; p Departments of Biochemistry, Molecular Biology and Cell Biology, and Chemistry, Northwestern University, Evanston, IL 60208; and q Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 Edited by David Karl, University of Hawaii, Honolulu, HI, and approved April 2, 2010 (received for review December 6, 2009) Ammonia-oxidizing archaea are ubiquitous in marine and terres- trial environments and now thought to be signicant contributors to carbon and nitrogen cycling. The isolation of Candidatus Nitro- sopumilus maritimusstrain SCM1 provided the opportunity for linking its chemolithotrophic physiology with a genomic inventory of the globally distributed archaea. Here we report the 1,645,259- bp closed genome of strain SCM1, revealing highly copper-depen- dent systems for ammonia oxidation and electron transport that are distinctly different from known ammonia-oxidizing bacteria. Consistent with in situ isotopic studies of marine archaea, the genome sequence indicates N. maritimus grows autotrophically using a variant of the 3-hydroxypropionate/4-hydroxybutryrate pathway for carbon assimilation, while maintaining limited capac- ity for assimilation of organic carbon. This unique instance of ar- chaeal biosynthesis of the osmoprotectant ectoine and an unprecedented enrichment of multicopper oxidases, thioredoxin- like proteins, and transcriptional regulators points to an organism responsive to environmental cues and adapted to handling reac- tive copper and nitrogen species that likely derive from its distinc- tive biochemistry. The conservation of N. maritimus gene content and organization within marine metagenomes indicates that the unique physiology of these specialized oligophiles may play a sig- nicant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation | marine microbiology | archaea | nitroxyl M arine Group I archaea are among the most abundant microorganisms in the global oceans (13). Originally dis- covered through ribosomal RNA gene sequencing (3, 4), recent metagenomic, biogeochemical, and microbiological studies established the capacity of these organisms to oxidize ammonia, thus linking this abundant microbial clade to one of the key steps of the global nitrogen cycle (59). For a century following the dis- covery of autotrophic ammonia oxidizers, only Bacteria were thought to catalyze this generally rate-limiting transformation in the two-step process of nitrication (10). Despite recent enrich- ment of mesophilic as well as thermophilic ammonia-oxidizing archaea (AOA) (6, 11, 12), only a single Group I-related strain, isolated from a gravel inoculum from a tropical marine aquarium, has thus far been successfully obtained in pure culture (7). The isolation of Nitrosopumilus maritimus strain SCM1 ulti- mately conrmed an archaeal capacity for chemoautotrophic growth on ammonia. More detailed characterization of this strain revealed cytological and physiological adaptations critical for life in an oligotrophic open ocean environment, most notably one of the highest substrate afnities yet observed (13). Among charac- terized ammonia oxidizers, only N. maritimus is capable growing at the extremely low concentrations of ammonia generally found in the open ocean (7, 13). This strain therefore provided an excellent opportunity to investigate the core genetic inventory for ammonia- based chemoautotrophy by Group I crenarchaea. The gene content and gene order of N. maritimus is highly similar to environmental populations represented in marine bac- terioplankton metagenomes, conrming on a genomic level its close relationship to many oceanic crenarchaea. Thus, an evalu- ation of the genomic inventory of N. maritimus should offer a framework to identify features shared among ammonia-oxidizing Group I crenarchaea, resolve physiological diversity among AOA, and rene understanding of their ecology in relationship to the larger assemblage of marine archaeanot all of which are am- monia oxidizers. In support of this expectation, the physiological and genomic proles together show that many of the non- extremearchaea identied in metagenomic studies, and currently assigned to the Crenarchaeota kingdom, are AOA that contribute to global carbon and nitrogen cycling, possibly determining rates of nitrication in a variety of environments (6, 8, 9, 13). Results and Discussion Primary Sequence Characteristics. N. maritimus strain SCM1 con- tains a single chromosome of 1,645,259 bp encoding 1,997 pre- dicted genes and no extrachromosomal elements or complete prophage sequences (Table 1). No unambiguous origin of rep- lication could be determined on the basis of local gene content or GC skew, as commonly observed for other archaeal genomes (14). Approximately 61% of the N. maritimus open-reading Author contributions: C.B.W., J.R.d.l.T., P.S.G.C., and D.A.S. designed research; C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., C.B-A., P.P.C., A.G., M.H., E.A.K., M.K., M.S., T.L., W.M-H., M.S., D.L., S.M.S., A.C.R., G.M., and D.A.S. performed research; C.B.W. and J.R.d.l.T. con- tributed new reagents/analytic tools; C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., D.J.A., C.B.-A., P.P.C., A.G., J.H., M.H., E.A.K., M.K., T.J.L., T.L., W.M.-H., L.A.S.-S., S.M.S., A.C.R., G.M., and D.A.S. analyzed data; and C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., C.B.-A., J.H., M.H., E.A.K., M.K., T.L., S.M.S., A.C.R., G.M., and D.A.S. wrote the paper. The authors declare no conict of interest. This article is a PNAS Direct Submission. Data deposition: The sequence reported in this paper has been deposited in the NCBI database (accession no. NC_010085). 1 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.0913533107/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.0913533107 PNAS Early Edition | 1 of 6 MICROBIOLOGY
Transcript
Page 1: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nitrosopumilus maritimus genome reveals uniquemechanisms for nitri!cation and autotrophy inglobally distributed marine crenarchaeaC. B. Walkera,b, J. R. de la Torrea, M. G. Klotzc, H. Urakawaa, N. Pinela, D. J. Arpd, C. Brochier-Armanete, P. S. G. Chainf,g,h,P. P. Chani, A. Gollabgirj, J. Hempk, M. Hüglerl,m, E. A. Karrn, M. Könnekeo, M. Shinf,g, T. J. Lawtonp, T. Lowei, W. Martens-Habbenaa, L. A. Sayavedra-Sotod, D. Langf,g, S. M. Sievertq, A. C. Rosenzweigp, G. Manningj, and D. A. Stahla,1

aDepartment of Civil and Environmental Engineering, University of Washington, Seattle, WA 98195; bGeosyntec Consultants, Seattle, WA 98101;cDepartment of Biology, University of Louisville, Louisville, KY 40292; dDepartment of Botany and Plant Pathology, Oregon State University, Corvalis, OR97331; eUniversité de Provence Aix-Marseille I, Laboratoire de Chimie Bactérienne, Centre National de la Recherche Scienti!que Unité Propre de Recherche,Marseille, 13402 France; fBiosciences Division, Lawrence Livermore National Laboratory, Livermore, CA 94550; gMicrobial Program, Joint Genome Institute,Walnut Creek, CA 94598; hCenter for Microbial Ecology, Michigan State University, East Lansing, MI 48824; iDepartment of Biomolecular Engineering,University of California, Santa Cruz, CA 95064; jRazavi Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, CA 92037; kSchool ofChemical Sciences, University of Illinois, Urbana, IL 61801; lLeibniz-Institut für Meereswissenschaften, Kiel, 24105 Germany; mWater Technology Center,Karlsruhe, 76139 Germany; nDepartment of Botany and Microbiology, University of Oklahoma, Norman, OK 73019; oInstitut für Chemie und Biologie desMeeres, Universität Oldenburg, Oldenburg, 26129 Germany; pDepartments of Biochemistry, Molecular Biology and Cell Biology, and Chemistry, NorthwesternUniversity, Evanston, IL 60208; and qBiology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543

Edited by David Karl, University of Hawaii, Honolulu, HI, and approved April 2, 2010 (received for review December 6, 2009)

Ammonia-oxidizing archaea are ubiquitous in marine and terres-trial environments and now thought to be signi!cant contributorsto carbon and nitrogen cycling. The isolation of Candidatus “Nitro-sopumilus maritimus” strain SCM1 provided the opportunity forlinking its chemolithotrophic physiology with a genomic inventoryof the globally distributed archaea. Here we report the 1,645,259-bp closed genome of strain SCM1, revealing highly copper-depen-dent systems for ammonia oxidation and electron transport thatare distinctly different from known ammonia-oxidizing bacteria.Consistent with in situ isotopic studies of marine archaea, thegenome sequence indicates N. maritimus grows autotrophicallyusing a variant of the 3-hydroxypropionate/4-hydroxybutryratepathway for carbon assimilation, while maintaining limited capac-ity for assimilation of organic carbon. This unique instance of ar-chaeal biosynthesis of the osmoprotectant ectoine and anunprecedented enrichment of multicopper oxidases, thioredoxin-like proteins, and transcriptional regulators points to an organismresponsive to environmental cues and adapted to handling reac-tive copper and nitrogen species that likely derive from its distinc-tive biochemistry. The conservation of N. maritimus gene contentand organization within marine metagenomes indicates that theunique physiology of these specialized oligophiles may play a sig-ni!cant role in the biogeochemical cycles of carbon and nitrogen.

ammonia oxidation | marine microbiology | archaea | nitroxyl

Marine Group I archaea are among the most abundantmicroorganisms in the global oceans (1–3). Originally dis-

covered through ribosomal RNA gene sequencing (3, 4), recentmetagenomic, biogeochemical, and microbiological studiesestablished the capacity of these organisms to oxidize ammonia,thus linking this abundantmicrobial clade to oneof the key steps ofthe global nitrogen cycle (5–9). For a century following the dis-covery of autotrophic ammonia oxidizers, only Bacteria werethought to catalyze this generally rate-limiting transformation inthe two-step process of nitri!cation (10). Despite recent enrich-ment of mesophilic as well as thermophilic ammonia-oxidizingarchaea (AOA) (6, 11, 12), only a single Group I-related strain,isolated from a gravel inoculum from a tropical marine aquarium,has thus far been successfully obtained in pure culture (7).The isolation of Nitrosopumilus maritimus strain SCM1 ulti-

mately con!rmed an archaeal capacity for chemoautotrophicgrowth on ammonia. More detailed characterization of this strainrevealed cytological and physiological adaptations critical for lifein an oligotrophic open ocean environment, most notably one of

the highest substrate af!nities yet observed (13). Among charac-terized ammonia oxidizers, onlyN.maritimus is capable growing atthe extremely low concentrations of ammonia generally found inthe open ocean (7, 13). This strain therefore provided an excellentopportunity to investigate the core genetic inventory for ammonia-based chemoautotrophy by Group I crenarchaea.The gene content and gene order of N. maritimus is highly

similar to environmental populations represented in marine bac-terioplankton metagenomes, con!rming on a genomic level itsclose relationship to many oceanic crenarchaea. Thus, an evalu-ation of the genomic inventory of N. maritimus should offera framework to identify features shared among ammonia-oxidizingGroup I crenarchaea, resolve physiological diversity amongAOA,and re!ne understanding of their ecology in relationship to thelarger assemblage of marine archaea—not all of which are am-monia oxidizers. In support of this expectation, the physiologicaland genomic pro!les together show that many of the “non-extreme” archaea identi!ed inmetagenomic studies, and currentlyassigned to the Crenarchaeota kingdom, are AOA that contributeto global carbon and nitrogen cycling, possibly determining ratesof nitri!cation in a variety of environments (6, 8, 9, 13).

Results and DiscussionPrimary Sequence Characteristics. N. maritimus strain SCM1 con-tains a single chromosome of 1,645,259 bp encoding 1,997 pre-dicted genes and no extrachromosomal elements or completeprophage sequences (Table 1). No unambiguous origin of rep-lication could be determined on the basis of local gene contentor GC skew, as commonly observed for other archaeal genomes(14). Approximately 61% of the N. maritimus open-reading

Author contributions: C.B.W., J.R.d.l.T., P.S.G.C., and D.A.S. designed research; C.B.W.,J.R.d.l.T., M.G.K., H.U., N.P., C.B-A., P.P.C., A.G., M.H., E.A.K., M.K., M.S., T.L., W.M-H.,M.S., D.L., S.M.S., A.C.R., G.M., and D.A.S. performed research; C.B.W. and J.R.d.l.T. con-tributed new reagents/analytic tools; C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., D.J.A., C.B.-A.,P.P.C., A.G., J.H., M.H., E.A.K., M.K., T.J.L., T.L., W.M.-H., L.A.S.-S., S.M.S., A.C.R., G.M., andD.A.S. analyzed data; and C.B.W., J.R.d.l.T., M.G.K., H.U., N.P., C.B.-A., J.H., M.H., E.A.K.,M.K., T.L., S.M.S., A.C.R., G.M., and D.A.S. wrote the paper.

The authors declare no con"ict of interest.

This article is a PNAS Direct Submission.

Data deposition: The sequence reported in this paper has been deposited in the NCBIdatabase (accession no. NC_010085).1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.0913533107/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.0913533107 PNAS Early Edition | 1 of 6

MICRO

BIOLO

GY

Page 2: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

frames (ORFs) could be assigned to clusters of orthologousgroups of proteins (COGs), a lower percentage than for genomesof ammonia-oxidizing bacteria (AOB) (Table S1) but similar toCenarchaeum symbiosum (15). The genome possesses a relativelyhigh coding density (91.9%), with a larger fraction dedicated toenergy production/conservation, coenzyme transport/metabolism,and translation genes than other characterized Crenarchaeota, butsimilar to two common species of photoautotrophic marineBacteria, Prochlorococcus, and Synechococcus.

Energy Metabolism. The stoichiometry of ammonia oxidation tonitrite is similar to that of characterized aerobic, obligate che-molithoautotrophicAOB(13), yet the contributing biochemistry isdistinctly unique. All AOB share a common pathway where hy-droxylamine, producedby an ammoniamonooxygenase (AMO), isoxidized to nitrite by a heme-rich hydroxylamine oxidoreductase(HAO) complex; the oxidation of hydroxylamine supplies elec-trons to both the AMO and a typical electron transport chaincomposed of cytochrome c proteins. N. maritimus lacks genesencoding a recognizable AOB-like HAO complex and pertinentcytochrome c proteins, indicating an alternative archaeal pathway.The numerous copper-containing proteins, including multicopperoxidases and small blue copper-containing proteins (similar toplasto-, halo-, and sulfocyanins), suggest an alternative electrontransfer mechanism (Table S2). These predicted periplasmicproteins likely serve functionally similar roles to soluble cyto-chrome c proteins in other Archaea, including Natronomonaspharaonis and thermoacidophiles such as Sulfolobus (16). Thisapparent reliance on copper for redox reactions is a major di-vergence from the iron-based electron transfer system of AOB.The N. maritimus genome contains genes coding for six soluble

periplasmic multicopper oxidase (MCO) proteins: two nearlyidentical NO-forming nitrite reductase proteins (NirK;Nmar_1259 and -1667), each with three cupredoxin domains;two NcgA-like (nirK cluster gene A) MCOs (Nmar_1131 and-1663) with two cupredoxin domains; one MCO (Nmar_1136)with three cupredoxin domains; and one MCO (Nmar_1354)with two domains fused to a blue copper-containing protein(Table S2). Two (Nmar_1131 and Nmar_1663) of the threegenes that are classi!ed as belonging to the emerging class oftwo-domain MCOs (2dMCOs) resemble the general architectureof the 2dMCO NcgA present in Nitrosomonas europaea. Al-though the overall sequence identity is low, clustering of NcgAwith a nitrite reductase suggests it may play a supporting role innitrite reduction (17). Genes Nmar_1131 andNmar_1663 are alsocolocatedwith amember of theDtxR family ofmetal regulators anda member of the ZIP metal transport family, suggesting a role inmetal homeostasis (see below). The third 2dMCO (Nmar_1354),possessing a fused blue copper-containing domain, has not beenfound in AOB and appears to be unique to AOA. Redox inter-

actions with the MCOs (and other predicted redox proteins) arelikely mediated by eight soluble and nine membrane-anchoredcopper-binding proteins containing plastocyanin-like domains(Table S2). The corresponding genes appear to be the result ofa series of duplications within the N. maritimus lineage (Fig. S1).A second family of predicted redox active periplasmic pro-

teins, composed of 11 thiol-disul!de oxidoreductases from thethioredoxin family (Nmar_0639, _0655, _0829, _0881, _1140,_1143, _1148, _1150, _1181, _1658, and Nmar_1670), show low(but recognizable) identity with the better characterized disul!debond oxidases/isomerases found in Bacillus subtilis (BdbD) andEscherichia coli (DsbA, DsbC, and DsbG). The mean percentageof sequence identity between the N. maritimus proteins andBdbD is 21 ± 3%. The signi!cantly lower mean percentage ofsequence identity to DsbA, DsbC, and DsbG (9.2, 10.7, and10.4%, respectively) is comparable to that shared between theE. coli proteins (10–11%). Although functional equivalencycannot be established, all but Nmar_0881 preserve the conservedthioredoxin-like active site FX4CXXC sequence (18–20). InE. coli, both DsbA and DsbC rectify nonspeci!c disul!de bondscatalyzed by copper (21), whereas up-regulation of dsbA by theCpx regulon occurs during copper stress (22, 23). Eukaryoticprotein disul!de isomerase (PDI) homologs sequester and/orreduce oxidized Cu(II), perhaps serving as copper acceptors/donors for copper-containing proteins (24). Another describedfunction of PDIs is the capture and transport of nitric oxide (25,26), a possible intermediate or by-product of ammonia oxidation.The related protein family in N. maritimus may function in partto alleviate copper and nitric oxide toxicity.

Pathways for Ammonia Oxidation and Electron Transfer. The threegenes (Nmar_1500, _1503, and _1502) annotated as amoA,amoB, and amoC and coding for a putative ammonia mono-oxygenase complex are the only recognizable genetic hallmarksof ammonia oxidation in the genome sequence. However, theN. maritimus sequences are no more similar (in either content ororganizational structure) to bacterial amo genes than they are tothe genes encoding bacterial particulate methane mono-oxygenases (pMMO), suggestive of functional differences be-tween the archaeal and bacterial versions of AMO (7, 27).Notably, mapping the sequence encoded by amoB onto thepmoB crystal structure of Methylococcus capsulatus (Bath) (28)reveals the conservation of the ligands to the pMMO metalcenters and the complete absence of both a transmembrane helixand a C-terminal cupredoxin domain predicted to be present inbacterial AMO (Fig. S2).The structural differences in the archaeal AMO, the lack of

genes encoding the hydroxylamine–ubiquinone redox module(29), and a periplasm enriched in redox active proteins togethersuggest signi!cant divergence from the bacterial pathway of

Table 1. Genome features of N. maritimus SCM1, C. symbiosum, sequenced AOB, and crenarchaeal genome fragments

Nitrosopumilusmaritimus SCM1

Cenarchaeumsymbiosum

Nitrosococcusoceani

ATCC 19707

Nitrosomonaseuropaea

ATCC 19718Nitrosomonaseutropha C91

NitrosospiramultiformisATCC 25196

Fosmid4B7

CosmidDeepAnt-EC39

Fosmid74A4

Size (bp) 1,645,259 2,045,086 3,481,691 2,812,094 2,661,057 3,184,243 39,297 33,347 43,902Percent coding 91.90% 91.20% 86.80% 88.40% 85.60% 85.60% 89.10% 86.10% 84.00%GC content 34.20% 57.70% 50.30% 50.70% 48.50% 53.90% 34.40% 34.10% 32.60%ORFs 1,997 2,066 3,186 2,628 2,578 2,827 41 41 51ORF density (ORF/kb) 1.19 0.986 0.889 0.876 0.952 0.86 0.992 1.17 1.12Avg. ORF length (bp) 757 924 964 1009 890 980 898 737 753Standard tRNAs 44 45 45 41 41 43 0 0 0rRNAs 1 1 2 1 1 1 1 1 1Plasmids 0 0 1 0 2 3 NA NA NA

NA, not analyzed.

2 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107 Walker et al.

Page 3: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

ammonia oxidation. There are two hypothetical mechanisticalternatives (Fig. 1, Table S2): either a unique biochemistryexists for the oxidation of hydroxylamine or the divergent AMOdoes not actually produce hydroxylamine. If the former is true,hydroxylamine oxidation may occur via one of the periplasmicMCOs (CuHAO). Given the lack of cytochrome c proteins, thefour electrons would then be transferred to a quinone reductase(QRED) via small blue copper-containing plastocyanin-likeelectron carriers. The protein encoded by Nmar_1226, whichcontains four transmembrane-spanning regions and two plasto-cyanin-like domains, may serve as an analog of the membrane-bound cytochrome cM552 quinone reductase present in AOB(29) and is a good candidate for the QRED.In an alternative scenario, the archaeal AMO produces not

hydroxylamine, but the reactive intermediate nitroxyl (nitroxylhydride, HNO). Nitroxyl is a highly toxic and reactive compoundrecently recognized as having biological signi!cance in a numberof systems (30, 31). During archaeal ammonia oxidation, nitroxylmight be formed by a unique monooxygenase function of ar-chaeal AMO. Alternatively, the archaeal AMO may act asa dioxygenase and insert two oxygen atoms into ammonia,producing nitroxyl from the spontaneous decay of HNOHOH.Both reaction sequences eliminate the requirement for reductantrecycling during the initial oxygenase reaction, a simpli!cationoffering signi!cant ecological advantage (when compared withAOB) in nutrient poor environments. In this pathway, one of theMCO-like proteins may act as a nitroxyl oxidoreductase (NXOR)and facilitate the oxidation of nitroxyl to nitrite with the extrac-tion of two protons and two electrons in the presence of water.The proposed NXOR would relay the two extracted electronsinto the quinone pool via the QRED pathway described above.In this proposed model, the electrons extracted by either

a CuHAO or a NXOR (and transferred into the quinone pool)would generate a proton motive force (PMF) through complexesIII (plastocyanin-like subunit, Nmar_1542; Rieske-type subunit,Nmar_1544; transmembrane subunit, Nmar_1543) and IV(Nmar_0182-5), driving the generation of ATP by an F0F1-typeATP synthase (Nmar_1688–1693). The production of reductant(i.e., NADH) would require the reverse operation of complex I(NuoABCDHIJKMLN, Nmar_0276–286) as a quinol oxidasedriven by a PMF. The proposed biochemistry involving nitroxylproduces the same net gain as bacterial ammonia oxidation,providing two electrons for reduction of the quinone pool andsubsequent linear electron "ow and the generation of a PMF.The presence of a copper-containing (versus heme) complex IIIand the unique evolutionary placement of terminal oxidase(complex IV) between two of the heme–copper oxygen reductasefamilies further distinguish this proposed ammonia oxidationpathway from that in AOB.

Carbon Fixation and Mixotrophy. N. maritimus, like all knownAOB, grows chemolithoautotrophically by using inorganic carbonas the sole carbon source (7, 32). However, whereas AOB use theCalvin–Bassham–Benson cycle with the CO2-!xing enzyme ribulose

bisphosphate carboxylase/oxygenase (RubisCO) as the key enzyme,the absence of genes inN. maritimus coding for RubisCO and otherenzymes of this cycle points to an alternative pathway for carbon!xation. The most likely mechanism supported by the genome se-quence is the 3-hydroxypropionate/4-hydroxybutyrate pathway elu-cidated in the thermophilic crenarchaoteMetallosphaera sedula andsuggested as a potential pathway of carbon !xation in C. symbiosum(33). The pathway has two parts: a sequence including two carbox-ylation reactions transforming acetyl-CoA to succinyl-CoA anda multistep sequence converting succinyl-CoA into twomolecules ofacetyl-CoA. Genes identi!ed in theN. maritimus genome coding forkey enzymes of the pathway (Fig. S3) include a biotin-dependentacetyl-CoA/propionyl-CoA carboxylase (Nmar_0272–0274), meth-ylmalonyl-CoA epimerase and mutase (Nmar_0953, _0954, and_0958), and 4-hydroxybutyrate dehydratase (Nmar_0207). With theexceptionofonegene (Nmar_1608), all of the genes implicated in the3-hydroxyproprionate/4-hydrobutyrate pathway for carbon assimila-tion in N. maritimus are present and show highest similarity to thegenes ofC. symbiosum (34, 35).AlthoughN.maritimus andM. sedulamost likely use the same CO2-!xation reaction sequences, not allindividual reactions appear to be catalyzed by identical enzymes. Inone instance, the stepwise reductive transformation of malonyl-CoAto propionyl-CoA involves !ve enzymes in M. sedula (33, 36, 37).Although the N. maritimus genome lacks any close homologs ofthe M. sedula genes, it contains alternative alcohol dehydrogenases,aldehyde dehydrogenases, acyl-CoA synthetases, and enoyl-CoAhydratases possibly ful!lling the same functions. Similarly, M. sedulacatalyzes theactivationof3-hydroxypropionate to3-hydroxypropionyl-CoA, using an AMP-forming 3-hydroxypropionyl-CoA synthetase(37). The N. maritimus genome lacks an obvious homolog, althoughdoescode foranADP-formingacyl-CoAsynthetase (Nmar_1309) thatsuggests a more energy ef!cient alternative.In addition to the genes coding for the 3-hydroxypropionate/4-

hydroxybutyrate pathway, the genome of N. maritimus containsa number of genes encoding enzymes of the tricarboxylic acid(TCA) cycle. No homologs for genes coding for a citrate-cleavingenzyme (ATP citrate lyase or citryl-CoA lyase) (38) were identi-!ed, permitting exclusion of the reductive TCA cycle as a pathwayfor carbon !xation. The lack of these genes suggests that N. mar-itimus utilizes either an incomplete (or horseshoe-type) TCA cyclefor strictly biosynthetic purposes or possibly a complete oxidativeTCA cycle.N. maritimus grows on a completely inorganic medium, in-

dicating the genes coding for essential biosynthetic capacity (SIMaterials and Methods), yet its genomic inventory also suggestssome "exibility in the utilization of organic sources of phos-phorus and carbon. Two systems for phosphorous acquisition aresuggested: the high-af!nity, high-activity phosphate transportsystem (pstSCAB, Nmar_0479, Nmar_0481–0483) and a phos-phonate transporter (Nmar_0873–0875). However, because thegenome lacks genes encoding known C-P lyases and hydrolases(39), and phosphate limitation is not alleviated by supplemen-tation with phosphonates common in the marine environment

Out

In

pcy

pcy

C I

NH3 + O2

H O + NH OHH O + HNO

QH2

Q

4H+ + HNO2H+ + HNO

AMOQH2

Q

pcy

QH2

QNDH I

2 H+

NO + H2OCuNIR

ATPaseaa3

Nitrite Reduction

Complex VComplex IVComplex IIIComplex I

Ammonia Metabolism

H+

NAD+ + H+ NADHH+

H+

H+ 2 H+

4 H+

0.5 O2 H2O4 H+

ADP + P ATP

4 H+

H+

CuHAONXOR

QRED2 e-

4 e-2 e-

2 2

22

2

i

Fig. 1. Proposed AOA respiratory pathway. Text indi-cates the described possible hydroxylamine (blue textand arrows) and nitroxyl (green) pathways. Red arrowsindicate electron "ow not involved in ammonia oxida-tion. Blue shading denotes blue copper-containing pro-teins. Pink box indicates possible alternative respiratoryelectron sink. Hexagons containing Q and QH2 representthe oxidized and reduced quinone pools, respectively.

Walker et al. PNAS Early Edition | 3 of 6

MICRO

BIOLO

GY

Page 4: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

(e.g., aminoethylphosphonate), there is as yet no support fora functional phosphonate utilization pathway. Numerous organictransport functions are also evident, broadly encompassing trans-porters for different amino acids, dipeptides/oligopeptides, sul-fonates/taurine, and glycerol. Additional physiological charac-terization will likely demonstrate some capacity for mixotrophicgrowth, as suggested by isotopic studies of natural populations(40–42). No genomic evidence exists for growth on urea, asN. maritimus lacks the homologs of the putative urease and ureatransporter genes identi!ed in C. symbiosum (35).

Noncoding RNA Genes. The N. maritimus genome contains a fullcomplement of essential noncoding RNA (ncRNA) genes, in-cluding one copy each of 5S/16S/23S ribosomal RNAs, RNase P,SRP RNA, and 44 transfer RNAs (Table S3). In addition to sixnormally placed canonical tRNA introns, noncanonical intronswere found at different positions in six of the tRNAs (ValCAC,Met,Trp, ArgCCT, LeuTAA, and GluTTC), a phenomenon previouslyobserved only in thermophiles and C. symbiosum (34, 43). Allother sequenced crenarchaea (includingC. symbiosum) contain atleast 46 tRNAs. N. maritimus lacks tRNA sequences coding forProCGG or ArgCCG, perhaps resulting from (or related to) the lowG + C content of the genome and preference for protein codonsending in A/T. Other archaeal genomes with relatively low G+ Ccontent, such as the euryarchaeon Haloquadratum walsbyi, alsolack these tRNAs, while possessing the exact complement of 44tRNAs found in N. maritimus (43). This occurrence likely re"ectsa difference in posttranscriptional modi!cation of the wobble baseof tRNAs ProTGG and ArgTCG, allowing more ef!cient decodingof the rare codons CCG and CGG, respectively (44).Six candidates for C/D box small RNAs (sRNAs) were identi!ed

(Nmar_sR1–sR6). Most C/D box sRNAs guide the precise posi-tioning of posttranscriptional 2!-O-methyl group addition to rRNAsor tRNAs, a process also occurring in eukaryotes, but notBacteria. InN. maritimus, predicted targets of 2!-O-methylation may include thewobble base of the LeuCAA tRNA and two different positions sep-arated by 26 nucleotides in the 23S rRNA. Before theN. maritimusand C. symbiosum genomes, multiple C/D box sRNAs were foundalmost exclusively in hyperthermophilic archaea (45). Detection ofthese conserved, syntenic guide sRNAs in the two mesophilic cren-archaeal genomes supports an RNA stabilization/chaperone func-tion not seen in other archaeal mesophiles and possiblymore similarto their predicted function(s) in eukaryotes (46).

Regulation of Transcription. The genome contains at least eighttranscription factor B (TFB) and two TATA-box binding protein

(TBP) (Table S3) genes required for starting site-speci!c tran-scription initiation, making N. maritimus among the densest andrichest archaeal genomes for these transcription factors. TFBand TBP are thought to serve functions similar to the bacterialsigma factors (e.g., modulating cellular function in response to"uctuating environmental conditions) in Archaea with genomescoding for multiple copies, with optimal TFB/TPB partners (47).Although many other archaeal genomes contain multiple copiesof these transcription factors, only the haloarchaea have morethan !ve TFB genes (47). The functional signi!cance of thisexceptionally high density of regulatory factors in an apparentlymetabolically specialized organism will likely be informed byfuture transcriptional analyses of different growth states. Genesfor two widely distributed types of archaeal chromatin proteinsare present, an archaeal histone (Nmar_0683) and two Albagenes (Nmar_0255 and Nmar_0933). These are thought tomaintain chromosomal material in a state permitting polymeraseaccessibility, with differential expression possibly providing foraltered global transcription (48).

Unique Cell Division Machinery and Previously Uncharacterized Instanceof Archaeal Biosynthesis of Hydroxyectoine.TheN. maritimus genomecontains elements homologous with two systems of cell division: ftsZ(Nmar_1262) and cdvABC (Nmar_0700, _0816, and _1088). ThecdvABCoperon, induced at the onset of genome segregation and celldivision, codes for machinery related to the eukaryotic endosomalsorting complex (49, 50). With the exception of N. maritimus, C.symbiosum, and the Thermoproteales (where the cell division ma-chinery remains uncharacterized), all available archaeal genomeshave either the FtsZ or theCdv cell divisionmachinery, but not both.The two cell division systems may comprise a hybrid mechanism orserve two distinct processes in marine Crenarchaeota.The genome of N. maritimus also encodes for synthesis of

the compatible solute hydroxyectoine: ectoine synthase (ectC, Nmar_1344), a L-2,4-diaminobutyrate transaminase (ectB, Nmar_1345),aL-2,4-diaminobutyrateacetyltransferase (ectA,Nmar_1346), andanectoine hydroxylase (ectD, Nmar_1343). Although widely distributedamong Bacteria (in particular the genome sequences of marineorganisms), the presence of these genes in N. maritimus representsa unique indication of archaeal biosynthesis.

Relationship to C. symbiosum. The genome of N. maritimus differssigni!cantly in G + C content and size from that of the closely re-lated sponge symbiont (!97% 16S rRNA gene sequence identity).Despite the differences in overall genomic G + C content (34.2%forN.maritimus versus 57.7%forC. symbiosum), theG+Ccontent

A B

C

Fig. 2. Synteny plots comparing the N. maritimus genome with (A) the Cenarchaeum symbiosum A type genome, (B) crenarchaeal genome fragments, and(C) Sargasso Sea fosmids. Vertical gray bar indicates location of ribosomal RNA operon.

4 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107 Walker et al.

Page 5: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

for the rRNAs is largely similar between both organisms (50–52%).The higher ORF density (1.19 ORF/kb) relative to C. symbiosum(0.986 ORF/kb) results principally from the 0.4 Mbp smaller ge-nome of N. maritimus, not from a large disparity in the number ofpredicted ORFs. The two genomes share 1,267 genes in common(when compared via reciprocal BLAST with expectation cutoffvalues of 10"4), yet there is little conservation of synteny (Fig. 2C,Table S4). Most of the increased size of the C. symbiosum genomeand much of the divergence in gene content are associated withdiscrete regions (“islands”), althoughnoobvious functionality couldbe assigned to individual islands. Homologs for a majority of genesimplicated in the archaeal ammonia oxidation pathway (51 of 69listed in Table S2) appear present in C. symbiosum.

Phylogeny and Evolution. The widely distributed Group I archaeallineage with which N. maritimus is af!liated was earlier assigned tothe hyperthermophilic Crenarchaeota (3). Questions regarding thisassociation arose through phylogenetic analysis of C. symbiosumribosomal proteins, indicating possible divergence before theCrenarchaeota–Euryarchaeota split and therefore deserving pro-visional assignment to a new archaeal kingdom, the Thaumarch-aeota (51). The basal position of the Group I archaea previouslyinferred from protein sequences encoded by the C. symbiosum ge-nome was reexamined by reanalysis of the combined dataset, usingpatterns of gene distribution (Table S5) and phylogeny inference.Maximum-likelihood analyses con!rmed the basal branching withsigni!cant statistical support (bootstrap value = 90%, Fig. S4).Bayesian analysis of a selection of species from the same datasetproduced results linking C. symbiosum and N. maritimus as sistertaxa ofCrenarchaeota, albeit with nonsigni!cant support (posteriorprobability = 0.88). Although a de!nitive placement within theArchaea stillmust be con!rmed by inclusion of genomes frommoredistantly related lineages, both analyses strongly support a lineagedistinct from all other cultivated Crenarchaeota.

High Similarity to the Metagenome of the Globally Distributed MarineGroup I Archaea. The genome of N. maritimus shares remarkable

conservation of gene content and gene order with numerous ar-chaeal sequences previously recovered in fosmid libraries and re-cent oceanic surveys. The Antarctic genome fragments DeepAnt-EC39 (taken from 500 m depth) (52) and cosmid 74A4 (from sur-facewaters) (53) both share veryhigh syntenywithportionsof theN.maritimus genome (Fig. 2B and Table S6) despite signi!cant dif-ferences in rRNA sequence identity (93 and 98%, respectively).Sixteen Sargasso Sea contigs (93% 16S rRNA gene sequenceidentity with N. maritimus) also have high synteny (Fig. 2C andTable S6). Retrieval of DNA fragments from the Global OceanSampling (GOS) database using differential protein sequencesimilarity showed that N. maritimus-like sequences constituted anaverage of 1.15% of all sequences across a wide range of temper-atures (9–29 °C), salinities (freshwater to seawater, 0.1–63 practicalsalinity units), two open oceans, and several coastal environments(Fig. 3A). The Block Island, NY, coastal site and the Lake Gatun,Panama Canal, site (neither of which share any notable physical/chemical characteristics other than sample depth) both exhibitednotably large increases in density (>2.5%). The available GOSsequences provided almost complete and uniform coverage of theN. maritimus genome (Fig. 3B), although at least three signi!cantgaps in coverage exist (possibly corresponding with uniqueN. maritimus genomic islands). Whereas some of the coverage mayresult in matches to bacterial sequences, particularly for very highlyconserved proteins, the majority of recruited proteins had >50%sequence identity to N. maritimus proteins. Together, these sharedgenomic features suggest N. maritimus is representative of many ofthe globally abundant marine Group I Crenarchaeaota and that itshould provide a useful model for developing an understanding ofthebasic physiologyof theseabundantand cosmopolitanorganisms.A comparison of the sequence content and genome organi-

zation hints at functionally more divergent marine populationtypes. N. maritimus has limited syntenic similarity to a deep-water population represented by the North Paci!c fosmid 4B7(93% 16S rRNA sequence identity), but shares proteins highlysimilar to most of those coded on this fosmid. Previous com-parison of marine crenarchaeal genomic fragments reportedchanges in genomic organization with sampling depths, sugges-tive of depth-related habitat types (54). Coupled with recentevidence indicating varied physiological lifestyles along depthand latitudinal gradients, distinct crenarchaeal ecotypes likelyexist, analogous to that observed in marine cyanobacteria (2, 55).However, no clear correlations currently exist between environ-mental parameters and the similarity of recruited fragments.The genome sequence presented here also offers further insight

into theecological successofAOA.Forexample, using the likelymoreenergy-ef!cient 3-hydroxypropionate/4-hydroxybutyrate pathway forCO2 !xation rather than the Calvin–Bassham–Benson cycle used byAOBcouldprovideagrowthadvantage.Further ecological advantagemaybeconferredby theirpotential capacity formixotrophicgrowthorthe use of copper as a major redox-active metal for respiration ingenerally iron-limited oceans. However, a deeper understanding ofthe remarkable success of this archaeal lineage will come only frommore detailed physiological, biochemical, and genetic characteriza-tion of N. maritimus and additional environmental isolates.

Materials and MethodsGenome sequencing was performed on high-molecular-weight DNAextracted from two cultures of N. maritimus. Whole-genome shotgun se-quencing of 3-, 8-, and 40-kb DNA libraries by the Joint Genome Instituteproduced at least 8! coverage. Annotation of the closed genome was per-formed using Department of Energy (DOE) computational support at OakRidge National Laboratory and The Institute for Genomic Research (TIGR)Autoannotation Service in conjunction with Manatee visualization software.Complete details describing high-molecular-weight DNA puri!cation andsequence analysis are found in SI Materials and Methods.

ACKNOWLEDGMENTS. The authors thank David Bruce and Paul Richardsonfrom the Joint Genome Institute for facilitating genome sequencing. This

Fig. 3. Comparison of N. maritimus genome to metagenomic sequence readsfrom GOS. (A) Percentage of reads from each GOS site that align to the N.maritimus genome by protein sequence similarity. (B) Number of GOS readshomologous to eachN.maritimus protein-coding gene. Counts of 40 aremostlydue to highly conserved proteins that include contaminants from other clades.

Walker et al. PNAS Early Edition | 5 of 6

MICRO

BIOLO

GY

Page 6: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

work was supported by the Department of Energy Microbial GenomeProgram, byNational Science FoundationMicrobial Interactions andProcessesGrant MCB-0604448 (to D.A.S. and J.R.d.l.T.), by National Science FoundationMolecular andCellular BiosciencesGrantMCB-0920741 (toD.A.S.), byNationalScience Foundation Biological Oceanography Grants OCE-0623174 (to D.A.S.)

and OCE-0623908 (to S.M.S.), by National Science Foundation Grant EF-0412129 (to M.G.K.), by incentive funds from the University of Louisville VPResearch of!ce (to M.G.K.), by the Deutsche Forschungsgemeinschaft (M.K.),by US Department of Agriculture Grant 2010-65115-20380 (to A.C.R.), and bya Salk Institute Innovation Grant (to G.M.).

1. Karner MB, DeLong EF, Karl DM (2001) Archaeal dominance in the mesopelagic zoneof the Paci!c Ocean. Nature 409:507–510.

2. Agogué H, Brink M, Dinasquet J, Herndl GJ (2008) Major gradients in putativelynitrifying and non-nitrifying Archaea in the deep North Atlantic. Nature 456:788–791.

3. DeLong EF (1992) Archaea in coastal marine environments. Proc Natl Acad Sci USA 89:5685–5689.

4. Fuhrman JA, McCallum K, Davis AA (1992) Novel major archaebacterial group frommarine plankton. Nature 356:148–149.

5. Venter JC, et al. (2004) Environmental genome shotgun sequencing of the SargassoSea. Science 304:66–74.

6. Wuchter C, et al. (2006) Archaeal nitri!cation in the ocean. Proc Natl Acad Sci USA103:12317–12322.

7. Könneke M, et al. (2005) Isolation of an autotrophic ammonia-oxidizing marinearchaeon. Nature 437:543–546.

8. Leininger S, et al. (2006) Archaea predominate among ammonia-oxidizing prokaryotesin soils. Nature 442:806–809.

9. Prosser JI, Nicol GW (2008) Relative contributions of archaea and bacteria to aerobicammonia oxidation in the environment. Environ Microbiol 10:2931–2941.

10. Prosser JI (1989) Autotrophic nitri!cation in bacteria. Adv Microb Physiol 30:125–181.11. de la Torre JR, Walker CB, Ingalls AE, Könneke M, Stahl DA (2008) Cultivation of

a thermophilic ammonia oxidizing archaeon synthesizing crenarchaeol. EnvironMicrobiol 10:810–818.

12. Hatzenpichler R, et al. (2008) A moderately thermophilic ammonia-oxidizingcrenarchaeote from a hot spring. Proc Natl Acad Sci USA 105:2134–2139.

13. Martens-Habbena W, Berube PM, Urakawa H, de la Torre JR, Stahl DA (2009)Ammonia oxidation kinetics determine niche separation of nitrifying Archaea andBacteria. Nature 461:976–979.

14. Sernova NV, Gelfand MS (2008) Identi!cation of replication origins in prokaryoticgenomes. Brief Bioinform 9:376–391.

15. Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV (2007) Clusters oforthologous genes for 41 archaeal genomes and implications for evolutionarygenomics of archaea. Biol Direct 2:33.

16. Schäfer G (2004) Respiration in Archaea and Bacteria, ed Zannoni D (Springer,Dordrecht, The Netherlands), pp 1–33.

17. Beaumont HJ, Lens SI, Westerhoff HV, van Spanning RJ (2005) Novel nirK clustergenes in Nitrosomonas europaea are required for NirK-dependent tolerance tonitrite. J Bacteriol 187:6849–6851.

18. Andersen CL, Matthey-Dupraz A, Missiakas D, Raina S (1997) A new Escherichia coligene, dsbG, encodes a periplasmic protein involved in disulphide bond formation,required for recycling DsbA/DsbB and DsbC redox proteins.Mol Microbiol 26:121–132.

19. Meima R, et al. (2002) The bdbDC operon of Bacillus subtilis encodes thiol-disul!deoxidoreductases required for competence development. J Biol Chem 277:6994–7001.

20. Raina S, Missiakas D (1997) Making and breaking disul!de bonds. Annu Rev Microbiol51:179–202.

21. Hiniker A, Collet JF, Bardwell JC (2005) Copper stress causes an in vivo requirement forthe Escherichia coli disul!de isomerase DsbC. J Biol Chem 280:33785–33791.

22. Pogliano J, Lynch AS, Belin D, Lin EC, Beckwith J (1997) Regulation of Escherichia colicell envelope proteins involved in protein folding and degradation by the Cpx two-component system. Genes Dev 11:1169–1182.

23. Kershaw CJ, Brown NL, Constantinidou C, Patel MD, Hobman JL (2005) The expressionpro!le of Escherichia coli K-12 in response to minimal, optimal and excess copperconcentrations. Microbiology 151:1187–1198.

24. Narindrasorasak S, Yao P, Sarkar B (2003) Protein disul!de isomerase, a multifunctionalprotein chaperone, shows copper-binding activity. Biochem Biophys Res Commun 311:405–414.

25. Sliskovic I, Raturi A, Mutus B (2005) Characterization of the S-denitrosation activity ofprotein disul!de isomerase. J Biol Chem 280:8733–8741.

26. Ramachandran N, Root P, Jiang XM, Hogg PJ, Mutus B (2001) Mechanism of transferof NO from extracellular S-nitrosothiols into the cytosol by cell-surface proteindisul!de isomerase. Proc Natl Acad Sci USA 98:9539–9544.

27. Nicol GW, Tscherko D, Chang L, Hammesfahr U, Prosser JI (2006) Crenarchaealcommunity assembly and microdiversity in developing soils at two sites associatedwith deglaciation. Environ Microbiol 8:1382–1393.

28. Lieberman RL, Rosenzweig AC (2005) Crystal structure of a membrane-boundmetalloenzyme that catalyses the biological oxidation of methane. Nature 434:177–182.

29. Klotz MG, Stein LY (2008) Nitri!er genomics and evolution of the nitrogen cycle.FEMS Microbiol Lett 278:146–156.

30. Fukuto JM, Switzer CH, Miranda KM, Wink DA (2005) Nitroxyl (HNO): Chemistry,biochemistry, and pharmacology. Annu Rev Pharmacol Toxicol 45:335–355.

31. Miranda KM, et al. (2003) A biochemical rationale for the discrete behavior of nitroxyland nitric oxide in the cardiovascular system. Proc Natl Acad Sci USA 100:9196–9201.

32. Arp DJ, Chain PS, Klotz MG (2007) The impact of genome analyses on ourunderstanding of ammonia-oxidizing bacteria. Annu Rev Microbiol 61:503–528.

33. Berg IA, Kockelkorn D, BuckelW, Fuchs G (2007) A 3-hydroxypropionate/4-hydroxybutyrateautotrophic carbon dioxide assimilation pathway in Archaea. Science 318:1782–1786.

34. Hallam SJ, et al. (2006) Genomic analysis of the uncultivated marine crenarchaeoteCenarchaeum symbiosum. Proc Natl Acad Sci USA 103:18296–18301.

35. Hallam SJ, et al. (2006) Pathways of carbon assimilation and ammonia oxidationsuggested by environmental genomic analyses of marine Crenarchaeota. PLoS Biol 4:e95.

36. Alber B, et al. (2006) Malonyl-coenzyme A reductase in the modi!ed 3-hydroxypropionatecycle for autotrophic carbon !xation in archaeal Metallosphaera and Sulfolobus spp.J Bacteriol 188:8551–8559.

37. Alber BE, Kung JW, Fuchs G (2008) 3-Hydroxypropionyl-coenzyme A synthetase fromMetallosphaera sedula, an enzyme involved in autotrophic CO2 !xation. J Bacteriol190:1383–1389.

38. Hügler M, Huber H, Molyneaux SJ, Vetriani C, Sievert SM (2007) Autotrophic CO2!xation via the reductive tricarboxylic acid cycle in different lineages within thephylum Aqui!cae: Evidence for two ways of citrate cleavage. Environ Microbiol 9:81–92.

39. Quinn JP, Kulakova AN, Cooley NA, McGrath JW (2007) New ways to break an oldbond: The bacterial carbon-phosphorus hydrolases and their role in biogeochemicalphosphorus cycling. Environ Microbiol 9:2392–2400.

40. Herndl GJ, et al. (2005) Contribution of Archaea to total prokaryotic production in thedeep Atlantic Ocean. Appl Environ Microbiol 71:2303–2309.

41. Ingalls AE, et al. (2006) Quantifying archaeal community autotrophy in the mesopelagicocean using natural radiocarbon. Proc Natl Acad Sci USA 103:6442–6447.

42. Ouverney CC, Fuhrman JA (2000) Marine planktonic archaea take up amino acids.Appl Environ Microbiol 66:4829–4833.

43. Chan PP, Lowe TM (2009) GtRNAdb: A database of transfer RNA genes detected ingenomic sequence. Nucleic Acids Res 37 (Database issue):D93–D97.

44. Grosjean H, Marck C, de Crécy-Lagard V (2007) The various strategies of codondecoding in organisms of the three domains of life: Evolutionary implications. NucleicAcids Symp Ser (Oxf) 51:15–16.

45. Dennis PP, Omer A, Lowe T (2001) A guided tour: Small RNA function in Archaea. MolMicrobiol 40:509–519.

46. Maxwell ES, Fournier MJ (1995) The small nucleolar RNAs. Annu Rev Biochem 64:897–934.

47. Facciotti MT, et al. (2007) General transcription factor speci!ed global generegulation in archaea. Proc Natl Acad Sci USA 104:4630–4635.

48. Sandman K, Reeve JN (2005) Archaeal chromatin proteins: Different structures butcommon function? Curr Opin Microbiol 8:656–661.

49. Lindås AC, Karlsson EA, Lindgren MT, Ettema TJ, Bernander R (2008) A unique celldivision machinery in the Archaea. Proc Natl Acad Sci USA 105:18942–18946.

50. Samson RY, Obita T, Freund SM, Williams RL, Bell SD (2008) A role for the ESCRTsystem in cell division in archaea. Science 322:1710–1713.

51. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P (2008) MesophilicCrenarchaeota: Proposal for a third archaeal phylum, the Thaumarchaeota. Nat RevMicrobiol 6:245–252.

52. Stein JL, Marsh TL, Wu KY, Shizuya H, DeLong EF (1996) Characterization ofuncultivated prokaryotes: Isolation and analysis of a 40-kilobase-pair genomefragment from a planktonic marine archaeon. J Bacteriol 178:591–599.

53. Béjà O, et al. (2002) Comparative genomic analysis of archaeal genotypic variants ina single population and in two different oceanic provinces. Appl Environ Microbiol68:335–345.

54. López-García P, Brochier C, Moreira D, Rodríguez-Valera F (2004) Comparativeanalysis of a genome fragment of an uncultivated mesopelagic crenarchaeote revealsmultiple horizontal gene transfers. Environ Microbiol 6:19–34.

55. Johnson ZI, et al. (2006) Niche partitioning among Prochlorococcus ecotypes alongocean-scale environmental gradients. Science 311:1737–1740.

6 of 6 | www.pnas.org/cgi/doi/10.1073/pnas.0913533107 Walker et al.

Page 7: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supporting InformationWalker et al. 10.1073/pnas.0913533107SI Materials and MethodsHigh-Molecular-Weight Genomic DNA Preparation. Two cultures ofNitrosopumilus maritimus strain SCM1 were grown as previouslydescribed in 500 mL of media in 1-L !asks (1). Cells from bothcultures were harvested in late-exponential phase using Sterivex"lters for one culture and a 0.1-!m "lter for the other. High-molecular-weight DNAwas isolated as previously described usingeither agarose plugs (2) or a modi"ed guanadinium thiocynateprotocol (3). Cells from the Sterivex "lter were resuspended in1 mL of 2! STE buffer [1 M NaCl, 0.1 M EDTA (pH 8.0), 10 mMTris (pH8.0)], extracted from the "lter, andmixedwith 1 vol of 1%molten SeaPlaque LMP agarose (FMC). The mixture was cooledto 40 °C, immediately drawn into a 1-mL syringe, and placed on icefor 10 min. The agarose plug was mixed with 10 mL of lysis buffer,incubated at 37 °C for 1 h, and then transferred to 40 mL of ESPbuffer (1%Sarkosyl–1mg of ProteinaseKperml in 0.5MEDTA).After incubation at 55 °C for 16 h, the solution was replaced withfresh ESP buffer and incubated at 55 °C for another hour. DNAwas puri"ed using phenol:chloroform:isoamyl alcohol (24:24:1)and recovered by precipitation with isopropanol.Cells collected on the 0.1-!m "lter were resuspended in 100 !L of

Tris-EDTA (pH 8.0) and 100 mg/mL lysozyme before incubating at37 °C for 30 min. Then, 3.0 mL of a solution containing 5 M guani-dinium thiocyanate, 100 mM EDTA (pH 8.0), and 0.5% (vol/vol)sarkosyl was added. The solution wasmixed gently for 15min beforebeing cooledon ice for10min.After cooling, anequal volumeof cold7.5 M ammonium acetate was added, and the solution was mixedgently and cooledon ice. Puri"cation andprecipitationofDNAwereperformed with chloroform:isoamyl alcohol (24:1) and isopropanol.

Genome Sequencing.A completely sequenced and closed genomeof N. maritimus was obtained through collaboration with theJoint Genome Institute. Whole-genome shotgun sequencing of3-, 8-, and 40-kb DNA libraries produced at least 8! coverage ofthe entire genome. Speci"cs of clone library generation, se-quencing, and assembly strategiesmay be found at theDOE JGIwebsite (www.jgi.doe.gov/sequencing/index.html).

Genome Sequence Analysis. Autoannotation of the closed genomesequence was performed by both the Computational Biologygroup at Oak Ridge National Laboratory (http://genome.ornl.gov/microbial/nmar/02jul07/) and the TIGR AutoannotationService (now hosted by JCVI; details available from http://www.jcvi.org/cms/research/projects/annotation-service/overview/). Thegenome visualization software Manatee (release 2.4.1; latestversion available from http://manatee.sourceforge.net/) was usedfor manual curation. Analysis of potential transporter genes wasperformed using the Transporter Automatic Annotation Pipe-line (TransAAP) through the TransportDB genomic comparisontool (membranetransport.org).Direct comparisons with theC. symbiosum genome and genome

fragments were performed using the Artemis Comparison Tool (3)

with a comparison library generated through WebACT (www.webact.org/WebACT/home). Orthologous genes shared betweenthese two organisms were identi"ed through reciprocal BLASTsearches,with anexpectation cutoff valueof 10!4 and aminimumof75% alignable N. maritimus sequence. Comparisons with theGlobal Ocean Sampling (GOS) and Sargasso Sea metagenomicdatasets were performed using several single-copy universal ar-chaeal genes to determine a count of "15 archaeal genomes in theGOS dataset. An initial set of candidateN. maritimus-like proteinswas found by BLASTP searches with each N. maritimus protein-coding gene, using a cutoff of 100 hits and a maximum expectedvalue of e= 10. This set consisted of 125,326 peptides drawn from107,223 scaffolds. Neighboring ORFs on any scaffold with two ormore hits were added to make a total of 319,585 peptides,amounting to 5.2% of all ORFs in GOS. These sequences were"ltered by BLAST alignment to four sequence datasets, containing24 completeproteomes fromdiverse euryarchaeota, crenarchaeota,and bacteria. Sequences scoringmore highly toN.maritimus and/orC. symbiosum proteins than any other entry in these datasets wereretained, giving a "ltered dataset of 21,278 ORFs. Coverage of theN. maritimus proteome was measured by bidirectional BLASTP ofN. maritimus proteins against the "ltered dataset to assign putativeorthologs. The average coverage was "11!, although some highlyconservedgenes had amuch greater number of hits, probably due torecruitment of nonarchaeal homologs: 48 ORFs had >30 hits, ofwhich most were highly conserved. Searches for CRISPR regionswere performed using the Java-based CRISPR recognition tool(CRT) with least stringent settings (4).

Maximum-Likelihood and Bayesian Trees of the Archaeal Domain.The trees are based on the concatenation of ribosomal proteinsusedbyBrochier-Armanet et al. (5)but including sequences fromN.maritimus and from “Candidatus Korarchaeum crypto!lum” OPF8.Sequences were aligned using MUSCLE (6). Resulting alignmentswere visually inspected and improved with theMUST software (7).Regions where homology between sites was doubtful were removedfrom further phylogenetic analyses. A total of 6,142 positions werekept for the phylogenetic analyses. The maximum-likelihood treewas computed with PHYML, using the WAG model corrected bya gamma law to take into account evolutionary rate among sitevariations (8).Theparameteralphaof thegammadistributionas theproportion of invariable sites was estimated from the dataset. Therobustness of each branchwas estimated by thebootstrap procedureimplemented in PHYML.ABayesian tree analysis on a subset of 29taxa was performed using MrBayes 3.2 (9) with a mixed model ofamino acid substitution and a gamma distribution (eight discretecategories and an estimated proportion of invariant sites) to takeinto account among-site rate variation. MrBayes was run with fourchains for 1 million generations and trees were sampled every 100generations. To construct the consensus tree, the "rst 1,500 treeswere discarded as ‘‘burn-in.’’ The reduction of the taxonomic sam-pling was necessary to reduce the computation time.

1. Könneke M, et al. (2005) Isolation of an autotrophic ammonia-oxidizing marinearchaeon. Nature 437:543–546.

2. Stein LY, et al. (2007) Whole-genome analysis of the ammonia-oxidizing bacterium,Nitrosomonas eutropha C91: Implications for niche adaptation. Environ Microbiol 9:2993–3007.

3. Carver TJ, et al. (2005) ACT: The Artemis comparison tool. Bioinformatics 21:3422–3423.4. Bland C, et al. (2007) CRISPR recognition tool (CRT): A tool for automatic detection of

clustered regularly interspaced palindromic repeats. BMC Bioinformatics 8:209.5. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P (2008) Mesophilic Crenarchaeota:

Proposal for a thirdarchaeal phylum, theThaumarchaeota.NatRevMicrobiol6:245–252.

6. Edgar RC (2004) MUSCLE: Multiple sequence alignment with high accuracy and highthroughput. Nucleic Acids Res 32:1792–1797.

7. Philippe H (1993) MUST, a computer package of management utilities for sequencesand trees. Nucleic Acids Res 21:5264–5272.

8. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate largephylogenies by maximum likelihood. Syst Biol 52:696–704.

9. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference undermixed models. Bioinformatics 19:1572–1574.

Walker et al. www.pnas.org/cgi/content/short/0913533107 1 of 5

Page 8: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

0.5

Thermoplasma volcanium GSS1 [rusticyanin; gi13542283]Acidithiobacillus ferrooxidans [rusticyanin; gi3282064]

Nmar0167

Nmar0747

95

8399

61

62100

9999

96

8891

79

64

82

82

Thermobaculum terrenum ATCC BAA-798 [plastocyanin; gi227373287]Sphaerobacter thermophilus DSM 20745 [plastocyanin; gi229876615]

Mycobacterium avium subsp. avium ATCC 25291 [copper-binding protein; gi254776845]Mycobacterium avium 104 [copper-binding protein; gi118462975]

Mycobacterium intracellulare ATCC 13950 [hypothetical protein; gi254819204]Catenulispora acidiphila DSM 44928 [blue (type 1) copper domain protein; gi256391720]

Streptomyces sp. AA4 [hypothetical protein; gi256667890]Burkholderia cenocepacia PC184 [plastocyanin; gi254250279]

Methylocella silvestris BL2 [blue (type 1) copper domain protein; gi217976580]

Methylocella silvestris BL2 [blue (type 1) copper domain protein; gi217976580]Nitrobacter hamburgensis X14 [blue (type 1) copper domain protein; gi92118183]

Methylobacterium nodulans ORS [blue (type 1) copper domain protein; gi220920817]

Chromobacterium violaceum ATCC 12472 [hypothetical protein; gi34498952]

Methanosarcina mazei Go1 [copper-binding protein; gi21228445]Methanosarcina acetivorans C2A [plastocyanin/azurin family copper-binding protein; gi20090218]

Methanosarcina mazei Go1 [copper-binding protein; gi21228446]Methanosarcina acetivorans C2A [plastocyanin/azurin family copper-binding protein; gi20090219]

C. symbiosum A [copper-binding protein; gi118575216]

Nmar0621

Nmar0361Candidatus Koribacter versatillis [plastocyanin-like protein; gi94968815]

Nmar0902

Thermobaculum terrenum ATCC BAA-798 [plastocyanin; gi227373254]Deinococcus deserti VCD115 [copper-binding protein; gi226356528]Synechococcus sp. WH 8102 [plastocyanin; gi33866031]Prochlorococcus marinus MIT 9312 [plastocyanin; gi78778966]

Anabaena variabilisATCC 29413 [plastocyanin; gi75908957]Gloeobacter violaceus PCC 7421 [plastocyanin; gi35212909]

Jannaschia sp. CCS1 [blue (type 1) copper domain protein; gi89056161]Nmar0718

Bradyrhizobium japonicum USDA 110 [putative Amicyanin precursor; gi27378126]Paracoccus denitrificans PD1222 [amicyanin; gi119387457]

Methanosarcina mazei Go1 [hypothetical protein; gi21226177]Methanosarcina acetivorans C2A [hypothetical protein; gi20090537]

Nmar0190

Methanococcus maripaludis C7 [copper-binding protein; gi150402166]Methanococcus maripaludis S2 [copper-binding protein; gi45358560]

Nmar1913Nmar1789

Nmar0300Nmar0762

C. symbiosum A [copper-binding protein; gi118576966]C. symbiosum A [copper-binding protein; gi118576965]

Nmar0516Nmar1087

C. symbiosum A [hypothetical protein; gi118575848]C. symbiosum A [hypothetical protein; gi118575831]

uncultured marine Crenarchaeote [putative copper-binding protein; gi167044521]uncultured marine Crenarchaeote [putative copper-binding protein; gi167042919]

uncultured marine Crenarchaeote [putative copper-binding protein; gi167045254]

C. symbiosum A [copper-binding protein; gi118575244]Nmar0121

Nmar0732Nmar0324

C. symbiosum A [copper-binding protein; gi118576967]

C. symbiosum A [copper-binding protein; gi118575724]Nmar0273

100

98

100100

79

7980

100

A.

0.1

Bacillus subtilis subsp. subtilis str. 168 [BdbD thiol-disulfide oxidoreductase; gi 16080401]Bacillus amyloliquefaciens FZB42 [BdbD; gi 154687467]

Paenibacillus larvae subsp. larvae BRL-230010 [disulfide dehydrogenase D; gi 167461952]Deinococcus geothermalis DSM 11300 [DsbA oxidoreductase; gi 94984799]

Thermobaculum terrenum ATCC BAA-798 [protein-disulfide isomerase; gi 227374421]Gemmatimonas aurantiaca T-27 [putative oxidoreductase; gi 226228008]

Sphaerobacter thermophilus DSM 20745 [protein-disulfide isomerase; gi 229876989]Thermomicrobium roseum DSM 5159 [DsbA oxidoreductase; gi 221635547]

Mycobacterium kansasii ATCC 12478 [DsbA oxidoreductase; gi 240169146]Rubrobacter xylanophilus DSM 9941 [DsbA oxidoreductase; gi 108804655]

Streptomyces coelicolor A3(2) [hypothetical protein SCO5993; gi 21224330]Streptomyces ghanaensis ATCC 14672 [hypothetical protein SghaA1_08748; gi 239928300]

Thermus aquaticus Y51MC23 [DsbA oxidoreductase; gi 218295130]Thermobifida fusca YX [protein-disulfide isomerase; gi 72161925]

Stigmatella aurantiaca DW4/3-1 [conserved hypothetical protein; gi 115374845]Solibacter usitatus Ellin6076 [DsbA oxidoreductase; gi 116625220]

Myxococcus xanthus DK 1622 [putative lipoprotein; gi 108762810]Stigmatella aurantiaca DW4/3-1 [disulfide interchange protein; gi 115379912]

Anaeromyxobacter dehalogenans 2CP-1 [DsbA oxidoreductase; gi 220919173]

Solibacter usitatus Ellin6076 [DsbA oxidoreductase; gi 116624599]Syntrophobacter fumaroxidans MPOB [DsbA oxidoreductase; gi 116751066]

Archaeoglobus fulgidus DSM 4304 [hypothetical protein AF1354; gi 11498950]Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118575694]

Nmar0218Nmar0740

Nmar0175

Nmar0752Nmar0179

Natrialba magadii ATCC 43099 ]DsbA oxidoreductase; gi 224823141]Nmar1863

Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118576454]Nmar1805Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118576169]

Nmar0169Nmar1590

Cenarchaeum symbiosum A [protein-disulfide isomerase; gi 118575427]

Nmar1606Nmar0164

Symbiobacterium thermophilum IAM 14863 [hypothetical protein STH2058; gi 51893196]Roseiflexus castenholzii DSM 13941 [DsbA oxidoreductase; gi 156741642]

Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219849651] Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219847445]

Marinobacter algicola DG893 [DsbA oxidoreductase; gi 149377658]

Roseiflexus castenholzii DSM 13941 [DsbA oxidoreductase; gi 156743646]Chloroflexus aurantiacus J-10-fl [DsbA oxidoreductase; gi 163848707]

Roseiflexus castenholzii DSM 13941 [protein-disulfide isomerase-like protein; gi 156741356]

Herpetosiphon aurantiacus ATCC 23779 [DsbA oxidoreductase; gi 159896786]Chloroflexus aurantiacus J-10-fl [DsbA oxidoreductase; gi 163845898]

Chloroflexus aggregans DSM 9485 [DsbA oxidoreductase; gi 219850456]

Vibrio harveyi HY01 [DsbA oxidoreductase; gi 153832115]Vibrio parahaemolyticus RIMD 2210633 [hypothetical protein VPA0994; gi 28900849]

Moritella sp. PE36 [putative membrane protein; gi 149908466]Shewanella benthica KT99 [hypothetical protein KT99_10483; gi 163751466]

100

100

8275

100

87

82

100

93

7379

91

67

90

100

72

88

63

100

99

75

72100

100

99

C.

Nmar0121

Nmar0167

Nmar0190

Nmar0273

Nmar0300

Nmar0324

Nmar0361

Nmar0516

Nmar0621

Nmar0718

Nmar0732

Nmar0747

Nmar0762

Nmar1087

Nmar1789

Nmar1913

100 aa

B.

Fig. S1. Phylogeny of plastocyanin-like protein sequences. Sequences with signi!cant matches to COG3794 (PetE: Plastocyanin [Energy production andconversion]) were used to query the non-redundant protein sequence database from NCBI. Sequences from the top 20–30 non-N. maritimus hits were re-trieved and their match to the above conserved domain model veri!ed. Sequences from experimentally characterized proteins were obtained from theavailable literature and included, aligned with ClustalW and then curated manually. Distance-based phylogenies were inferred in Phylip using the Neighbor-Joining algorithm and 100 bootstrap replicates. Bootstrap support values >60% are displayed. Nodes with <50% bootstrap support were collapsed.

Walker et al. www.pnas.org/cgi/content/short/0913533107 2 of 5

Page 9: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Fig. S2. Archaeal ammonia monooxygenase AmoB sequence mapped onto the crystal structure of the particulate methane monooxygenase (PDB accessioncode 1YEW). The pmoA subunit is shown in dark gray, the pmoC subunit in light gray, and the pmoB subunit in red and pink. The red part represents theregion of pmoB conserved in the predicted N. maritimus AmoB. The transmembrane helix and C-terminal cupredoxin domain shown in pink are missing in thepredicted N. maritimus AmoB. Cyan spheres represent copper ions. The grey sphere represents a zinc ion.

Walker et al. www.pnas.org/cgi/content/short/0913533107 3 of 5

Page 10: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Acetyl-CoA carboxylase(Nmar_0272, 0273, 0274)

Malonyl-CoA reductaseMalonate semialdehyde reductase(unknown)

3-Hydroxypropionyl-CoA synthetase3-Hydroxypropionyl-CoA dehydrataseAcryloyl-CoA reductase (unknown)

Propionyl-CoA carboxylase(Nmar_0272, 0273, 0274)

Methylmalonyl-CoA epimeraseMethylmalonyl-CoA mutase

(Nmar_0953, 0954, 0958)

Succinyl-CoA reductase(Nmar_1608)

Succinate semialdehyde reductase(Nmar_1110 or Nmar_0161)

4-Hydroxybutyryl-CoA synthetase(Nmar_0206)

4-Hydroxybutyryl-CoA dehydratase(Nmar_0207)

Crotonyl-CoA hydratase(Nmar_1308)

3-Hydroxybutyryl-CoA dehydrogenase(Nmar_1028)

Acetoacetyl-CoA -ketothiolase(Nmar_0841 or Nmar_1631)

Fig. S3. Proposed 3-hydroxypropionate/4-hydroxybutyrate cycle for autotrophic carbon !xation by N. maritimus.

Walker et al. www.pnas.org/cgi/content/short/0913533107 4 of 5

Page 11: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Other Supporting Information Files

Table S1 (DOC)Table S2 (DOC)Table S3 (DOC)Table S4 (DOC)Table S5 (DOC)Table S6 (DOC)

0.1

Cenarchaeum symbiosumCandidatus Nitrosopumilus maritimus

Candidatus Korarchaeum cryptofilumThermofilum pendens

Caldivirga maquilingensisPyrobaculum calidifontis

Pyrobaculum islandicumPyrobaculum aerophilum

Pyrobaculum arsenaticumIgnicoccus hospitalisStaphylothermus marinus

Aeropyrum pernixHyperthermus butylicus

Metallosphaera sedulaSulfolobus solfataricus

Sulfolobus acidocaldariusSulfolobus tokodaii

Nanoarchaeum equitansThermococcus gammatoleransThermococcus kodakarensis

Pyrococcus furiosusPyrococcus abyssiPyrococcus horikoshii

Methanopyrus kandleriMethanosphaera stadtmanae

Methanothermobacter thermautotrophicusMethanocaldococcus jannaschii

Methanococcus aeolicusMethanococcus maripaludisMethanococcus vannielii

Picrophilus torridusFerroplasma acidarmanus

Thermoplasma acidophilumThermoplasma volcanium

Archaeoglobus fulgidusHalobacterium sp

Natronomonas pharaonisHaloarcula marismortui

Halorubrum lacusprofundiHaloquadratum walsbyi

Haloferax volcaniiMethanocorpusculum labreanum

Methanoculleus marisnigriCandidatus Methanoregula

Methanospirillum hungateiMethanosaeta thermophila

Methanococcoides burtoniiMethanosarcina barkeriMethanosarcina acetivoransMethanosarcina mazei

Giardia lambliaEntamoeba histolytica

Leishmania majorTrypanosoma cruziTrypanosoma brucei

Cryptosporidium parvumTheileria parvaPlasmodium yoelii

Plasmodium falciparumArabidopsis thaliana

Oryza sativaDictyostelium discoideum

Homo sapiensAnopheles gambiae

Saccharomyces cerevisiaeSchizosaccharomyces pombe

94

7061

90

52

55

71

69

8275

61

88

95

38

78

97

96

90

100

100

100100

100

100

100

100

100

100

100

100100

100

100

100100

100

100

100100

100100

100

100

100100

100100

100

100

8651

97

64

100100

100100

100

100

100

100

100

100

Eucarya

ThaumarchaeotaKorarchaeota

Sulfolobales

Desulfurococcales

Thermoproteales

Nanoarchaeota

Thermococcales

MethanopyralesMethanobacteriales

Methanococcales

Thermoplasmatales

Archaeoglobales

Halobacteriales

Methanomicrobiales

Methanosarcinales

Crenarchae ota

Eu r y arch aeota

0.1

Crenarcha eo t a

Eu ry archa eota

Saccharomyces cerevisiaeDictyostelium discoideum

Oryza sativa1.00Candidatus Korarchaeum cryptofilum

Candidatus Nitrosopumilus maritimusCenarchaeum symbiosum1.00

Pyrobaculum aerophilumThermofilum pendens1.00

Aeropyrum pernixHyperthermus butylicus

1.00

Sulfolobus acidocaldariusMetallosphaera sedula1.00

1.00

1.00

0.88

Nanoarchaeum equitansThermococcus kodakarensis

Pyrococcus abyssi1.00Methanopyrus kandleri

Methanothermobacter thermautotrophicusMethanosphaera stadtmanae1.00

1.00

Methanocaldococcus jannaschiiMethanococcus maripaludis1.00

Thermoplasma volcaniumFerroplasma acidarmanus1.00

Archaeoglobus fulgidusHaloarcula marismortuiNatronomonas pharaonis1.00

Methanosarcina mazeiMethanosaeta thermophila

1.00

Methanospirillum hungateiMethanocorpusculum labreanum1.00

0.70

1.00

1.00

1.00

1.00

1.00

1.00

1.00

0.88

1.00

Eucarya

ThaumarchaeotaKorarchaeota

Sulfolobales

Desulfurococcales

Thermoproteales

Nanoarchaeota

Thermococcales

MethanopyralesMethanobacteriales

Methanococcales

Thermoplasmatales

Archaeoglobales

Halobacteriales

Methanomicrobiales

Methanosarcinales

A.

B.

Fig. S4. (A) Maximum-likelihood phylogeny of Group 1 Archaea. The phylogeny was inferred using an alignment of concatenated R-proteins (66 taxa, 6,142positions). WAG+Inv+Gamma (4 classes); 100 replicates. (B) Bayesian tree of mesophilic Group 1 Archaea inferred using an alignment of concatenated R-proteins (29 taxa, 6,142 positions). Mixed model + Gamma (4 classes); 100 replicates.

Walker et al. www.pnas.org/cgi/content/short/0913533107 5 of 5

Page 12: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 1: Number and genomic density of genes grouped as clusters of orthologous genes (COGs).

Nitrosopumilus maritimus SCM1

Nitrosococcus oceani ATCC 19707

Nitrosomonas europaea ATCC 19718

Nitrosomonas eutropha C71

Nitrosospira multiformis ATCC

25196

Pelagibacter ubique HTCC1062

Procholoroccus species

Synechococcus species

Archaeoglobus fulgidus DSM 4304

Halobacterium sp. NRC-1

Methanococcus maripaludis S2

Thermoplasma acidophilum DSM 1728

Pyrococcus species

Sulfolobus species

Genome size (Mb) 1.65 3.52 2.81 2.78 3.23 1.31 1.64-2.68 2.22-3.05 2.18 2.57 1.66 1.56 1.74-1.91 2.23-2.99 ORFs 1,997 3,186 2,628 2,578 2,827 1,.389 1,901 - 3,152 2,580 - 3,401 2,471 2,674 1,772 1,548 1,879 - 2,229 2,305 - 3,031 COG Genes 1,131 2,290 1,995 1,952 2,102 1,131 1,157-1,540 1,459-1,968 1,918 1,812 1,417 1,201 1,421-1,575 1,563-2,105 # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb # #/Mb Amino acid transport & metabolism 109 66 180 51 139 49 141 51 154 48 182 139 121-152 57-76

141-178 57-65 149 68 151 59 125 75 130 83

114-159 66-83

150-202 63-68

Carbohydrate transport & metabolism 46 28 105 30 73 26 81 29 94 29 64 49 62-94 34-41 86-108 29-46 54 25 60 23 50 30 92 59 80-93 45-49

88-128 37-43

Cell cycle control, cell division 11 7 31 9 39 14 32 12 31 10 13 10 15-18 6-10 18-33 7-11 23 11 31 12 16 10 8 5 18-21 9-12 8-15 4-5 Cell motility 6 4 75 21 82 29 75 27 64 20 12 9 1-20 1-7 4-28 2-10 21 10 45 18 27 16 14 9 13-27 7-16 10-14 4-5 Cell wall/membrane/envelope biogenesis 49 30 186 53 167 59 152 55 194 60 99 76 86-136 47-60

113-137 38-60 60 28 59 23 38 23 41 26 46-61 26-32 51-79 19-26

Coenzyme transport & metabolism 95 58 114 32 102 36 99 36 108 33 77 59 106-122 45-66

122-143 47-56 114 52 109 42 126 76 86 55 70-89 40-47

102-110 37-49

Defense mechanisms 13 8 59 17 49 17 41 15 31 10 11 8 14-33 8-14 23-37 9-14 18 8 22 9 11 7 12 8 16-29 9-15 17-22 7-8 Energy production & conservation 100 61 182 52 122 43 142 51 160 49 99 76 82-106 40-52

110-139 43-51 227 104 130 51 151 91 103 66

104-128 60-68

144-176 59-65

Function unknown 86 52 235 67 172 61 149 54 197 61 78 60 85-145 48-56 140-179 55-67 259 119 187 73 219 132 116 74

189-197

99-113

147-187 59-69

General function prediction only 152 92 282 80 221 79 196 70 228 70 112 86 141-189 70-91 171-270 77-98 331 152 276 107 207 125 186 119

262-287

144-162

272-346

116-122

Inorganic ion tranport & metabolism 63 38 135 38 164 58 115 41 105 32 40 31 55-77 27-38 72-148 32-49 91 42 128 50 77 46 51 33 78-99 45-56 63-85 25-28 Intracellular trafficking & secretion 11 7 91 26 91 32 107 38 96 30 38 29 20-52 12-21 27-47 12-17 25 11 23 9 21 13 20 13 20-21 10-12 14-21 5-7 Lipid transport & metabolism 30 18 72 20 65 23 63 23 78 24 51 39 37-49 18-26 35-51 12-20 109 50 56 22 16 10 56 36 19-25 11-13 70-89 26-40 Nucleotide transport & metabolism 45 27 56 16 50 18 54 19 54 17 46 35 48-52 18-31 52-63 18-24 51 23 63 25 51 31 50 32 47-52 27-28 62-66 21-28 Posttranslational modifications 74 45 130 37 103 37 127 46 129 40 61 47 70-95 35-48 90-119 36-42 69 32 105 41 53 32 49 31 51-55 29-30 67-88 27-33 Replication, recombination & repair 56 34 202 57 170 60 225 81 182 56 54 41 62-84 31-40 78-207 31-70 98 45 169 66 58 35 67 43

67-109 38-57

84-326

38-109

Secondary metabolites biosynthesis & transport 15 9 44 12 36 13 49 18 59 18 37 28 19-46 11-17 27-47 9-18 33 15 26 10 5 3 22 14 10-12 6-7 35-53 13-24 Singal transduction mechanisms 47 29 121 34 126 45 86 31 124 38 34 26 23-49 14-20 42-116 19-42 71 33 76 30 35 21 9 6 19-33 10-19 23-28 8-10

Transcription 76 46 96 27 113 40 92 33 99 31 42 32 36-63 21-26 60-93 24-34 97 45 122 47 75 45 71 45 80-85 42-49 95-107 34-43

Translation, ribosomal structure & biogenesis 136 83 149 42 154 55 152 55 151 47 113 86 128-146 54-80

137-144 45-64 160 73 148 58 155 93 136 87

160-166 87-92

165-166 55-74

Page 13: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 2: Predicted coordinates for genes implicated in novel archaeal biochemistry for ammonia oxidation and respiratory chain.

NCBI locus tag Protein # Strand Left bp Right bp Annotation Best Hit Analog in Ammonia oxidizing Bacteria

Nmar_0588 ABX12484 F 523,890 525,197 ammonium transporter AmtB Nmar_1698 ABX13594 R 1,559,816 1,558,251 ammonium transporter AmtB Nmar_1500 ABX13396 R 1,367,890 1,367,240 putative ammonia monooxygenase, subunit A AMO (AmoA) Nmar_1501 ABX13397 R 1,368,387 1,368,025 hypothetical protein Nmar_1502 ABX13398 R 1,369,079 1,368,507 putative ammonia monooxygenase, subunit C AMO (AmoC) Nmar_1503 ABX13399 F 1,369,326 1,369,895 putative ammonia monooxygenase, subunit B AMO (AmoB)

Nmar_0815 ABX12711 R 718,761 720,380 soluble periplasmic BCP (complete upstream CxxxxxC domain, complete CxxHxxM domain [C]) PetE, plastocyanin cytochrome c552

Nmar_1102 ABX12998 F 1,004,439 1,005,395 soluble periplasmic BCP (complete upstream CxxxxxC domain, complete CxxHxxM domain [C]) PetE, plastocyanin cytochrome c552

Nmar_1443 ABX13339 R 1,314,030 1,313,572 soluble periplasmic BCP (complete upstream CxxxxxC domain, complete CxxHxxM domain [C]) PetE, plastocyanin cytochrome c552

Nmar_1637 ABX13533 R 1,494,547 1,492,676 soluble periplasmic BCP (complete upstream CxxxxxC domain, complete CxxHxxM domain [C]) PetE, plastocyanin cytochrome c552

Nmar_0004 ABX11904 F 3,463 4,359 soluble periplasmic BCP (modified upstream CxxxxxC domain, modified CxxHxxM domain [C]) Halocyanin, P39442 cytochrome c552

Nmar_1307 ABX13203 F 1,195,212 1,195,736 soluble periplasmic BCP (modified upstream CxxxxxC domain, modified CxxHxxM domain [C]) Halocyanin, P39442 cytochrome c552

Nmar_0918 ABX12814 R 802,019 802,861 periplasmic membrane BCP (1 TMS [C], modified upstream CxxxxxC domain, complete CxxHxxM domain [M]) plastocyanin anchored cytochrome c552

Nmar_1678 ABX13574 R 1,538,377 1,537,367 periplasmic membrane BCP (1 TMS [C], modified upstream CxxxxxC domain, complete CxxHxxM domain [N]) plastocyanin anchored cytochrome c552

Nmar_1273 ABX13169 R 1,169,255 1,168,392 periplasmic membrane BCP (1 TMS [C], complete upstream CxxxxxC domain, modified CxxHxxM domain [C]) plastocyanin anchored cytochrome c552

Nmar_1161 ABX13057 R 1063718 1062348 periplasmic membrane BCP (1 TMS [N], modified upstream CxxxxxC domain, modified CxxHxxM domain [M]) plastocyanin anchored cytochrome c552

Nmar_1226 ABX13122 F 1,126,185 1,127,261 periplasmic membrane BCP (4 TMS [N], modified upstream CxxxxxC domain, 2 complete CxxHxxM domains [C]) plastocyanin cM552

Nmar_1142 ABX13038 F 1,045,689 1,046,150 cytoplasmic membrane BCP (1 TMS [N], modified upstream CxxxxxC domain, complete CxxHxxM domain [C]) Nmar_1542 "cytoplasmic BCP"

Nmar_0276 ABX12172 F 243,850 244,182 CI, NADH-ubiquinone/plastoquinone oxidoreductase chain 3 Nmar_0277 ABX12173 F 244,223 244,747 CI, NADH-quinone oxidoreductase, B subunit Nmar_0278 ABX12174 F 244,747 245,349 CI, NADH dehydrogenase (ubiquinone) 30 kDa subunit Nmar_0279 ABX12175 F 245,352 246,488 CI, NADH dehydrogenase (quinone) Nmar_0280 ABX12176 F 246,489 247,787 CI, NADH dehydrogenase (quinone) Nmar_0281 ABX12177 F 247,787 248,284 CI, [4Fe-4S] ferredoxin iron-sulfur binding domain protein Complex I Nmar_0282 ABX12178 F 248,277 248,789 CI, NADH-ubiquinone/plastoquinone oxidoreductase chain 6 Nmar_0283 ABX12179 F 248,770 249,075 CI, NADH-ubiquinone oxidoreductase chain 4L Nmar_0284 ABX12180 F 249,075 250,628 CI, proton-translocating NADH-quinone oxidoreductase, chain M Nmar_0285 ABX12181 F 250,630 252,708 CI, proton-translocating NADH-quinone oxidoreductase, chain L Nmar_0286 ABX12182 F 252,721 254,205 CI, proton-translocating NADH-quinone oxidoreductase, chain N

Nmar_1542 ABX13438 R 1,404,537 1,404,073 CIII, cytoplasmic membrane BCP, 1 TMS (N), modified upstream CxxxxxC domain, complete CxxHxxM domain (C) Nmar_1142 membrane-bound cytochrome c1

Nmar_1543 ABX13439 R 1,406,185 1,404,590 CIII, Cytochrome b/b6 domain Complex III cyt. B Nmar_1544 ABX13440 R 1,406,774 1,406,169 CIII, Rieske [2Fe-2S] domain protein Complex III [Fe-S] Nmar_0182 ABX12078 F 166,066 166,326 CIV, hypothetical protein HCO subunit IV Nmar_0183 ABX12079 F 166,329 166,760 CIV, heme-copper oxidase subunit II HCO subunit II Nmar_0184 ABX12080 F 166,798 168,324 CIV, heme-copper oxidase subunit I HCO subunit I

Nmar_0185 ABX12081 F 168,337 169,284 CIV, periplasmic membrane BCP, 1 TMS (N), complete upstream CxxxxxC domain, complete CxxHxxM domain (C) Nmar_1354 cytochrome c subunit III

Nmar_1259 ABX13155 R 1,155,995 1,154,583 NirK, soluble periplasmic MCO (Cu-oxidase_1; Cu-oxidase_2; Cu-oxidase_3), Nwi_2648; Nham_3281; NE0924; Neut_1403; Noc_0089

3dMCO, NirK

Nmar_1661 ABX13557 R 1,519,832 1,519,440 Transcriptional regulator, MarR family (winged helix-turn-helix HxlR type) HTH -regulator MarR Nmar_1662 ABX13558 R 1,521,322 1,520,144 Metal cation transporter (ZIP Zinc transporter; cl00437) Zip

Nmar_1663 ABX13559 R 1,522,373 1,521,312 soluble periplasmic 2dMCO (Cu-oxidase_2; Cu-oxidase_3), similar to NcgA (BCO)

Nwi_2651; Nham_3284; NE0927; Neut_1406

2dMCO, NcgA (BCO)

Nmar_1664 ABX13560 R 1,522,974 1,522,483 metal (Mn, Fe) dependent repressor protein, DtxR family Nmar_1132 HTH -regulator DtxR

Nmar_1665 ABX13561 R 1,523,638 1,523,135 soluble periplasmic BCP (modified upstream CxxxxxC domain, modified CxxHxxM domain [C]) Halocyanin, P39442 monoheme cytochrome c

Nmar_1666 ABX13562 R 1,524,788 1,523,754 beta-propeller structure oxidase (Kelch repeat-containing protein [CDD:121499]) Nmar_1133 beta propeller structure

Nmar_1667 ABX13563 R 1,526,542 1,525,130 NirK, soluble periplasmic 3d MCO (Cu-oxidase_1; Cu-oxidase_2; Cu-oxidase_3), NsrR motif in upstream region.

Nwi_2648; Nham_3281; NE0924; Neut_1403; Noc_0089

3dMCO, Nirk

Nmar_1136 ABX13032 R 1,041,807 1,040,200 soluble periplasmic 3dMCO (Cu-oxidase_1; Cu-oxidase_2; Cu-oxidase_3), similar to NcgA

Nwi_2651; Nham_3284; NE0927; Neut_1406

3dMCO- modified NcgA (BCO)

Nmar_1135 ABX13031 R 1,040,071 1,038,395 putative dehydrogenase with Rossmann-fold NAD(P)+-binding domain [cl09931] Rossmann-fold - DH Nmar_1134 ABX13030 R 1,038,244 1,037,921 hypothetical protein

Nmar_1133 ABX13029 R 1,037,891 1,036,866 beta-propeller structure oxidase (Kelch repeat-containing protein [CDD:121499]) Nmar_1666 beta propeller structure

Nmar_1132 ABX13028 R 1,036,419 1,035,937 metal (Mn, Fe) dependent repressor protein, DtxR family Nmar_1664 HTH -regulator DtxR

Nmar_1131 ABX13027 R 1,035,788 1,034,733 soluble periplasmic 2d MCO (Cu-oxidase_2; Cu-oxidase_3), similar to NcgA (BCO)

Nwi_2651; Nham_3284; NE0927; Neut_1406

2dMCO NcgA (BCO)

Nmar_1130 ABX13026 R 1,034,743 1,033,565 Metal cation transporter (ZIP Zinc transporter; cl00437) Zip

Nmar_1129 ABX13025 R 1,031,846 1,033,210 cytoplasmic membrane BCP (3 TMS [1N, 2C], complete upstream CxxxxxC domain, complete CxxHxxM domain [N) "cytoplasmic BCP"

Nmar_1128 ABX13024 R 1,031,477 1,030,587 periplasmic binding protein (transport of ferric siderophores and metal ions such as Mn2+, Fe3+, Cu2+ and/or Zn2+) cation BP

Nmar_1352 ABX13248 R 1,238,660 1,237,860 Transcriptional regulator, ArsR family (winged helix-turn-helix type, COG4742) HTH -regulator Nmar_1353 ABX13249 R 1,239,270 1,238,728 Uncharacterized conserved protein [COG3945; Function unknown]

Nmar_1354 ABX13250 R 1,240,615 1,239,281 Fusion Protein: soluble periplasmic 2d MCO (Cu-oxidase_2; Cu-oxidase_3) - BCP (modified upstream CxxxxxC domain & CxxHxxM domain [C])

Nmar_1131; Nmar_0185 2dMCO - BCP bi-functional MCO

Nmar_1355 ABX13251 R 1,242,507 1,241,821 hypothetical protein

Page 14: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1356 ABX13252 R 1,243,134 1,242,610 IsiB, flavodoxin/nitric oxide synthase [cl00438] flavodoxin/nitric oxide synthase Nmar_1357 ABX13253 F 1,243,214 1,243,858 nitroreductase [Nitro_FMN_reductase] Nitro_FMN_reductase Nmar_0650 ABX12546 R 588,034 587,297 Nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase

Nmar_1650 ABX13546 R 1,508,803 1,509,615 periplasmic membrane BCP (1 TMS [C], complete upstream CxxxxxC domain, complete CxxHxxM domain [M]) plastocyanin monoheme cytochrome c

Nmar_1651 ABX13547 R 1,510,321 1,509,629 hypothetical protein

Nmar_1652 ABX13548 F 1,510,361 1,513,144 periplasmic Cu resistance membrane hybrid protein: CopC - [8 TMS] - CopD - [1 TMS] CENSYa_1798 copC-copD 3' of amoCAB(E)D

Nmar_1653 ABX13549 R 1,513,952 1,513,134 ABC-3 protein; ABC-ATPase subunit interface [CDD:119348] Nmar_1654 ABX13550 R 1,514,677 1,513,955 ABC-type Mn/Zn transport systems, ATPase component [COG1121]

Nmar_1655 ABX13551 R 1,515,591 1,514,671 periplasmic solute binding protein [ABC transport of ferric siderophores and metal ions such as Mn2+, Fe3+, Cu2+ and/or Zn2+] cation BP

Nmar_1656 ABX13552 R 1,516,066 1,515,683 NikR; transcriptional regulator with C-terminal nickel binding domain [pfam08753]

Nmar_1657 ABX13553 F 1,516,198 1,516,479 hypothetical protein

Nmar_1250 ABX13146 R 1,148,436 1,147,717 protease inhibitor, cytoplasmic membrane BCP (1 TMS [N], complete upstream CxxxxxC domain, complete CxxHxxM domain [C]) cytoplasmic protease inhibitor

Red labelling indicates presence in C. symbiosum genome. Green shading indicates gene cluster encoding a 2dMCO (Nmar_1663) and a 3dMCO-NirK (Nmar_1667) preceeded by a NsrR binding site. Both genes flank a gene encoding a soluble periplasmic blue copper redox protein. This resembles AOB and nitrite-oxidizing bacteria in which the MCO-encoded genes flank genes encoding cytochrome c proteins. Blue shading indicates gene cluster that may have arisen by incomplete gene duplication of the cluster shaded in green (loss of BCP gene), followed by modification of the 3dMCO gene (loss of nirK-specific domains) and acquisition of genes Nmar_1134 and Nmar_1135). Orange shading indicates gene cluster containing a unique combination of gene homologues implicated in nitrogen oxide processing. Gene cluster Nmar_1650-7 may encode inventory responsible for copper uptake and homeostatis. Nmar_1652 encodes a fusion protein of the CopC and CopD proteins, whose encoding genes are found in tandem immediately downstream of amo genes.

Page 15: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 3: Gene prediction coordinates for stable RNAs and TFBs/TBPs Gene Name Start End Product Note

STABLE RNAs

Nmar_rR16S 896,240 897,711 16S

ribosomal RNA

Small subunit ribosomal RNA

Nmar_rR23S 893,105 896,109 23S

ribosomal RNA

Large subunit ribosomal RNA

Nmar_rR5S 262,500 262,619 5S

ribosomal RNA

5S ribosomal RNA; detected by Rfam model

Nmar_SRP 95,541 95,248 SRP RNA

Signal recognition particle RNA; detected by Rfam model

Nmar_RNaseP 569,667 569,388 RNase P RNA Detected by Rfam model

Nmar_sR1 100,802 100,856 C/D box

guide sRNA

Predicted to modify tRNA tR12-LeuCAA at Cm36 (anticodon) by snoscan; conserved region in C. symbiosum

Nmar_sR2 537,407 537,466 C/D box

guide sRNA

Predicted to modify 23S rRNA at Cm2014 by snoscan

Nmar_sR3 256,057 256,113 C/D box

guide sRNA

Predicted to modify tRNA tR06-AlaGGC at Am60 by snoscan; conserved sRNA in C. symbiosum

Nmar_sR4 1,131,193 1,131,136 C/D box

guide sRNA

Predicted to modify tRNA tR35-TyrGTA at Am83 by snoscan; conserved sRNA in C. symbiosum

Nmar_sR5 471,299 471,354 C/D box

guide sRNA

No predicted target, but conserved with sRNA in C. symbiosum

Nmar_sR6 1,376,278 1,376,344 C/D box

guide sRNA

No predicted target, but conserved with sRNA in C. symbiosum

Nmar_tR01 88,221 88,305 LeuCAG No introns; tRNAscan-SE v1.23 Score 53.62

Nmar_tR02 171,008 171,081 LysCTT No introns; tRNAscan-SE v1.23 Score 71.92

Nmar_tR03 177,670 177,744 ValTAC No introns; tRNAscan-SE v1.23 Score 74.7

Nmar_tR04 188,725 188,796 AlaCGC No introns; tRNAscan-SE v1.23 Score 70.62

Nmar_tR05 210,730 210,813 SerTGA No introns; tRNAscan-SE

Page 16: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

v1.23 Score 65.85

Nmar_tR06 224,909 224,982 AlaGGC No introns; tRNAscan-SE v1.23 Score 62.77

Nmar_tR07 232,880 232,957 AspGTC No introns; tRNAscan-SE v1.23 Score 64.16

Nmar_tR08 255,938 256,011 MetCAT No introns; tRNAscan-SE v1.23 Score 60.27

Nmar_tR09 270,209 270,280 HisGTG No introns; tRNAscan-SE v1.23 Score 44.15

Nmar_tR10 281,501 281,585 LeuGAG No introns; tRNAscan-SE v1.23 Score 59.05

Nmar_tR11 366,393 366,493 SerGGA Intron at 366431-366444; tRNAscan-SE v1.23 Score 59.06

Nmar_tR12 454,316 454,416 LeuCAA Intron at 454356-454371; tRNAscan-SE v1.23 Score 50.5

Nmar_tR13 504,317 504,391 ArgGCG No introns; tRNAscan-SE v1.23 Score 62.07

Nmar_tR14 537,291 537,367 ProTGG No introns; tRNAscan-SE v1.23 Score 74.43

Nmar_tR15 647,814 647,895 SerCGA No introns; tRNAscan-SE v1.23 Score 55.73

Nmar_tR16 1,029,515 1,029,611 ValCAC Contains a non-canonical intron; tRNAscan-SE v1.23 Score 70.29

Nmar_tR17 1,079,755 1,079,829 ThrTGT No introns; tRNAscan-SE v1.23 Score 68.33

Nmar_tR18 1,212,832 1,212,945 MetCAT Contains a non-canonical intron; tRNAscan-SE v1.23 Score 80.94

Nmar_tR19 1,313,500 1,313,571 CysGCA No introns; tRNAscan-SE v1.23 Score 25.47

Nmar_tR20 1,349,774 1,349,927 TrpCCA

Two introns: one at 1349839-1349891 and other is non-canonical; tRNAscan-SE v1.23 Score 67.18

Nmar_tR21 1,364,392 1,364,466 IleGAT No introns; tRNAscan-SE v1.23 Score 72.57

Nmar_tR22 1,366,079 1,366,152 ThrGGT No introns; tRNAscan-SE v1.23 Score 69.19

Nmar_tR23 1,403,611 1,403,685 ArgTCT No introns; tRNAscan-SE v1.23 Score 65.64

Nmar_tR24 1,430,303 1,430,377 iMetCAT No introns; tRNAscan-SE v1.23 Score 66.66

Nmar_tR25 1,592,406 1,592,498 ArgCCT Contains a non-canonical intron; tRNAscan-SE v1.23 Score 55.09

Page 17: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_tR26 1,630,610 1,630,537 ProGGG No introns; tRNAscan-SE v1.23 Score 63.49

Nmar_tR27 1,591,588 1,591,515 ValGAC No introns; tRNAscan-SE v1.23 Score 60.87

Nmar_tR28 1,479,706 1,479,633 GlyTCC No introns; tRNAscan-SE v1.23 Score 70.32

Nmar_tR29 1,365,987 1,365,864 LeuTAA

Two introns: one at 1365923-1365909 and other is non-canonical; tRNAscan-SE v1.23 Score 56.34

Nmar_tR30 1,363,723 1,363,650 AlaTGC No introns; tRNAscan-SE v1.23 Score 73.24

Nmar_tR31 1,317,848 1,317,775 ThrCGT No introns; tRNAscan-SE v1.23 Score 76.73

Nmar_tR32 1,263,222 1,263,136 SerGCT No introns; tRNAscan-SE v1.23 Score 70.89

Nmar_tR33 754,037 753,964 PheGAA No introns; tRNAscan-SE v1.23 Score 73.7

Nmar_tR34 707,158 707,082 AsnGTT No introns; tRNAscan-SE v1.23 Score 64.81

Nmar_tR35 470,128 470,002 TyrGTA Intron at 470090-470041; tRNAscan-SE v1.23 Score 57.47

Nmar_tR36 418,681 418,604 GluCTC No introns; tRNAscan-SE v1.23 Score 70.32

Nmar_tR37 397,959 397,883 GlyGCC No introns; tRNAscan-SE v1.23 Score 70.33

Nmar_tR38 321,890 321,800 GlnCTG Intron at 321852-321835; tRNAscan-SE v1.23 Score 52.48

Nmar_tR39 232,783 232,710 LysTTT No introns; tRNAscan-SE v1.23 Score 70.32

Nmar_tR40 198,740 198,653 LeuTAG No introns; tRNAscan-SE v1.23 Score 70.33

Nmar_tR41 162,816 162,742 ArgTCG No introns; tRNAscan-SE v1.23 Score 70.34

Nmar_tR42 81,511 81,439 GlnTTG No introns; tRNAscan-SE v1.23 Score 70.35

Nmar_tR43 59,676 59,603 GlyCCC No introns; tRNAscan-SE v1.23 Score 70.36

Nmar_tR44 255,184 255,090 GluTTC Contains a non-canonical intron; tRNAscan-SE v1.23 Score 73.75

TRANSCRIPTION FACTOR B

Nmar_0013 12,800 13,720 TFB TFB5 Nmar_0020 16,889 17,794 TFB TFB6 Nmar_0519 464,679 464,978 TFB TFB7

Page 18: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0624 564,972 565,880 TFB TFB8 Nmar_0979 856,416 857,327 TFB TFB1 Nmar_0987 863,158 864,105 TFB TFB2 Nmar_1340 1,227,903 1,228,823 TFB TFB3 Nmar_1341 1,229,190 1,230,119 TFB TFB4

TRANSCIPTION BINDING PROTEINS

Nmar_0598 534,324 534,884 TBP TBP2 Nmar_1519 1,382,425 1,382,988 TBP TBP1

Page 19: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 4: Gene prediction coordinates shared between the N. maritimus and C. symbiosum genomes.

N. maritimus SCM1 C. symbiosum A N. maritimus ORF

C. symbiosum ORF

aa % indentity Start End Length

(aa) Strand Start End Length (aa) Strand

N. maritimus Annotation of Top BLAST Hit

146 1,348 400 f 1,814,836 1,816,068 410 f cell division control protein 6 family protein Nmar_0002 CENSYa_1840 58 1,325 2,770 481 r 1,816,083 1,817,513 476 r DNA-directed DNA polymerase Nmar_0003 CENSYa_1841 53 2,881 3,387 168 f 1,817,646 1,818,176 176 f hypothetical protein Nmar_0005 CENSYa_1842 65 4,356 5,072 238 r 1,818,183 1,818,902 239 r precorrin-6y C5,15-methyltransferase (decarboxylating), CbiE subunit Nmar_0006 CENSYa_1843 58 5,075 5,566 163 r 1,818,899 1,819,387 162 r hypothetical protein Nmar_0007 CENSYa_1844 42 5,639 7,168 509 f 1,819,536 1,821,173 545 f peptidylprolyl isomerase Nmar_0008 CENSYa_1845 70 7,169 8,524 451 r 1,821,115 1,822,368 417 r FAD-dependent pyridine nucleotide-disulphide oxidoreductase Nmar_0009 CENSYa_1846 56 8,594 9,244 216 r 1,822,490 1,823,149 219 r hypothetical protein Nmar_0010 CENSYa_1847 63 9,370 9,861 163 f 1,823,259 1,823,753 164 f heat shock protein HSP20 Nmar_0011 CENSYa_1848 72 9,889 12,030 713 f 1,823,778 1,825,907 709 f AAA family ATPase, CDC48 subfamily Nmar_0013 CENSYa_1825 64 12,800 13,720 306 r 1,803,378 1,804,292 304 f transcription factor TFIIB cyclin-related Nmar_0018 CENSYa_1820 41 15,531 15,830 99 f 1,800,611 1,800,907 98 r hypothetical protein Nmar_0019 CENSYa_1875 49 15,827 16,771 314 r 1,848,134 1,849,048 304 r hypothetical protein Nmar_0020 CENSYa_1876 88 16,889 17,794 301 f 1,849,320 1,850,231 303 f transcription factor TFIIB cyclin-related Nmar_0021 CENSYa_1877 63 17,795 19,249 484 r 1,850,234 1,851,682 482 r sodium/hydrogen exchanger Nmar_0022 CENSYa_1879 52 19,365 19,907 180 f 1,852,830 1,853,375 181 f conserved hypothetical protein Nmar_0023 CENSYa_1880 80 19,920 20,459 179 f 1,853,445 1,853,939 164 f adenylylsulfate kinase Nmar_0024 CENSYa_1881 49 20,451 21,266 271 r 1,853,945 1,854,727 260 r inositol monophosphatase Nmar_0027 CENSYa_1883 44 22,532 22,960 142 r 1,855,370 1,855,804 144 r hypothetical protein Nmar_0029 CENSYa_1885 56 23,675 24,322 215 f 1,857,324 1,857,911 195 f conserved hypothetical protein Nmar_0033 CENSYa_1886 77 27,032 27,292 86 f 1,858,052 1,858,387 111 f hypothetical protein Nmar_0034 CENSYa_1887 45 27,296 28,303 335 f 1,858,390 1,859,367 325 f conserved hypothetical protein Nmar_0036 CENSYa_1891 68 28,710 28,904 64 f 1,861,583 1,861,777 64 f hypothetical protein Nmar_0037 CENSYa_1892 58 28,946 29,659 237 f 1,861,823 1,862,539 238 f radical SAM domain protein Nmar_0038 CENSYa_1893 75 29,723 30,280 185 f 1,862,589 1,863,158 189 f GTP cyclohydrolase I Nmar_0039 CENSYa_1894 28 30,258 30,575 105 f 1,863,183 1,863,527 114 f hypothetical protein Nmar_0040 CENSYa_1895 62 30,578 31,252 224 r 1,863,524 1,864,195 223 r ExsB family protein Nmar_0041 CENSYa_1896 59 31,256 32,179 307 r 1,864,192 1,864,965 257 r homoserine kinase Nmar_0042 CENSYa_1897 64 32,179 32,937 252 r 1,865,115 1,865,855 246 r protein of unknown function ATP binding Nmar_0043 CENSYa_1898 66 33,030 34,121 363 f 1,865,926 1,866,957 343 f conserved hypothetical protein Nmar_0044 CENSYa_1899 61 34,156 34,950 264 f 1,866,962 1,867,729 255 f short-chain dehydrogenase/reductase SDR Nmar_0046 CENSYa_1900 57 35,374 35,685 103 r 1,867,726 1,867,968 80 r hypothetical protein Nmar_0047 CENSYa_1901 65 35,839 37,074 411 r 1,868,110 1,869,285 391 r hypothetical protein Nmar_0048 CENSYa_1902 41 37,114 38,277 387 r 1,869,343 1,870,470 375 r thiamine biosynthesis ATP pyrophosphatase-like protein Nmar_0049 CENSYa_1903 68 38,305 39,030 241 f 1,870,540 1,871,325 261 f adenylylsulfate reductase, thioredoxin dependent Nmar_0050 CENSYa_1904 68 39,032 40,174 380 f 1,871,322 1,872,461 379 f sulfate adenylyltransferase Nmar_0051 CENSYa_1905 54 40,206 40,871 221 f 1,872,493 1,873,206 237 f hypothetical protein Nmar_0052 CENSYa_1906 71 40,855 41,511 218 r 1,873,004 1,873,837 277 r DNA-(apurinic or apyrimidinic site) lyase Nmar_0053 CENSYa_1907 59 41,561 42,715 384 f 1,873,844 1,875,040 398 f protein of unknown function DUF521 Nmar_0054 CENSYa_1908 62 42,712 43,098 128 f 1,875,088 1,875,396 102 f protein of unknown function DUF126 Nmar_0055 CENSYa_1920 49 43,073 45,031 652 r 1,885,975 1,887,870 631 f protein of unknown function DUF814 Nmar_0056 CENSYa_1919 46 45,126 45,602 158 f 1,885,281 1,885,766 161 r hypothetical protein Nmar_0057 CENSYa_1918 61 45,627 46,217 196 f 1,884,678 1,885,274 198 r precorrin-6Y C5,15-methyltransferase (decarboxylating), CbiT subunit Nmar_0058 CENSYa_1917 73 46,253 46,975 240 f 1,883,935 1,884,660 241 r precorrin-2 C20-methyltransferase Nmar_0059 CENSYa_1916 68 46,968 47,738 256 f 1,883,031 1,883,933 300 r precorrin-4 C11-methyltransferase Nmar_0061 CENSYa_1915 71 48,580 49,230 216 f 1,882,322 1,882,966 214 r conserved hypothetical protein Nmar_0062 CENSYa_1914 62 49,278 50,438 386 r 1,881,108 1,882,265 385 f glycosyl transferase family 2 Nmar_0063 CENSYa_1924 72 50,600 52,057 485 r 1,889,509 1,890,846 445 r glycyl-tRNA synthetase Nmar_0064 CENSYa_1925 64 52,088 52,927 279 r 1,890,979 1,891,821 280 r apurinic endonuclease Apn1 Nmar_0065 CENSYa_1926 59 53,013 54,038 341 f 1,891,958 1,892,971 337 f DNA primase, large subunit Nmar_0066 CENSYa_1927 53 54,038 55,159 373 f 1,892,964 1,894,082 372 f DNA primase small subunit Nmar_0067 CENSYa_1928 71 55,201 58,062 953 f 1,894,254 1,897,037 927 f conserved hypothetical protein Nmar_0068 CENSYa_1929 41 58,063 58,605 180 f 1,897,039 1,897,578 179 f glutamine amidotransferase class-I Nmar_0069 CENSYa_1930 50 58,640 59,602 320 f 1,897,613 1,898,572 319 f tetratricopeptide TPR_2 repeat protein Nmar_0070 CENSYa_1932 59 59,729 59,902 57 f 1,898,697 1,898,876 59 f hypothetical protein Nmar_0071 CENSYa_1934 60 60,101 60,514 137 f 1,899,102 1,899,503 133 f ribosomal protein S6e Nmar_0072 CENSYa_1935 74 60,546 61,808 420 f 1,899,621 1,900,799 392 f protein synthesis factor GTP-binding Nmar_0073 CENSYa_1936 40 61,801 62,169 122 f 1,900,831 1,901,166 111 f conserved hypothetical protein Nmar_0074 CENSYa_1937 53 62,205 62,654 149 f 1,901,192 1,901,644 150 f PEBP family protein Nmar_0075 CENSYa_1939 47 62,723 64,861 712 f 1,902,465 1,904,675 736 f dolichyl-diphosphooligosaccharide--protein glycotransferase Nmar_0077 CENSYa_1948 67 65,021 65,818 265 r 1,922,910 1,923,689 259 f RNA binding S1 domain protein Nmar_0078 CENSYa_1947 62 65,859 66,389 176 r 1,922,333 1,922,863 176 f cob(I)alamin adenosyltransferase Nmar_0079 CENSYa_1944 52 66,477 67,829 450 f 1,908,224 1,909,495 423 r cobyrinic acid a,c-diamide synthase Nmar_0080 CENSYa_1943 61 67,818 68,459 213 r 1,907,632 1,908,222 196 f precorrin-8X methylmutase CbiC/CobH Nmar_0081 CENSYa_1942 60 68,449 69,201 250 r 1,906,845 1,907,600 251 f cobalamin (vitamin B12) biosynthesis CbiX protein Nmar_0082 CENSYa_1941 58 69,198 70,250 350 r 1,905,805 1,906,848 347 f cobalamin (vitamin B12) biosynthesis CbiG protein Nmar_0083 CENSYa_1940 61 70,282 71,367 361 r 1,904,838 1,905,773 311 f cobalamin biosynthesis protein CbiD Nmar_0084 CENSYa_1949 35 71,458 72,327 289 f 1,923,956 1,924,705 249 f NAD-dependent epimerase/dehydratase Nmar_0085 CENSYa_1950 70 72,304 73,470 388 r 1,924,679 1,925,845 388 r methionine adenosyltransferase Nmar_0086 CENSYa_1951 80 73,486 73,731 81 r 1,925,860 1,926,180 106 r like-Sm ribonucleoprotein core Nmar_0087 CENSYa_1952 37 73,840 74,493 217 f 1,926,247 1,926,903 218 f RecA/RadA recombinase-like protein Nmar_0088 CENSYa_1953 66 74,592 77,333 913 f 1,927,046 1,929,784 912 f DEAD/DEAH box helicase domain protein Nmar_0089 CENSYa_1954 54 77,330 78,730 466 r 1,929,781 1,931,187 468 r nucleic acid binding OB-fold tRNA/helicase-type Nmar_0090 CENSYa_1955 66 78,944 79,882 312 f 1,931,295 1,932,236 313 f ABC transporter related Nmar_0091 CENSYa_1956 73 79,866 80,627 253 f 1,932,292 1,932,981 229 f ABC-2 type transporter Nmar_0092 CENSYa_0589 40 80,633 81,373 246 r 539,411 540,283 290 r UDP-glucose/GDP-mannose dehydrogenase Nmar_0093 CENSYa_1958 44 81,548 82,600 350 r 1,933,158 1,934,201 347 r protein of unknown function DUF354 Nmar_0094 CENSYa_1959 61 82,664 83,803 379 f 1,934,200 1,935,390 396 f hypothetical protein Nmar_0095 CENSYa_1960 46 83,839 84,402 187 f 1,935,423 1,935,986 187 f GrpE protein Nmar_0096 CENSYa_1961 79 84,405 86,315 636 f 1,935,989 1,937,959 656 f chaperone protein DnaK Nmar_0097 CENSYa_1962 65 86,365 87,450 361 f 1,937,993 1,939,048 351 f chaperone protein DnaJ Nmar_0099 CENSYa_1967 49 88,308 90,212 634 r 1,941,188 1,943,074 628 r putative aspartyl-tRNA(Asn) amidotransferase, B subunit Nmar_0100 CENSYa_1968 57 90,221 91,510 429 r 1,943,097 1,944,275 392 r glutamyl-tRNA(Gln) amidotransferase, subunit D

Page 20: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0101 CENSYa_1969 89 91,543 93,711 722 r 1,944,442 1,946,628 728 r AAA family ATPase, CDC48 subfamily Nmar_0102 CENSYa_1970 79 93,816 94,550 244 f 1,946,846 1,947,577 243 f ribosomal protein L2 Nmar_0105 CENSYa_0057 62 95,588 96,502 304 r 42,784 43,593 269 f PP-loop domain protein Nmar_0107 CENSYa_0056 72 97,656 98,852 398 r 41,429 42,655 408 f peptidase M50 Nmar_0108 CENSYa_1015 52 98,896 99,207 103 f 1,074,559 1,074,888 109 f CutA1 divalent ion tolerance protein Nmar_0109 CENSYa_1016 50 99,204 100,253 349 r 1,074,878 1,075,909 343 r eRF1 domain 1 protein Nmar_0110 CENSYa_1017 69 100,286 100,699 137 r 1,075,925 1,076,326 133 r hypothetical protein Nmar_0111 CENSYa_1018 71 100,838 101,290 150 f 1,076,501 1,076,920 139 f protein of unknown function UPF0047 Nmar_0112 CENSYa_1019 68 101,295 101,459 54 r 1,076,922 1,077,152 76 r hypothetical protein Nmar_0113 CENSYa_1014 75 101,499 101,951 150 r 1,074,077 1,074,523 148 f iron (metal) dependent repressor, DtxR family Nmar_0114 CENSYa_1013 71 101,953 102,720 255 r 1,073,307 1,074,074 255 f transcriptional regulator, TrmB Nmar_0115 CENSYa_1012 91 102,822 104,501 559 f 1,071,473 1,073,242 589 r radical SAM domain protein Nmar_0116 CENSYa_1002 26 104,553 105,578 341 f 1,057,826 1,058,809 327 f glycosyl transferase group 1 Nmar_0120 CENSYa_0995 11 109,495 110,676 393 r 1,049,888 1,051,114 408 r glycosyl transferase group 1 Nmar_0161 CENSYa_0549 45 149,412 150,347 935 f 480,454 481,398 315 r alcohol dehydrogenase, class IV Nmar_0167 CENSYa_0590 44 154,844 155,815 323 r 540,324 541,418 364 r transcriptional regulator, RpiR family Nmar_0168 CENSYa_1003 32 155,950 156,876 308 f 1,058,746 1,059,645 299 r NAD-dependent epimerase/dehydratase Nmar_0169 CENSYa_0591 66 156,890 157,927 345 f 541,417 542,457 346 f putative translation initiation factor, aIF-2BI family Nmar_0172 CENSYa_1971 73 159,978 160,616 212 f 1,947,720 1,948,343 207 f protein of unknown function DUF59 Nmar_0173 CENSYa_1972 60 160,619 161,266 215 r 1,948,340 1,949,260 306 r short-chain dehydrogenase/reductase SDR Nmar_0177 CENSYa_1974 66 162,861 164,036 391 f 1,949,468 1,950,619 383 f aminotransferase class I and II Nmar_0178 CENSYa_1975 58 164,030 164,431 133 r 1,950,616 1,950,939 107 r pyridoxamine 5'-phosphate oxidase-related FMN-binding Nmar_0179 CENSYa_1976 49 164,415 164,648 77 r 1,951,001 1,951,237 78 r hypothetical protein Nmar_0180 CENSYa_1977 60 164,664 165,335 223 r 1,951,240 1,951,860 206 r NADP oxidoreductase coenzyme F420-dependent Nmar_0181 CENSYa_1978 59 165,387 166,010 207 f 1,951,880 1,952,566 228 f phosphoglycerate mutase Nmar_0182 CENSYa_1979 69 166,066 166,326 86 f 1,952,711 1,952,971 86 f hypothetical protein Nmar_0183 CENSYa_1980 82 166,329 166,760 143 f 1,952,975 1,953,406 143 f cytochrome c oxidase subunit II Nmar_0184 CENSYa_1981 85 166,798 168,324 508 f 1,953,456 1,954,982 508 f cytochrome c oxidase subunit I Nmar_0185 CENSYa_1982 64 168,337 169,284 315 f 1,954,992 1,955,579 195 f blue (type 1) copper domain protein Nmar_0186 CENSYa_1983 66 169,322 169,735 137 f 1,955,641 1,956,033 130 f conserved hypothetical protein Nmar_0187 CENSYa_1984 44 169,725 170,249 174 r 1,956,065 1,956,634 189 r hypothetical protein Nmar_0188 CENSYa_1985 76 170,346 170,942 198 f 1,956,860 1,957,351 163 f hypothetical protein Nmar_0190 CENSYa_1377 46 172,976 173,530 184 r 1,405,238 1,405,786 182 f methyltransferase type 11 Nmar_0191 CENSYa_1376 60 173,569 173,730 53 r 1,405,056 1,405,232 58 f hypothetical protein Nmar_0192 CENSYa_1375 64 173,772 174,686 304 r 1,404,088 1,405,023 311 f branched-chain amino acid aminotransferase Nmar_0193 CENSYa_1371 59 175,136 176,551 471 r 1,400,934 1,402,349 471 f aspartate/glutamate/uridylate kinase Nmar_0194 CENSYa_1370 43 176,568 177,590 340 r 1,399,888 1,400,904 338 f glycosyltransferase 28 domain Nmar_0195 CENSYa_1966 41 177,763 178,044 93 f 1,940,891 1,941,151 86 f hypothetical protein Nmar_0196 CENSYa_0389 50 178,141 178,488 115 r 338,160 338,510 116 r hypothetical protein Nmar_0197 CENSYa_0461 36 178,619 178,984 365 f 385,999 386,391 131 f hypothetical protein Nmar_0198 CENSYa_0009 68 178,988 179,500 170 r 7,457 7,954 165 f hypothetical protein Nmar_0200 CENSYa_0017 42 179,850 180,302 150 f 12,121 12,525 134 f hypothetical protein Nmar_0201 CENSYa_0018 71 180,299 181,180 293 r 12,522 13,436 304 r radical SAM domain protein Nmar_0205 CENSYa_0020 61 182,361 182,717 118 f 13,798 14,151 117 f ribonuclease H Nmar_0206 CENSYa_0021 74 182,783 184,879 698 f 14,200 16,305 701 f CoA-binding domain protein Nmar_0207 CENSYa_0022 80 184,953 186,479 508 f 16,378 17,901 507 f vinylacetyl-CoA Delta-isomerase Nmar_0208 CENSYa_0023 51 186,634 187,344 236 r 17,915 18,889 324 r peptidase S26B, signal peptidase Nmar_0209 CENSYa_0024 55 187,385 188,155 256 r 18,917 19,720 267 r hypothetical protein Nmar_0210 CENSYa_0025 82 188,271 188,597 108 f 19,871 20,203 110 f thioredoxin Nmar_0211 CENSYa_0027 57 188,840 189,199 119 f 21,002 21,361 119 r hypothetical protein Nmar_0212 CENSYa_0026 61 189,221 189,760 179 r 20,450 21,001 183 f hypothetical protein Nmar_0214 CENSYa_0031 60 190,720 192,114 464 f 21,858 23,306 482 f dihydropyrimidinase Nmar_0215 CENSYa_1613 49 192,117 192,944 275 r 1,615,452 1,616,273 273 f protein of unknown function DUF52 Nmar_0216 CENSYa_1923 50 192,965 193,582 205 r 1,888,900 1,889,499 199 f AMMECR1 domain protein Nmar_0217 CENSYa_0284 68 193,616 194,686 356 f 247,243 248,373 376 f radical SAM domain protein Nmar_0218 CENSYa_0034 53 194,923 195,597 224 r 25,253 25,930 225 r ribose 5-phosphate isomerase Nmar_0219 CENSYa_0035 31 195,598 196,557 319 r 25,927 26,514 195 r hypothetical protein Nmar_0220 CENSYa_0036 85 196,558 196,725 55 r 26,520 26,687 55 r ribosomal protein L37e Nmar_0221 CENSYa_0037 83 196,736 196,972 78 r 26,696 26,932 78 r like-Sm ribonucleoprotein core Nmar_0222 CENSYa_0038 42 197,030 197,752 240 r 26,994 27,704 236 r creatininase Nmar_0223 CENSYa_0039 65 197,758 198,627 289 r 27,701 28,543 280 r formyl transferase domain protein Nmar_0224 CENSYa_0292 54 198,788 200,137 449 f 252,466 253,737 423 f phosphoglucosamine mutase Nmar_0225 CENSYa_0293 85 200,199 200,585 128 f 253,819 254,199 126 f ribosomal protein L7Ae/L30e/S12e/Gadd45 Nmar_0226 CENSYa_0294 82 200,582 200,794 70 f 254,196 254,408 70 f ribosomal protein S28e Nmar_0227 CENSYa_0295 84 200,804 201,004 66 f 254,420 254,617 65 f ribosomal protein L24E Nmar_0228 CENSYa_0296 73 201,007 201,408 133 f 254,730 255,020 96 f nucleoside-diphosphate kinase Nmar_0229 CENSYa_0297 71 201,415 203,196 593 f 255,027 256,808 593 f translation initiation factor aIF-2 Nmar_0230 CENSYa_1987 52 203,199 203,600 133 r 1,957,626 1,958,033 135 r thioredoxin Nmar_0231 CENSYa_1734 45 203,816 205,486 556 r 1,723,506 1,725,080 524 r Ig family protein Nmar_0232 CENSYa_1988 52 205,565 205,981 138 r 1,958,075 1,958,494 139 r protein of unknown function DUF101 Nmar_0234 CENSYa_0852 56 206,483 207,019 178 f 878,130 878,741 203 f SNARE associated Golgi protein Nmar_0236 CENSYa_1730 50 207,468 208,751 427 f 1,719,258 1,720,511 417 r histidyl-tRNA synthetase Nmar_0237 CENSYa_1729 57 208,732 209,397 221 r 1,718,596 1,719,261 221 f putative translation initiation factor eIF-6 Nmar_0238 CENSYa_1728 74 209,508 209,999 163 f 1,717,976 1,718,470 164 r 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_0239 CENSYa_1727 75 210,113 210,649 178 f 1,717,376 1,717,909 177 r 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_0240 CENSYa_1725 66 210,817 211,767 316 r 1,716,277 1,717,212 311 f Replication factor C Nmar_0241 CENSYa_1724 44 211,964 212,482 172 f 1,715,560 1,716,078 172 r hypothetical protein Nmar_0242 CENSYa_1723 71 212,475 214,562 695 f 1,713,470 1,715,563 697 r MCM family protein Nmar_0243 CENSYa_1722 58 214,559 216,688 709 f 1,711,350 1,713,473 707 r DEAD/DEAH box helicase domain protein Nmar_0244 CENSYa_1721 64 216,685 217,677 330 f 1,710,360 1,711,349 329 r glycosyl transferase family 4 Nmar_0245 CENSYa_1720 56 217,901 218,377 158 r 1,709,666 1,710,142 158 f acetyltransferase Nmar_0246 CENSYa_1719 61 218,381 218,650 89 r 1,709,392 1,709,661 89 f protein of unknown function DUF343 Nmar_0247 CENSYa_1718 62 218,647 219,582 311 r 1,708,460 1,709,395 311 f oxidoreductase domain protein Nmar_0248 CENSYa_1717 58 219,583 220,500 305 r 1,707,546 1,708,463 305 f branched-chain amino acid aminotransferase Nmar_0249 CENSYa_1630 15 220,541 221,920 459 r 1,633,373 1,635,529 718 r histidine kinase Nmar_0252 CENSYa_1716 60 223,281 223,886 201 r 1,707,197 1,707,493 98 f hypothetical protein Nmar_0253 CENSYa_1714 74 223,939 224,784 281 r 1,705,909 1,706,721 270 f oxidoreductase FAD/NAD(P)-binding domain protein Nmar_0255 CENSYa_1711 84 225,638 225,925 95 f 1,705,085 1,705,375 96 r alba, DNA/RNA-binding protein Nmar_0256 CENSYa_1710 70 225,951 226,817 288 r 1,704,218 1,705,060 280 f putative methylisocitrate lyase Nmar_0257 CENSYa_1708 65 227,020 227,529 169 r 1,703,109 1,703,915 268 f transcriptional regulator, PadR-like family

Page 21: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0258 CENSYa_1705 82 227,624 229,336 570 f 1,700,274 1,701,332 352 r succinate dehydrogenase or fumarate reductase, flavoprotein subunit Nmar_0259 CENSYa_1704 77 229,337 229,768 143 f 1,699,840 1,700,274 144 r conserved hypothetical protein Nmar_0260 CENSYa_1703 73 229,770 230,114 114 f 1,699,496 1,699,843 115 r conserved hypothetical protein Nmar_0261 CENSYa_1702 69 230,117 230,866 249 f 1,698,720 1,699,496 258 r succinate dehydrogenase and fumarate reductase iron-sulfur protein Nmar_0262 CENSYa_1701 83 230,863 231,219 118 r 1,698,323 1,698,646 107 f protein of unknown function DUF59 Nmar_0263 CENSYa_1700 53 231,236 232,693 485 f 1,696,813 1,698,324 503 r argininosuccinate lyase Nmar_0264 CENSYa_1656 42 233,062 233,793 243 f 1,660,812 1,661,540 242 f hypothetical protein Nmar_0265 CENSYa_1657 49 233,806 234,882 358 f 1,661,549 1,662,619 356 f glutamyl-tRNA reductase Nmar_0266 CENSYa_1658 56 234,888 235,055 55 r 1,662,677 1,662,856 59 f hypothetical protein Nmar_0267 CENSYa_1659 70 235,140 235,982 280 r 1,662,970 1,663,809 279 r oxidoreductase FAD/NAD(P)-binding domain protein Nmar_0268 CENSYa_1652 58 236,068 236,868 266 r 1,657,288 1,658,103 271 r Inositol-phosphate phosphatase Nmar_0269 CENSYa_1653 59 236,951 238,072 373 f 1,658,197 1,659,321 374 f hypothetical protein Nmar_0270 CENSYa_1654 62 238,081 239,238 385 f 1,659,318 1,660,478 386 f AAA ATPase central domain protein Nmar_0271 CENSYa_1655 48 239,256 239,546 96 f 1,660,490 1,660,792 100 f hypothetical protein Nmar_0272 CENSYa_1660 79 239,653 241,200 515 f 1,663,911 1,665,458 515 f carboxyl transferase Nmar_0273 CENSYa_1661 77 241,206 242,693 495 f 1,665,512 1,666,942 476 f carbamoyl-phosphate synthase L chain ATP-binding Nmar_0274 CENSYa_1662 56 242,700 243,212 170 f 1,666,944 1,667,453 169 f biotin/lipoyl attachment domain-containing protein Nmar_0275 CENSYa_1663 81 243,229 243,693 154 r 1,667,459 1,667,929 156 r alkyl hydroperoxide reductase/ Thiol specific antioxidant/ Mal allergen Nmar_0276 CENSYa_1664 77 243,850 244,182 110 f 1,668,048 1,668,419 123 f NADH-ubiquinone/plastoquinone oxidoreductase chain 3 Nmar_0277 CENSYa_1665 94 244,223 244,747 174 f 1,668,474 1,668,956 160 f NADH-quinone oxidoreductase, B subunit Nmar_0278 CENSYa_1666 64 244,747 245,349 200 f 1,668,953 1,669,765 270 f NADH dehydrogenase (ubiquinone) 30 kDa subunit Nmar_0279 CENSYa_1667 81 245,352 246,488 378 f 1,669,768 1,670,907 379 f NADH dehydrogenase (quinone) Nmar_0280 CENSYa_1668 75 246,489 247,787 432 f 1,670,907 1,672,205 432 f NADH dehydrogenase (quinone) Nmar_0281 CENSYa_1669 87 247,787 248,284 165 f 1,672,202 1,672,702 166 f 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_0282 CENSYa_1670 77 248,277 248,789 170 f 1,672,728 1,673,207 159 f NADH-ubiquinone/plastoquinone oxidoreductase chain 6 Nmar_0283 CENSYa_1671 91 248,770 249,075 101 f 1,673,218 1,673,493 91 f NADH-ubiquinone oxidoreductase chain 4L Nmar_0284 CENSYa_1672 77 249,075 250,628 517 f 1,673,493 1,675,055 520 f proton-translocating NADH-quinone oxidoreductase, chain M Nmar_0285 CENSYa_1673 83 250,630 252,708 692 f 1,675,057 1,677,132 691 f proton-translocating NADH-quinone oxidoreductase, chain L Nmar_0286 CENSYa_1674 72 252,721 254,205 494 f 1,677,146 1,678,633 495 f proton-translocating NADH-quinone oxidoreductase, chain N Nmar_0287 CENSYa_1675 69 254,195 255,046 283 r 1,678,623 1,679,441 272 r Polyprenyl synthetase Nmar_0288 CENSYa_1684 60 255,156 255,896 246 f 1,684,842 1,685,531 229 f TENA/THI-4 domain protein Nmar_0289 CENSYa_1686 46 256,122 256,727 201 r 1,685,789 1,686,391 200 r orotate phosphoribosyltransferase Nmar_0290 CENSYa_1687 49 256,778 257,860 360 f 1,686,443 1,687,516 357 f hypothetical protein Nmar_0291 CENSYa_1688 64 257,852 258,958 368 r 1,687,508 1,688,602 364 r peptidase M50 Nmar_0292 CENSYa_1690 76 259,014 259,529 171 f 1,689,758 1,690,264 168 f ribosomal-protein-alanine acetyltransferase Nmar_0293 CENSYa_1691 70 259,567 259,785 72 f 1,690,295 1,690,513 72 f transcription regulator containing HTH domain-like protein Nmar_0294 CENSYa_1692 56 259,787 260,614 275 r 1,690,514 1,691,341 275 r protein of unknown function Met10 Nmar_0295 CENSYa_1694 73 260,620 262,407 595 r 1,691,905 1,693,689 594 r ABC transporter related Nmar_0296 CENSYa_0283 56 262,794 264,260 488 r 245,772 247,238 488 r leucyl aminopeptidase Nmar_0299 CENSYa_0086 67 265,235 266,902 555 f 61,147 62,661 504 f ribulose-phosphate 3-epimerase Nmar_0300 CENSYa_0087 71 266,899 267,873 324 f 62,658 63,614 318 f transketolase central region Nmar_0301 CENSYa_0088 71 267,870 268,538 222 f 63,665 64,276 203 f putative transaldolase Nmar_0303 CENSYa_1633 63 268,906 269,928 340 r 1,637,318 1,638,322 334 f Mg transporter protein CorA family protein Nmar_0304 CENSYa_0461 31 270,335 270,724 390 r 385,999 386,391 131 f hypothetical protein Nmar_0305 CENSYa_1622 65 270,781 271,128 115 r 1,623,311 1,623,655 114 f hypothetical protein Nmar_0306 CENSYa_1621 50 271,169 271,930 253 r 1,622,531 1,623,277 248 f hypothetical protein Nmar_0307 CENSYa_1620 67 271,956 272,225 89 r 1,622,173 1,622,520 115 f protein of unknown function UPF0147 Nmar_0309 CENSYa_0195 20 272,612 273,388 258 f 177,352 178,161 269 f GCN5-related N-acetyltransferase Nmar_0310 CENSYa_1619 59 273,380 274,249 289 r 1,621,263 1,622,129 288 f 5-carboxymethyl-2-hydroxymuconate Delta-isomerase Nmar_0311 CENSYa_1618 57 274,255 275,964 569 r 1,619,568 1,621,256 562 f glutamyl-tRNA synthetase Nmar_0312 CENSYa_1617 61 275,986 276,966 326 r 1,618,606 1,619,571 321 f polyprenyl synthetase Nmar_0313 CENSYa_1616 57 276,953 277,603 216 r 1,617,977 1,618,609 210 f isopentenyl-diphosphate delta-isomerase, type 1 Nmar_0314 CENSYa_1615 54 277,603 278,346 247 r 1,617,237 1,617,977 246 f aspartate/glutamate/uridylate kinase Nmar_0315 CENSYa_1614 44 278,371 279,312 313 r 1,616,279 1,617,208 309 f mevalonate kinase Nmar_0316 CENSYa_1612 58 279,341 279,964 207 r 1,614,826 1,615,452 208 f ribosomal protein S2 Nmar_0317 CENSYa_1611 62 279,972 281,210 412 r 1,613,584 1,614,822 412 f phosphopyruvate hydratase Nmar_0318 CENSYa_1610 71 281,212 281,484 90 r 1,613,346 1,613,582 78 f RNA polymerase, N/8 Kd subunit Nmar_0319 CENSYa_1608 53 281,601 282,215 204 r 1,612,595 1,613,230 211 f translin Nmar_0320 CENSYa_1603 67 282,289 282,936 215 f 1,607,129 1,607,770 213 r carbohydrate kinase, YjeF related protein Nmar_0321 CENSYa_1602 62 282,933 283,541 202 f 1,606,521 1,607,132 203 r beta-lactamase domain protein Nmar_0322 CENSYa_1601 57 283,546 284,607 353 f 1,605,464 1,606,504 346 r nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase Nmar_0323 CENSYa_1600 61 284,600 286,360 586 r 1,603,720 1,605,477 585 f glucosamine--fructose-6-phosphate aminotransferase, isomerizing Nmar_0324 CENSYa_1599 66 286,452 287,057 201 r 1,603,051 1,603,668 205 f ribosomal protein S4 Nmar_0325 CENSYa_1598 61 287,067 287,651 194 r 1,602,364 1,603,047 227 f ribosomal protein S13 Nmar_0326 CENSYa_1584 41 287,984 288,292 102 f 1,595,224 1,595,526 100 r hypothetical protein Nmar_0327 CENSYa_1583 30 288,284 288,787 167 r 1,594,731 1,595,231 166 f hypothetical protein Nmar_0333 CENSYa_1580 42 294,247 297,375 1042 f 1,589,264 1,592,332 1022 r hypothetical protein Nmar_0334 CENSYa_1579 50 297,421 298,452 343 f 1,588,210 1,589,202 330 r hypothetical protein Nmar_0335 CENSYa_1578 71 298,503 298,826 107 f 1,587,827 1,588,150 107 r conserved hypothetical protein Nmar_0336 CENSYa_1577 85 298,821 300,401 526 r 1,586,170 1,587,705 511 f radical SAM domain protein Nmar_0337 CENSYa_1576 54 300,474 301,250 258 r 1,584,905 1,585,708 267 f 1-(5-phosphoribosyl)-5-amino-4-imidazole- carboxylate (AIR) carboxylase Nmar_0338 CENSYa_1575 77 301,284 302,198 304 r 1,583,961 1,584,869 302 f lactate/malate dehydrogenase Nmar_0339 CENSYa_1574 46 302,289 303,509 406 f 1,582,613 1,583,830 405 r protein of unknown function DUF111 Nmar_0340 CENSYa_1573 60 303,510 304,307 265 f 1,581,822 1,582,616 264 r ExsB family protein Nmar_0341 CENSYa_1572 56 304,304 305,470 388 f 1,580,668 1,581,792 374 r cysteine desulfurase Nmar_0342 CENSYa_1571 35 305,467 306,258 263 r 1,579,826 1,580,671 281 f hypothetical protein Nmar_0343 CENSYa_1570 45 306,364 307,248 294 r 1,578,886 1,579,749 287 f hypothetical protein Nmar_0344 CENSYa_0439 37 307,311 308,168 285 r 369,874 370,728 284 r hypothetical protein Nmar_0345 CENSYa_1569 53 308,232 309,071 279 r 1,578,018 1,578,821 267 f hypothetical protein Nmar_0346 CENSYa_1568 81 309,252 309,503 83 f 1,577,515 1,577,766 83 r RNA polymerase Rpb5 Nmar_0347 CENSYa_1567 88 309,504 312,851 1115 f 1,574,167 1,577,514 1115 r RNA polymerase Rpb2 domain 6 Nmar_0348 CENSYa_1566 82 312,851 316,642 1263 f 1,570,385 1,574,167 1260 r DNA-directed RNA polymerase subunit A' Nmar_0350 CENSYa_1565 42 316,926 318,104 392 f 1,569,188 1,570,354 388 r hypothetical protein Nmar_0352 CENSYa_0447 47 319,896 320,219 107 f 374,252 374,557 101 r ribosomal protein L7Ae/L30e/S12e/Gadd45 Nmar_0353 CENSYa_0446 74 320,271 320,735 154 f 373,714 374,178 154 r NusA family KH domain protein Nmar_0354 CENSYa_0445 88 320,739 321,176 145 f 373,274 373,711 145 r ribosomal protein S23 (S12) Nmar_0355 CENSYa_0444 82 321,179 321,778 199 f 372,672 373,271 199 r ribosomal protein S7 Nmar_0357 CENSYa_0442 61 322,289 322,993 234 f 371,462 372,409 315 r protein of unknown function UPF0153 Nmar_0359 CENSYa_0276 83 323,446 324,444 332 r 239,994 241,007 337 r dehydrogenase (flavoprotein)-like protein Nmar_0366 CENSYa_1913 58 330,106 330,759 217 r 1,880,399 1,881,082 227 f uracil-DNA glycosylase superfamily

Page 22: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0367 CENSYa_1256 55 330,772 333,639 955 r 1,278,565 1,280,301 578 f leucyl-tRNA synthetase Nmar_0368 CENSYa_1254 62 333,646 336,315 889 r 1,274,761 1,277,445 894 f alanyl-tRNA synthetase Nmar_0370 CENSYa_1331 32 337,401 337,793 393 f 1,340,198 1,340,593 132 f hypothetical protein Nmar_0373 CENSYa_1252 60 340,670 341,179 169 f 1,273,836 1,274,363 175 r hypothetical protein Nmar_0374 CENSYa_1242 78 341,230 341,547 105 f 1,268,526 1,268,822 98 r ribosomal protein 60S Nmar_0381 CENSYa_1250 70 345,558 346,424 288 r 1,272,219 1,273,043 274 r ribosomal protein L10 Nmar_0382 CENSYa_1251 66 346,417 347,079 220 r 1,273,086 1,273,733 215 r ribosomal protein L1 Nmar_0383 CENSYa_0345 79 347,271 347,708 145 f 290,622 291,101 159 r transcriptional regulator, AsnC family Nmar_0385 CENSYa_0344 79 348,548 349,027 159 r 290,143 290,625 160 f ribosomal protein L11 Nmar_0386 CENSYa_0343 74 349,062 349,520 152 r 289,715 290,110 131 f NusG antitermination factor Nmar_0387 CENSYa_0341 58 349,736 350,512 258 f 288,664 289,440 258 r protein of unknown function DUF516 Nmar_0388 CENSYa_0340 59 350,509 350,808 99 f 288,353 288,667 104 r methylated-DNA--protein-cysteine methyltransferase Nmar_0389 CENSYa_1261 68 350,810 351,205 131 r 1,283,501 1,283,893 130 f glyoxalase/bleomycin resistance protein/dioxygenase Nmar_0390 CENSYa_0339 73 351,250 351,705 151 r 287,904 288,356 150 f ribosomal protein L19e Nmar_0391 CENSYa_0338 68 351,689 352,093 134 r 287,516 287,920 134 f ribosomal protein L32e Nmar_0392 CENSYa_0337 70 352,349 353,848 499 f 285,725 287,344 539 r phosphoenolpyruvate carboxykinase (ATP) Nmar_0394 CENSYa_0336 85 355,951 356,571 206 f 285,044 285,664 206 r superoxide dismutase Nmar_0395 CENSYa_0332 51 356,654 357,091 145 f 282,430 282,951 173 r prefoldin, alpha subunit Nmar_0396 CENSYa_0331 54 357,095 358,636 513 f 281,085 282,353 422 r signal recognition particle-docking protein FtsY Nmar_0397 CENSYa_0330 60 358,633 359,547 304 f 280,183 281,088 301 r ornithine carbamoyltransferase Nmar_0398 CENSYa_0329 68 359,584 360,066 160 f 279,669 280,148 159 r ribosomal protein L18P/L5E Nmar_0399 CENSYa_0328 79 360,068 360,784 238 f 279,010 279,669 219 r ribosomal protein S5 Nmar_0400 CENSYa_0327 60 360,791 361,258 155 f 278,538 279,005 155 r ribosomal protein L30 Nmar_0401 CENSYa_0326 71 361,260 361,724 154 f 278,121 278,537 138 r ribosomal protein L15 Nmar_0402 CENSYa_0325 85 361,717 363,147 476 f 276,684 278,120 478 r preprotein translocase, SecY subunit Nmar_0403 CENSYa_0324 59 363,134 363,712 192 f 276,119 276,655 178 r adenylate kinase Nmar_0404 CENSYa_0323 74 363,716 364,339 207 f 275,487 276,113 208 r membrane protein-like protein Nmar_0405 CENSYa_0322 64 364,336 364,896 186 f 274,930 275,487 185 r cytidylate kinase Nmar_0406 CENSYa_0321 69 364,893 365,891 332 f 273,932 274,933 333 r pseudouridylate synthase TruB domain protein Nmar_0407 CENSYa_0320 39 365,895 366,341 148 f 273,498 273,935 145 r hypothetical protein Nmar_0408 CENSYa_0388 38 366,847 367,329 160 f 337,276 338,094 272 f secreted protein Nmar_0409 CENSYa_0316 60 367,332 368,645 437 r 269,701 271,068 455 f FAD dependent oxidoreductase Nmar_0410 CENSYa_0315 49 368,734 369,015 93 f 269,324 269,605 93 r hypothetical protein Nmar_0411 CENSYa_0314 49 369,033 369,413 126 f 268,926 269,327 133 r hypothetical protein Nmar_0412 CENSYa_0313 55 369,607 370,575 322 r 267,895 268,941 348 f glyoxylate reductase Nmar_0413 CENSYa_0311 64 370,822 372,732 636 f 265,794 267,722 642 r pyruvate flavodoxin/ferredoxin oxidoreductase domain protein Nmar_0414 CENSYa_0310 79 372,722 373,684 320 f 264,842 265,771 309 r pyruvate ferredoxin/flavodoxin oxidoreductase, beta subunit Nmar_0415 CENSYa_0309 68 373,734 374,174 146 f 264,323 264,724 133 r hypothetical protein Nmar_0416 CENSYa_0308 50 374,167 374,700 177 r 263,824 264,330 168 f hypothetical protein Nmar_0418 CENSYa_0307 60 375,131 375,700 189 r 263,185 263,760 191 f hypothetical protein Nmar_0419 CENSYa_0306 56 375,731 376,510 259 r 262,391 263,173 260 f uncharacterised conserved protein UCP005026 Nmar_0420 CENSYa_1484 77 376,584 377,852 422 f 1,485,551 1,486,840 429 f 3-isopropylmalate dehydratase Nmar_0421 CENSYa_1485 74 377,852 378,334 160 f 1,486,837 1,487,322 161 f 3-isopropylmalate dehydratase, small subunit Nmar_0422 CENSYa_1486 71 378,343 379,758 471 f 1,487,324 1,488,742 472 f 3-isopropylmalate dehydratase, large subunit Nmar_0423 CENSYa_1487 56 379,760 380,344 194 f 1,488,783 1,489,307 174 f 3-isopropylmalate dehydratase, small subunit Nmar_0424 CENSYa_1488 55 380,322 381,041 239 r 1,489,285 1,490,004 239 r nucleotidyl transferase Nmar_0425 CENSYa_1489 78 381,045 381,494 149 r 1,490,008 1,490,457 149 r ribosomal protein S9 Nmar_0426 CENSYa_1490 61 381,491 381,955 154 r 1,490,454 1,490,873 139 r ribosomal protein L13 Nmar_0427 CENSYa_1491 59 381,948 382,295 115 r 1,490,890 1,491,270 126 r ribosomal protein L15 Nmar_0428 CENSYa_1544 67 382,339 383,004 221 r 1,553,362 1,554,000 212 f RNA polymerase insert Nmar_0429 CENSYa_0389 35 383,085 383,438 354 f 338,160 338,510 117 r hypothetical protein Nmar_0430 CENSYa_1538 74 383,542 384,228 228 f 1,549,535 1,550,221 228 r Shwachman-Bodian-Diamond syndrome protein Nmar_0431 CENSYa_1537 70 384,228 384,908 226 f 1,548,858 1,549,535 225 r KH type 1 domain protein Nmar_0432 CENSYa_1536 76 384,908 385,642 244 f 1,548,127 1,548,858 243 r exosome complex exonuclease 1 Nmar_0433 CENSYa_1535 72 385,645 386,463 272 f 1,547,315 1,548,127 270 r 3' exoribonuclease Nmar_0434 CENSYa_1534 79 386,465 386,677 70 f 1,547,105 1,547,314 69 r hypothetical protein Nmar_0436 CENSYa_0007 53 386,930 387,319 129 f 4,974 5,492 172 r prefoldin, beta subunit Nmar_0437 CENSYa_0010 71 387,372 388,058 228 f 7,982 8,677 231 f ERCC4 domain protein Nmar_0439 CENSYa_0011 50 388,545 389,024 159 r 8,668 9,255 195 r nucleotide binding protein, PINc Nmar_0440 CENSYa_1369 70 389,014 389,850 278 r 1,398,983 1,399,816 277 r ATP-NAD/AcoX kinase Nmar_0441 CENSYa_1378 83 389,913 390,758 281 f 1,405,811 1,406,680 289 f rhodanese domain protein Nmar_0442 CENSYa_1379 60 390,760 391,071 103 f 1,406,681 1,406,983 100 f hypothetical protein Nmar_0443 CENSYa_1126 71 391,108 392,055 315 f 1,156,940 1,157,869 309 r 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_0444 CENSYa_1125 73 392,111 394,177 688 f 1,155,560 1,156,894 444 r glycosyl transferase family 2 Nmar_0445 CENSYa_1123 32 394,174 395,703 509 r 1,152,895 1,154,877 660 f hypothetical protein Nmar_0446 CENSYa_1129 73 396,135 397,460 441 f 1,158,729 1,160,099 456 f sodium/hydrogen exchanger Nmar_0447 CENSYa_1130 47 397,453 397,881 142 f 1,160,102 1,160,578 158 f hypothetical protein Nmar_0449 CENSYa_1135 71 398,278 398,787 169 r 1,164,146 1,164,658 170 r ribosomal protein L16 Nmar_0450 CENSYa_1136 58 398,850 399,449 199 f 1,164,797 1,165,360 187 f tRNA intron endonuclease Nmar_0451 CENSYa_1137 57 399,439 400,095 218 r 1,165,401 1,166,060 219 r DNA repair helicase Nmar_0455 CENSYa_1909 29 401,560 402,282 240 r 1,875,393 1,876,043 216 r hypothetical protein Nmar_0456 CENSYa_0508 13 402,319 410,613 2764 f 426,495 447,083 6862 r hypothetical protein Nmar_0458 CENSYa_0264 37 411,035 411,481 148 r 231,486 231,938 150 r hypothetical protein Nmar_0459 CENSYa_1909 33 411,535 412,170 636 r 1,875,393 1,876,043 217 r hypothetical protein Nmar_0462 CENSYa_1142 35 415,626 416,150 525 r 1,167,952 1,168,620 222 f HEAT repeat Nmar_0466 CENSYa_1142 37 418,792 419,673 293 f 1,167,952 1,168,620 222 f hypothetical protein Nmar_0467 CENSYa_1145 55 420,220 421,641 473 f 1,169,301 1,170,752 483 f hypothetical protein Nmar_0470 CENSYa_1147 64 422,609 423,250 213 f 1,171,630 1,172,274 214 f hypothetical protein Nmar_0473 CENSYa_1148 56 426,320 426,637 105 r 1,172,276 1,172,596 106 r hypothetical protein Nmar_0474 CENSYa_1149 62 426,726 427,178 150 f 1,172,685 1,173,140 151 f hypothetical protein Nmar_0477 CENSYa_0565 35 429,251 430,828 525 f 494,502 499,202 1566 f NHL repeat containing protein Nmar_0479 CENSYa_0570 60 432,960 434,213 417 f 505,584 507,107 507 f phosphate ABC transporter, periplasmic phosphate-binding protein Nmar_0481 CENSYa_0576 69 434,638 435,615 325 f 512,356 513,321 321 f phosphate ABC transporter, inner membrane subunit PstC Nmar_0482 CENSYa_0577 63 435,612 436,544 310 f 513,354 514,232 292 f phosphate ABC transporter, inner membrane subunit PstA Nmar_0483 CENSYa_0578 70 436,541 437,374 277 f 514,235 515,056 273 f phosphate ABC transporter, ATPase subunit Nmar_0484 CENSYa_1367 66 437,374 438,033 219 f 1,397,483 1,398,139 218 r phosphate uptake regulator, PhoU Nmar_0489 CENSYa_1359 42 440,199 440,882 227 f 1,385,705 1,386,367 220 r major intrinsic protein Nmar_0490 CENSYa_1357 58 440,908 442,182 424 f 1,383,004 1,384,374 456 r glutamate-1-semialdehyde-2,1-aminomutase Nmar_0491 CENSYa_1356 60 442,179 443,114 311 f 1,382,072 1,383,007 311 r porphobilinogen deaminase Nmar_0492 CENSYa_1355 68 443,111 443,851 246 f 1,381,335 1,382,075 246 r uroporphyrin-III C-methyltransferase

Page 23: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0493 CENSYa_1354 56 443,851 444,654 267 f 1,380,559 1,381,338 259 r uroporphyrinogen III synthase HEM4 Nmar_0495 CENSYa_1353 89 444,970 446,370 466 f 1,379,081 1,380,493 470 r FeS assembly protein SufB Nmar_0496 CENSYa_1352 67 446,390 447,790 466 f 1,377,672 1,379,057 461 r FeS assembly protein SufD Nmar_0497 CENSYa_1350 71 447,804 449,048 414 f 1,376,114 1,377,358 414 r cysteine desulfurase, SufS subfamily Nmar_0498 CENSYa_1349 67 449,045 449,488 147 f 1,375,687 1,376,094 135 r SUF system FeS assembly protein, NifU family Nmar_0499 CENSYa_1348 45 449,490 449,834 114 f 1,375,341 1,375,685 114 r hypothetical protein Nmar_0500 CENSYa_1347 70 449,827 451,053 408 r 1,374,119 1,375,339 406 f phosphoglycerate kinase Nmar_0501 CENSYa_1392 52 451,172 451,840 222 f 1,417,127 1,417,753 208 f phosphoesterase PA-phosphatase related Nmar_0502 CENSYa_1393 56 451,829 452,140 103 r 1,417,742 1,418,053 103 r hypothetical protein Nmar_0504 CENSYa_1397 32 453,282 453,887 201 f 1,421,180 1,421,770 196 f hypothetical protein Nmar_0505 CENSYa_1398 52 453,881 454,195 104 r 1,421,828 1,422,226 132 r hypothetical protein Nmar_0506 CENSYa_1400 59 454,758 455,378 206 f 1,422,629 1,423,249 206 f SNARE associated Golgi protein Nmar_0508 CENSYa_1401 50 455,996 456,307 103 r 1,423,246 1,423,503 85 r hypothetical protein Nmar_0509 CENSYa_1402 64 456,304 457,284 326 r 1,423,535 1,424,524 329 r porphobilinogen synthase Nmar_0510 CENSYa_1403 63 457,316 458,581 421 r 1,424,556 1,425,809 417 r glutamyl-tRNA reductase Nmar_0511 CENSYa_1404 38 458,578 459,231 217 r 1,425,806 1,426,459 217 r siroheme synthase Nmar_0512 CENSYa_1301 69 459,323 460,348 341 f 1,310,966 1,311,985 339 f putative transcriptional regulator, AsnC family Nmar_0514 CENSYa_1302 69 460,691 461,689 332 r 1,311,986 1,312,966 326 r aldo/keto reductase Nmar_0515 CENSYa_1416 65 461,765 462,442 225 r 1,436,202 1,436,873 223 f chlorite dismutase Nmar_0516 CENSYa_1414 82 462,514 463,284 256 r 1,434,079 1,434,846 255 f FeS assembly ATPase SufC Nmar_0517 CENSYa_1413 64 463,382 464,338 318 r 1,433,062 1,434,027 321 f zinc finger TFIIB-type domain protein Nmar_0518 CENSYa_1412 50 464,335 464,583 82 r 1,432,820 1,433,065 81 f hypothetical protein Nmar_0519 CENSYa_1411 62 464,679 464,978 99 r 1,432,434 1,432,739 101 f signal recognition particle, subunit SRP19 (SRP19) Nmar_0520 CENSYa_1410 73 464,980 465,363 127 r 1,432,040 1,432,429 129 f ribosomal protein S8e Nmar_0521 CENSYa_1409 60 465,432 466,010 192 r 1,431,384 1,431,977 197 f chromosome segregation and condensation protein ScpB Nmar_0522 CENSYa_1408 58 466,065 466,742 225 r 1,430,712 1,431,380 222 f chromosome segregation and condensation protein ScpA Nmar_0523 CENSYa_1407 66 466,837 467,889 350 f 1,429,504 1,430,562 352 r alcohol dehydrogenase GroES domain protein Nmar_0526 CENSYa_0779 73 469,360 469,671 103 f 731,412 731,969 185 f translation initiation factor eIF-1A Nmar_0527 CENSYa_0497 77 469,717 469,908 63 f 419,650 419,841 63 f cold-shock DNA-binding domain protein Nmar_0528 CENSYa_1173 53 470,238 470,846 202 f 1,206,514 1,207,119 201 r hypothetical protein Nmar_0529 CENSYa_0419 78 470,881 471,291 136 f 355,433 355,906 157 f translation initiation factor eIF-5A Nmar_0530 CENSYa_0420 51 471,402 472,091 229 f 355,998 356,804 268 f putative ATP binding protein Nmar_0531 CENSYa_0421 73 472,084 473,412 442 f 356,806 357,999 397 f GTP-binding signal recognition particle SRP54 G- domain Nmar_0532 CENSYa_0422 33 473,435 474,580 381 f 358,039 359,247 402 f pseudouridylate synthase-like protein Nmar_0533 CENSYa_0424 51 474,754 475,464 236 f 359,642 360,211 189 f hypothetical protein Nmar_0534 CENSYa_0425 45 475,477 475,650 57 r 360,216 360,383 55 r ribosomal protein S27a Nmar_0535 CENSYa_0426 51 475,650 476,093 147 r 360,383 360,997 204 r hypothetical protein Nmar_0536 CENSYa_0430 58 476,150 476,710 186 f 362,275 362,814 179 f protein of unknown function DUF309 Nmar_0537 CENSYa_0431 61 476,694 477,980 428 r 362,804 364,063 419 r phosphoglycerate mutase Nmar_0538 CENSYa_0432 73 478,209 479,225 338 f 364,127 365,182 351 f galactose-1-phosphate uridylyltransferase Nmar_0539 CENSYa_0433 36 479,268 479,714 148 f 365,236 365,673 145 f hypothetical protein Nmar_0540 CENSYa_0434 58 479,711 480,148 145 r 365,670 366,089 139 r GCN5-related N-acetyltransferase Nmar_0541 CENSYa_0435 59 480,145 480,861 238 r 366,104 366,751 215 r nucleotidyl transferase Nmar_0542 CENSYa_0221 40 480,908 482,206 432 r 199,804 201,120 438 f phosphomethylpyrimidine kinase Nmar_0543 CENSYa_0216 80 482,942 484,270 442 f 195,975 197,330 451 r thiamine biosynthesis protein ThiC Nmar_0544 CENSYa_0215 44 484,253 484,657 134 r 195,726 196,031 101 f NUDIX hydrolase Nmar_0545 CENSYa_0214 69 484,663 485,514 283 r 194,795 195,640 281 f prephenate dehydrogenase Nmar_0546 CENSYa_0213 52 485,511 486,881 456 r 193,434 194,783 449 f aminotransferase class I and II Nmar_0547 CENSYa_0212 69 486,886 487,983 365 r 192,343 193,434 363 f chorismate synthase Nmar_0548 CENSYa_0205 51 488,310 489,578 422 r 185,867 187,111 414 f 3-phosphoshikimate 1-carboxyvinyltransferase Nmar_0549 CENSYa_0204 58 489,568 490,422 284 r 185,020 185,865 281 f shikimate kinase Nmar_0550 CENSYa_0203 55 490,422 491,243 273 r 184,217 185,023 268 f shikimate 5-dehydrogenase Nmar_0551 CENSYa_0202 49 491,292 491,963 223 r 183,509 184,162 217 f 3-dehydroquinate dehydratase, type I Nmar_0552 CENSYa_0201 73 491,965 493,026 353 r 182,496 183,512 338 f 3-dehydroquinate synthase Nmar_0553 CENSYa_0200 72 493,019 493,801 260 r 181,742 182,473 243 f predicted phospho-2-dehydro-3-deoxyheptonate aldolase Nmar_0554 CENSYa_0199 73 493,936 495,054 372 f 180,425 181,531 368 r radical SAM domain protein Nmar_0555 CENSYa_0198 73 495,049 496,491 480 r 178,951 180,393 480 f UbiD family decarboxylase Nmar_0556 CENSYa_1358 30 496,619 499,099 826 f 1,384,458 1,385,690 410 f histidine kinase Nmar_0557 CENSYa_0196 76 499,165 499,548 127 r 178,254 178,628 124 f hypothetical protein Nmar_0558 CENSYa_0192 52 499,587 500,318 243 f 171,229 172,347 372 r hypothetical protein Nmar_0559 CENSYa_0192 38 500,703 501,698 996 f 171,229 172,347 373 r hypothetical protein Nmar_0560 CENSYa_0191 67 501,721 502,185 154 r 170,706 171,167 153 f alkyl hydroperoxide reductase/ Thiol specific antioxidant/ Mal allergen Nmar_0561 CENSYa_0190 57 502,286 502,999 237 f 169,899 170,645 248 r major intrinsic protein Nmar_0563 CENSYa_0269 42 503,460 504,278 819 f 235,440 236,177 246 r hypothetical protein Nmar_0571 CENSYa_1990 57 508,373 508,744 123 f 1,959,314 1,959,685 123 r conserved hypothetical protein Nmar_0572 CENSYa_1806 54 508,749 510,014 421 f 1,786,901 1,788,181 426 f conserved hypothetical protein, membrane Nmar_0573 CENSYa_1826 57 510,081 510,596 171 f 1,804,394 1,804,939 181 r ferritin Dps family protein Nmar_0574 CENSYa_1161 68 510,959 511,132 57 r 1,195,220 1,195,393 57 r hypothetical protein Nmar_0576 CENSYa_1160 49 511,861 512,307 148 f 1,194,717 1,195,163 148 r pyridoxamine 5'-phosphate oxidase-related FMN-binding Nmar_0577 CENSYa_1159 30 512,302 513,054 250 r 1,193,941 1,194,732 263 f hypothetical protein Nmar_0578 CENSYa_1157 64 513,299 514,888 529 r 1,192,113 1,193,690 525 f lysyl-tRNA synthetase Nmar_0579 CENSYa_1156 60 514,892 516,475 527 r 1,190,545 1,192,113 522 f histone acetyltransferase, ELP3 family Nmar_0581 CENSYa_1152 76 516,887 518,137 416 r 1,175,420 1,176,811 463 f sodium/hydrogen exchanger Nmar_0582 CENSYa_1151 63 518,212 519,234 340 r 1,174,372 1,175,376 334 f phosphoribosylformylglycinamidine cyclo-ligase Nmar_0584 CENSYa_0161 33 520,425 522,080 1,656 f 138,954 144,167 1,738 r hypothetical protein Nmar_0589 CENSYa_0389 26 525,213 525,581 369 f 338,160 338,510 117 r hypothetical protein Nmar_0590 CENSYa_0472 59 525,903 526,793 296 f 395,221 396,105 294 r Na+/Ca+ antiporter, CaCA family Nmar_0592 CENSYa_1240 66 528,167 529,426 419 r 1,266,048 1,267,214 388 f radical SAM domain protein Nmar_0593 CENSYa_1239 58 529,423 530,595 390 r 1,264,894 1,265,946 350 f radical SAM domain protein Nmar_0594 CENSYa_1238 62 530,634 531,773 379 r 1,263,583 1,264,740 385 f 2-alkenal reductase Nmar_0595 CENSYa_1236 61 531,812 533,020 402 r 1,261,060 1,262,217 385 f glycosyl transferase family 2 Nmar_0596 CENSYa_1235 51 533,024 533,440 138 r 1,260,497 1,261,018 173 f cyclase/dehydrase Nmar_0598 CENSYa_1234 98 534,324 534,884 186 r 1,259,842 1,260,402 186 f 2-alkenal reductase Nmar_0599 CENSYa_1233 57 535,023 536,144 373 f 1,258,601 1,259,671 356 r protein of unknown function DUF1512 Nmar_0600 CENSYa_1232 44 536,144 536,323 59 f 1,258,434 1,258,604 56 r hypothetical protein Nmar_0601 CENSYa_1231 71 536,325 537,218 297 r 1,257,517 1,258,437 306 f methionine aminopeptidase, type II Nmar_0602 CENSYa_1222 33 537,506 537,751 81 f 1,247,651 1,247,896 81 r hypothetical protein Nmar_0616 CENSYa_1221 36 554,316 557,606 1096 f 1,244,330 1,247,569 1079 r hypothetical protein Nmar_0617 CENSYa_1220 47 557,641 558,132 163 f 1,243,780 1,244,298 172 r protein of unknown function DUF367 Nmar_0618 CENSYa_1313 53 558,117 560,579 820 r 1,321,685 1,324,150 821 f solute binding protein-like protein

Page 24: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0619 CENSYa_1312 49 560,659 561,312 217 f 1,320,816 1,321,439 207 r hypothetical protein Nmar_0620 CENSYa_1311 54 561,296 563,041 581 r 1,319,093 1,320,829 578 f membrane protein-like protein Nmar_0621 CENSYa_1209 55 563,248 563,577 109 r 1,234,676 1,235,170 164 r Sjogrens syndrome scleroderma autoantigen 1 Nmar_0622 CENSYa_1212 92 563,679 563,987 102 f 1,237,032 1,237,340 102 f translation initiation factor SUI1 Nmar_0623 CENSYa_1213 36 563,994 564,983 329 f 1,237,342 1,238,274 310 f abortive infection protein Nmar_0624 CENSYa_1876 69 564,972 565,880 909 r 1,849,320 1,850,231 304 f transcription initiation factor, TFIIB Nmar_0625 CENSYa_1216 50 565,999 566,637 212 r 1,239,556 1,240,089 177 r protein of unknown function DUF121 Nmar_0626 CENSYa_1217 79 566,639 567,598 319 r 1,240,091 1,241,017 308 r LPPG domain containing protein Nmar_0627 CENSYa_1218 47 567,675 569,051 458 f 1,241,346 1,242,695 449 f hypothetical protein Nmar_0629 CENSYa_1314 49 569,705 570,328 207 r 1,324,125 1,324,871 248 r methyltransferase type 11 Nmar_0630 CENSYa_1316 45 570,769 573,681 970 f 1,325,819 1,328,764 981 f hypothetical protein Nmar_0631 CENSYa_1317 53 573,682 574,890 402 r 1,328,925 1,329,713 262 r geranylgeranyl reductase Nmar_0632 CENSYa_1322 65 574,939 575,148 69 r 1,334,601 1,334,810 69 r hypothetical protein Nmar_0634 CENSYa_1323 63 575,917 576,678 253 r 1,334,887 1,335,645 252 r short-chain dehydrogenase/reductase SDR Nmar_0635 CENSYa_0230 52 576,732 577,610 292 r 206,615 207,463 282 f 2-hydroxy-3-oxopropionate reductase Nmar_0636 CENSYa_0228 55 577,614 578,000 128 r 205,590 205,976 128 f hypothetical protein Nmar_0637 CENSYa_1022 40 578,441 578,743 303 f 1,079,363 1,079,785 141 r hypothetical protein Nmar_0639 CENSYa_0223 60 579,023 579,685 220 f 201,595 202,233 212 r DSBA oxidoreductase Nmar_0640 CENSYa_0222 57 579,686 580,165 159 f 201,117 201,524 135 r hypothetical protein Nmar_0641 CENSYa_0387 63 580,411 581,352 313 f 336,405 337,172 255 r hypothetical protein Nmar_0642 CENSYa_0386 62 581,336 582,430 364 r 335,323 336,414 363 f DNA-directed DNA polymerase Nmar_0643 CENSYa_0385 49 582,503 582,754 83 f 335,006 335,251 81 r hypothetical protein Nmar_0644 CENSYa_0384 72 583,285 583,518 77 f 334,474 334,950 158 r DNA-binding protein Nmar_0647 CENSYa_0383 49 584,407 585,120 237 r 333,787 334,494 235 f protein of unknown function DUF75 Nmar_0648 CENSYa_0380 67 585,157 586,098 313 r 326,341 327,279 312 f deoxyhypusine synthase Nmar_0649 CENSYa_0379 81 586,202 587,296 364 f 325,073 326,290 405 r myo-inositol-1-phosphate synthase Nmar_0651 CENSYa_0378 50 588,210 591,734 1174 f 321,467 324,994 1175 r SMC domain protein Nmar_0652 CENSYa_0377 63 591,731 592,699 322 r 320,465 321,421 318 r oligopeptide/dipeptide ABC transporter, ATPase subunit Nmar_0653 CENSYa_0372 49 592,793 593,812 339 f 316,933 317,973 346 r conserved hypothetical protein Nmar_0655 CENSYa_0223 40 594,507 595,166 660 r 201,595 202,233 213 r protein-disulfide isomerase Nmar_0657 CENSYa_0371 65 596,491 597,801 436 r 315,657 316,961 434 f protein of unknown function DUF323 Nmar_0658 CENSYa_0369 68 597,843 599,285 480 r 313,409 314,839 476 f prolyl-tRNA synthetase Nmar_0659 CENSYa_0368 74 599,345 599,629 94 r 313,058 313,372 104 f conserved hypothetical protein Nmar_0666 CENSYa_0367 72 608,181 608,897 238 f 312,186 312,836 216 r phosphoserine phosphatase SerB Nmar_0667 CENSYa_0366 59 608,932 609,657 241 f 311,444 311,998 184 r F420-dependent oxidoreductase, putative Nmar_0668 CENSYa_0365 54 609,714 610,628 304 f 310,446 311,363 305 r PHP domain protein Nmar_0669 CENSYa_0364 68 610,696 610,932 78 f 310,177 310,398 73 r SirA family protein Nmar_0672 CENSYa_0363 69 612,652 613,656 334 r 309,204 310,175 323 f thioredoxin reductase Nmar_0674 CENSYa_1325 48 614,254 614,685 143 f 1,336,267 1,336,713 148 r UspA domain protein Nmar_0677 CENSYa_0362 40 615,949 616,719 256 r 308,437 309,147 236 f hypothetical protein Nmar_0678 CENSYa_0361 79 616,743 617,561 272 r 307,557 308,369 270 f thiazole biosynthesis enzyme Nmar_0679 CENSYa_0358 71 617,751 619,559 602 f 302,735 304,540 601 r sulfite reductase (ferredoxin) Nmar_0680 CENSYa_0359 63 619,556 620,353 265 f 304,731 305,525 264 r rhodanese domain protein Nmar_0681 CENSYa_0357 66 620,354 621,481 375 r 301,484 302,593 369 f tetratricopeptide TPR_2 repeat protein Nmar_0682 CENSYa_0352 58 621,596 623,224 542 f 296,630 298,036 468 r collagen triple helix repeat Nmar_0684 CENSYa_0349 74 624,050 625,159 369 r 294,723 295,793 356 f small GTP-binding protein Nmar_0685 CENSYa_0348 51 625,481 626,176 231 f 293,881 294,684 267 r SPP-like hydrolase Nmar_0686 CENSYa_0347 58 626,178 628,064 628 f 292,037 293,860 607 r arginyl-tRNA synthetase Nmar_0689 CENSYa_0024 53 630,214 630,555 342 r 18,917 19,720 268 r hypothetical protein Nmar_0690 CENSYa_0346 66 631,181 632,062 293 f 291,169 291,993 274 r tRNA methyltransferase complex GCD14 subunit Nmar_0691 CENSYa_0257 63 632,107 632,970 287 f 226,468 227,448 326 f carbohydrate kinase, YjeF related protein Nmar_0692 CENSYa_0258 69 632,975 634,327 450 f 227,451 228,560 369 f peptidase M20 Nmar_0693 CENSYa_0260 57 634,324 634,740 138 r 228,794 229,225 143 r DoxX family protein Nmar_0694 CENSYa_0261 75 634,836 635,432 198 f 229,403 229,984 193 f proteasome endopeptidase complex Nmar_0695 CENSYa_0262 62 635,442 635,918 158 r 229,981 230,454 157 r PUA domain containing protein Nmar_0696 CENSYa_0263 69 635,915 636,925 336 r 230,451 231,458 335 r homoserine dehydrogenase Nmar_0697 CENSYa_0265 36 637,018 637,425 135 r 232,064 232,441 125 r methionine-R-sulfoxide reductase Nmar_0698 CENSYa_0266 61 637,462 639,099 545 r 232,489 234,069 526 r thymidylate synthase complementing protein ThyX Nmar_0699 CENSYa_0267 51 639,111 639,404 97 r 234,154 234,441 95 r hypothetical protein Nmar_0700 CENSYa_0268 56 639,538 640,416 292 f 234,581 235,402 273 f hypothetical protein Nmar_0703 CENSYa_0270 71 642,950 643,972 340 r 236,246 237,268 340 r XPG I domain protein Nmar_0710 CENSYa_0075 41 648,858 649,091 234 f 52,786 53,025 80 r transcriptional regulator Nmar_0712 CENSYa_0273 46 649,883 650,224 113 r 237,590 237,922 110 r hypothetical protein Nmar_0714 CENSYa_1303 33 650,813 652,093 426 f 1,313,074 1,314,369 431 f hypothetical protein Nmar_0715 CENSYa_1304 61 652,125 652,526 133 f 1,314,397 1,314,801 134 f Mov34/MPN/PAD-1 family protein Nmar_0716 CENSYa_1305 75 652,664 652,927 87 f 1,314,887 1,315,177 96 f hypothetical protein Nmar_0717 CENSYa_1306 43 652,930 653,241 103 f 1,315,178 1,315,483 101 f hypothetical protein Nmar_0718 CENSYa_1307 63 653,242 654,609 455 r 1,315,480 1,316,775 431 r binding-protein-dependent transport systems inner membrane component Nmar_0719 CENSYa_1308 76 654,606 655,655 349 r 1,316,831 1,318,042 403 r binding-protein-dependent transport systems inner membrane component Nmar_0720 CENSYa_1733 46 655,859 656,167 102 f 1,723,076 1,723,480 134 r hypothetical protein Nmar_0723 CENSYa_1800 33 657,033 657,980 315 r 1,780,929 1,782,050 373 r hypothetical protein Nmar_0726 CENSYa_1802 72 659,913 661,637 574 f 1,782,471 1,784,195 574 f SNF2-related protein Nmar_0727 CENSYa_1803 33 661,638 662,252 204 r 1,784,167 1,784,916 249 r hypothetical protein Nmar_0729 CENSYa_0336 55 662,927 663,550 624 f 285,044 285,664 207 r superoxide dismutase Nmar_0730 CENSYa_1804 61 663,600 664,898 432 f 1,784,986 1,786,530 514 f hypothetical protein Nmar_0731 CENSYa_1343 48 665,007 665,381 124 f 1,348,305 1,348,667 120 r hypothetical protein Nmar_0732 CENSYa_1805 68 665,435 665,686 83 f 1,786,607 1,786,858 83 f hypothetical protein Nmar_0733 CENSYa_1482 60 665,703 665,846 144 r 1,485,117 1,485,260 48 f hypothetical protein Nmar_0734 CENSYa_1809 37 666,166 667,611 481 r 1,793,565 1,794,941 458 r FAD linked oxidase domain protein Nmar_0735 CENSYa_1810 60 667,620 668,933 437 r 1,794,938 1,796,248 436 r hypothetical protein Nmar_0736 CENSYa_1811 28 669,032 669,436 134 f 1,796,356 1,796,751 131 f hypothetical protein Nmar_0737 CENSYa_1812 34 669,433 669,981 182 r 1,796,748 1,797,263 171 r heat shock protein DnaJ domain protein Nmar_0738 CENSYa_1813 67 669,992 670,435 147 r 1,797,267 1,797,725 152 r transcriptional regulator, AsnC family Nmar_0739 CENSYa_1814 53 670,559 670,786 75 f 1,797,797 1,798,018 73 f TPR repeat-containing protein Nmar_0740 CENSYa_1815 42 670,859 671,335 158 f 1,798,058 1,798,531 157 f TPR repeat-containing protein Nmar_0741 CENSYa_1816 61 671,322 671,495 57 r 1,798,528 1,798,701 57 r hypothetical protein Nmar_0742 CENSYa_1817 77 671,572 672,669 365 f 1,798,830 1,799,927 365 f radical SAM domain protein Nmar_0743 CENSYa_1818 57 672,666 672,971 101 r 1,799,924 1,800,238 104 r protein of unknown function DUF77 Nmar_0746 CENSYa_0919 69 673,708 674,535 275 r 974,439 975,221 260 f protein of unknown function DUF191 Nmar_0747 CENSYa_0916 40 674,573 675,526 317 f 956,842 957,774 310 r hypothetical protein

Page 25: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0748 CENSYa_0915 84 675,576 677,768 730 f 954,593 956,785 730 r translation elongation factor aEF-2 Nmar_0755 CENSYa_0448 43 681,293 681,910 205 r 374,609 375,172 187 r O-methyltransferase family 3 Nmar_0756 CENSYa_0766 56 682,006 682,383 125 f 719,113 719,487 124 r hypothetical protein Nmar_0763 CENSYa_0910 52 685,116 685,928 270 r 951,992 952,753 253 f hypothetical protein Nmar_0764 CENSYa_0909 67 685,965 687,782 605 r 950,069 951,898 609 f CBS domain containing protein Nmar_0765 CENSYa_0908 74 687,923 688,366 147 f 949,553 950,002 149 r hypothetical protein Nmar_0770 CENSYa_0907 70 690,338 691,429 363 r 948,465 949,556 363 f IMP biosynthesis enzyme PurP domain protein Nmar_0771 CENSYa_0902 75 691,728 692,285 185 f 945,138 945,779 213 r 3-octaprenyl-4-hydroxybenzoate carboxy-lyase Nmar_0772 CENSYa_0901 36 692,293 692,721 142 f 944,670 945,080 136 r hypothetical protein Nmar_0773 CENSYa_0900 80 692,718 692,924 68 r 944,482 944,673 63 f sec-independent translocation protein mttA/Hcf106 Nmar_0774 CENSYa_0899 65 692,953 693,744 263 r 943,615 944,409 264 f sec-independent protein translocase, TatC subunit Nmar_0775 CENSYa_0898 63 693,772 694,236 154 r 943,089 943,586 165 f hypothetical protein Nmar_0776 CENSYa_2060 53 688,854 689,246 393 f 2,030,326 2,034,024 1,233 r hypothetical protein Nmar_0777 CENSYa_0353 61 695,077 695,337 86 f 298,139 298,399 86 r phosphoribosylformylglycinamidine synthase, purS Nmar_0779 CENSYa_1912 66 695,828 696,514 228 f 1,879,631 1,880,305 224 r phosphoribosylformylglycinamidine synthase I Nmar_0780 CENSYa_1911 61 696,511 698,676 721 f 1,877,493 1,879,634 713 r phosphoribosylformylglycinamidine synthase II Nmar_0781 CENSYa_1910 78 698,669 700,138 489 f 1,876,052 1,877,455 467 r glutamine amidotransferase class-II Nmar_0782 CENSYa_0895 70 700,141 700,419 92 f 911,771 912,091 106 r protein of unknown function DUF427 Nmar_0784 CENSYa_0887 59 700,874 701,698 274 f 906,283 907,104 273 r phosphoribosylaminoimidazolesuccinocarboxamide synthase Nmar_0785 CENSYa_0885 58 702,047 702,550 167 f 905,096 905,617 173 r transcriptional regulator, ArsR family Nmar_0787 CENSYa_0884 62 702,901 703,305 134 f 904,689 905,087 132 r pyridoxamine 5'-phosphate oxidase-related FMN-binding Nmar_0792 CENSYa_0877 79 706,645 706,893 82 r 891,729 891,974 81 f hypothetical protein Nmar_0793 CENSYa_0874 47 707,231 707,518 95 r 889,774 890,061 95 f hypothetical protein Nmar_0794 CENSYa_0873 63 707,534 708,091 185 r 889,274 889,771 165 f ribosomal protein L6 Nmar_0795 CENSYa_0872 79 708,081 708,473 130 r 888,844 889,236 130 f ribosomal protein S8 Nmar_0796 CENSYa_0871 91 708,480 708,659 59 r 888,659 888,838 59 f ribosomal protein S14 Nmar_0797 CENSYa_0870 69 708,659 709,180 173 r 888,150 888,659 169 f ribosomal protein L5 Nmar_0798 CENSYa_0869 60 709,185 709,901 238 r 887,432 888,148 238 f ribosomal protein S4E, central domain protein Nmar_0799 CENSYa_0868 48 709,901 710,407 168 r 886,929 887,432 167 f KOW domain protein Nmar_0800 CENSYa_0867 79 710,412 710,834 140 r 886,493 886,927 144 f ribosomal protein L14b/L23e Nmar_0801 CENSYa_0866 63 710,834 711,157 107 r 886,164 886,493 109 f ribosomal protein S17 Nmar_0802 CENSYa_0865 31 711,154 711,420 88 r 885,925 886,167 80 f ribonuclease P, Rpp29 Nmar_0803 CENSYa_0864 62 711,417 711,623 68 r 885,728 885,928 66 f ribosomal protein L29 Nmar_0804 CENSYa_0863 66 711,620 712,375 251 r 884,946 885,731 261 f ribosomal protein S3 Nmar_0805 CENSYa_0862 72 712,378 712,836 152 r 884,485 884,943 152 f ribosomal protein L22 Nmar_0806 CENSYa_0861 80 712,842 713,237 131 r 884,084 884,479 131 f ribosomal protein S19 Nmar_0807 CENSYa_0860 73 713,265 713,531 88 r 883,802 884,044 80 f ribosomal protein L25/L23 Nmar_0808 CENSYa_0859 61 713,528 714,343 271 r 882,972 883,787 271 f ribosomal protein L4/L1e Nmar_0809 CENSYa_0858 68 714,340 715,335 331 r 881,989 882,975 328 f ribosomal protein L3 Nmar_0810 CENSYa_0992 55 715,541 716,353 270 r 1,046,544 1,047,341 265 f protein of unknown function DUF171 Nmar_0811 CENSYa_0991 73 716,385 717,122 245 r 1,045,770 1,046,507 245 f proteasome endopeptidase complex Nmar_0812 CENSYa_0989 51 717,175 717,591 138 f 1,044,329 1,044,742 137 r protein of unknown function DUF371 Nmar_0813 CENSYa_0987 69 717,819 718,160 113 f 1,043,791 1,044,075 94 r hypothetical protein Nmar_0815 CENSYa_0033 45 718,761 720,380 1,620 r 24,575 25,207 211 f coppper binding protein, plasocyanin/azurin family Nmar_0816 CENSYa_0986 53 720,467 721,117 216 r 1,043,156 1,043,785 209 f SNF7 Nmar_0817 CENSYa_0985 55 721,209 722,318 369 r 1,041,950 1,043,062 370 f redoxin domain protein Nmar_0818 CENSYa_0984 68 722,323 723,057 244 r 1,041,214 1,041,945 243 f cytochrome c biogenesis protein transmembrane region Nmar_0819 CENSYa_0982 50 723,259 724,383 374 r 1,039,061 1,040,083 340 f 8-amino-7-oxononanoate synthase Nmar_0820 CENSYa_1338 44 728,698 728,925 228 r 1,345,711 1,346,229 173 r hypothetical protein Nmar_0821 CENSYa_0981 66 725,326 726,306 326 r 1,038,016 1,038,972 318 f biotin synthase Nmar_0822 CENSYa_0980 57 726,380 727,660 426 r 1,036,637 1,037,956 439 f aminotransferase class-III Nmar_0823 CENSYa_0979 43 728,011 728,697 228 f 1,035,647 1,036,423 258 r dethiobiotin synthase Nmar_0827 CENSYa_0978 29 730,421 730,969 182 r 1,035,259 1,035,669 136 f hypothetical protein Nmar_0828 CENSYa_0977 76 730,969 732,363 464 r 1,033,737 1,035,131 464 f hydroxymethylglutaryl-CoA synthase Nmar_0829 CENSYa_0976 56 732,469 733,179 236 r 1,032,975 1,033,655 226 f DSBA oxidoreductase Nmar_0831 CENSYa_0972 64 734,100 735,152 350 f 1,030,389 1,031,432 347 r glyceraldehyde-3-phosphate dehydrogenase (NAD(P)(+)) (phosphorylating) Nmar_0832 CENSYa_0971 75 735,149 736,402 417 r 1,029,131 1,030,387 418 f protein of unknown function DUF21 Nmar_0833 CENSYa_0532 76 736,555 737,892 445 r 464,494 465,816 440 r threonine synthase Nmar_0835 CENSYa_0584 57 738,666 738,980 104 r 536,523 536,852 109 f hypothetical protein Nmar_0836 CENSYa_0580 41 739,036 739,842 268 r 515,690 516,448 252 f hypothetical protein Nmar_0837 CENSYa_0579 53 739,895 740,443 182 r 515,094 515,630 178 f resolvase, Holliday junction-type Nmar_0839 CENSYa_0269 44 741,001 741,828 275 f 235,440 236,177 245 r hypothetical protein Nmar_0840 CENSYa_0586 26 741,821 742,129 102 r 537,033 537,323 96 r hypothetical protein Nmar_0841 CENSYa_0587 52 742,126 743,241 371 r 537,320 538,423 367 r propanoyl-CoA C-acyltransferase Nmar_0842 CENSYa_0588 55 743,238 744,203 321 r 538,420 539,382 320 r luciferase family protein Nmar_0843 CENSYa_0853 52 744,462 745,130 222 r 878,773 879,432 219 f nucleotidyl transferase Nmar_0845 CENSYa_0851 51 745,568 745,960 130 r 877,724 878,077 117 f hypothetical protein Nmar_0846 CENSYa_0847 53 745,994 747,136 380 r 847,322 848,458 378 f phosphoribosylaminoimidazole carboxylase, ATPase subunit Nmar_0847 CENSYa_1141 61 747,236 747,811 191 f 1,167,335 1,167,880 181 r phosphoribosylaminoimidazole carboxylase, catalytic subunit Nmar_0848 CENSYa_0843 72 747,812 749,020 402 r 830,743 831,909 388 f threonine dehydratase Nmar_0849 CENSYa_0841 46 749,201 750,037 278 r 818,673 819,863 396 f hypothetical protein Nmar_0850 CENSYa_0838 57 750,195 751,031 278 f 815,903 816,799 298 r adenylate/guanylate cyclase Nmar_0852 CENSYa_0835 53 751,758 752,144 128 f 814,758 815,144 128 r hypothetical protein Nmar_0853 CENSYa_0834 70 752,141 752,551 136 r 814,348 814,761 137 f glyoxalase/bleomycin resistance protein/dioxygenase Nmar_0854 CENSYa_0833 68 752,586 752,804 72 r 814,140 814,295 51 f transcriptional regulator, AsnC family Nmar_0855 CENSYa_0832 68 752,798 753,037 79 r 813,871 814,083 70 f hypothetical protein Nmar_0858 CENSYa_0231 56 754,081 754,617 178 r 207,460 208,083 207 f ATP--cobalamin adenosyltransferase Nmar_0859 CENSYa_0234 44 754,698 755,231 177 f 208,426 209,232 268 f metal dependent phosphohydrolase Nmar_0860 CENSYa_0234 50 755,231 755,455 225 f 208,426 209,232 269 f HD superfamily hydrolase Nmar_0862 CENSYa_0235 66 755,829 756,176 115 f 209,286 209,642 118 f hypothetical protein Nmar_0863 CENSYa_0237 79 756,232 757,455 407 f 210,319 210,912 197 f LOR/SDH bifunctional protein conserved domain protein Nmar_0864 CENSYa_1734 54 757,600 758,112 513 f 1,723,506 1,725,080 525 r hypothetical protein Nmar_0868 CENSYa_0238 63 761,773 763,176 467 r 210,958 212,352 464 r glutamyl-tRNA(Gln) amidotransferase, B subunit Nmar_0869 CENSYa_0239 62 763,173 764,618 481 r 212,349 213,788 479 r glutamyl-tRNA(Gln) amidotransferase, A subunit Nmar_0870 CENSYa_0240 43 764,618 764,884 88 r 213,788 214,054 88 r hypothetical protein Nmar_0871 CENSYa_0241 70 764,895 766,205 436 r 214,051 215,619 522 r aspartyl-tRNA synthetase Nmar_0872 CENSYa_0242 69 766,788 767,813 341 f 215,698 216,723 341 f IMP biosynthesis enzyme PurP domain protein Nmar_0873 CENSYa_1262 75 767,816 768,613 265 r 1,283,890 1,284,627 245 r phosphonate ABC transporter, inner membrane subunit Nmar_0874 CENSYa_1263 61 768,591 769,421 276 r 1,284,662 1,285,462 266 r ABC transporter related Nmar_0875 CENSYa_1264 61 769,440 770,528 362 r 1,285,472 1,286,515 347 r phosphonate ABC transporter, periplasmic phosphonate-binding protein

Page 26: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_0876 CENSYa_1265 65 770,616 770,966 116 r 1,286,603 1,286,941 112 r hypothetical protein Nmar_0877 CENSYa_1266 68 770,972 771,304 110 r 1,286,943 1,287,272 109 r putative transcriptional regulator Nmar_0878 CENSYa_1267 43 771,370 772,098 242 r 1,287,406 1,288,119 237 r conserved hypothetical protein Nmar_0879 CENSYa_1268 56 772,111 772,326 71 r 1,288,120 1,288,317 65 r hypothetical protein Nmar_0880 CENSYa_1269 37 772,407 773,036 209 f 1,288,431 1,289,057 208 f hypothetical protein Nmar_0881 CENSYa_0976 41 773,072 773,716 645 f 1,032,975 1,033,655 227 f protein-disulfide isomerase Nmar_0882 CENSYa_1290 73 773,772 775,052 426 f 1,305,181 1,306,416 411 f eRF1 domain 2 protein Nmar_0883 CENSYa_1289 61 775,045 775,449 134 r 1,304,689 1,305,114 141 f histidine triad (HIT) protein Nmar_0886 CENSYa_1285 79 776,452 778,128 558 f 1,301,101 1,302,717 538 r dihydroxy-acid dehydratase Nmar_0888 CENSYa_0208 53 778,828 780,228 466 r 188,290 189,969 559 r helicase c2 Nmar_0906 CENSYa_0671 30 791,128 791,799 223 r 639,924 640,541 205 r conserved hypothetical protein Nmar_0911 CENSYa_1280 63 794,959 796,308 449 f 1,297,926 1,299,197 423 r Anthranilate synthase Nmar_0912 CENSYa_1279 66 796,305 796,901 198 f 1,297,339 1,297,929 196 r glutamine amidotransferase of anthranilate synthase Nmar_0913 CENSYa_1278 49 796,898 797,944 348 f 1,296,308 1,297,339 343 r anthranilate phosphoribosyltransferase Nmar_0914 CENSYa_1277 54 797,937 798,713 258 f 1,295,547 1,296,299 250 r indole-3-glycerol-phosphate synthase Nmar_0915 CENSYa_1276 70 798,710 799,897 395 f 1,294,375 1,295,490 371 r tryptophan synthase, beta subunit Nmar_0916 CENSYa_1275 50 799,884 800,690 268 f 1,293,558 1,294,337 259 r tryptophan synthase, alpha subunit Nmar_0918 CENSYa_1796 48 802,019 802,861 843 r 1,775,021 1,776,016 332 f copper binding protein, plastocyanin/azurin family Nmar_0919 CENSYa_1274 71 802,942 804,543 533 f 1,291,934 1,293,526 530 r CTP synthase Nmar_0921 CENSYa_1273 65 806,078 807,076 332 r 1,290,936 1,291,937 333 f NAD(+) kinase Nmar_0922 CENSYa_1272 46 807,118 807,987 289 r 1,290,026 1,290,889 287 f PfkB domain protein Nmar_0923 CENSYa_1382 88 808,250 808,537 95 f 1,408,487 1,408,774 95 r ribosomal protein S26E Nmar_0924 CENSYa_1381 59 808,540 809,112 190 r 1,407,918 1,408,490 190 f CDP-alcohol phosphatidyltransferase Nmar_0925 CENSYa_1380 65 809,190 810,065 291 r 1,407,021 1,407,881 286 f putative agmatinase Nmar_0926 CENSYa_0251 67 810,069 811,934 621 r 222,040 223,887 615 r threonyl-tRNA synthetase Nmar_0927 CENSYa_0252 52 811,992 812,327 111 r 223,920 224,339 139 r hypothetical protein Nmar_0928 CENSYa_0253 66 812,404 813,468 354 f 224,381 225,442 353 f deoxyhypusine synthase Nmar_0929 CENSYa_0254 74 813,482 813,694 70 r 225,439 225,651 70 r hypothetical protein Nmar_0930 CENSYa_0255 39 813,793 814,104 103 f 225,650 226,111 153 f small subunit ribosomal protein S25e Nmar_0932 CENSYa_0303 58 814,486 815,034 182 f 259,382 259,849 155 r RNA polymerase Rpb6 Nmar_0933 CENSYa_0302 45 815,086 815,400 104 f 259,055 259,333 92 r alba, DNA/RNA-binding protein Nmar_0934 CENSYa_0301 80 815,397 815,813 138 r 258,753 259,058 101 f hypothetical protein Nmar_0935 CENSYa_0300 44 815,847 816,866 339 f 257,603 258,754 383 r asparagine synthase Nmar_0936 CENSYa_1331 48 817,055 817,420 366 r 1,340,198 1,340,593 132 f hypothetical protein Nmar_0937 CENSYa_1194 59 817,489 818,589 366 r 1,220,391 1,221,509 372 f aldo/keto reductase Nmar_0938 CENSYa_1185 61 818,620 819,510 296 r 1,216,088 1,216,966 292 f UbiA prenyltransferase Nmar_0939 CENSYa_1184 63 819,822 820,223 133 r 1,215,686 1,216,087 133 f conserved hypothetical protein Nmar_0942 CENSYa_1183 79 821,297 821,575 92 f 1,215,288 1,215,566 92 r PpiC-type peptidyl-prolyl cis-trans isomerase Nmar_0944 CENSYa_1182 65 821,861 823,369 502 r 1,213,605 1,215,281 558 f DEAD/DEAH box helicase domain protein Nmar_0947 CENSYa_1181 66 824,040 824,642 200 f 1,213,154 1,213,606 150 r CMP/dCMP deaminase zinc-binding Nmar_0948 CENSYa_1180 69 824,647 827,205 852 f 1,210,620 1,213,085 821 r DNA polymerase B region Nmar_0949 CENSYa_1179 55 827,214 827,495 93 f 1,210,318 1,210,623 101 r conserved hypothetical protein Nmar_0950 CENSYa_1177 56 827,535 828,191 218 f 1,209,091 1,209,744 217 f triosephosphate isomerase Nmar_0951 CENSYa_0473 67 828,219 830,882 887 f 396,302 398,956 884 f pyruvate, phosphate dikinase Nmar_0952 CENSYa_0474 67 830,925 831,482 185 f 398,970 399,527 185 f transcriptional regulator, XRE family Nmar_0953 CENSYa_0475 67 831,494 831,889 131 r 399,511 399,909 132 r glyoxalase/bleomycin resistance protein/dioxygenase Nmar_0954 CENSYa_0476 75 831,890 833,488 532 r 399,909 401,495 528 r methylmalonyl-CoA mutase, large subunit Nmar_0955 CENSYa_0477 59 833,495 834,412 305 r 401,500 402,423 307 r LAO/AO transport system ATPase Nmar_0956 CENSYa_0478 38 834,456 834,863 135 r 402,439 402,828 129 r hypothetical protein Nmar_0957 CENSYa_0479 54 834,865 837,357 830 r 402,832 405,372 846 r peptidase M1 membrane alanine aminopeptidase Nmar_0958 CENSYa_0482 80 837,459 837,881 140 f 408,103 408,525 140 f cobalamin B12-binding domain protein Nmar_0959 CENSYa_1385 81 837,893 838,891 332 r 1,410,242 1,411,240 332 r ketol-acid reductoisomerase Nmar_0960 CENSYa_1386 59 839,050 839,580 176 r 1,411,439 1,411,969 176 r cytidyltransferase-related domain Nmar_0961 CENSYa_1387 49 839,615 840,535 306 r 1,412,000 1,412,932 310 r elongation factor Tu domain 2 protein Nmar_0962 CENSYa_1388 42 840,580 841,332 250 r 1,412,929 1,413,630 233 r hypothetical protein Nmar_0963 CENSYa_1389 69 841,334 842,989 551 r 1,413,645 1,415,291 548 r methionyl-tRNA synthetase Nmar_0964 CENSYa_1390 73 843,027 844,268 413 r 1,415,334 1,416,584 416 r adenosylhomocysteinase Nmar_0965 CENSYa_1391 60 844,342 844,644 100 f 1,416,705 1,417,040 111 f Rieske (2Fe-2S) domain protein Nmar_0966 CENSYa_1345 52 844,775 845,488 237 f 1,350,107 1,350,922 271 r NAD+ synthetase Nmar_0967 CENSYa_1344 53 845,485 846,873 462 f 1,348,743 1,350,110 455 r cysteinyl-tRNA synthetase Nmar_0970 CENSYa_0105 75 847,835 848,116 93 f 78,732 79,031 99 f hypothetical protein Nmar_0973 CENSYa_1060 50 849,754 850,653 299 r 1,104,640 1,105,479 279 r hypothetical protein Nmar_0975 CENSYa_0438 59 851,328 852,485 385 r 368,725 369,849 374 f protein of unknown function DUF650 Nmar_0976 CENSYa_0436 43 852,580 853,506 308 f 367,004 367,903 299 r periplasmic binding protein Nmar_0978 CENSYa_1058 42 855,102 855,944 280 f 1,103,679 1,104,167 162 r response regulator receiver protein Nmar_0979 CENSYa_1332 40 856,416 857,327 912 f 1,340,590 1,341,576 269 r transcription initiation factor, TFIIB Nmar_0980 CENSYa_1342 68 857,503 858,714 403 r 1,347,103 1,348,323 406 f GTPase of unknown function domain protein Nmar_0981 CENSYa_1341 33 858,749 859,114 121 r 1,346,725 1,347,081 118 f hypothetical protein Nmar_0982 CENSYa_1340 65 859,116 859,526 136 r 1,346,538 1,346,723 61 f CoA-binding domain protein Nmar_0983 CENSYa_1338 67 859,607 860,128 173 f 1,345,711 1,346,229 172 r hypothetical protein Nmar_0984 CENSYa_1337 41 860,165 861,172 335 f 1,344,730 1,345,701 323 r biotin--acetyl-CoA-carboxylase ligase Nmar_0985 CENSYa_1336 39 861,173 861,535 120 r 1,344,424 1,344,741 105 f hypothetical protein Nmar_0986 CENSYa_1335 48 861,646 863,085 479 f 1,342,821 1,344,224 467 r TPR repeat-containing protein Nmar_0987 CENSYa_1332 77 863,158 864,105 315 f 1,340,590 1,341,576 328 r transcription factor TFIIB cyclin-related Nmar_0989 CENSYa_1331 62 864,552 864,944 130 r 1,340,198 1,340,593 131 f hypothetical protein Nmar_0990 CENSYa_1330 63 864,998 865,576 192 f 1,339,589 1,340,143 184 r hypothetical protein Nmar_0991 CENSYa_1329 73 865,566 866,630 354 f 1,338,538 1,339,539 333 r poly(R)-hydroxyalkanoic acid synthase, class III, PhaC subunit Nmar_0992 CENSYa_1328 52 866,668 867,096 142 r 1,338,051 1,338,476 141 f hypothetical protein Nmar_0993 CENSYa_1327 59 867,093 867,518 141 r 1,337,629 1,338,054 141 f transcriptional regulator, AbrB family Nmar_0994 CENSYa_1326 52 867,555 868,400 281 r 1,336,790 1,337,575 261 f alpha/beta hydrolase fold Nmar_0995 CENSYa_1324 66 868,653 869,105 150 r 1,335,811 1,336,263 150 f peptide methionine sulfoxide reductase Nmar_0996 CENSYa_0064 69 869,170 869,568 132 r 45,734 46,117 127 f hypothetical protein Nmar_0997 CENSYa_0933 40 869,663 870,511 282 r 986,055 986,867 270 f methyltransferase type 11 Nmar_1001 CENSYa_0120 41 876,009 876,260 252 f 94,871 95,185 105 r transcriptional regulator Nmar_1002 CENSYa_0064 37 876,296 876,688 393 f 45,734 46,117 128 f hypothetical membrane protein Nmar_1007 CENSYa_0971 55 881,616 882,884 1,269 r 1,029,131 1,030,387 419 f hemolysin Nmar_1017 CENSYa_1168 61 898,088 899,395 435 f 1,202,736 1,204,040 434 f aminotransferase class-III Nmar_1019 CENSYa_1169 46 899,956 900,777 273 f 1,204,093 1,204,899 268 f tetratricopeptide TPR_2 repeat protein Nmar_1020 CENSYa_1170 48 900,772 901,476 234 r 1,204,904 1,205,599 231 r hypothetical protein Nmar_1021 CENSYa_1171 53 901,554 901,919 121 f 1,205,677 1,206,054 125 f hypothetical protein

Page 27: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1022 CENSYa_1172 31 901,921 902,376 151 f 1,206,056 1,206,418 120 f cupin 2 conserved barrel domain protein Nmar_1024 CENSYa_0418 50 902,804 903,475 223 f 354,691 355,434 247 r heat shock protein DnaJ domain protein Nmar_1025 CENSYa_0417 53 903,476 903,883 135 r 354,274 354,690 138 f histidine triad (HIT) protein Nmar_1026 CENSYa_0416 66 903,919 904,224 101 r 353,971 354,258 95 f hypothetical protein Nmar_1027 CENSYa_0415 65 904,199 904,366 55 r 353,797 353,967 56 f hypothetical protein Nmar_1028 CENSYa_0413 75 904,421 905,560 379 f 351,794 352,891 365 r 3-hydroxybutyryl-CoA dehydrogenase Nmar_1030 CENSYa_0412 63 905,845 906,354 169 r 351,295 351,801 168 f Rossmann fold nucleotide-binding protein Nmar_1031 CENSYa_0410 64 906,617 907,027 136 f 350,610 351,020 136 r hypothetical protein Nmar_1032 CENSYa_0408 65 907,229 907,432 67 r 350,199 350,420 73 f RNA polymerase Rbp10 Nmar_1033 CENSYa_0407 78 907,455 907,763 102 r 349,860 350,168 102 f ribosomal protein S10 Nmar_1034 CENSYa_0406 90 907,775 909,073 432 r 348,543 349,853 436 f translation elongation factor EF-1, subunit alpha Nmar_1035 CENSYa_0564 78 909,195 910,340 381 f 492,785 494,302 505 f protein of unknown function DUF100 Nmar_1036 CENSYa_1022 65 910,408 910,815 135 f 1,079,363 1,079,785 140 r hypothetical protein Nmar_1037 CENSYa_1021 70 910,852 912,618 588 f 1,077,557 1,079,290 577 r DNA ligase I, ATP-dependent Dnl1 Nmar_1038 CENSYa_1036 87 912,685 913,338 217 f 1,088,248 1,088,901 217 f protein of unknown function DUF47 Nmar_1039 CENSYa_1037 66 913,430 913,966 178 f 1,088,903 1,089,436 177 f tRNA intron endonuclease Nmar_1040 CENSYa_1258 36 914,012 916,327 2,316 r 1,281,178 1,282,698 507 f hypothetical protein Nmar_1041 CENSYa_1038 61 916,375 916,752 125 r 1,089,433 1,089,804 123 r hypothetical protein Nmar_1042 CENSYa_1039 64 916,755 916,991 78 r 1,089,807 1,090,043 78 r hypothetical protein Nmar_1043 CENSYa_1197 59 917,089 918,855 588 f 1,223,211 1,224,965 584 f short-chain dehydrogenase/reductase SDR Nmar_1047 CENSYa_1734 43 921,429 924,140 2,712 r 1,723,506 1,725,080 525 r hypothetical protein Nmar_1048 CENSYa_1198 58 924,387 925,160 257 r 1,224,962 1,225,711 249 r protein of unknown function DUF81 Nmar_1050 CENSYa_1199 66 925,869 926,360 163 r 1,225,790 1,226,281 163 r protein of unknown function DUF192 Nmar_1052 CENSYa_1200 59 927,198 927,554 118 r 1,226,320 1,226,676 118 r hypothetical protein Nmar_1053 CENSYa_1201 70 927,691 928,155 154 f 1,226,888 1,227,256 122 f iron (metal) dependent repressor, DtxR family Nmar_1054 CENSYa_1202 62 928,142 929,959 605 r 1,227,253 1,229,088 611 r PilT protein domain protein Nmar_1056 CENSYa_0598 47 931,785 932,375 196 f 546,285 546,935 216 f phage SPO1 DNA polymerase-related protein Nmar_1057 CENSYa_0599 57 932,353 932,916 187 f 546,932 547,480 182 f DNA-3-methyladenine glycosylase Nmar_1058 CENSYa_0600 66 932,913 933,545 210 r 547,482 548,114 210 r SNARE associated Golgi protein Nmar_1059 CENSYa_0601 62 933,635 934,009 124 f 548,192 548,584 130 f hypothetical protein Nmar_1061 CENSYa_1841 30 935,438 936,661 1,224 f 1,817,646 1,818,176 177 f hypothetical protein Nmar_1064 CENSYa_0602 63 937,290 938,225 311 r 548,575 549,555 326 r oligopeptide/dipeptide ABC transporter, ATPase subunit Nmar_1066 CENSYa_0611 68 938,826 940,112 428 f 556,181 557,458 425 f aspartyl-tRNA synthetase Nmar_1067 CENSYa_1709 49 940,151 940,396 81 f 1,703,932 1,704,177 81 f hypothetical protein Nmar_1069 CENSYa_0612 63 940,960 941,973 337 r 557,455 558,453 332 r isopropylmalate/isohomocitrate dehydrogenase Nmar_1070 CENSYa_0613 73 942,007 943,524 505 r 558,460 559,977 505 r isopropylmalate/citramalate/homocitrate synthase Nmar_1071 CENSYa_0614 77 943,521 944,006 161 r 559,974 560,462 162 r acetolactate synthase, small subunit Nmar_1072 CENSYa_0615 86 944,015 945,715 566 r 560,465 562,120 551 r acetolactate synthase, large subunit, biosynthetic type Nmar_1078 CENSYa_0624 72 982,395 983,390 331 r 573,573 574,379 268 r fructose-bisphosphate aldolase Nmar_1079 CENSYa_0629 55 983,428 984,456 342 r 577,481 578,560 359 r alcohol dehydrogenase zinc-binding domain protein Nmar_1080 CENSYa_0630 76 984,493 984,786 97 r 578,670 578,960 96 r vacuolar H+transporting two-sector ATPase F subunit Nmar_1081 CENSYa_0765 69 984,872 985,174 100 f 718,810 719,109 99 f hypothetical protein Nmar_1083 CENSYa_0767 34 985,476 985,778 100 r 719,534 719,896 120 r hypothetical protein Nmar_1084 CENSYa_0768 46 985,813 986,562 249 f 719,908 720,600 230 f metallophosphoesterase Nmar_1085 CENSYa_0770 71 986,566 989,808 1080 r 721,706 723,676 656 r carbamoyl-phosphate synthase, large subunit Nmar_1086 CENSYa_0771 62 989,810 990,964 384 r 723,835 724,959 374 r carbamoyl-phosphate synthase, small subunit Nmar_1087 CENSYa_1874 29 991,041 992,597 1,557 r 1,846,587 1,848,137 517 r secreted periplasmic Zn-dependent protease Nmar_1088 CENSYa_0823 69 992,659 993,852 397 r 806,555 807,748 397 f AAA ATPase central domain protein Nmar_1089 CENSYa_0822 26 993,863 994,363 166 r 806,250 806,558 102 f hypothetical protein Nmar_1091 CENSYa_0821 36 995,066 996,226 386 f 804,862 806,019 385 r transcriptional regulator, TrmB Nmar_1092 CENSYa_0818 50 996,256 996,711 151 f 763,435 763,908 157 r endoribonuclease L-PSP Nmar_1094 CENSYa_0298 43 998,322 998,714 130 f 256,809 257,180 123 r pyridoxamine 5'-phosphate oxidase family protein Nmar_1097 CENSYa_0260 39 1,000,385 1,000,831 447 f 228,794 229,225 144 r hypothetical protein Nmar_1098 CENSYa_0815 68 1,000,832 1,001,974 380 r 756,993 757,994 333 f phosphoesterase DHHA1 Nmar_1099 CENSYa_0802 77 1,002,328 1,003,491 387 f 747,976 749,373 465 r 2-methylcitrate synthase/citrate synthase II Nmar_1100 CENSYa_0801 73 1,003,495 1,003,701 68 r 747,776 747,979 67 f hypothetical protein Nmar_1102 CENSYa_0033 35 1,004,439 1,005,395 318 f 24,575 25,207 210 f blue (type 1) copper domain protein Nmar_1103 CENSYa_0543 59 1,005,416 1,007,002 528 r 471,136 472,731 531 r excinuclease ABC, C subunit Nmar_1104 CENSYa_0544 70 1,006,999 1,009,818 939 r 472,728 475,535 935 r excinuclease ABC, A subunit Nmar_1105 CENSYa_0545 75 1,009,805 1,011,757 650 r 475,537 477,486 649 r excinuclease ABC, B subunit Nmar_1106 CENSYa_0547 59 1,011,807 1,012,457 216 r 478,638 479,270 210 r pyridoxamine 5'-phosphate oxidase-related FMN-binding Nmar_1109 CENSYa_0548 87 1,013,628 1,014,734 368 r 479,274 480,446 390 r luciferase family protein Nmar_1110 CENSYa_0549 82 1,014,829 1,015,773 314 r 480,454 481,398 314 r iron-containing alcohol dehydrogenase Nmar_1111 CENSYa_0799 46 1,015,826 1,017,103 425 f 746,118 747,377 419 f hydroxypyruvate reductase Nmar_1112 CENSYa_0119 58 1,017,166 1,017,402 78 f 94,631 94,870 79 r hypothetical protein Nmar_1113 CENSYa_0118 61 1,017,395 1,018,360 321 r 93,587 94,630 347 f conserved hypothetical protein Nmar_1114 CENSYa_0117 65 1,018,479 1,019,054 191 f 92,960 93,535 191 r phosphoribosylglycinamide formyltransferase Nmar_1116 CENSYa_0278 45 1,019,373 1,019,657 94 r 241,761 241,916 51 r antibiotic biosynthesis monooxygenase Nmar_1117 CENSYa_0115 65 1,019,750 1,020,613 287 f 91,675 92,523 282 r methylenetetrahydrofolate dehydrogenase (NADP(+)) Nmar_1118 CENSYa_0114 46 1,020,603 1,021,166 187 f 91,134 91,655 173 r 5-formyltetrahydrofolate cyclo-ligase Nmar_1119 CENSYa_0113 61 1,021,157 1,022,422 421 r 89,903 91,144 413 f phosphoribosylamine--glycine ligase Nmar_1120 CENSYa_0112 63 1,022,506 1,025,703 1065 f 86,634 89,813 1059 r isoleucyl-tRNA synthetase Nmar_1121 CENSYa_0485 74 1,025,889 1,026,476 195 f 411,900 412,481 193 r ferritin Dps family protein Nmar_1122 CENSYa_0111 56 1,026,579 1,026,908 109 f 86,244 86,579 111 r hypothetical protein Nmar_1123 CENSYa_0110 73 1,026,949 1,028,055 368 f 85,041 86,138 365 r succinate--CoA ligase (ADP-forming) Nmar_1124 CENSYa_0109 75 1,028,052 1,028,969 305 f 84,130 85,044 304 r CoA-binding domain protein Nmar_1125 CENSYa_0108 76 1,029,001 1,029,168 55 f 83,933 84,100 55 r 50S ribosomal protein L40e Nmar_1129 CENSYa_1579 76 1,031,846 1,033,210 1,365 r 1,588,210 1,589,202 331 r hypothetical protein Nmar_1130 CENSYa_1581 66 1,033,565 1,034,743 392 r 1,592,406 1,593,530 374 r putative metal cation transporter Nmar_1131 CENSYa_1582 75 1,034,733 1,035,788 351 r 1,593,556 1,594,596 346 r multicopper oxidase type 3 Nmar_1132 CENSYa_1201 59 1,035,937 1,036,419 483 r 1,226,888 1,227,256 123 f Mn-dependent transcriptional regulator Nmar_1139 CENSYa_1882 63 1,043,604 1,043,813 69 r 1,854,724 1,855,053 109 r hypothetical protein Nmar_1140 CENSYa_0223 39 1,044,115 1,044,795 681 r 201,595 202,233 213 r protein-disulfide isomerase Nmar_1142 CENSYa_0003 54 1,045,689 1,046,150 153 f 2,443 2,907 154 f blue (type 1) copper domain protein Nmar_1143 CENSYa_0223 55 1,046,241 1,046,906 666 f 201,595 202,233 213 r protein-disulfide isomerase Nmar_1146 CENSYa_1238 54 1,048,465 1,049,589 1,125 r 1,263,583 1,264,740 386 f trypsin-like serine protease Nmar_1148 CENSYa_0500 47 1,050,213 1,051,010 798 r 420,881 421,621 247 r protein-disulfide isomerase Nmar_1150 CENSYa_0500 48 1,052,123 1,052,866 744 r 420,881 421,621 247 r protein-disulfide isomerase Nmar_1152 CENSYa_1266 45 1,053,652 1,053,921 270 f 1,286,943 1,287,272 110 r hypothetical protein Nmar_1161 CENSYa_0637 43 1,062,348 1,063,718 1,371 r 586,359 591,230 1,624 r hypothetical protein

Page 28: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1165 CENSYa_0569 26 1,066,715 1,067,632 305 r 504,646 505,296 216 r MT-A70 family protein Nmar_1171 CENSYa_1291 40 1,071,457 1,071,837 126 f 1,306,457 1,306,843 128 f hypothetical protein Nmar_1174 CENSYa_0017 44 1,073,391 1,073,876 486 f 12,121 12,525 135 f hypothetical protein Nmar_1177 CENSYa_0484 64 1,075,014 1,076,786 590 f 409,898 411,667 589 f oligoendopeptidase, pepF/M3 family Nmar_1179 CENSYa_0487 41 1,077,622 1,078,767 381 f 412,775 413,887 370 f hypothetical protein Nmar_1180 CENSYa_0488 80 1,078,768 1,079,319 183 r 413,884 414,450 188 r arginine decarboxylase, pyruvoyl-dependent Nmar_1181 CENSYa_0500 54 1,079,893 1,080,690 798 f 420,881 421,621 247 r protein-disulfide isomerase Nmar_1182 CENSYa_0491 69 1,080,792 1,082,114 440 f 415,609 416,913 434 f DEAD/DEAH box helicase domain protein Nmar_1183 CENSYa_0506 45 1,082,163 1,082,396 234 f 425,326 425,631 102 r transcriptional regulator Nmar_1184 CENSYa_0496 51 1,082,430 1,082,720 96 f 419,300 419,587 95 f hypothetical protein Nmar_1185 CENSYa_0498 40 1,082,717 1,083,412 231 f 419,938 420,420 160 f hypothetical protein Nmar_1187 CENSYa_0499 47 1,084,186 1,084,641 151 r 420,423 420,884 153 r hypothetical protein Nmar_1188 CENSYa_0376 30 1,084,732 1,085,529 265 f 319,621 320,463 280 f hypothetical protein Nmar_1189 CENSYa_0501 50 1,085,567 1,085,911 114 f 421,755 422,075 106 f hypothetical protein Nmar_1190 CENSYa_0502 62 1,086,156 1,086,416 86 f 422,198 422,518 106 f hypothetical protein Nmar_1191 CENSYa_0504 60 1,086,413 1,088,440 675 r 422,726 424,756 676 r protein of unknown function DUF255 Nmar_1192 CENSYa_0509 57 1,088,479 1,088,712 77 r 447,189 447,401 70 r hypothetical protein Nmar_1193 CENSYa_0510 65 1,088,792 1,089,826 344 f 447,580 448,587 335 f Alcohol dehydrogenase GroES domain protein Nmar_1194 CENSYa_0512 43 1,090,267 1,090,794 175 r 449,193 449,612 139 r hypothetical protein Nmar_1196 CENSYa_0514 62 1,091,128 1,091,814 228 r 449,964 450,650 228 r hypothetical protein Nmar_1197 CENSYa_0075 40 1,092,124 1,092,354 231 r 52,786 53,025 80 r transcriptional regulator Nmar_1198 CENSYa_0515 51 1,092,421 1,093,617 398 r 450,723 451,910 395 r tRNA pseudouridine synthase D TruD Nmar_1199 CENSYa_0516 49 1,093,614 1,095,215 533 r 451,910 453,436 508 r conserved hypothetical protein Nmar_1200 CENSYa_0519 67 1,095,261 1,095,635 124 r 454,802 455,209 135 r aminoacyl-tRNA hydrolase Nmar_1201 CENSYa_0161 59 1,096,377 1,101,503 5,127 r 138,954 144,167 1,738 r hypothetical protein Nmar_1205 CENSYa_1796 37 1,103,274 1,104,188 915 f 1,775,021 1,776,016 332 f copper binding protein, plastocyanin/azurin family Nmar_1211 CENSYa_2031 31 1,110,705 1,111,259 184 r 1,995,184 1,995,735 183 f conserved hypothetical protein Nmar_1212 CENSYa_0956 44 1,111,485 1,112,732 415 f 1,015,051 1,016,304 417 r metallophosphoesterase Nmar_1213 CENSYa_0955 27 1,112,729 1,115,149 806 f 1,012,608 1,014,941 777 r SMC domain protein Nmar_1220 CENSYa_0520 51 1,121,044 1,122,180 378 f 455,208 456,329 373 f dehydrogenase (flavoprotein)-like protein Nmar_1221 CENSYa_0525 66 1,122,289 1,122,417 42 f 460,035 460,274 79 r protein of unknown function DUF1610 Nmar_1222 CENSYa_0524 70 1,122,422 1,122,703 93 f 459,752 460,033 93 r translation elongation factor aEF-1 beta Nmar_1223 CENSYa_0523 78 1,122,715 1,124,880 721 f 457,571 459,745 724 r AAA family ATPase, CDC48 subfamily Nmar_1224 CENSYa_0522 58 1,124,883 1,125,179 98 f 457,271 457,570 99 r hypothetical protein Nmar_1225 CENSYa_0521 74 1,125,183 1,126,052 289 r 456,459 457,274 271 f protoheme IX farnesyltransferase Nmar_1226 CENSYa_0530 54 1,126,185 1,127,261 358 f 463,257 464,102 281 f blue (type 1) copper domain protein Nmar_1227 CENSYa_0531 48 1,127,264 1,127,656 130 f 464,099 464,497 132 f Mov34/MPN/PAD-1 family protein Nmar_1229 CENSYa_0970 80 1,128,090 1,128,347 85 f 1,028,713 1,028,970 85 r conserved hypothetical protein Nmar_1230 CENSYa_0969 57 1,128,406 1,128,594 62 f 1,028,467 1,028,652 61 r hypothetical protein Nmar_1231 CENSYa_0968 83 1,128,591 1,130,630 679 r 1,026,431 1,028,470 679 f V-type H(+)-translocating pyrophosphatase Nmar_1232 CENSYa_0967 75 1,130,666 1,131,043 125 r 1,026,018 1,026,395 125 f hypothetical protein Nmar_1233 CENSYa_0966 39 1,131,195 1,131,659 154 r 1,025,433 1,025,888 151 f protein of unknown function DUF359 Nmar_1234 CENSYa_0965 81 1,131,701 1,132,813 370 r 1,024,081 1,025,394 437 f protein of unknown function DUF59 Nmar_1235 CENSYa_0963 85 1,132,962 1,133,558 198 f 1,021,919 1,022,653 244 r DNA-directed RNA polymerase Nmar_1236 CENSYa_0962 83 1,133,560 1,133,748 62 f 1,021,729 1,021,917 62 r DNA-directed RNA polymerase subunit E, RpoE2 Nmar_1237 CENSYa_0961 52 1,133,788 1,134,606 272 f 1,020,912 1,021,727 271 r nicotinate-nucleotide pyrophosphorylase Nmar_1239 CENSYa_0960 69 1,135,009 1,135,965 318 r 1,019,962 1,020,915 317 f quinolinate synthetase complex, A subunit Nmar_1240 CENSYa_0959 58 1,136,059 1,136,877 272 r 1,019,042 1,019,872 276 f aspartate dehydrogenase Nmar_1241 CENSYa_0159 30 1,136,958 1,137,911 954 f 1,136,958 1,137,911 318 f hypothetical protein Nmar_1243 CENSYa_0957 16 1,138,810 1,140,342 510 f 1,016,301 1,017,860 519 r protein of unknown function DUF87 Nmar_1250 CENSYa_0033 68 1,147,717 1,148,436 720 r 24,575 25,207 211 f copper binding protein, plastocyanin/azurin family Nmar_1251 CENSYa_0954 79 1,148,438 1,148,926 162 r 1,012,165 1,012,611 148 f PBS lyase HEAT domain protein repeat-containing protein Nmar_1252 CENSYa_0952 65 1,149,023 1,149,688 221 f 1,011,324 1,011,983 219 r alkyl hydroperoxide reductase/ Thiol specific antioxidant/ Mal allergen Nmar_1253 CENSYa_1795 52 1,149,923 1,150,696 774 r 1,774,086 1,774,961 292 f copper binding protein, plastocyanin/azurin family Nmar_1254 CENSYa_0949 69 1,150,749 1,151,213 154 r 999,655 1,000,113 152 f ribosomal protein L15e Nmar_1255 CENSYa_1580 33 1,151,421 1,152,080 660 f 1,589,264 1,592,332 1,023 r secreted periplasmic Zn-dependent protease Nmar_1257 CENSYa_0944 29 1,152,589 1,153,482 297 r 997,413 998,363 316 f hypothetical protein Nmar_1258 CENSYa_0943 71 1,153,583 1,154,515 310 r 996,419 997,351 310 f D-isomer specific 2-hydroxyacid dehydrogenase NAD-binding Nmar_1260 CENSYa_0942 67 1,156,106 1,157,467 453 f 994,972 996,420 482 r fumarate lyase Nmar_1261 CENSYa_0941 63 1,157,471 1,158,739 422 f 993,703 994,968 421 r MiaB-like tRNA modifying enzyme Nmar_1262 CENSYa_0940 46 1,158,813 1,159,763 316 f 992,699 993,631 310 r tubulin/FtsZ GTPase Nmar_1263 CENSYa_0939 42 1,159,800 1,160,363 187 r 991,969 992,622 217 f ribosomal protein L31e Nmar_1264 CENSYa_0938 84 1,160,364 1,160,522 52 r 991,810 991,968 52 f ribosomal protein L39e Nmar_1265 CENSYa_0937 54 1,160,560 1,161,111 183 r 991,179 991,784 201 f NUDIX hydrolase Nmar_1266 CENSYa_0936 35 1,161,181 1,161,918 245 f 990,356 991,099 247 r 5 10-methylenetetrahydrofolate reductase-like protein Nmar_1267 CENSYa_0935 71 1,161,915 1,164,413 832 r 987,875 990,361 828 f methionine synthase Nmar_1268 CENSYa_0934 66 1,164,446 1,165,408 320 r 986,890 987,843 317 f homocysteine S-methyltransferase Nmar_1269 CENSYa_0932 59 1,165,474 1,166,622 382 f 984,780 985,925 381 r major facilitator superfamily MFS_1 Nmar_1273 CENSYa_1796 50 1,168,392 1,169,255 287 r 1,775,021 1,776,016 331 f blue (type 1) copper domain protein Nmar_1276 CENSYa_1788 33 1,170,971 1,171,528 558 f 1,768,797 1,769,285 163 r hypothetical protein Nmar_1281 CENSYa_1788 41 1,174,343 1,174,807 465 r 1,768,797 1,769,285 163 r hypothetical protein Nmar_1282 CENSYa_0064 29 1,174,917 1,175,300 384 r 45,734 46,117 128 f hypothetical membrane protein Nmar_1283 CENSYa_0931 73 1,175,356 1,176,114 252 r 984,041 984,805 254 f binding-protein-dependent transport systems inner membrane component Nmar_1284 CENSYa_0930 71 1,176,120 1,176,884 254 r 983,286 984,044 252 f ABC transporter related Nmar_1285 CENSYa_0929 48 1,176,884 1,177,909 341 r 982,315 983,289 324 f aliphatic sulfonates family ABC transporter, periplsmic ligand-binding protein Nmar_1286 CENSYa_0928 69 1,178,045 1,179,244 399 f 981,022 982,206 394 r argininosuccinate synthase Nmar_1287 CENSYa_1455 67 1,179,241 1,179,408 55 f 1,467,880 1,468,047 55 f conserved hypothetical protein Nmar_1288 CENSYa_0927 67 1,179,408 1,180,265 285 f 980,160 981,005 281 r lysine biosynthesis enzyme LysX Nmar_1289 CENSYa_1203 85 1,180,294 1,181,340 348 f 1,229,148 1,230,194 348 f N-acetyl-gamma-glutamyl-phosphate reductase Nmar_1290 CENSYa_1204 69 1,181,343 1,182,146 267 f 1,230,197 1,230,997 266 f acetylglutamate kinase Nmar_1291 CENSYa_1205 65 1,182,139 1,183,320 393 f 1,231,005 1,232,156 383 f acetylornithine and succinylornithine aminotransferase Nmar_1292 CENSYa_1206 71 1,183,307 1,183,726 139 f 1,232,233 1,232,568 111 f transcriptional regulator, AsnC family Nmar_1293 CENSYa_1207 77 1,183,825 1,185,006 393 f 1,232,629 1,233,825 398 f pyruvate carboxyltransferase Nmar_1294 CENSYa_0123 72 1,185,028 1,185,195 55 f 99,306 99,473 55 f conserved hypothetical protein Nmar_1295 CENSYa_0124 76 1,185,192 1,186,034 280 f 99,470 100,318 282 f lysine biosynthesis enzyme LysX Nmar_1296 CENSYa_0125 74 1,186,038 1,187,165 375 f 100,362 101,444 360 f N-acetyl-ornithine/N-acetyl-lysine deacetylase Nmar_1297 CENSYa_0126 56 1,187,165 1,188,202 345 f 101,444 102,481 345 f diphthine synthase Nmar_1298 CENSYa_0127 58 1,188,243 1,188,863 206 f 102,510 103,118 202 f cytidyltransferase-related domain Nmar_1301 CENSYa_0128 53 1,191,274 1,191,957 227 r 103,115 103,696 193 r protein of unknown function DUF120 Nmar_1302 CENSYa_0129 71 1,191,966 1,193,108 380 r 103,800 104,867 355 r Toprim sub domain protein

Page 29: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1303 CENSYa_0131 86 1,193,261 1,193,611 116 r 105,653 106,003 116 r iron-sulfur cluster assembly accessory protein Nmar_1305 CENSYa_1821 56 1,194,146 1,194,745 199 r 1,800,958 1,801,551 197 r hypothetical protein Nmar_1306 CENSYa_1823 42 1,194,774 1,195,124 116 r 1,802,477 1,802,830 117 r hypothetical protein Nmar_1307 CENSYa_0033 42 1,195,212 1,195,736 525 f 24,575 25,207 211 f copper binding protein, plastocyanin/azurin family Nmar_1308 CENSYa_0166 77 1,195,786 1,196,547 253 r 149,820 150,575 251 r enoyl-CoA hydratase/isomerase Nmar_1309 CENSYa_0167 75 1,196,584 1,198,701 705 r 150,613 152,679 688 r CoA-binding domain protein Nmar_1310 CENSYa_0168 59 1,198,744 1,199,427 227 r 152,821 153,453 210 r glutamine amidotransferase class-I Nmar_1312 CENSYa_0173 75 1,199,847 1,201,121 424 r 156,263 157,543 426 r Glu/Leu/Phe/Val dehydrogenase Nmar_1314 CENSYa_0175 73 1,201,690 1,202,412 240 f 157,814 158,530 238 f proteasome endopeptidase complex Nmar_1315 CENSYa_0176 69 1,202,461 1,203,153 230 f 158,584 159,264 226 f methyltransferase type 11 Nmar_1316 CENSYa_0177 57 1,203,150 1,204,304 384 f 159,261 160,517 418 f major facilitator superfamily MFS_1 Nmar_1317 CENSYa_1824 44 1,204,301 1,204,615 104 r 1,802,898 1,803,227 109 r nitrogen regulatory protein P-II Nmar_1318 CENSYa_0178 56 1,204,740 1,205,087 115 f 160,861 161,199 112 f sec-independent translocation protein mttA/Hcf106 Nmar_1319 CENSYa_1368 56 1,205,088 1,205,864 258 r 1,398,175 1,398,969 264 r DNA methylase N-4/N-6 domain protein Nmar_1320 CENSYa_0180 57 1,206,022 1,208,469 815 f 161,579 164,041 820 f DEAD/DEAH box helicase domain protein Nmar_1321 CENSYa_1360 57 1,208,508 1,209,338 276 f 1,386,379 1,387,209 276 r short-chain dehydrogenase/reductase SDR Nmar_1322 CENSYa_0182 67 1,209,416 1,211,071 551 f 164,390 165,949 519 f 2-alkenal reductase Nmar_1328 CENSYa_0793 34 1,218,185 1,219,645 1,461 f 740,421 742,553 711 r hypothetical protein Nmar_1331 CENSYa_1176 46 1,221,245 1,222,054 269 f 1,208,048 1,208,812 254 r parB-like partition protein Nmar_1340 CENSYa_1332 54 1,227,903 1,228,823 921 r 1,340,590 1,341,576 329 r transcription initiation factor, TFIIB Nmar_1341 CENSYa_1825 40 1,229,190 1,230,119 930 f 1,803,378 1,804,292 305 f transcription initiation factor, TFIIB Nmar_1348 CENSYa_0779 56 1,234,431 1,234,745 315 f 731,412 731,969 186 f translation initiation factor 1 Nmar_1349 CENSYa_0926 35 1,234,843 1,236,567 1,725 r 978,506 980,158 551 f adenylate cyclase Nmar_1350 CENSYa_0461 26 1,236,680 1,237,030 351 f 385,999 386,391 131 f hypothetical protein Nmar_1353 CENSYa_1879 53 1,238,728 1,239,270 543 r 1,852,830 1,853,375 182 f hypothetical protein Nmar_1354 CENSYa_1582 38 1,239,281 1,240,615 1,335 r 1,593,556 1,594,596 347 r multicopper oxidase Nmar_1360 CENSYa_1138 38 1,246,835 1,247,611 258 f 1,166,066 1,166,956 296 f protein of unknown function DUF541 Nmar_1362 CENSYa_1266 42 1,248,624 1,248,965 342 f 1,286,943 1,287,272 110 r hypothetical protein Nmar_1367 CENSYa_0185 56 1,253,150 1,254,346 398 f 166,477 167,673 398 f hypothetical protein Nmar_1368 CENSYa_0186 44 1,254,350 1,255,168 272 f 167,673 168,464 263 f conserved hypothetical protein Nmar_1369 CENSYa_0187 76 1,255,165 1,255,335 56 r 168,454 168,735 93 r hypothetical protein Nmar_1370 CENSYa_0574 12 1,255,511 1,256,839 442 r 510,093 511,292 399 f conserved hypothetical protein Nmar_1377 CENSYa_0165 34 1,261,372 1,262,889 505 r 148,284 149,819 511 f hypothetical protein Nmar_1379 CENSYa_0471 64 1,264,147 1,265,178 343 f 394,117 395,085 322 r isocitrate dehydrogenase (NAD(+)) Nmar_1380 CENSYa_0470 48 1,265,179 1,265,715 178 r 393,630 394,127 165 f putative methylase Nmar_1381 CENSYa_0469 44 1,265,690 1,266,394 234 r 392,957 393,622 221 f ribosomal RNA adenine methylase transferase Nmar_1382 CENSYa_0468 59 1,266,391 1,266,957 188 r 392,352 392,927 191 f protein of unknown function DUF655 Nmar_1383 CENSYa_0467 79 1,266,978 1,267,304 108 r 392,018 392,344 108 f RNA polymerase Rpb4 Nmar_1384 CENSYa_0466 80 1,267,307 1,267,606 99 r 391,708 392,016 102 f ribosomal protein L21e Nmar_1386 CENSYa_0250 60 1,268,792 1,269,958 388 r 220,780 221,976 398 f DNA repair and recombination protein RadA Nmar_1387 CENSYa_0550 78 1,270,040 1,270,222 60 r 481,521 481,703 60 r hypothetical protein Nmar_1388 CENSYa_0554 81 1,270,263 1,271,987 574 r 482,581 484,305 574 r radical SAM domain protein Nmar_1389 CENSYa_0555 60 1,272,039 1,272,851 270 r 484,468 485,247 259 r oxidoreductase FAD/NAD(P)-binding domain protein Nmar_1390 CENSYa_0556 65 1,272,835 1,273,746 303 r 485,261 486,106 281 r dihydroorotate dehydrogenase family protein Nmar_1391 CENSYa_0557 51 1,273,832 1,275,244 470 f 486,266 487,444 392 f pre-mRNA processing ribonucleoprotein, binding domain protein Nmar_1392 CENSYa_0558 58 1,275,231 1,275,905 224 f 487,518 488,123 201 f non-specific serine/threonine protein kinase Nmar_1393 CENSYa_0559 55 1,275,902 1,276,519 205 r 488,120 488,737 205 r ribonuclease HII Nmar_1394 CENSYa_0561 48 1,276,580 1,277,728 382 f 489,649 490,782 377 f tRNA (guanine-N(2)-)-methyltransferase Nmar_1395 CENSYa_0562 66 1,277,718 1,278,311 197 r 490,779 491,369 196 r ribosomal RNA methyltransferase RrmJ/FtsJ Nmar_1396 CENSYa_0563 48 1,278,308 1,279,183 291 r 491,366 492,454 362 r hypothetical protein Nmar_1398 CENSYa_0794 72 1,279,813 1,281,906 697 f 742,613 744,697 694 r hypothetical protein Nmar_1399 CENSYa_0793 38 1,281,980 1,284,148 722 f 740,421 742,553 710 r hypothetical protein Nmar_1400 CENSYa_0792 57 1,284,188 1,284,700 170 f 739,797 740,309 170 r adenine phosphoribosyltransferase Nmar_1401 CENSYa_0791 68 1,284,700 1,285,491 263 f 739,022 739,744 240 r methylthioadenosine phosphorylase Nmar_1402 CENSYa_0790 68 1,285,488 1,285,772 94 r 738,741 739,025 94 f hypothetical protein Nmar_1403 CENSYa_0789 76 1,285,781 1,286,179 132 r 738,342 738,740 132 f hypothetical protein Nmar_1404 CENSYa_0788 58 1,286,201 1,286,551 116 r 738,060 738,320 86 f hypothetical protein Nmar_1405 CENSYa_0786 42 1,286,593 1,286,814 73 r 736,980 737,195 71 f hypothetical protein Nmar_1406 CENSYa_0785 47 1,286,833 1,287,345 170 f 736,361 736,807 148 r hypothetical protein Nmar_1407 CENSYa_0784 86 1,287,346 1,288,458 370 r 735,490 736,410 306 f DNA topoisomerase (ATP-hydrolyzing) Nmar_1408 CENSYa_0783 63 1,288,445 1,290,325 626 r 734,298 735,191 297 f DNA topoisomerase VI, B subunit Nmar_1409 CENSYa_0781 61 1,290,312 1,290,884 190 r 732,767 733,351 194 f KH type 1 domain protein Nmar_1410 CENSYa_0780 39 1,290,881 1,291,660 259 r 732,017 732,691 224 f non-specific serine/threonine protein kinase Nmar_1411 CENSYa_0829 62 1,291,717 1,292,013 98 r 811,368 811,664 98 f protein of unknown function DUF424 Nmar_1412 CENSYa_0828 84 1,292,013 1,292,435 140 r 810,946 811,368 140 f translation initiation factor IF2/IF5 Nmar_1413 CENSYa_1482 85 1,292,507 1,292,668 53 r 1,485,117 1,485,260 47 f conserved hypothetical protein Nmar_1414 CENSYa_1481 44 1,292,904 1,293,401 165 f 1,484,393 1,484,878 161 r hypothetical protein Nmar_1415 CENSYa_1471 71 1,293,472 1,294,617 381 f 1,477,323 1,478,564 413 r protein of unknown function DUF373 Nmar_1416 CENSYa_1470 75 1,294,697 1,295,488 263 f 1,476,506 1,477,294 262 r undecaprenyl diphosphate synthase Nmar_1417 CENSYa_1469 52 1,295,485 1,295,904 139 f 1,475,989 1,476,504 171 r NUDIX hydrolase Nmar_1418 CENSYa_1468 52 1,295,884 1,296,624 246 r 1,475,264 1,475,992 242 f orotidine 5'-phosphate decarboxylase Nmar_1419 CENSYa_1467 61 1,296,605 1,297,288 227 r 1,474,694 1,475,236 180 f hypothetical protein Nmar_1420 CENSYa_1463 70 1,297,338 1,298,339 333 r 1,472,438 1,473,454 338 f Adenylosuccinate synthase Nmar_1421 CENSYa_1462 58 1,298,420 1,298,881 153 f 1,471,829 1,472,290 153 r hypothetical protein Nmar_1423 CENSYa_0461 35 1,299,224 1,299,649 426 r 385,999 386,391 131 f hypothetical protein Nmar_1424 CENSYa_0492 71 1,299,803 1,300,012 69 f 416,951 417,175 74 f hypothetical protein Nmar_1425 CENSYa_1460 66 1,300,073 1,300,714 213 f 1,470,761 1,471,363 200 r hypothetical protein Nmar_1426 CENSYa_1459 62 1,300,835 1,301,122 95 f 1,470,393 1,470,677 94 r hypothetical protein Nmar_1427 CENSYa_1699 24 1,301,276 1,301,872 198 f 1,696,144 1,696,713 189 f Sua5/YciO/YrdC/YwlC family protein Nmar_1428 CENSYa_1384 43 1,301,869 1,302,390 173 f 1,409,614 1,410,144 176 r THUMP domain protein Nmar_1429 CENSYa_1383 39 1,302,382 1,303,155 257 r 1,408,801 1,409,625 274 f thiamineS protein Nmar_1430 CENSYa_0104 34 1,303,197 1,303,586 129 r 77,954 78,478 174 f hypothetical protein Nmar_1431 CENSYa_0054 52 1,303,617 1,304,435 272 r 40,327 41,076 249 r tatD-related deoxyribonuclease Nmar_1432 CENSYa_0055 70 1,304,454 1,304,690 78 f 41,164 41,391 75 f transcription factor CBF/NF-Y/histone domain protein Nmar_1443 CENSYa_0033 46 1,313,572 1,314,030 459 r 24,575 25,207 211 f copper binding protein, plastocyanin/azurin family Nmar_1445 CENSYa_1109 59 1,316,179 1,316,757 192 r 1,142,077 1,142,586 169 f hypothetical protein Nmar_1446 CENSYa_1082 79 1,316,841 1,317,122 93 f 1,121,523 1,121,801 92 r conserved hypothetical protein Nmar_1447 CENSYa_1078 53 1,317,902 1,319,200 432 f 1,119,536 1,120,801 421 r phosphoglucomutase/phosphomannomutase alpha/beta/alpha domain I Nmar_1448 CENSYa_1077 43 1,319,184 1,320,116 310 f 1,118,332 1,119,534 400 r thiamine-monophosphate kinase Nmar_1450 CENSYa_1076 93 1,320,746 1,321,072 108 f 1,117,961 1,118,287 108 r ribosomal protein S11

Page 30: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1451 CENSYa_1075 50 1,321,069 1,321,377 102 r 1,117,653 1,117,964 103 f hypothetical protein Nmar_1452 CENSYa_1074 56 1,321,377 1,321,781 134 r 1,117,267 1,117,653 128 f hypothetical protein Nmar_1453 CENSYa_1073 73 1,321,778 1,323,226 482 r 1,115,891 1,117,270 459 f protein of unknown function UPF0027 Nmar_1455 CENSYa_1069 66 1,323,582 1,324,943 453 f 1,113,242 1,114,597 451 r adenylosuccinate lyase Nmar_1456 CENSYa_1060 44 1,325,203 1,326,207 1,005 r 1,104,640 1,105,479 280 r secreted periplasmic Zn-dependent protease Nmar_1462 CENSYa_2060 37 1,329,980 1,331,845 1,866 r 2,030,326 2,034,024 1,233 r hypothetical protein Nmar_1464 CENSYa_0461 33 1,332,454 1,332,846 130 r 385,999 386,391 130 f hypothetical protein Nmar_1465 CENSYa_0766 28 1,333,198 1,333,590 393 r 719,113 719,487 125 r hypothetical protein Nmar_1466 CENSYa_1068 48 1,333,685 1,334,440 251 r 1,112,509 1,113,294 261 f asparagine synthase Nmar_1467 CENSYa_0688 29 1,334,547 1,335,326 259 f 649,353 651,266 637 f excalibur domain protein Nmar_1468 CENSYa_1064 62 1,335,364 1,335,639 91 f 1,110,183 1,110,506 107 r transcriptional coactivator/pterin dehydratase Nmar_1469 CENSYa_1063 61 1,335,702 1,336,520 272 f 1,109,241 1,110,053 270 r hypothetical protein Nmar_1473 CENSYa_1631 23 1,340,205 1,341,287 360 r 1,635,661 1,636,638 325 r DNA-cytosine methyltransferase Nmar_1477 CENSYa_1062 66 1,345,206 1,347,521 771 r 1,106,918 1,109,239 773 r aminoacyl-tRNA synthetase class Ia Nmar_1478 CENSYa_0065 47 1,347,536 1,348,153 205 r 46,119 46,724 201 r SNO glutamine amidotransferase Nmar_1479 CENSYa_0066 76 1,348,150 1,349,118 322 r 46,721 47,689 322 r vitamin B6 biosynthesis protein Nmar_1480 CENSYa_0069 57 1,349,149 1,349,682 177 r 48,395 48,928 177 r hypothetical protein Nmar_1481 CENSYa_1822 22 1,349,952 1,350,800 282 r 1,801,614 1,802,432 272 r putative signal transduction protein with CBS domains Nmar_1482 CENSYa_0071 70 1,350,892 1,353,156 754 r 49,155 51,419 754 r aconitate hydratase Nmar_1483 CENSYa_0073 79 1,353,303 1,353,932 209 f 51,561 52,199 212 f hypothetical protein Nmar_1484 CENSYa_0074 50 1,353,935 1,354,501 188 r 52,200 52,688 162 r hypothetical protein Nmar_1485 CENSYa_0075 96 1,354,520 1,354,759 79 r 52,786 53,025 79 r transcriptional regulator, AsnC family Nmar_1486 CENSYa_0076 68 1,354,803 1,355,222 139 r 53,068 53,514 148 r protein of unknown function DUF55 Nmar_1488 CENSYa_0077 54 1,355,702 1,357,345 547 r 53,729 54,940 403 r phenylalanyl-tRNA synthetase, beta subunit Nmar_1489 CENSYa_0079 61 1,357,336 1,358,724 462 r 55,338 56,711 457 r phenylalanyl-tRNA synthetase, alpha subunit Nmar_1490 CENSYa_0080 73 1,358,772 1,359,884 370 f 56,765 57,871 368 f tryptophanyl-tRNA synthetase Nmar_1493 CENSYa_0081 53 1,361,392 1,361,721 109 r 58,139 58,459 106 r hypothetical protein Nmar_1494 CENSYa_0082 57 1,361,774 1,362,250 158 r 58,537 58,995 152 r cytidyltransferase-related domain Nmar_1495 CENSYa_0084 37 1,362,284 1,363,348 354 r 59,423 60,526 367 r hypothetical protein Nmar_1496 CENSYa_0290 63 1,363,804 1,364,358 184 f 251,584 252,135 183 r alkyl hydroperoxide reductase/ Thiol specific antioxidant/ Mal allergen Nmar_1497 CENSYa_1117 34 1,364,467 1,365,354 295 r 1,147,995 1,149,071 358 f putative transcriptional regulator Nmar_1498 CENSYa_0404 62 1,365,473 1,365,856 127 f 347,787 348,173 128 f hypothetical protein Nmar_1499 CENSYa_0403 58 1,366,322 1,367,164 280 f 346,746 347,606 286 r DNA adenine methylase Nmar_1500 CENSYa_0402 93 1,367,240 1,367,890 216 r 345,988 346,653 221 f hypothetical protein Nmar_1501 CENSYa_0401 68 1,368,025 1,368,387 120 r 344,371 345,648 425 f hypothetical protein Nmar_1502 CENSYa_0399 95 1,368,507 1,369,079 190 r 343,290 343,859 189 f ammonia monooxygenase/methane monooxygenase, subunit C Nmar_1503 CENSYa_0394 82 1,369,326 1,369,895 189 f 340,470 341,045 191 r hypothetical protein Nmar_1504 CENSYa_0393 61 1,369,991 1,370,305 104 f 340,074 340,439 121 r hypothetical protein Nmar_1505 CENSYa_0392 51 1,370,346 1,370,864 172 f 339,283 340,068 261 r hypothetical protein Nmar_1506 CENSYa_0391 91 1,370,967 1,371,236 89 r 338,795 339,064 89 f hypothetical protein Nmar_1508 CENSYa_0103 73 1,371,803 1,372,252 149 f 77,110 77,556 148 r ribosomal S13S15 domain protein Nmar_1509 CENSYa_0102 63 1,372,252 1,373,673 473 f 75,704 77,035 443 r phosphoesterase DHHA1 Nmar_1510 CENSYa_0101 43 1,373,636 1,373,878 80 f 75,502 75,627 41 r hypothetical protein Nmar_1511 CENSYa_0100 61 1,373,878 1,375,143 421 f 74,246 75,502 418 r seryl-tRNA synthetase Nmar_1512 CENSYa_0099 70 1,375,184 1,375,795 203 f 73,575 74,186 203 r ribosomal protein S3Ae Nmar_1513 CENSYa_0098 32 1,375,822 1,376,247 141 f 73,152 73,565 137 r protein of unknown function DUF54 Nmar_1514 CENSYa_0097 71 1,376,438 1,377,895 485 f 71,565 73,016 483 r geranylgeranyl reductase Nmar_1515 CENSYa_0096 75 1,377,936 1,378,730 264 f 70,691 71,485 264 r ATPase associated with various cellular activities AAA_5 Nmar_1516 CENSYa_0095 58 1,378,731 1,380,293 520 f 69,126 70,691 521 r von Willebrand factor type A Nmar_1517 CENSYa_0094 64 1,380,359 1,381,903 514 f 66,985 68,868 627 r pyridoxal-5'-phosphate-dependent protein beta subunit Nmar_1519 CENSYa_0092 87 1,382,425 1,382,988 187 f 65,855 66,415 186 r TATA-box binding family protein Nmar_1522 CENSYa_0090 36 1,383,857 1,384,567 236 r 64,822 65,451 209 f hypothetical protein Nmar_1524 CENSYa_0075 83 1,385,420 1,385,659 240 r 52,786 53,025 80 r transcriptional regulator Nmar_1526 CENSYa_1777 54 1,386,804 1,388,972 722 r 1,760,307 1,763,201 964 r thrombospondin type 3 repeat Nmar_1527 CENSYa_1634 57 1,389,055 1,391,223 722 r 1,638,466 1,641,234 922 r thrombospondin type 3 repeat Nmar_1528 CENSYa_1634 45 1,391,235 1,393,613 2,379 r 1,638,446 1,641,234 930 r autotransporter adhesin Nmar_1530 CENSYa_1640 46 1,394,794 1,395,360 188 r 1,648,357 1,648,737 126 r non-canonical purine NTP pyrophosphatase, rdgB/HAM1 family Nmar_1531 CENSYa_1641 54 1,395,344 1,395,964 206 r 1,648,913 1,649,869 318 r Mn2+-dependent serine/threonine protein kinase Nmar_1533 CENSYa_1642 74 1,396,322 1,397,305 327 r 1,649,969 1,650,682 237 r putative metalloendopeptidase, glycoprotease family Nmar_1534 CENSYa_1644 66 1,397,307 1,398,503 398 r 1,650,955 1,652,142 395 r GTP-binding protein HSR1-related Nmar_1535 CENSYa_1645 41 1,398,567 1,399,115 182 f 1,652,305 1,652,766 153 f conserved hypothetical protein Nmar_1536 CENSYa_1646 55 1,399,135 1,400,658 507 f 1,652,780 1,654,306 508 f tRNA-guanine transglycosylase Nmar_1537 CENSYa_1647 95 1,400,699 1,401,001 100 r 1,654,313 1,654,615 100 r 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_1538 CENSYa_1648 77 1,401,142 1,401,354 70 r 1,654,830 1,655,054 74 r hypothetical protein Nmar_1539 CENSYa_1649 74 1,401,468 1,402,733 421 f 1,655,177 1,656,430 417 f beta-lactamase domain protein Nmar_1540 CENSYa_1651 47 1,402,714 1,403,577 287 f 1,656,651 1,657,328 225 f HhH-GPD family protein Nmar_1541 CENSYa_0004 60 1,403,726 1,404,076 116 f 2,904 3,251 115 r hypothetical protein Nmar_1542 CENSYa_0003 51 1,404,073 1,404,357 285 r 2,443 2,907 155 f copper binding protein, plastocyanin/azurin family Nmar_1543 CENSYa_0002 66 1,404,590 1,406,185 531 r 769 2,397 542 f cytochrome b/b6 domain Nmar_1544 CENSYa_0001 72 1,406,169 1,406,774 201 r 42 650 202 f Rieske (2Fe-2S) domain protein Nmar_1545 CENSYa_2066 57 1,406,818 1,410,627 1269 r 2,041,909 2,045,052 1047 f peptidase S8 and S53 subtilisin kexin sedolisin Nmar_1546 CENSYa_1415 61 1,410,659 1,411,663 334 r 1,435,019 1,435,984 321 r hypothetical protein Nmar_1547 CENSYa_0159 50 1,411,775 1,416,979 1734 r 133,102 138,144 1680 f hypothetical protein Nmar_1548 CENSYa_2046 64 1,417,196 1,418,527 443 r 2,007,223 2,008,599 458 f UBA/THIF-type NAD/FAD binding protein Nmar_1549 CENSYa_2045 73 1,418,527 1,419,741 404 r 2,005,985 2,007,223 412 f threonine synthase Nmar_1550 CENSYa_2044 32 1,419,947 1,420,534 195 f 2,005,331 2,005,741 136 r hypothetical protein Nmar_1551 CENSYa_2043 78 1,420,539 1,420,865 108 r 2,005,008 2,005,334 108 f phosphoribosyl-AMP cyclohydrolase Nmar_1552 CENSYa_2042 64 1,420,888 1,421,688 266 r 2,004,223 2,004,984 253 f imidazoleglycerol phosphate synthase, cyclase subunit

Nmar_1553 CENSYa_2041 55 1,421,678 1,422,388 236 r 2,003,492 2,004,196 234 f 1-(5-phosphoribosyl)-5-((5- phosphoribosylamino)methylideneamino)imidazole-4- carboxamideisomerase

Nmar_1554 CENSYa_2040 55 1,422,385 1,422,987 200 r 2,002,890 2,003,495 201 f imidazole glycerol phosphate synthase, glutamine amidotransferase subunit Nmar_1555 CENSYa_2039 63 1,422,987 1,423,574 195 r 2,002,309 2,002,890 193 f Imidazoleglycerol-phosphate dehydratase Nmar_1556 CENSYa_2038 46 1,423,603 1,424,523 306 r 2,001,344 2,002,309 321 f haloacid dehalogenase domain protein hydrolase Nmar_1557 CENSYa_2037 52 1,424,527 1,425,597 356 r 2,000,283 2,001,347 354 f aminotransferase class I and II Nmar_1558 CENSYa_2036 50 1,425,594 1,426,856 420 r 1,999,015 2,000,283 422 f histidinol dehydrogenase Nmar_1559 CENSYa_2035 82 1,426,856 1,427,833 325 r 1,998,044 1,999,018 324 f ATP phosphoribosyltransferase Nmar_1560 CENSYa_2026 61 1,427,870 1,429,126 418 r 1,990,738 1,991,985 415 f hydroxymethylglutaryl-CoA reductase, degradative Nmar_1561 CENSYa_2025 57 1,429,128 1,430,219 363 r 1,989,566 1,990,738 390 f glucose sorbosone dehydrogenase Nmar_1562 CENSYa_2023 44 1,430,379 1,431,731 450 r 1,987,870 1,989,396 508 f domain of unknown function DUF1743 Nmar_1563 CENSYa_2018 72 1,431,790 1,432,785 331 f 1,983,977 1,984,972 331 r diphthamide biosynthesis protein

Page 31: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1564 CENSYa_2017 44 1,432,780 1,433,379 199 r 1,983,345 1,983,965 206 f TPR repeat-containing protein Nmar_1565 CENSYa_1751 72 1,433,386 1,434,471 1,086 r 1,739,791 1,740,870 360 f Zn-dependent oxidoreductase Nmar_1566 CENSYa_2014 58 1,434,547 1,435,098 183 f 1,980,218 1,981,306 362 r RNA-binding protein (consists of S1 domain and a Zn-ribbon domain)-like protein Nmar_1567 CENSYa_2013 60 1,435,101 1,435,916 271 f 1,979,404 1,980,111 235 r prephenate dehydratase Nmar_1568 CENSYa_2012 40 1,435,913 1,436,380 155 r 1,978,994 1,979,407 137 f hypothetical protein Nmar_1569 CENSYa_2008 78 1,436,416 1,437,846 476 r 1,975,735 1,977,159 474 f inosine-5'-monophosphate dehydrogenase Nmar_1570 CENSYa_2007 46 1,437,892 1,438,737 281 r 1,974,898 1,975,725 275 f dihydropteroate synthase Nmar_1571 CENSYa_2006 47 1,438,772 1,439,506 244 f 1,974,097 1,974,816 239 r protein of unknown function DUF115 Nmar_1572 CENSYa_2004 40 1,439,563 1,440,114 183 f 1,972,862 1,973,485 207 r hypothetical protein Nmar_1573 CENSYa_2003 73 1,440,119 1,441,642 507 f 1,971,335 1,972,858 507 r GMP synthase, large subunit Nmar_1574 CENSYa_0461 37 1,441,652 1,442,008 357 f 385,999 386,391 131 f hypothetical protein Nmar_1575 CENSYa_0970 54 1,442,021 1,442,281 261 f 1,028,713 1,028,970 86 r conserved hypothetical protein Nmar_1576 CENSYa_2002 58 1,442,282 1,443,151 289 r 1,970,486 1,971,334 282 f ROK family protein Nmar_1577 CENSYa_2001 78 1,443,189 1,444,811 540 r 1,968,870 1,970,474 534 f thermosome Nmar_1578 CENSYa_2000 58 1,444,856 1,446,571 571 r 1,967,039 1,968,868 609 f adenine deaminase Nmar_1579 CENSYa_1999 74 1,446,690 1,446,971 93 f 1,966,668 1,966,949 93 r ribosomal protein L44E Nmar_1580 CENSYa_1695 40 1,446,968 1,447,165 65 f 1,693,717 1,694,391 224 r ribosomal protein S27E Nmar_1581 CENSYa_1998 38 1,447,148 1,447,732 194 r 1,966,100 1,966,696 198 f 5-deoxyadenosylcobinamide phosphate nucleotidyltransferase Nmar_1582 CENSYa_1997 56 1,447,729 1,448,454 241 r 1,965,441 1,966,100 219 f cobalamin 5'-phosphate synthase Nmar_1583 CENSYa_1996 54 1,448,447 1,449,418 323 r 1,964,417 1,965,379 320 f cobalamin biosynthesis protein CobD Nmar_1584 CENSYa_1995 52 1,449,415 1,450,254 279 r 1,963,590 1,964,420 276 f cobyrinic acid ac-diamide synthase Nmar_1585 CENSYa_1994 42 1,450,251 1,451,330 359 r 1,962,436 1,963,593 385 f aminotransferase class I and II Nmar_1586 CENSYa_1993 80 1,451,414 1,452,490 358 f 1,961,358 1,962,437 359 r aspartate-semialdehyde dehydrogenase Nmar_1587 CENSYa_1992 72 1,452,654 1,453,028 124 f 1,960,834 1,961,292 152 r transcriptional regulator, AsnC family Nmar_1588 CENSYa_1991 75 1,453,031 1,453,960 309 f 1,959,891 1,960,823 310 r protoheme IX farnesyltransferase Nmar_1591 CENSYa_1732 26 1,454,981 1,456,726 581 r 1,720,812 1,722,983 723 r hypothetical protein Nmar_1592 CENSYa_0091 58 1,456,838 1,457,221 127 f 65,453 65,809 118 r conserved hypothetical protein Nmar_1594 CENSYa_1467 36 1,457,903 1,458,568 666 r 1,474,694 1,475,236 181 f hypothetical protein Nmar_1597 CENSYa_1735 67 1,459,613 1,460,944 443 r 1,725,195 1,726,466 423 r UbiD family decarboxylase Nmar_1598 CENSYa_1736 38 1,460,937 1,461,278 113 r 1,726,513 1,727,001 162 r hypothetical protein Nmar_1599 CENSYa_0304 59 1,461,296 1,461,742 148 r 259,927 260,145 72 r hypothetical protein Nmar_1600 CENSYa_1744 59 1,462,053 1,462,715 220 r 1,731,593 1,732,126 177 f hypothetical protein Nmar_1601 CENSYa_1743 55 1,462,737 1,463,069 110 r 1,731,107 1,731,442 111 f hypothetical protein Nmar_1602 CENSYa_1742 66 1,463,132 1,463,473 113 r 1,730,734 1,731,072 112 f ubiquitin-associated- domain-containing protein Nmar_1603 CENSYa_1741 62 1,463,470 1,463,943 157 r 1,730,258 1,730,734 158 f PUA domain containing protein Nmar_1604 CENSYa_1739 64 1,464,034 1,465,080 348 f 1,728,946 1,729,980 344 r phosphate uptake regulator, PhoU Nmar_1605 CENSYa_1738 65 1,465,136 1,466,251 371 f 1,727,846 1,728,928 360 r GTP1/OBG protein Nmar_1606 CENSYa_1737 58 1,466,241 1,466,774 177 f 1,727,322 1,727,849 175 r protein of unknown function DUF127 Nmar_1607 CENSYa_1760 87 1,466,801 1,467,316 171 f 1,746,840 1,747,355 171 f transcription factor TFIIE, alpha subunit Nmar_1610 CENSYa_1761 60 1,469,218 1,469,775 185 r 1,747,356 1,747,907 183 r hemerythrin HHE cation binding domain protein Nmar_1612 CENSYa_1759 42 1,470,160 1,470,816 218 r 1,746,161 1,746,814 217 f bifunctional deaminase-reductase domain protein Nmar_1613 CENSYa_1758 56 1,470,803 1,471,993 396 r 1,744,975 1,746,153 392 f amidohydrolase Nmar_1614 CENSYa_1757 42 1,471,987 1,472,742 251 r 1,744,421 1,744,978 185 f GTP cyclohydrolase IIa Nmar_1615 CENSYa_1756 67 1,472,739 1,473,155 138 r 1,743,846 1,744,256 136 f 6,7-dimethyl-8-ribityllumazine synthase Nmar_1616 CENSYa_1755 63 1,473,158 1,473,823 221 r 1,743,156 1,743,830 224 f 3,4-dihydroxy-2-butanone 4-phosphate synthase Nmar_1617 CENSYa_1754 60 1,473,902 1,474,495 197 r 1,742,512 1,743,099 195 f riboflavin synthase, alpha subunit Nmar_1618 CENSYa_1753 43 1,474,520 1,475,245 241 r 1,741,800 1,742,507 235 f TPR repeat-containing protein Nmar_1621 CENSYa_1752 54 1,476,524 1,477,378 284 r 1,740,877 1,741,743 288 f citryl-CoA lyase Nmar_1622 CENSYa_1751 83 1,477,386 1,478,462 358 r 1,739,791 1,740,870 359 f alcohol dehydrogenase zinc-binding domain protein Nmar_1623 CENSYa_1764 52 1,478,505 1,478,786 93 r 1,750,041 1,750,328 95 r DNA-binding TFAR19-related protein Nmar_1624 CENSYa_1765 79 1,478,794 1,479,246 150 r 1,750,336 1,750,770 144 r ribosomal protein S19e Nmar_1625 CENSYa_1768 52 1,479,729 1,480,397 222 r 1,751,289 1,751,957 222 r suppressor Mra1 family protein Nmar_1626 CENSYa_1769 45 1,480,399 1,481,523 374 r 1,751,954 1,753,033 359 r hypothetical protein Nmar_1627 CENSYa_1770 75 1,481,570 1,484,224 884 r 1,753,072 1,755,717 881 r ribonucleoside-diphosphate reductase, adenosylcobalamin-dependent Nmar_1628 CENSYa_1772 71 1,484,428 1,486,599 723 r 1,756,597 1,758,057 486 r regulatory protein ArsR Nmar_1629 CENSYa_1778 69 1,486,737 1,486,910 57 r 1,763,185 1,763,346 53 r hypothetical protein Nmar_1630 CENSYa_1779 67 1,486,969 1,487,355 128 r 1,763,411 1,763,782 123 r protein of unknown function DUF35 Nmar_1631 CENSYa_1780 82 1,487,336 1,488,499 387 r 1,763,784 1,764,938 384 r propanoyl-CoA C-acyltransferase Nmar_1632 CENSYa_1782 54 1,488,548 1,489,006 152 f 1,765,747 1,766,373 208 f DNA topoisomerase type IA zn finger domain protein Nmar_1633 CENSYa_1783 54 1,488,990 1,489,796 268 r 1,766,363 1,767,172 269 r nitrilase/cyanide hydratase and apolipoprotein N-acyltransferase Nmar_1634 CENSYa_1784 70 1,489,962 1,490,138 58 r 1,767,203 1,767,379 58 r 4-oxalocrotonate tautomerase Nmar_1635 CENSYa_1785 79 1,490,172 1,490,930 252 r 1,767,418 1,768,185 255 r rhomboid family protein Nmar_1636 CENSYa_1874 30 1,490,971 1,492,524 1,554 r 1,846,587 1,848,137 517 r secreted periplasmic Zn-dependent protease Nmar_1637 CENSYa_1060 50 1,492,676 1,494,547 1,872 r 1,104,640 1,105,479 280 r secreted periplasmic Zn-dependent protease Nmar_1638 CENSYa_0049 30 1,494,591 1,496,462 1,872 r 35,479 36,987 503 r hypothetical protein Nmar_1639 CENSYa_0047 22 1,496,471 1,497,802 443 r 34,090 35,028 312 f hypothetical protein Nmar_1640 CENSYa_0046 23 1,497,802 1,499,169 455 r 32,852 34,093 413 f hypothetical protein Nmar_1641 CENSYa_0043 50 1,499,144 1,500,094 316 r 29,939 30,874 311 r von Willebrand factor type A Nmar_1642 CENSYa_0044 41 1,500,095 1,500,967 290 r 30,878 31,708 276 r conserved hypothetical protein Nmar_1643 CENSYa_0045 56 1,500,977 1,501,999 340 r 31,737 32,744 335 r ATPase associated with various cellular activities AAA_3 Nmar_1644 CENSYa_1060 41 1,402,127 1,503,287 101,161 r 1,104,640 1,105,479 280 r secreted periplasmic Zn-dependent protease Nmar_1645 CENSYa_1788 57 1,503,352 1,503,852 166 r 1,768,797 1,769,285 162 r transcriptional regulator, ArsR family Nmar_1646 CENSYa_1789 64 1,503,845 1,504,201 118 r 1,769,293 1,769,694 133 r hypothetical protein Nmar_1647 CENSYa_1874 36 1,504,301 1,505,896 531 r 1,846,587 1,848,137 516 r hypothetical protein Nmar_1648 CENSYa_1791 50 1,506,093 1,507,811 572 r 1,769,957 1,771,690 577 r hypothetical protein Nmar_1649 CENSYa_1793 58 1,507,916 1,508,743 275 f 1,771,994 1,772,821 275 f hypothetical protein Nmar_1650 CENSYa_1795 45 1,508,803 1,509,615 813 f 1,774,086 1,774,961 292 f copper binding protein, plastocyanin/azurin family Nmar_1651 CENSYa_1797 47 1,509,629 1,510,321 230 r 1,776,029 1,776,727 232 r hypothetical protein Nmar_1652 CENSYa_1798 52 1,510,361 1,513,144 927 f 1,776,761 1,779,544 927 f copper resistance D domain protein Nmar_1653 CENSYa_1053 52 1,513,134 1,513,952 272 r 1,099,722 1,100,555 277 r ABC-3 protein Nmar_1654 CENSYa_1054 47 1,513,955 1,514,677 240 r 1,100,560 1,101,345 261 r ABC transporter related Nmar_1655 CENSYa_1055 26 1,514,671 1,515,591 306 r 1,101,351 1,103,033 560 r periplasmic solute binding protein Nmar_1656 CENSYa_1056 59 1,515,683 1,516,066 127 f 1,103,156 1,103,551 131 f putative transcriptional regulator, CopG family Nmar_1658 CENSYa_0500 42 1,516,880 1,517,671 263 f 420,881 421,621 246 r DSBA oxidoreductase Nmar_1659 CENSYa_0920 69 1,517,700 1,518,122 423 r 975,284 975,703 140 f universal stress protein Nmar_1662 CENSYa_1581 59 1,520,144 1,521,322 1,179 r 1,592,406 1,593,530 375 r divalent heavy-metal cation transporter Nmar_1663 CENSYa_1582 79 1,521,312 1,522,373 1,062 r 1,593,556 1,594,596 347 r multicopper oxidase Nmar_1664 CENSYa_1201 57 1,522,483 1,522,974 492 r 1,226,888 1,227,256 123 f Mn-dependent transcriptional regulator Nmar_1665 CENSYa_0033 37 1,523,135 1,523,638 504 r 24,575 25,207 211 f copper binding protein, plastocyanin/azurin family Nmar_1670 CENSYa_0500 52 1,529,002 1,529,724 723 f 420,881 421,621 247 r protein-disulfide isomerase

Page 32: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1671 CENSYa_0516 39 1,530,045 1,531,619 1,575 f 451,910 453,436 509 r streptogramin lyase Nmar_1673 CENSYa_1238 53 1,532,270 1,533,415 1,146 r 1,263,583 1,264,740 386 f trypsin-like serine protease Nmar_1674 CENSYa_1788 30 1,533,567 1,534,310 744 f 1,768,797 1,769,285 163 r hypothetical protein Nmar_1676 CENSYa_1874 30 1,534,989 1,536,710 1,722 r 1,846,587 1,848,137 517 r secreted periplasmic Zn-dependent protease Nmar_1678 CENSYa_1796 40 1,537,367 1,538,377 1,011 r 1,775,021 1,776,016 332 f copper binding protein, plastocyanin/azurin family Nmar_1679 CENSYa_1905 57 1,539,601 1,540,227 627 f 1,872,493 1,873,206 238 f hypothetical protein Nmar_1684 CENSYa_1799 45 1,543,991 1,545,364 457 r 1,779,532 1,780,869 445 r hypothetical protein Nmar_1686 CENSYa_1440 73 1,546,135 1,547,064 309 r 1,454,784 1,455,713 309 r aspartate carbamoyltransferase Nmar_1687 CENSYa_1441 75 1,547,106 1,547,567 153 f 1,455,753 1,456,214 153 f aspartate carbamoyltransferase, regulatory subunit Nmar_1688 CENSYa_1442 69 1,547,603 1,547,911 102 r 1,456,242 1,456,541 99 r H+transporting two-sector ATPase C subunit Nmar_1689 CENSYa_1444 83 1,548,095 1,548,724 209 r 1,456,722 1,457,087 121 r V-type ATPase, D subunit Nmar_1690 CENSYa_1445 88 1,548,729 1,550,114 461 r 1,457,356 1,458,744 462 r sodium-transporting two-sector ATPase Nmar_1691 CENSYa_1446 86 1,550,111 1,551,889 592 r 1,458,741 1,460,519 592 r H+transporting two-sector ATPase alpha/beta subunit central region Nmar_1692 CENSYa_1447 50 1,551,891 1,552,487 198 r 1,460,521 1,461,117 198 r H+transporting two-sector ATPase E subunit Nmar_1693 CENSYa_1451 66 1,552,577 1,554,676 699 f 1,462,539 1,464,614 691 f V-type ATPase 116 kDa subunit Nmar_1697 CENSYa_1452 51 1,557,401 1,558,177 258 r 1,464,598 1,465,389 263 r rhodanese domain protein Nmar_1698 CENSYa_1453 81 1,558,251 1,559,816 521 r 1,465,453 1,467,018 521 r ammonium transporter Nmar_1699 CENSYa_1454 48 1,559,945 1,560,610 221 f 1,467,168 1,467,824 218 f hypothetical protein Nmar_1700 CENSYa_1456 59 1,560,670 1,561,704 344 f 1,468,122 1,469,162 346 f proton-transporting two-sector ATPase Nmar_1701 CENSYa_1458 61 1,561,983 1,562,630 215 f 1,469,415 1,470,062 215 f hypothetical protein Nmar_1703 CENSYa_1533 40 1,562,875 1,563,429 184 r 1,546,355 1,546,888 177 f hypothetical protein Nmar_1704 CENSYa_1532 60 1,563,501 1,563,851 116 f 1,545,916 1,546,263 115 r hypothetical protein Nmar_1705 CENSYa_1531 64 1,563,854 1,564,537 227 r 1,545,226 1,545,912 228 f uridylate kinase, putative Nmar_1706 CENSYa_1530 56 1,564,534 1,565,769 411 r 1,544,003 1,545,226 407 f metal dependent phosphohydrolase Nmar_1707 CENSYa_1529 40 1,565,766 1,566,347 193 r 1,543,420 1,543,980 186 f thymidylate kinase Nmar_1708 CENSYa_1510 52 1,566,629 1,567,009 126 f 1,522,043 1,522,426 127 r heat shock protein Hsp20 Nmar_1709 CENSYa_1509 64 1,567,013 1,568,557 514 r 1,520,453 1,521,997 514 f ABC-1 domain protein Nmar_1710 CENSYa_1508 39 1,568,561 1,568,944 127 r 1,519,903 1,520,436 177 f conserved hypothetical protein Nmar_1712 CENSYa_1507 56 1,569,640 1,570,932 430 r 1,518,604 1,519,875 423 f hypothetical protein Nmar_1713 CENSYa_1506 68 1,570,929 1,571,552 207 r 1,517,927 1,518,607 226 f ABC transporter related Nmar_1714 CENSYa_1505 77 1,571,629 1,572,816 395 r 1,516,768 1,517,925 385 f protein of unknown function DUF214 Nmar_1715 CENSYa_1504 56 1,572,933 1,573,835 300 f 1,515,644 1,516,588 314 r putative transcriptional regulator, AsnC family Nmar_1717 CENSYa_1503 53 1,574,377 1,575,357 326 r 1,514,679 1,515,656 325 f phosphate uptake regulator, PhoU Nmar_1718 CENSYa_1502 49 1,575,422 1,576,333 303 f 1,513,742 1,514,614 290 r GHMP kinase Nmar_1719 CENSYa_1501 52 1,576,344 1,577,108 254 f 1,512,977 1,513,735 252 r Ppotein of unknown function DUF137 Nmar_1720 CENSYa_1500 57 1,577,101 1,577,940 279 f 1,512,166 1,512,954 262 r 3-methyl-2-oxobutanoate hydroxymethyltransferase Nmar_1721 CENSYa_1499 57 1,577,930 1,579,174 414 f 1,510,896 1,512,083 395 r phosphopantothenoylcysteine decarboxylase/phosphopantothenate--cysteine ligase Nmar_1722 CENSYa_1498 43 1,579,343 1,580,185 280 f 1,510,052 1,510,873 273 r methyltransferase type 11 Nmar_1724 CENSYa_1497 63 1,580,685 1,581,749 354 r 1,508,989 1,510,050 353 f peptidase M24 Nmar_1725 CENSYa_1496 74 1,581,822 1,582,835 337 f 1,507,907 1,508,917 336 r tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase Nmar_1726 CENSYa_1495 75 1,582,909 1,583,187 92 f 1,507,570 1,507,869 99 r protein of unknown function DUF1621 Nmar_1727 CENSYa_1545 79 1,583,179 1,585,119 646 r 1,554,047 1,555,984 645 r beta-lactamase domain protein Nmar_1728 CENSYa_1546 82 1,585,127 1,585,759 210 r 1,555,992 1,556,579 195 r proteasome endopeptidase complex Nmar_1729 CENSYa_1547 64 1,585,837 1,586,901 354 f 1,556,693 1,557,757 354 f 3-dehydroquinate synthase Nmar_1730 CENSYa_1548 45 1,586,903 1,587,649 248 r 1,557,762 1,558,547 261 r peptidylprolyl isomerase FKBP-type Nmar_1731 CENSYa_1549 70 1,587,692 1,588,837 381 r 1,558,576 1,559,706 376 r serine--pyruvate transaminase Nmar_1732 CENSYa_1550 41 1,588,879 1,589,373 164 r 1,559,764 1,560,099 111 r cytidyltransferase-related domain Nmar_1733 CENSYa_1552 42 1,589,437 1,590,453 338 f 1,560,320 1,561,339 339 f RNA-3'-phosphate cyclase Nmar_1734 CENSYa_1553 58 1,590,433 1,590,930 165 r 1,561,319 1,561,828 169 r hypothetical protein Nmar_1735 CENSYa_1554 46 1,591,012 1,591,506 164 f 1,561,963 1,562,388 141 f hypothetical protein Nmar_1736 CENSYa_1556 66 1,591,671 1,592,381 236 f 1,562,476 1,563,285 269 f cob(II)yrinic acid a,c-diamide reductase Nmar_1737 CENSYa_1558 60 1,592,549 1,593,073 174 f 1,563,477 1,564,001 174 f NADPH-dependent FMN reductase Nmar_1738 CENSYa_1559 28 1,593,432 1,593,893 153 r 1,564,055 1,564,522 155 r hypothetical protein Nmar_1739 CENSYa_1439 61 1,593,978 1,594,208 76 r 1,454,255 1,454,566 103 f hypothetical protein Nmar_1740 CENSYa_1438 75 1,594,446 1,594,922 158 f 1,453,532 1,454,011 159 r peptidylprolyl isomerase Nmar_1741 CENSYa_1437 33 1,594,923 1,595,819 298 f 1,452,639 1,453,535 298 r hypothetical protein Nmar_1744 CENSYa_1435 50 1,596,801 1,597,250 149 f 1,451,646 1,452,200 184 r hypothetical protein Nmar_1745 CENSYa_1434 75 1,597,262 1,598,155 297 f 1,450,743 1,451,636 297 r PfkB domain protein Nmar_1746 CENSYa_1433 66 1,598,195 1,598,542 115 f 1,450,340 1,450,642 100 r dUTPase Nmar_1747 CENSYa_1432 43 1,598,547 1,599,014 155 r 1,449,921 1,450,346 141 f tetrahydromethanopterin S-methyltransferase 23 kD subunit Nmar_1748 CENSYa_1431 52 1,599,007 1,600,140 377 r 1,448,791 1,449,873 360 f glutamine--scyllo-inositol transaminase Nmar_1749 CENSYa_1430 63 1,600,176 1,600,748 190 f 1,448,141 1,448,704 187 r hypothetical protein Nmar_1751 CENSYa_1428 63 1,601,298 1,602,344 348 f 1,446,783 1,447,820 345 r tyrosyl-tRNA synthetase Nmar_1752 CENSYa_1427 75 1,602,341 1,603,135 264 r 1,446,017 1,446,802 261 f precorrin-3B C17-methyltransferase Nmar_1753 CENSYa_1424 47 1,603,180 1,604,646 488 r 1,442,671 1,444,221 516 f hypothetical protein Nmar_1754 CENSYa_1423 58 1,604,682 1,606,085 467 r 1,441,118 1,442,497 459 f aspartate kinase Nmar_1755 CENSYa_1422 77 1,606,121 1,606,867 248 r 1,440,313 1,441,059 248 f proliferating cell nuclear antigen PcnA Nmar_1756 CENSYa_1421 60 1,606,902 1,607,219 105 r 1,439,929 1,440,240 103 f transcription termination factor Tfs Nmar_1757 CENSYa_1420 54 1,607,221 1,607,499 92 r 1,439,648 1,439,932 94 f hypothetical protein Nmar_1758 CENSYa_1300 62 1,607,551 1,608,087 178 f 1,310,299 1,310,913 204 r hypothetical protein Nmar_1759 CENSYa_1298 50 1,608,281 1,608,427 48 f 1,309,959 1,310,108 49 r hypothetical protein Nmar_1760 CENSYa_1297 58 1,608,424 1,608,933 169 r 1,309,462 1,309,962 166 f dual specificity protein phosphatase Nmar_1761 CENSYa_0285 50 1,608,962 1,609,603 213 r 248,375 249,004 209 r cyclase family protein Nmar_1762 CENSYa_0286 64 1,609,679 1,609,936 85 f 249,085 249,345 86 f transcriptional regulator, AsnC family Nmar_1763 CENSYa_0287 50 1,609,931 1,611,076 381 r 249,340 250,464 374 r aminotransferase class V Nmar_1764 CENSYa_0288 24 1,611,103 1,611,699 198 f 250,539 251,102 187 f hypothetical protein Nmar_1765 CENSYa_0289 92 1,611,819 1,612,118 99 f 251,173 251,472 99 f 4Fe-4S ferredoxin iron-sulfur binding domain protein Nmar_1766 CENSYa_0926 44 1,612,286 1,614,061 591 f 978,506 980,158 550 f adenylate/guanylate cyclase with integral membrane sensor Nmar_1767 CENSYa_0925 55 1,614,058 1,615,212 384 r 977,223 978,425 400 f heat shock protein DnaJ domain protein Nmar_1768 CENSYa_0920 73 1,615,312 1,615,734 140 r 975,284 975,703 139 f UspA domain protein Nmar_1770 CENSYa_1873 64 1,617,135 1,617,791 218 r 1,845,999 1,846,613 204 f hypothetical protein Nmar_1771 CENSYa_1872 79 1,617,863 1,619,308 481 r 1,844,447 1,845,874 475 f glutamine synthetase, type I Nmar_1772 CENSYa_1871 70 1,619,396 1,620,742 448 r 1,843,004 1,844,374 456 f lysine 2,3-aminomutase related protein Nmar_1773 CENSYa_1870 65 1,620,969 1,622,612 547 f 1,841,072 1,842,844 590 r MscS Mechanosensitive ion channel Nmar_1774 CENSYa_1869 48 1,622,652 1,623,530 292 f 1,840,087 1,840,938 283 r ribose-phosphate pyrophosphokinase Nmar_1776 CENSYa_1867 59 1,624,474 1,625,373 299 f 1,838,172 1,839,068 298 r ribonuclease Z Nmar_1779 CENSYa_1865 38 1,626,908 1,627,312 134 r 1,837,467 1,837,871 134 f protein of unknown function DUF54 Nmar_1780 CENSYa_1864 57 1,627,309 1,627,902 197 r 1,836,923 1,837,453 176 f dephospho-CoA kinase, CoaE Nmar_1781 CENSYa_1863 38 1,627,930 1,628,472 180 f 1,836,293 1,836,820 175 r 2'-5' RNA ligase Nmar_1782 CENSYa_1862 45 1,628,469 1,629,800 443 f 1,834,968 1,836,296 442 r tRNA cytidylyltransferase

Page 33: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar_1783 CENSYa_1861 52 1,629,775 1,630,536 253 f 1,834,247 1,834,936 229 r putative serine/threonine protein kinase Nmar_1784 CENSYa_1858 50 1,630,667 1,630,876 69 f 1,833,306 1,833,506 66 r hypothetical protein Nmar_1786 CENSYa_1857 77 1,632,027 1,632,767 246 r 1,832,578 1,833,309 243 f short-chain dehydrogenase/reductase SDR Nmar_1787 CENSYa_1856 75 1,632,871 1,633,092 73 f 1,832,252 1,832,497 81 r AN1-type Zinc finger protein Nmar_1788 CENSYa_1855 53 1,633,089 1,633,373 94 r 1,831,980 1,832,255 91 f hypothetical protein Nmar_1789 CENSYa_1854 45 1,633,406 1,633,963 185 r 1,831,380 1,831,976 198 f hypothetical protein Nmar_1790 CENSYa_1853 82 1,634,005 1,635,477 490 f 1,829,910 1,831,358 482 r glutamine synthetase, type I Nmar_1791 CENSYa_1852 57 1,635,474 1,635,752 92 r 1,829,653 1,829,907 84 f hypothetical protein Nmar_1792 CENSYa_1851 82 1,635,808 1,637,520 570 r 1,827,824 1,829,524 566 f thermosome Nmar_1793 CENSYa_1849 77 1,637,624 1,638,946 440 f 1,825,900 1,827,225 441 r glycine hydroxymethyltransferase Nmar_1794 CENSYa_1827 70 1,638,948 1,642,325 1125 r 1,804,970 1,808,308 1112 r DNA polymerase II, large subunit DP2 Nmar_1795 CENSYa_1828 68 1,642,337 1,643,089 250 r 1,808,313 1,809,098 261 r geranylgeranylglyceryl phosphate synthase Nmar_1796 CENSYa_1834 58 1,643,155 1,643,517 120 f 1,811,835 1,812,197 120 f hypothetical protein Nmar_1797 CENSYa_1835 40 1,643,644 1,644,246 200 f 1,812,229 1,812,876 215 f hypothetical protein Nmar_1798 CENSYa_1836 56 1,644,373 1,644,534 53 f 1,813,052 1,813,207 51 f transcriptional regulator, AbrB family Nmar_1799 CENSYa_1838 62 1,644,540 1,644,926 128 r 1,813,938 1,814,321 127 r hypothetical protein

N. maritimus ORFs highlighted in blue and bold represent genes implicated in the proposed archaeal ammonia oxidation pathway.

Page 34: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 5: Predicted distribution pattern of proteins diagnostic for major archaeal phyla.

Protein COG N. maritimus C. symbiosium Euryarchaeota Crenaracheota Cell division protein FtsZ COG0206 YP_001582596 YP_875876 + - Predicted ATP-dependent protease COG1067 - - + - ERCC4-type nuclease COG1111 YP_001582278 YP_876110 + - MiaB family, Radical SAM enzyme COG1244 - - + - Archaeal DNA polymerase D, small subunit COG1311 YP_001581340 YP_876754 + -

Archaeal DNA polymerase D, large subunit COG1933 YP_001583128 YP_876741 + -

Predicted membrane protein COG1422 YP_001581738 YP_875263 + - Uncharacterised MobA-related protein COG1873 -/?1 -/?1 + - Uncharacterised conserved protein COG2450 YP_001583060 YP_876421 + - Predicted membrane protein COG4243 - - - + ribosomal protein L13E COG4352 - - - + Uncharacterised protein conserved in archaea COG4755 - - - +

Uncharacterised conserved protein COG4353 - - - + Uncharacterised protein conserved in archaea COG4879 - - - +

Predicted aminopeptidase, Iap family COG4882 - - - + Uncharacterized Zn ribbon-containing protein COG4888 - - - +

Predicted metallopeptidase COG4900 - - - +

Ribosomal protein S25 COG4901 -/? (YP_001582264) -/? (YP_875201) - +

Predicted nucleotidyltransferase COG4914 - - - + Ribosomal protein S30 COG4919 YP_001582721 YP_875487 - + Predicted membrane protein COG4920 - - - + Uncharacterised protein conserved in archaea COG5399 - - - +

Uncharacterized conserved protein containing a coiled-coil COG5493 - - - +

Predicted thioredoxin/glutaredoxin COG5494 - - - + EMAP domain RNA-binding protein COG0073 YP_001581931 - + + Phosphoribosylanthranilate isomerase COG0135 - - + + Molybdopterin biosynthesis enzyme COG0303 - - + + Hydrogenase maturation factor COG0309 - - + + Molybdenum cofactor biosynthesis enzyme COG0315 - - + +

Dihydrodipicolinate synthase/N-acetylneuraminate lyase COG0329 - - + +

Membrane protease subunit, stomatin/prohibitin homolog COG0330 -:? - + +

NAD(FAD)-dependent dehydrogenase COG0446 -/? - + + Molybdopterin biosynthesis enzyme COG0521 - - + + Topoisomerase IA COG0550 - - + + CDP-diglyceride synthetase COG0575 - - + + Methionine synthase II (cobalamin-independent) COG0620 - - + +

Deoxycytidine deaminase COG0717 - - + + ABC-type molybdate transport system, periplasmic component COG0725 - - + +

Molybdopterin-guanine dinucleotide biosynthesis protein A COG0746 - - + +

Predicted transcriptional regulator COG1395 - - + + NMD protein affecting ribosome stability and mRNA decay COG1499 - - + +

Page 35: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Molybdopterin-guanine dinucleotide biosynthesis protein COG1763 - - + +

Archaeal S-adenosylmethionine synthetase COG1812 - - + +

Predicted transcription factor, homolog of eukaryotic MBF1 COG1813 - - + +

Transcriptional regulators COG1846 - - + + Zn-dependent protease COG1994 - - + + Ribosomal protein S24E COG2004 - - + + Zn-dependent hydrolase of the beta-lactamase fold COG2220 - - + +

Molybdenum cofactor biosynthesis enzyme COG2896 - - + +

1 - Proteins present in N. maritmius and C. symbiosum showing similarity with the euryarchaeal protein. ? - Proteins present in N. maritimus or/and C. symbiosium but (i) showing a weak similarity or (ii) sharing only a domain with euryarchaeal or crenarchaeal proteins

Page 36: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Supplemental Table 6: Gene predictions conserved between the N. maritimus genome and crenarchaeal fosmids and Sargasso Sea contigs.

CRENARCHAEAL GENOME FRAGMENTS ORF Genome fragment Strand Left bp Right bp Annotation of top BLAST hit Nmar0007 DeepAntEC39 f 5,639 7,168 Peptidylprolyl isomerase Nmar0129 DeepAntEC39 r 119,812 119,660 hypothetical protein Nmar0221 54d9 r 196,972 196,736 Like-Sm ribonucleoprotein core Nmar0271 54d9 f 239,256 239,546 Putative uncharacterized protein Nmar0302 DeepAntEC39 r 268,839 268,579 hypothetical protein Nmar0344 4B7 r 308,168 307,311 hypothetical protein Nmar0375 54d9 f 341,630 342,262 protein of unknown function UPF0126

Nmar0413 54d9 f 370,822 372,732 2-oxoglutarate ferredoxin oxidoreductase, alpha subunit

Nmar0414 54d9 f 372,722 373,684 2-oxoglutarate ferredoxin oxidoreductase, beta subunit

Nmar0429 54d9 f 383,085 383,438 hypothetical protein Nmar0438 74A4 f 388,048 388,542 Tetratricopeptide TPR_2 repeat protein Nmar0456 54d9 f 402,319 410,613 hypothetical protein Nmar0527 4B7 f 469,717 469,908 Cold-shock protein DNA-binding Nmar0570 54d9 f 507,793 508,344 DNA-3-methyladenine glycosylase I Nmar0601 4B7 r 536,325 537,218 Methionine aminopeptidase Nmar0602 4B7 f 537,506 537,751 Zn-ribbon protein Nmar0616 4B7 f 554,316 557,606 Putative uncharacterized protein Nmar0617 4B7 f 557,641 558,132 Protein of unknown function DUF367 Nmar0624 54d9 r 565,880 564,972 Transcription factor TFIIB cyclin-related Nmar0639 4B7 f 579,023 579,685 DSBA oxidoreductase Nmar0640 4B7 f 579,686 580,165 hypothetical protein Nmar0641 4B7 f 580,411 581,352 hypothetical protein Nmar0642 4B7 r 582,430 581,336 DNA-directed DNA polymerase Nmar0672 DeepAntEC39 r 613,656 612,652 thioredoxin reductase Nmar0726 4B7 f 659,913 661,637 SNF2-related protein Nmar0735 DeepAntEC39, 54d9 r 668,933 667,620 hypothetical protein Nmar0736 4B7 f 669,032 669,436 hypothetical protein Nmar0737 4B7 r 669,981 669,433 heat shock protein DnaJ domain protein Nmar0748 4B7 f 675,576 677,768 translation elongation factor aEF-2 Nmar0756 54d9 f 682,006 682,383 hypothetical protein

Nmar0850 DeepAntEC39 f 750,195 751,031 adenylyl cyclase class-3/4/guanylyl cyclase

Nmar0878 4B7 r 772,098 771,370 conserved hypothetical protein Nmar0921 DeepAntEC39 r 807,076 806,078 NAD(+) kinase Nmar0922 DeepAntEC39 r 807,987 807,118 PfkB domain protein Nmar0923 DeepAntEC39 f 808,250 808,537 Ribosomal protein S26E Nmar0924 DeepAntEC39 r 809,112 808,540 CDP-alcohol phosphatidyltransferase Nmar0925 DeepAntEC39 r 810,065 809,190 putative agmatinase Nmar0926 DeepAntEC39 r 811,934 810,069 threonyl-tRNA synthetase Nmar0927 DeepAntEC39 r 812,327 811,992 hypothetical protein Nmar0928 DeepAntEC39 f 812,404 813,468 deoxyhypusine synthase Nmar0929 DeepAntEC39 r 813,694 813,482 hypothetical protein Nmar0930 DeepAntEC39 f 813,793 814,104 small subunit ribosomal protein S25e Nmar0931 DeepAntEC39 r 814,361 814,086 hypothetical protein Nmar0932 DeepAntEC39 f 814,486 815,034 RNA polymerase Rpb6 Nmar0933 DeepAntEC39 f 815,086 815,400 Alba, DNA/RNA-binding protein Nmar0934 DeepAntEC39 r 815,397 815,813 Transcriptional regulator

Page 37: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0935 DeepAntEC39 f 815,847 816,866 Asparagine synthase (glutamine-hydrolyzing)

Nmar0937 54d9 r 817,489 818,589 Putative uncharacterized protein

Nmar0938 DeepAntEC39 r 818,620 819,510 1,4-dihydroxy-2-napthoate octaprenyltransferase

Nmar0939 DeepAntEC39 r 819,822 820,223 Putative uncharacterized protein Nmar0949 54d9 f 827,214 827,495 conserved hypothetical protein Nmar0966 74A4 f 844,775 845,488 NH3-dependent NAD+ synthetase Nmar0967 74A4 f 845,485 846,873 Cysteinyl-tRNA synthetase Nmar0970 4B7, 74A4 f 847,835 848,116 hypothetical protein Nmar0971 74A4 r 848,935 848,117 MscS Mechanosensitive ion channel Nmar0972 74A4 r 849,716 848,940 conserved hypothetical protein Nmar0975 4B7, 74A4 r 852,485 851,328 Protein of unknown function DUF650 Nmar0976 4B7, 74A4 f 852,580 853,506 periplasmic binding protein Nmar0981 74A4 r 858,749 859,114 Putative uncharacterized protein Nmar0982 74A4, 54d9 r 859,116 859,526 Coenzyme A-binding protein Nmar0983 4B7, 74A4 f 859,607 860,128 Putative uncharacterized protein Nmar0984 4B7, 74A4 f 860,165 861,172 Bifunctional protein; biotin repressor Nmar0985 74A4 r 861,173 861,535 Putative uncharacterized protein Nmar0986 4B7, 74A4 f 861,646 863,085 Zn-dependent metalloprotease Nmar0987 4B7 f 863,158 864,105 Transcription initiation factor TFIIB Nmar0988 74A4 r 864,515 864,108 phosphoribosyltransferase-like Nmar0992 54d9 r 867,096 866,668 hypothetical protein Nmar0996 4B7 r 869,170 869,568 Hypothetical membrane protein Nmar0997 74A4 r 870,511 869,663 Methyltransferase type 11 Nmar1011 4B7 r 887,979 886,990 nitroreductase Nmar1017 4B7, 74A4 f 898,088 899,395 aminotransferase class-III Nmar1019 74A4, DeepAntEC39 f 899,956 900,777 TPR-repeat protein Nmar1020 74A4, DeepAntEC39 r 900,772 901,476 Double-stranded beta-helix fold enzyme Nmar1022 74A4, DeepAntEC39 f 901,921 902,376 Cupin 2 conserved barrel domain protein Nmar1023 74A4 f 902,379 902,741 Cupin 2 conserved barrel domain protein Nmar1024 74A4, DeepAntEC39 f 902,804 803,475 Molecular chaperone Nmar1025 74A4, DeepAntEC39, 54d9 r 903,476 903,883 HIT superfamily hydrolase Nmar1026 DeepAntEC39 r 904,224 903,919 hypothetical protein Nmar1027 DeepAntEC39 r 904,366 904,199 hypothetical protein Nmar1028 74A4 f 904,421 905,560 3-hydroxyacyl-CoA dehydrogenase Nmar1030 74A4, DeepAntEC39 r 905,845 906,354 Rossman fold nucleotide-binding protein Nmar1031 74A4 f 906,617 907,027 Multiple Zn-finger protein Nmar1032 74A4 r 907,432 907,229 RNA polymerase Rbp10 Nmar1033 74A4 r 907,455 907,763 Ribosomal protein S10 Nmar1034 74A4 r 907,775 909,073 Elongation factor 1-alpha Nmar1035 74A4 f 909,195 910,340 Putative uncharacterized protein Nmar1036 74A4 f 910,408 910,815 Secreted protein Nmar1037 74A4 f 910,852 912,618 DNA ligase Nmar1038 74A4 f 912,685 913,338 Coil protein Nmar1039 74A4 f 913,430 913,966 tRNA intron endonuclease Nmar1041 74A4 r 914,012 916,327 Putative uncharacterized protein Nmar1043 74A4 f 917,089 918,855 Rossman-fold oxidoreductase Nmar1048 74A4 r 924,387 925,160 Permease Nmar1049 74A4 r 925,828 925,196 hypothetical protein Nmar1050 74A4 r 926,360 925,869 protein of unknown function DUF192 Nmar1082 DeepAntEC39 r 985,427 985,176 hypothetical protein Nmar1098 54d9 r 1,001,974 1,000,832 phosphoesterase DHHA1

Page 38: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar1121 4B7 f 1,025,889 1,026,476 Ferritin Dps family protein

Nmar1182 4B7 f 1,080,792 1,082,114 DEAD/DEAH box helicase domain protein

Nmar1183 4B7 r 985,476 985,778 Putative uncharacterized protein Nmar1185 4B7 r 986,566 989,808 Carbamyl phosphate synthetase

Nmar1186 4B7 r 989,810 990,964 Carbamoylphosphate synthase, small subunit

Nmar1189 4B7 f 1,085,567 1,085,911 hypothetical protein Nmar1190 4B7 f 1,086,156 1,086,416 hypothetical protein Nmar1191 4B7 r 1,088,440 1,086,413 protein of unknown function DUF255 Nmar1275 4B7 r 1,170,906 1,170,133 Rhodanese domain protein Nmar1310 54d9 r 1,199,427 1,198,744 glutamine amidotransferase class-I Nmar1352 DeepAntEC39 r 1,238,660 1,237,860 regulatory protein ArsR Nmar1357 4B7 f 1,243,214 1,243,858 nitroreductase Nmar1455 54d9 f 1,323,582 1,324,943 Adenylosucinnate lyase

Nmar1466 54d9 r 1,333,685 1,334,440 Asparagine synthase (glutamine-hydrolyzing)

Nmar1469 54d9 f 1,335,702 1,336,520 hypothetical protein Nmar1477 54d9 r 1,345,206 1,347,521 Valyl-tRNA synthetase

Nmar1500 54d9 r 1,367,890 1,367,240 hypothetical protein: putative archaeal ammonia monoxygenase subunit A

Nmar1501 54d9 r 1,368,387 1,368,025 hypothetical protein

Nmar1503 54d9 f 1,369,326 1,369,895 hypothetical protein: putative archaeal ammonia monoxygenase subunit B

Nmar1524 DeepAntEC39 r 1,385,659 1,385,420 regulatory protein AsnC/Lrp family

Nmar1622 54d9 r 1,478,462 1,477,386 Alcohol dehydrogenase zinc-binding domain protein

Nmar1649 54d9 f 1,507,916 1,508,743 hypothetical protein Nmar1667 54d9 r 1,526,542 1,525,130 hypothetical protein Nmar1685 4B7 r 1,545,979 1,545,383 hypothetical protein SARGASSO SEA CONTIGS ORF Accession number Strand Left bp Right bp Annotation of top BLAST hit Nmar0037 EQ086597 f 28,946 29,659 Organic radical activating enzyme Nmar0083 EQ085329 r 70,282 71,367 Cobalt-precorrin-6A synthase Nmar0084 EQ085329 f 71,458 72,327 Nucleoside-diphosphate-sugar epimerase Nmar0085 EQ085329 r 72,304 73,470 S-adenosylmethionine synthetase Nmar0087 EQ085329 f 73,840 74,493 RecA/RadA recombinase related protein Nmar0088 EQ085329 f 74,592 77,333 Lhr-like helicase

Nmar0089 EQ085329 r 77,330 78,730 Single-stranded DNA-binding replication protein A, large (70 kD) subunit

Nmar0090 EQ085329 f 78,944 79,882 ABC-type multidrug transport system, ATPase component

Nmar0091 EQ085329 f 79,866 80,627 ABC-type multidrug transport system, permease component

Nmar0093 EQ085329 r 81,548 82,600 Uncharacterized conserved archaeal protein

Nmar0094 EQ085329 f 82,664 83,803 Putative uncharacterized protein Nmar0095 EQ085329 f 83,839 84,402 Molecular chaperone GrpE Nmar0096 EQ085329 f 84,405 86,315 Molecular chaperone Nmar0097 EQ085329 f 86,365 87,450 DnaJ-class molecular chaperone

Nmar0099 EQ085329 r 88,308 90,212 Archaeal Glu-tRNA Gln amidotransferase subunit E

Nmar0100 EQ085329 r 90,221 91,510 Archaeal Glu-tRNA Gln amidotransferase subunit D/asparaginase

Page 39: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0101 EQ085329 r 91,543 93,711 AAA ATPase Nmar0102 EQ085329 f 93,816 94,550 Ribosomal protein L2 Nmar0104 EQ085329 r 94,876 95,208 Nmar0105 EQ085329 r 95,588 96,502 ATPase of the PP-loop superfamily

Nmar0107 EQ085329 r 97,656 98,852 Membrane-associated Zn-dependent protease

Nmar0108 EQ085329 f 98,896 99,207 Uncharacterized protein involved in tolerance to divalent cations

Nmar0168 EQ086682 f 155,950 156,876 Hypothetical UDP-glucose 4-epimerase Nmar0173 EQ084738 r 160,619 161,266 Short-chain alcohol dehydrogenase Nmar0198 EQ086546 r 178,988 179,500 Putative uncharacterized protein Nmar0231 EQ086498, EQ085222 r 203,816 205,486 Putative uncharacterized protein Nmar0269 EQ085249 f 236,951 238,072 Putative uncharacterized protein

Nmar0270 EQ085249 f 238,081 239,238 Replicatin factor C/ATPase involved in DNA replication

Nmar0271 EQ085249 f 239,256 239,546 Putative uncharacterized protein

Nmar0272 EQ085249 f 239,653 241,200 Acetyl-CoA carboxylase, alpha and beta subunits

Nmar0273 EQ085249 f 241,206 242,693 Biotin carboxylase

Nmar0274 EQ085249 f 242,700 243,212 Acetyl/proprionyl-CoA carboxylase, alpha subunit

Nmar0275 EQ085249 r 243,229 243,693 Peroxiredoxin

Nmar0276 EQ085249 f 243,850 244,182 NADH-ubiquinone oxidoreductase, subunit A

Nmar0277 EQ085249 f 244,223 244,747 NADH-ubiquinone oxidoreductase, subunit B

Nmar0278 EQ085249 f 24,474 245,349 NADH-ubiquinone oxidoreductase, subunit C

Nmar0279 EQ085249 f 245,352 246,488 NADH-ubiquinone oxidoreductase, subunit D

Nmar0280 EQ085249 f 246,489 247,787 NADH-ubiquinone oxidoreductase, subunit H

Nmar0281 EQ085249 f 247,787 248,284 NADH-ubiquinone oxidoreductase, subunit I

Nmar0282 EQ085249 f 248,277 248,789 NADH-ubiquinone oxidoreductase, subunit J

Nmar0283 EQ085249 f 248,770 249,075 NADH-ubiquinone oxidoreductase, subunit 4L

Nmar0284 EQ085249 f 249,075 250,628 Formate hydrogenlyase, subunit 3/multisubunit Na/H antiporter

Nmar0285 EQ085249 f 250,630 252,708 NADH-ubiquinone oxidoreductase, subunit L

Nmar0286 EQ085249 f 252,721 254,205 NADH-ubiquinone oxidoreductase, subunit N

Nmar0287 EQ085249 r 254,195 255,046 Geranylgeranyl pyrophosphate synthase/geranyltransferase

Nmar0289 EQ085222 r 256,122 256,727 Orotate phosphoribosyltransferase Nmar0290 EQ085222 f 256,778 257,860 Putative uncharacterized protein

Nmar0291 EQ085222 r 257,852 258,958 Membrane-associated Zn-dependent protease

Nmar0292 EQ085222 f 259,014 259,529 Acetyltransferase Nmar0294 EQ085222 r 259,787 260,614 Methyltransferasae Nmar0295 EQ085222 r 260,620 262,407 ATPase, Rnase L inhibitor Nmar0352 EQ085222 f 319,896 320,219 Ribosomal protein L30E Nmar0353 EQ085222 f 320,271 320,735 Transcription elogation factor Nmar0354 EQ085222 f 320,739 321,176 30S ribosomal protein S12P Nmar0355 EQ085222 f 321,179 321,778 Ribosomal protein S7

Page 40: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0356 EQ085222 r 321,937 322,266 Nmar0357 EQ085222 f 322,289 322,993 Putative uncharacterized protein Nmar0359 EQ085222 r 323,446 324,444 Dehydrogenase Nmar0367 EQ085222, EP918968 r 330,772 333,639 Leucyl-tRNA snthetase Nmar0368 EQ085222, EP918968 r 333,646 336,315 Alanyl-tRNA synthetase Nmar0373 EQ085222 f 340,670 341,179 Putative uncharacterized protein

Nmar0374 EQ085222, EP918968 f 341,230 341,547 Ribosomal protein L12E/L44/L45/RPP1/RPP2

Nmar0375 EP918968 f 341,630 342,262 Putative uncharacterized protein Nmar0381 EP918968 r 345,558 346,424 Ribosomal protein L10 Nmar0382 EQ085222 r 346,417 347,079 Ribosomal protein L1 Nmar0383 EQ085222 f 347,271 347,708 Tranascriptional reglator Nmar0385 EQ085222, EP918968 r 348,548 349,027 Ribosomal protein L11 Nmar0386 EQ085222, EP918968 r 349,062 349,520 Ribosomal prrotein L24A

Nmar0387 EQ085222, EP918968 f 349,736 350,512 Uncharacterized conserved archaeal protein

Nmar0388 EQ085222, EP918968 f 350,509 350,808 Methylated DNA-protein cysteine methyltransferase

Nmar0390 EQ085222, EP918968 r 351,250 351,705 Ribosomal protein L19E Nmar0391 EQ085222, EP918968 r 351,689 352,093 Ribosomal protein L32E

Nmar0392 EQ085222, EP918968 f 352,349 353,848 Phosphoenolpyruvate carboxykinase (ATP)

Nmar0393 EQ085222 r 353,850 355,832 Vpr Nmar0394 EQ085222 f 355,951 356,571 Superoxide dismutase Nmar0395 EQ085222 f 356,654 357,091 Prefoldin, molecular chaperone Nmar0396 EQ085222 f 357,095 358,636 Signal recognition particle GTPase Nmar0409 EQ086635 r 367,332 368,645 Dehydrogenase Nmar0410 EQ086635 f 368,734 369,015 Putative uncharacterized protein Nmar0412 EQ086635 r 369,607 370,575 Lactate dehydrogenase

Nmar0413 EQ086635 f 370,822 372,732 2-oxoglutarate ferredoxin oxidoreductase, alpha subunit

Nmar0414 EQ086635 f 372,722 373,684 2-oxoglutarate ferredoxin oxidoreductase, beta subunit

Nmar0415 EQ086635 f 373,734 374,174 Putative uncharacterized protein Nmar0416 EQ086635 r 374,167 374,700 Putative uncharacterized protein Nmar0418 EQ086635 r 375,131 375,700 Putative uncharacterized protein Nmar0419 EQ085799 r 375,731 376,510 Putative uncharacterized protein

Nmar0420 EQ085799 f 376,584 377,852 3-isopropylmalate dehydratase, large subunit

Nmar0421 EQ085799 f 377,852 378,334 3-isopropylmalate dehydratase, small subunit

Nmar0422 EQ085799 f 378,343 379,758 3-isopropylmalate dehydratase, large subunit

Nmar0423 EQ085799 f 379,760 380,344 3-isopropylmalate dehydratase, small subunit

Nmar0425 EQ085799 r 381,045 381,494 30S ribosomal protein S9P Nmar0426 EQ085799 r 381,491 381,955 Ribosomal protein L13 Nmar0427 EQ085799 r 381,948 382,295 Ribosomal protein L18E

Nmar0428 EQ085799 r 382,339 383,004 DNA-directed RNA polymerase, alpha subunit

Nmar0430 EQ085799 f 383,542 384,228 Exosome complex subunit Nmar0431 EQ085799 f 384,228 384,908 Exosome complex RNA-binding protein Nmar0432 EQ085799 f 384,908 385,642 Rnase PH Nmar0433 EQ085799 f 385,645 386,463 Rnase PH-related exoribonuclease Nmar0435 EQ085799 f 386,661 386,906

Page 41: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0436 EQ085799 f 386,930 387,319 Prefolding, chaperonin cofactor

Nmar0437 EQ085799 f 387,372 388,058 Helicase-associated endonuclease for fork-structured DNA

Nmar0439 EQ085799 r 388,545 389,024 Nucleic acid-binding protein

Nmar0440 EQ085799 r 389,014 389,850 Inorganic polyphosphate/ATP-NAD kinase

Nmar0443 EQ085799 f 391,108 392,055 Ferredoxin Nmar0444 EQ085799 f 392,111 394,177 Glycosyltransferase Nmar0445 EQ085799 r 394,174 395,703 Methyl-accepting chemotaxis protein

Nmar0446 EQ085799 f 396,135 397,460 Kef-type K transport system, membrane component

Nmar0447 EQ085799 f 397,453 397,881 Putative uncharacterized protein Nmar0520 EQ085799 r 464,980 465,363 Ribosomal protien S8E

Nmar0521 EQ085799 r 465,432 466,010 Transcriptional regulator containing HTH domain

Nmar0523 EQ085799 f 466,837 467,889 Zn-dependent alcohol dehydrogenase Nmar0526 EQ086555 f 469,360 469,671 Translation initiation factor 1 Nmar0528 EQ085799 f 470,238 470,846 Putative uncharacterized protein

Nmar0529 EQ085799 f 470,881 471,291 Translational elongation factor P (EF-P)/translation initiation factor 5A

Nmar0530 EQ085799 f 471,402 472,091 ATPase of the PP-loop superfamily Nmar0531 EQ085799 f 472,084 473,412 Signal recognition particle GTPase Nmar0532 EQ086635 f 473,435 474,580 Pseudouridylate synthase Nmar0533 EQ086635 f 474,754 475,464 Putative uncharacterized protein Nmar0535 EQ086635 r 475,650 476,093 Putative uncharacterized protein Nmar0536 EQ086635 f 476,150 476,710 Putative uncharacterized protein Nmar0537 EQ086635 r 476,694 477,980 Phosphoglycerate mutase

Nmar0538 EQ086635 f 478,209 479,225 Galactose-1-phosphate uridylyltransferase

Nmar0539 EQ086635 f 479,268 479,714 Putative uncharacterized protein Nmar0562 EQ086635 f 503,038 503,463 Nmar0578 EQ085355 r 513,299 514,888 Lysyl-tRNA synthetase Nmar0579 EQ085355 r 514,892 516,475 Histone acetyltransferase

Nmar0581 EQ085355 r 516,887 518,137 Kef-type K transport system, membrane component

Nmar0582 EQ085355 r 518,212 519,234 Phosphoribosylformylglycinamidine cyclo-ligase

Nmar0583 EQ085355 r 519,284 520,009 Putative uncharacterized protein Nmar0585 EQ085355 r 522,090 522,437 Nmar0590 EQ085355 f 525,903 526,793 Ca2/Na antiporter Nmar0592 EQ085355, EQ086504 r 528,167 529,426 Thiamine biosynthesis enzyme Nmar0593 EQ086504 r 529,423 530,595 Thiamine biosynthesis enzyme Nmar0594 EQ086504 r 530,634 531,773 Trypsin-like serine protease

Nmar0595 EQ086504 r 531,812 533,020 Glycosyltransferase involved in cell wall biogenesis

Nmar0596 EQ086504 r 533,024 533,440 Putative uncharacterized protein Nmar0597 EQ086504 f 533,768 534,178 Methionyl-tRNA synthetase

Nmar0598 EQ086504 r 534,324 534,884 TATA-box binding protein (TBP), compoenent of TFIID and TFIIIB

Nmar0599 EQ086504 f 535,023 536,144 Uncharacterized conserved archaeal protein

Nmar0601 EQ086504 r 536,325 537,218 Methionine aminopeptidase Nmar0602 EQ086504 f 537,506 537,751 Zn-ribbon protein Nmar0616 EQ086504 f 554,316 557,606 Putative uncharacterized protein Nmar0627 EQ086546 f 567,675 569,051 Putative uncharacterized protein

Page 42: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0652 EQ086546 r 591,731 592,699 ABC-type oligopeptide transport system, ATPase component

Nmar0653 EQ086546 f 592,793 593,812 Putative uncharacterized protein Nmar0657 EQ086546 r 596,491 597,801 Putative uncharacterized protein Nmar0674 EQ086597 f 614,251 614,685 Universal stress protein Nmar0684 EQ086597 r 624,050 625,159 GTPase Nmar0685 EQ086597 f 625,481 626,176 Hydrolase of the HAD superfamily Nmar0686 EQ086597 f 626,178 628,064 Arginyl-tRNA synthetase

Nmar0690 EQ086597 f 631,181 632,062 L-isoaspartate methyltransferase/tRNA (1-methyladenosine) methyltransferase

Nmar0691 EQ086597 f 632,107 632,970 Sugar kinase Nmar0694 EQ086597 f 634,836 635,432 20S proteasome, alpha and beta subunit Nmar0695 EQ086597 r 635,442 635,918 RNA-binding protein Nmar0696 EQ086597 r 635,915 636,925 Homoserine dehydrogenase Nmar0697 EQ086597 r 637,018 637,425 Putative uncharacterized protein Nmar0698 EQ086597 r 637,462 639,099 Thymidylate synthase Nmar0699 EQ086597 r 639,111 639,404 Putative uncharacterized protein Nmar0700 EQ086597 f 639,583 640,416 Putative uncharacterized protein Nmar0701 EQ086597 f 640,471 642,387 acetate-CoA ligase Nmar0703 EQ086597 r 642,950 643,972 5'-3' exonuclease Nmar0788 EQ086597 r 703,382 703,780 Bacterioferritin comigratory protein Nmar0795 EQ086466 r 708,081 708,473 Ribosomal protein S8 Nmar0797 EQ086466 r 708,659 709,180 Ribosomal protein L5 Nmar0798 EQ086466 r 709,185 709,901 Ribosomal protein S4E Nmar0804 EQ086466 r 711,620 712,375 30S ribosomal protein S3P Nmar0805 EQ086466 r 712,378 712,836 Ribosomal protein L22 Nmar0806 EQ086466 r 712,842 713,237 Ribosomal protein S19 Nmar0808 EQ086466 r 713,528 714,343 Ribosomal protein L4 Nmar0809 EQ086466 r 714,340 715,335 50S ribosomal protein L3 Nmar0810 EQ086466 r 715,541 716,353 Putative uncharacterized protein Nmar0811 EQ086466 r 716,385 717,122 20S proteasome, alpha and beta subunit Nmar0813 EQ086466 f 717,819 718,160 Putative uncharacterized protein

Nmar0816 EQ086466 r 720,467 721,117 Conserved protein implicated in secretion

Nmar0817 EQ086466 r 721,209 722,318 Thiol-disulfide isomerase Nmar0818 EQ086466 r 722,323 723,057 Cytochrome c biogenesis protein Nmar0839 EQ086635 f 741,001 741,828 Putative uncharacterized protein Nmar0889 EP710054 f 780,605 780,922 Nmar0934 EQ086597 r 815,397 815,813 Transcriptional regulator

Nmar0935 EQ086597 f 815,847 816,866 Asparagine synthase (glutamine-hydrolyzing)

Nmar0937 EQ086597 r 817,489 818,589 Putative uncharacterized protein

Nmar0938 EQ086597 r 818,620 819,510 1,4-dihydroxy-2-napthoate octaprenyltransferase

Nmar0939 EQ086597 r 819,822 820,223 Putative uncharacterized protein Nmar0940 EQ086597 f 820,283 820,573 Nmar0962 EQ086597 r 840,580 841,332 Putative uncharacterized protein Nmar0963 EQ086597, EP710054 r 841,334 842,989 Methionyl-tRNA synthetase Nmar0964 EQ086597, EP710054 r 843,027 844,268 Adenosylhomocysteinase Nmar0965 EP710054 f 844,342 844,644 Dioxygenase ferredoxin protein Nmar0966 EP710054 f 844,775 845,488 NH3-dependent NAD+ synthetase Nmar0967 EP710054 f 845,485 846,873 Cysteinyl-tRNA synthetase Nmar0980 EP710054 r 857,503 585,714 GTPase Nmar0981 EP710054 r 858,749 859,114 Putative uncharacterized protein

Page 43: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar0982 EP710054 r 859,116 859,526 Coenzyme A-binding protein Nmar0983 EP710054 f 859,607 860,128 Putative uncharacterized protein Nmar0984 EP710054 f 860,165 861,172 Bifunctional protein; biotin repressor Nmar0985 EP710054 r 861,173 861,535 Putative uncharacterized protein Nmar0986 EP710054 f 861,646 863,085 Zn-dependent metalloprotease Nmar0987 EP710054 f 863,158 864,105 Transcription initiation factor TFIIB Nmar0989 EP710054 r 864,552 864,944 Putative uncharacterized protein Nmar0990 EP710054 f 864,998 865,576 Putative uncharacterized protein Nmar0991 EP710054 f 865,566 866,630 Poly(3-hydroxyalkanoate) synthetase Nmar0993 EP710054 r 867,093 867,518 Transcriptional regulator Nmar0994 EP710054 r 867,555 868,400 Acyltransferase Nmar0996 EP710054 r 869,170 869,568 Hypothetical membrane protein Nmar1019 EQ086597 f 899,956 900,777 TPR-repeat protein Nmar1020 EQ086597 r 900,772 901,476 Double-stranded beta-helix fold enzyme Nmar1024 EQ086597 f 902,804 803,475 Molecular chaperone Nmar1025 EQ086597 r 903,476 903,883 HIT superfamily hydrolase Nmar1028 EQ086597 f 904,421 905,560 3-hydroxyacyl-CoA dehydrogenase Nmar1030 EQ086597 r 905,845 906,354 Rossman fold nucleotide-binding protein Nmar1031 EQ086597 f 906,617 907,027 Multiple Zn-finger protein Nmar1033 EQ086597 r 907,455 907,763 Ribosomal protein S10 Nmar1034 EQ086597 r 907,775 909,073 Elongation factor 1-alpha Nmar1035 EQ086597 f 909,195 910,340 Putative uncharacterized protein Nmar1036 EQ086597 f 910,408 910,815 Secreted protein Nmar1037 EQ086597 f 910,852 912,618 DNA ligase Nmar1038 EQ086597 f 912,685 913,338 Coil protein Nmar1039 EQ086597 f 913,430 913,966 tRNA intron endonuclease Nmar1041 EQ086597 r 914,012 916,327 Putative uncharacterized protein Nmar1043 EQ086597 f 917,089 918,855 Rossman-fold oxidoreductase Nmar1044 EQ086597 f 918,858 919,307 Nmar1048 EQ086597 r 924,387 925,160 Permease Nmar1057 EQ086597 f 932,353 932,916 3-methyladenine DNA glycosylase

Nmar1058 EQ086597 r 932,913 933,545 Uncharacterized membrane-associated protein

Nmar1066 EQ086498 f 938,826 940,112 Aspartyl-/asparaginyl-tRNA synthetase

Nmar1069 EQ086498 r 940,960 941,973 Isocitrate/isopropylmalate dehydrogenase

Nmar1070 EQ086498 r 942,007 943,524 Isopropylmalate dehydrogenase Nmar1071 EQ086498 r 943,521 944,006 Acetolactate synthase Nmar1072 EQ086498 r 944,015 945,715 Acetolactate synthase Nmar1077 EQ086498 f 981,989 982,393

Nmar1078 EQ086498 r 982,395 983,390 DhnA-type fructose-1,6-bisphosphate aldolase

Nmar1079 EQ086498 r 983,428 984,456 L-iditol 2-dehydrogenase/threonine dehydrogenase

Nmar1080 EQ086498 r 984,493 984,786 Putative uncharacterized protein Nmar1081 EQ086498 f 984,872 985,174 Putative uncharacterized protein Nmar1083 EQ086498 r 985,476 985,778 Putative uncharacterized protein Nmar1084 EQ086498 f 985,813 986,562 ICC-like phosphoesterase Nmar1085 EQ086498 r 986,566 989,808 Carbamyl phosphate synthetase

Nmar1086 EQ086498 r 989,810 990,964 Carbamoylphosphate synthase, small subunit

Nmar1088 EQ086498 r 992,659 993,852 AAA ATPase Nmar1092 EQ086498 f 996,256 996,711 Translation initiation inhibitor Nmar1098 EQ086498 r 1,000,832 1,001,974 Single-stranded DNA-specific

Page 44: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

exonuclease Nmar1099 EQ086498 f 1,002,328 1,003,491 Citrate synthase Nmar1101 EQ086498 f 1,003,778 1,004,125 Nmar1103 EQ086498 r 1,005,416 1,007,002 Excinuclease nuclease subunit Nmar1104 EQ086498 r 1,006,999 1,009,818 Excinuclease ATPase subunit Nmar1105 EQ086498 r 1,009,805 1,011,757 UvrABC system protein B Nmar1106 EQ086498 r 1,011,807 1,012,457 Flavin-nucleotide-binding protein Nmar1109 EQ086498 r 1,013,628 1,014,734 Flavin-dependent oxidoreductase Nmar1110 EQ086498 r 1,014,829 1,015,773 Alcohol dehydrogenase Nmar1111 EQ086498 f 1,015,826 1,017,103 Hydroxypyruvate reductase Nmar1113 EQ086498 r 1,017,395 1,018,360 Exonuclease of the beta-lactamase fold

Nmar1117 EQ086498 f 1,019,750 1,020,613 5,10-methylene-tetrahydrofolate dehydrogenase/methenyltetrahydrofolase cyclohydrolase

Nmar1119 EQ086498 r 1,021,157 1,022,422 Phosphoribosylamine-glycine ligase Nmar1120 EQ086498 f 1,022,506 1,025,703 Isoleucyl-tRNA synthetase Nmar1122 EQ086498 f 1,026,579 1,026,908 Putative uncharacterized protein Nmar1123 EQ086498 f 1,026,949 1,028,055 Succinyl-CoA synthetase, beta subunit Nmar1124 EQ086498 f 1,028,052 1,028,969 Succinyl-CoA synthetase, alpha subunit Nmar1254 EQ086682 r 1,150,749 1,151,213 Ribosomal protein L15E

Nmar1255 EQ086682 f 1,151,421 1,152,080 Secreted periplasmic Zn-dependent protease

Nmar1257 EQ086682 r 1,152,589 1,153,482 Putative uncharacterized protein Nmar1258 EQ086682 r 1,153,583 1,154,515 Phosphoglycerate dehydrogenase

Nmar1259 EQ086682 r 1,154,583 1,155,995 Putative copper-containing nitrite reductase

Nmar1260 EQ086682 f 1,156,106 1,157,467 Fumarate hydratase/fumarase Nmar1261 EQ086682 f 1,157,471 1,158,739 2-methylthioadenine synthetase Nmar1262 EQ086682 f 1,158,813 1,159,763 Cell division GTPase Nmar1263 EQ086682 r 11,589,800 1,160,363 Ribosomal protein L31E Nmar1265 EQ086682 r 1,160,560 1,161,111 NTP pyrophosphohydrolase Nmar1266 EQ086682 f 1,161,181 1,161,918 Putative uncharacterized protein

Nmar1267 EQ086682 r 1,161,915 1,164,413 Methionine synthase I, cobalamin-binding domain

Nmar1268 EQ086682 r 1,164,446 1,165,408 Methionine synthase I (cobalamin-dependent), methyltransferase domain

Nmar1269 EQ086682 f 1,165,474 1,166,622 Permease

Nmar1283 EQ086682 r 1,175,356 1,176,114 ABC-type nitrate/sulfonate/bicarbonaet transport system, permease component

Nmar1284 EQ086682 r 1,176,120 1,176,884 ABC-type nitrate/sulfonate/bicarbonaet transport system, ATPase component

Nmar1285 EQ086682 r 1,176,884 1,177,909 ABC-type nitrate/sulfonate/bicarbonaet transport system, periplasmic component

Nmar1286 EQ086682 f 1,178,045 1,179,244 Argininosuccinate synthase

Nmar1288 EQ086682 f 1,179,408 1,180,265 Glutathione synthase/ribosomal protein S6 modification enzyme

Nmar1289 EQ086682 f 1,180,294 1,181,340 N-acetyl-gamma-glutamyl-phosphate reductase

Nmar1291 EQ086682 f 1,182,139 1,183,320 Pyridoxal-phosphate-dependent aminotransferase

Nmar1292 EQ086682 f 1,183,307 1,183,726 Transcriptional regulator Nmar1293 EQ086682 f 1,183,825 1,185,006 2-isopropylmalate synthase

Nmar1295 EQ086682 f 1,185,192 1,186,034 Glutathione synthase/ribosomal protein S6 modification enzyme

Nmar1296 EQ086682 f 1,186,038 1,187,165 Acetylornithine deacetylase/succinyl-diaminopimelate desuccinylase

Page 45: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar1297 EQ086682 f 1,187,165 1,188,202 Dipthine synthase Nmar1298 EQ086682 f 1,188,243 1,188,863 Cytidylyltransferase

Nmar1301 EQ086682 r 1,191,274 1,191,957 Transcriptional regulator of a riboflavin/FAD biosynthetic operon

Nmar1302 EQ086682 r 1,191,966 1,193,108 Bacteral-type DNA primase

Nmar1303 EQ086682 r 1,193,261 1,193,611 Putative copper-containing nitrite reductase

Nmar1308 EQ086682 r 1,195,786 1,196,547 Enoyl-CoA hydratase/carnithine racemase

Nmar1309 EQ086682 r 1,196,584 1,198,701 Acyl-CoA synthetase Nmar1312 EQ086682 r 1,199,847 1,201,121 Glutamate/leucine dehydrogenase Nmar1314 EQ086682 f 1,201,690 1,202,412 20S proteasome, alpha and beta subunits

Nmar1315 EQ086682 f 1,202,461 1,203,153 Methylase involved in ubiquinone/menaquinone biosynthesis

Nmar1316 EQ086682 f 120,150 1,204,304 Permease of the major facilitator superfamily

Nmar1321 EQ086635 f 1,208,508 1,209,338 Short-chain alcohol dehydrogenase Nmar1322 EQ086635 f 1,209,416 1,211,071 DNA topoisomerase IB Nmar1360 EQ086597 f 1,246,835 1,247,611 Putative uncharacterized protein Nmar1378 EQ085799 r 1,263,279 1,264,037 Metallophosphoesterase

Nmar1379 EQ086635 f 1,264,147 1,265,178 Isocitrate/isopropylmalate dehydrogenase

Nmar1380 EQ086635 r 1,265,179 1,265,715 Methylase of polypeptide chain release factor

Nmar1381 EQ086635 r 1,265,690 1,266,394 Dimethyladenosine transferase (rRNA methylation)

Nmar1382 EQ086635 r 1,266,391 1,266,957 RNA-binding protein Nmar1383 EQ086635 r 1,266,978 1,267,304 Putative uncharacterized protein Nmar1384 EQ086635 r 1,267,307 1,267,606 Ribosomal proteini L21E

Nmar1386 EQ086635 r 1,268,792 1,269,958 DNA repair and recombination protein, radA

Nmar1388 EQ086635 r 1,270,263 1,271,987 Fe-S oxidoreductase Nmar1389 EQ086635 r 1,272,039 1,272,851 2-polyprenylphenol hydroxylase Nmar1390 EQ086635 r 1,272,835 1,273,746 Dihydroorotate dehydrogenase

Nmar1391 EQ086635 f 1,273,832 1,275,244 Ribosomal bioogenesis protein/nucleolar protein

Nmar1392 EQ086635 f 1,275,231 1,275,905 Fibrillarin-like rRNA methylase Nmar1393 EQ086635 r 1,275,902 1,276,519 Ribonuclease

Nmar1394 EQ086635 f 1,276,580 1,277,728 N2,N2-demthylguanosine tRNA methyltransferase

Nmar1395 EQ086635 r 1,277,718 1,278,311 rRNA large subunit methyltransferase J Nmar1396 EQ086635 r 1,278,308 1,279,183 Transcriptional regulator Nmar1397 EQ086635 r 1,279,233 1,279,709 Nmar1398 EQ086635 f 1,279,813 1,281,906 ATP-dependent endonuclease Nmar1407 EQ086555 r 1,287,346 1,288,458 DNA topoisomerase VI, subunit A Nmar1408 EQ086555 r 1,288,445 1,290,325 DNA topoisomerase VI, subunit B Nmar1409 EQ086555 r 1,290,312 1,290,884 RNA-binding protein Nmar1410 EQ086555 r 1,290,881 1,291,660 Serine/threonine protein kinase Nmar1411 EQ086555 r 1,291,717 1,292,013 Putative uncharacterized protein

Nmar1412 EQ086555 r 1,292,013 1,292,435 Translation initiation factor 2, beta subunit/elF-5 N-terminal domain

Nmar1414 EQ086555 f 1,292,904 1,293,401 Putative uncharacterized protein Nmar1415 EQ086555 f 1,293,472 1,294,617 Membrane protein Nmar1416 EQ086555 f 1,294,697 1,295,488 Undecaprenyl pyrophosphate synthase

Nmar1417 EQ086555 f 1,295,485 1,295,904 Diadenosine 5'5'''-P1,P4-tetraphosphate pyrophosphohydrolase

Page 46: Nitrosopumilus maritimus genome reveals unique …archaea.sfsu.edu › pubs › walker2010.pdfni!cant role in the biogeochemical cycles of carbon and nitrogen. ammonia oxidation |

Nmar1418 EQ086555 r 1,295,884 1,296,624 Orotidine-5'phosphate decarboxylase Nmar1420 EQ086555 r 1,297,338 1,298,339 Adenylosuccinate synthetase Nmar1421 EQ086555 f 1,298,420 1,298,881 Putative uncharacterized protein Nmar1425 EQ086555 f 1,300,073 1,300,714 Putative uncharacterized protein Nmar1447 EQ084738 f 1,317,902 1,319,200 Phosphomannomutase Nmar1448 EQ084738 f 1,319,184 1,320,116 Thiamine monophosphate kinase Nmar1450 EQ084738 f 1,320,746 1,321,072 30S ribosomal protein S11P Nmar1451 EQ084738 r 1,321,069 1,321,377 Putative uncharacterized protein Nmar1452 EQ084738 r 1,321,377 1,321,781 Putative uncharacterized protein Nmar1453 EQ084738 r 1,321,778 1,323,226 Putative uncharacterized protein Nmar1455 EQ084738 f 1,323,582 1,324,943 Adenylosucinnate lyase

Nmar1466 EQ084738 r 1,333,685 1,334,440 Asparagine synthase (glutamine-hydrolyzing)

Nmar1468 EQ084738 f 1,335,364 1,335,639 Pterin-4a-carbinolamine dehydratase Nmar1477 EQ084738 r 1,345,206 1,347,521 Valyl-tRNA synthetase

Nmar1479 EQ084738 r 1,348,150 1,349,118 Pyridoxine/pyridoxal 5-phosphate biosynthesis enzyme

Nmar1480 EQ084738 r 1,349,149 1,349,682 Putative uncharacterized protein

Nmar1482 EQ084738 r 1,350,892 1,353,156 3-isopropylmalate isomerase/aconitase A

Nmar1483 EQ084738 f 1,353,303 1,353,932 Putative uncharacterized protein Nmar1484 EQ084738 r 1,353,935 1,354,501 Putative uncharacterized protein Nmar1486 EQ084738 r 1,354,803 1,355,222 Putative uncharacterized protein

Nmar1488 EQ084738 r 1,355,702 1,357,345 Phenylalanyl-tRNA synthetase, beta subunit

Nmar1489 EQ084738 r 1,357,336 1,358,724 Phenylalanyl-tRNA synthetase, alpha subunit

Nmar1490 EQ084738 f 1,358,772 1,359,884 Tryptophanyl-tRNA synthetase Nmar1491 EQ084738 f 1,359,881 1,360,162 Nmar1493 EQ084738 r 1,361,392 1,361,721 Putative uncharacterized protein

Nmar1494 EQ084738 r 1,361,774 1,362,250 Pantetheine-phosphate nucleotidyltransferase

Nmar1496 EQ084738 f 1,363,804 1,364,358 Thiol-disulfide isomerase Nmar1592 EP710054 f 1,456,838 1,457,221 Putative uncharacterized protein Nmar1610 EQ086498 r 1,469,218 1,469,775 Putative uncharacterized protein Nmar1618 EQ086597 r 1,474,520 1,475,245 TPR repeat protein Nmar1640 EQ086546 r 1,497,802 1,499,169 Putative uncharacterized protein Nmar1641 EQ086546 r 1,499,144 1,500,094 BatA Nmar1642 EQ086546 r 1,500,095 1,500,967 Putative uncharacterized protein Nmar1643 EQ086546 r 1,500,977 1,501,999 MoxR-like ATPase

Nmar1644 EQ086682, EP710054 r 1,502,127 1,503,287 Secreted Zn-dependent protease containing TPR repeats

Nmar1650 EQ086498 f 1,508,803 1,509,615 Copper binding protein, pastocyanin/azurin family


Recommended