To my Familyufdcimages.uflib.ufl.edu/UF/E0/04/97/65/00001/LI_K.pdfAcoX project and Prof. Adrian...

STRUCTURAL INSIGHT INTO MICROBIAL NATURAL PRODUCT PATHWAYS

By

KUNHUA LI

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2016

© 2016 Kunhua Li

To my Family

4

ACKNOWLEDGMENTS

Graduate school has been a wonderful time for me. There are so many people

whom I wish to thank during my PhD study at the University of Florida. I would most

certainly not be writing this dissertation without all the support.

I would like to offer my sincere thanks to my advisor and committee chair, Prof.

Steven Bruner. Steve, I consider myself to be extremely lucky to be a member of the

Bruner lab. I enjoy your mentorship style. Your patience and trust in me have fostered

me as an independent researcher and allowed me to accomplish things that I often

thought I would not be able to do. You have been, and still will be the role model for me.

I believe our supervisor-student relationship will continue to be a lifelong intellectual

friendship.

I also thank my other committee members Prof. Rebecca Butcher, Prof. Nicole

Horenstein, Prof. Robert McKenna, and Prof. Wei Wei. I greatly appreciate your time

reviewing this work. To Prof. Horenstein, you have been a source of encouragement,

always willing to discuss science, teaching as well as academia life and career with me.

I would like to thank Prof. McKenna for all the insightful discussions and suggestions on

X-ray protein crystallography.

I would like to thank all our collaborators. I would like to thank Prof. Butcher in the

AcoX project and Prof. Adrian Roitberg in the DpgC project. I wish I had enough room to

include these interesting and promising projects in my dissertation. To Prof. Andrew

Hansen, I appreciate your help on the MmuM project. You assisted me to go through

the HMT manuscript line by line, which greatly improved my writing. I also appreciate

your resourceful discussions in other metabolites repair projects. To Prof. Yousong

5

Ding, I enjoyed the collaborations on Y3 and other projects. And personally, I am

exceedingly grateful for your instructions and recommendations in my job-hunting.

Lastly, I would like to thank the staff on the Brookhaven National Laboratory and

Argonne National Laboratory for X-ray access, without whom I can barely finish any of

my projects.

To my fellow colleagues and friends, I would thank Dr. Heather Condurso. I came

here as one of the first Bruner Lab members in UF, and you certainly helped us a lot,

like a big sister. Now, as a senior graduate student myself, I can really appreciate your

kindness and patience back in the days. Dr. Jarrod Mousa, Wei-Hung Chen, Aleksandra

Zagulyaeva, Matthew Burg, Brian MacTavish and other Bruner members, you provided

me your advice and friendship that have made my graduate experience (and my

bachelors party!) wonderful. Gengnan Li, it is my great pleasure to work with you, and

thank you for your help in several projects. To Dr. Rachel Jones and Dr. Zhanglong Liu,

you provide me with valuable advises and suggestions on my career. To Xinxing Zhang

of the Butcher Lab and Dr. Guang Yang, Peilan Zhang, Yi Zhang of the Ding Lab, I

enjoyed working with you in the past years.

My family is my lifelong support. My father Jinfu Li and mother Peiyu Zhang

encouraged me to come to the state for graduate school, and I never regretted about it.

My wife Jingwen Liu, you know I will say a thousand thanks for all your support and

understanding. I am so lucky to have you in my life. The work I present in the following

pages is dedicated to them.

6

I would like to thank all the funding support to my projects, include the grants

from NSF, NIH and UF (to S.D.B.). I thank the assistantship from UF chemistry. UF

CLAS Dissertation Fellowship also supports this dissertation.

7

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS .................................................................................................. 4

LIST OF TABLES .......................................................................................................... 11

LIST OF FIGURES ........................................................................................................ 12

LIST OF ABBREVIATIONS ........................................................................................... 15

ABSTRACT ................................................................................................................... 20

CHAPTER

1 MICROBIAL SIDEROPHORE-BASED IRON ACQUISITION AND MEDICINAL APPLICATION ........................................................................................................ 22

1.1 Introduction of Microbial Siderophores .............................................................. 22

1.2 General Approaches of Siderophore Biosynthesis ........................................... 23

1.3 Siderophore Secretion ...................................................................................... 26

1.4 Holo-Siderophore Acquisition ........................................................................... 28

1.5 Siderophore Utilization ...................................................................................... 30

1.6 Siderophores as Therapeutical Drug Leads ...................................................... 31

1.7 Targeting Siderophore Pathways ...................................................................... 34

1.8 Conclusions and Insights .................................................................................. 35

2 STRUCTURE AND MECHANISM OF THE SIDEROPHORE INTERACTING PROTEIN FROM THE FUSCACHELIN GENE CLUSTER OF Thermobifida fusca ....................................................................................................................... 37

2.1 Siderophore Utilization Pathway in Thermobifida fusca .................................... 37

2.2 FscN is a Flavin Containing Siderophore Interaction Protein ............................ 40

2.2.1 FscN Contains a Non-covalently Bound FAD Cofactor ........................... 40

2.2.2 The Siderophore Utilization Protein FscN Binds Ferric-Fuscachelin ....... 41

2.2.3 FscN Specifically Reduces Ferric Fuscachelin ........................................ 42

2.2.4 Structure Determination with X-ray Crystallography ................................ 43

2.2.5 Proposed NADH Cofactor Binding Site in FscN ...................................... 47

2.2.6 Metal Binding Site Adjacent to the Flavin Cofactor .................................. 48

2.3 Structure of FscN Facilitate the Understanding of Siderophore Pathway ......... 49

2.4 Experimental Procedures .................................................................................. 57

2.4.1 General Methods ..................................................................................... 57

2.4.2 Cloning, Expression, and Purification of FscN ......................................... 57

2.4.3 FAD Determination, Reduction and Enzyme Kinetics.............................. 59

8

2.4.4 Purification of the Ferric-Fuscachelin Complex ....................................... 60

2.4.5 Purification Ferric-Enterobactin ............................................................... 60

2.4.6 Metal Analysis ......................................................................................... 61

2.4.7 Isothermal Titration Calorimetry Binding Determination .......................... 62

2.4.8 Coexpression of FscN-FscP and Western Blot Analysis ......................... 62

2.4.9 Crystallography, Structure Determination, and Refinement ..................... 63

3 STRUCTURE AND FUNCTIONAL ANALYSIS OF THE SIDEROPHORE PERIPLASMIC BINDING PROTEIN FROM THE FUSCACHELIN GENE CLUSTER OF Thermobifida fusca .......................................................................... 65

3.1 Periplasmic Binding Protein Facilitates Siderophore Delivery ........................... 65

3.2 Characterization of FscJ as a Siderophore Binding Protein .............................. 67

3.2.1 FscJ and Fuscachelin Delivery ................................................................ 67

3.2.2 FscJ Structure Determination with X-ray Crystallography ....................... 69

3.2.3 FscJ Ferric-Siderophores Interaction ....................................................... 72

3.3 Structural Features and Siderophore Interaction Mechanism ........................... 74

3.3.1 FscJ Structural Comparison with Other Type III PBPs ............................ 74

3.3.2 pH-Dependent Dynamics Observed in Crystal Structures ....................... 75

3.3.3 FscJ Has a Large Binding Pocket with Unique Charge Arrangement ..... 76

3.3.4 Conclusions and Insights ......................................................................... 79

3.4 Materials and Methods ...................................................................................... 79

3.4.1 Cloning, Expression and Purification of FscJ .......................................... 79

3.4.2 Cloning of Full Length FscJ, Expression and Localization ....................... 81

3.4.3 Isothermal Titration Calorimetry Binding Affinity Determination ............... 81

3.4.4 FscJ Crystallization and Optimization ...................................................... 82

3.4.5 FscJ Atomic Structure Determination and Refinement ............................ 83

3.4.6 Detailed FscJ SAD-MR Based Experimental Phasing ............................. 84

3.4.7 Modeling the FscJ/Ferric-Siderophore Interaction ................................... 85

3.4.8 FscJ Siderophore Binding Site Mutagenesis ........................................... 85

4 PRECURSOR PROTEIN-DIRECTED PEPTIDE MACROCYCLIZATION IN A RIBOSOMAL PEPTIDE NATURAL PRODUCT BIOSYNTHETIC PATHWAY ........ 87

4.1 RiPPs Biosynthesis and Microviridin Biosynthetic Pathway .............................. 87

4.2 Leader Protein-Directed Microviridin J Biosynthesis ......................................... 93

4.2.1 MdnB and MdnC Catalyze MdnA Cyclization .......................................... 93

4.2.2 MdnA Leader Peptide Interacts with MdnB and MdnC Macrocyclases ... 94

4.2.3 Overall Structure of MdnC and MdnB ...................................................... 98

4.2.4 Key Residues for Nucleotide Binding and Catalysis .............................. 105

4.2.5 Recognition of MdnA Leader Peptide by MdnC ..................................... 106

4.3 Structures of MdnC and MdnB Facilitate to Understand the Precursor Recognition in RiPP Pathway ............................................................................ 109

4.4. Materials and Methods ................................................................................... 115

9

4.4.1 Protein Cloning, Expression and Purification ......................................... 115

4.4.2 MdnC Mutagenesis................................................................................ 117

4.4.3 Designed MdnA Variants ....................................................................... 117

4.4.4 Macrocyclization of MdnA Variants........................................................ 117

4.4.5 MdnC-MdnA Dicyclization Kinetics ........................................................ 117

4.4.6 Leader Peptide Binding Affinity Determination ...................................... 118

4.4.7 Preparation of SeMet Labelled Protein .................................................. 118

4.4.8 Crystallization, Data Collection and Crystallographic Analysis .............. 119

5 CRYSTAL STRUCTURES OF Y3 IN FUNGUS Coprinus comatus REVEAL A NOVEL LECTIN FAMILY WITH ANTIVIRAL AND ANTITUMOR ACTIVITIES ..... 122

5.1 Introduction of Lectins, and Proteinous Natural Product Y3 ............................ 122

5.2 Characterization of Y3 as a Novel Lectin ........................................................ 124

5.2.1 Genetic and Biochemical Characterization of Y3 .................................. 124

5.2.2 tNCS Complicates the Structure Determination .................................... 128

5.2.3 Structure of Y3 Reveals a Novel Lectin Family ..................................... 129

5.2.4 Carbohydrates Binding Site ................................................................... 131

5.2.5 Y3 with Antitumor and Antiviral Activities ............................................... 133

5.3 Discussions and Insights ................................................................................ 134

5.4 Materials and Methods .................................................................................... 138

5.4.1 Heterologous Expression of Mature Y3 ................................................. 138

5.4.2 Ellman's Test for Free Thiol Determination............................................ 139

5.4.3 Carbohydrate Content Determination .................................................... 139

5.4.4 Carbohydrates Specificity Determination with ITC ................................ 140

5.4.5 Protein Mass Analysis ........................................................................... 140

5.4.6 Crystallization of Y3 ............................................................................... 140

5.4.7 Diffraction Data Collection and Processing ........................................... 141

5.4.8 Atomic Structure Determination and Refinement .................................. 142

5.4.9 Models of Oligosaccharides Bound Y3 .................................................. 143

5.4.10 MTT Assay .......................................................................................... 143

6 HOMOCYSTEINE METHYLTRANSFERASE MmuM FROM Escherichia coli FACILITATES L-METHIONINE BIOSYNTHESIS AND DAMAGED COFACTOR (R,S)-S-ADENOSYL-L-METHIONINE REPAIR .................................................... 145

6.1 Metabolite Damage is an Under-Recognized Fact of Life ............................... 145

6.2 Homocysteine S-Methyltransferase and Metabolite Repair ............................ 145

6.3 HMT MmuM in E. coli ...................................................................................... 148

6.3.1 Biological Functions and Distribution of bacteria HMT .......................... 148

6.3.2 Overall Structure of MmuM in E. coli ..................................................... 150

6.3.3 The Zn2+-binding, Active Site ................................................................. 153

6.3.4 Modeling MmuM Substrate Recognition and Binding ............................ 157

6.3.5 Specific Potassium Ion Requirement for MmuM Activity ....................... 161

10

6.4 Conclusions and Insights ................................................................................ 161

6.5 Materials and Methods .................................................................................... 162

6.5.1 Chemicals .............................................................................................. 162

6.5.2 Cloning, Expression, and Purification of MmuM .................................... 162

6.5.3 MmuM Crystallization ............................................................................ 163

6.5.4 Data Collection and Processing, and Structure Refinement .................. 164

6.5.5 Models of Bound Methyl-Donors and Analysis ...................................... 165

6.5.6 MmuM Active Site Mutagenesis ............................................................ 165

6.5.7 ITC-Based Activity Assay ...................................................................... 166

6.5.8 Determination of Protein Secondary Structure ...................................... 166

LIST OF REFERENCES ............................................................................................. 168

BIOGRAPHICAL SKETCH .......................................................................................... 195

11

LIST OF TABLES

Table page

2-1 X-ray crystallography statistics of FscN .............................................................. 45

2-2 ICP-MS metal analysis of FscN .......................................................................... 49

2-3 PCR primers used for FscN related cloning ....................................................... 63

3-1 FscJ crystallization data collection and processing ............................................ 68

3-2 Protein structure comparison using the Dali server ............................................ 73

3-3 FscJ MR-SAD phasing and building cycles ........................................................ 84

3-4 Primers for FscJ site-directed mutagenesis ........................................................ 85

4-1 MS profiles of MdnA and MdnA variants ............................................................ 95

4-2 ITC-based precursor peptide/cyclase interaction ................................................ 98

4-3 Data collection and refinement statistics macrocyclases .................................. 102

4-4 Key residues in nucleotide interaction .............................................................. 103

4-5 Oligonucleotides for protein cloning and mutagenesis ..................................... 116

5-1 X-ray data collection, processing and structure refinement of Y3. .................... 127

5-2 Heavy atom derivative crystal preparation of Y3 .............................................. 142

6-1 X-ray data collection, processing and structure refinement of MmuM .............. 149

6-2 Kinetic parameters for HMT MmuM .................................................................. 155

6-3 Docking statistics of metallated MmuM and different methyl donors ................ 158

12

LIST OF FIGURES

Figure page

1-1 Microbial siderophores with diverse scaffolds..................................................... 23

1-2 Typical siderophore pathway in Gram-negative bacteria .................................... 25

1-3 Examples of siderophore related therapeutic drug leads. ................................... 32

2-1 The fuscachelin siderophore pathway of T. fusca............................................... 38

2-2 Annotated gene list of the T. fusca fuscachelin siderophore gene cluster .......... 39

2-3 Properties of the cofactor flavin bound FscN ...................................................... 41

2-4 ITC analysis of ferric-fuscachelin A binding to FscN ........................................... 42

2-5 FscN reduces ferric ion with the presence of NADH ........................................... 43

2-6 X-ray crystal structure of FscN ........................................................................... 44

2-7 Global multiple-protein sequence alignment of putative SIP proteins ................. 46

2-8 Two views of the flavin binding site .................................................................... 47

2-9 Domain arrangement of the flavoprotein reductase and electron transferase .... 54

2-10 S-Tag and His-tag Western blot for FscP-FscN coexpression............................ 56

3-1 FscJ purification and characterization ................................................................ 69

3-2 Overall structure of FscJ ..................................................................................... 70

3-3 Siderophore binding pockets in the SBP family .................................................. 71

3-4 ITC binding analysis of FscJ against purified ferric-fuscachelin A complex ........ 73

3-5 FscJ sequence alignments with clustalW2 ......................................................... 75

3-6 Molecular docking of FscJ with ferric fuscachelin A ............................................ 78

4-1 Representative precursor peptides and gene clusters in microviridin biosynthetic pathways ........................................................................................ 88

4-2 Biosynthesis of microviridin J.............................................................................. 91

4-3 Alignment of MdnC, MdnB and other reported ATP-grasp ligases ..................... 92

13

4-4 Expression and purification of MdnB and MdnC as dimeric proteins .................. 93

4-5 Analysis of full-length MdnA cyclizations ............................................................ 94

4-6 Kinetic analysis of the MdnC catalyzed dicyclization .......................................... 94

4-7 ITC profiles of MdnA variants interact with macrocyclases MdnB and MdnC ..... 96

4-8 Overall structures of MdnC and MdnB ................................................................ 97

4-9 Crystallographic packing of MdnC and MdnB ..................................................... 99

4-10 Electrostatics map of MdnC and MdnB dimers ................................................. 100

4-11 MdnC crystallized as a dimer............................................................................ 101

4-12 C-alpha distance difference plotting between MdnC and MdnB protomers. ..... 101

4-13 MdnC uses ATP to catalyze the macrocyclization ............................................ 103

4-14 Characterization of determinants for MdnC catalyzed cyclization..................... 104

4-15 MdnC central domain interacts with the precursor peptide MdnA..................... 104

4-16 Binding and cyclization activity of macrocyclases toward MdnA variants. ........ 107

4-17 MdnA variants and their macrocyclizations ...................................................... 107

4-18 Leader peptide directed peptide macrocyclization in the microviridin J biosynthetic pathway ........................................................................................ 114

5-1 Recombinant Y3 expression and characterization ............................................ 124

5-2 Characterization of purified Y3 ......................................................................... 125

5-3 tNCS complicates the structural determination of Y3 ....................................... 126

5-4 Overall structure of Y3 dimer ............................................................................ 128

5-5 High-resolution crystal structure of Y3 .............................................................. 132

5-6 Y3 interact with carbohydrates in a novel mechanism ...................................... 133

6-1 HMT converts L-homocysteine to L-methionine with unique substrate selectivity .......................................................................................................... 146

6-2 Overall structure of E. coli MmuM ..................................................................... 150

14

6-3 MmuM apo and metallated form crystals have different crystallographic packing ............................................................................................................. 151

6-4 MmuM belongs to the HMT superfamily of proteins ......................................... 152

6-5 The Zn2+ binding active site of MmuM .............................................................. 154

6-6 MmuM catalyzed kenetics ................................................................................ 155

6-7 Stability of MmuM mutants ............................................................................... 156

6-8 MmuM substrate binding and recognition ......................................................... 157

6-9 Crystallization and optimization of the MmuM .................................................. 163

15

LIST OF ABBREVIATIONS

A Adenylation Domain

ABC ATP-Binding Cassette

ACN Acetonitrile

ACP Acyl Carrier Protein

AdoHcy S-Adenosyl-L-Homocysteine

AdoMet S-Adenosyl-L-Methionine

ADP Adenosine Diphosphate

AMP Adenosine Monophosphate

Amp Ampicillin

Amx Amoxicillin

ANL Argonne National Laboratory

APS Advanced Photon Source

ASP Ammonium Sulfate Precipitation

AT Acyl Transferase

ATP Adenosine Triphosphate

BHMT Betaine Homocysteine Methyltransferase

C Condensation Domain

CD Circular Dichroism

CHES 2-(Cyclohexylamino)ethanesulfonic Acid

CoA Coenzyme-A

16

Cy Cyclization Domain

DH Dehydratase

DHB 2,3-Dihydroxybenzoic Acid

DMSO Dimethyl

DTT Dithiothreitol

E Epimerization Domain

EDTA Ethylenediaminetetraacetic Acid

ESI Electrospray Ionisation

FAD Flavin Adenine Dinucleotide

FPLC Fast Protein Liquid Chromatography

FSR Ferric-Siderophore Reductase

Hcy L-Homocysteine

HEPES 2-[4-(2-Hydroxyethyl)piperazin-1-yl]ethanesulfonic Acid

HMT Homocysteine S-Methyltransferase

HPLC High-Performance Liquid Chromatography

ICP-MS Inductively Coupled Plasma-Mass Spectrometry

IMAC Immobilized Metal Affinity Chromatography

ITC Isothermal Iitration Calorimetry

KR Ketoreductase

KS Ketosynthase

LC-MS Liquid Chromatography–Mass Spectrometry

17

LCP Lipid Cubic Phase

LS-CAT Life Sciences Collaborative Access Team

LSP Lipid Sponge Phase

MAD Multi-Wavelength Anomalous Dispersion

MALDI-TOF Matrix Assisted Laser Desorption Ionization Time-of-Flight

MDR Multidrug Resistance

MES 2-(N-morpholino)ethanesulfonic Acid

MetS Methionine synthase

MFS Major Facilitator Superfamily

MIC Minimum Inhibitory Concentration

MIR Molecular Replacement

MMT Methionine S-methyltransferase

MPD Methyl-2,4-Pentanediol

MR Multiple Isomorphous Replacement

MS Mass Spectrometry

Mt Methylation Domain

MTHF 5-Methyltetrahydrofolate

NADH Nicotinamide Adenine Dinucleotide

NADPH Nicotinamide Adenine Dinucleotide Phosphate

NBD Nucleotide-Binding Domains

NIS NRPS-Independent Synthetase

18

NMR Nuclear Magnetic Resonance

NRP Nonribosomal Peptide

NRPS Nonribosomal Peptide Synthetase

NSF National Science Foundation

Ox Oxidation Domain

PBP Periplasmic Binding Protein

PCP Peptidyl Carrier Protein

PCR Polymerase Chain Reaction

PDB Protein Data Bank

PEG Polyethylene Glycol

PK Polyketide

PKS Polyketide Synthase

RMSD Root-Mean-Square Deviation

RND Resistance, Nodulation, and Cell Division Superfamily

RP Ribosomal Peptide

RPS Ribosomal Peptide Synthetase

RT Room Temperature

S-ribosylMet S-Ribosyl-L-Methionine

SA Salicylic Acid

SAD Single Wavelength Anomalous Dispersion

SIP Siderophore-Interacting Protein

19

SMM S-Methyl-L-Methionine

TCEP Tris-2-carboxyethyl-phosphine

TE Thioesterase Domain

TFA Trifluoroacetic acid

TIM Triosephosphateisomerase

20

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

STRUCTURAL INSIGHT INTO MICROBIAL NATURAL PRODUCT PATHWAYS

By

Kunhua Li

May 2016

Chair: Steven D. Bruner

Major: Chemistry

Natural products include any substance produced by living organisms and

provide enormous structural diversities with unique biological relevant functions. Natural

products have been a surge in interest of new therapeutics and have been found

clinically effective for a variety of human ailments. The structures of the enzymes

involved in the natural product biosynthesis, delivery, utilization, degrading and repair

are critical to understanding the natural product pathways, thus facilitate the

development of relevant drugs. The identification and engineering of natural products

and their related pathways also facilitate the building, maintaining and developing of the

natural product library. My dissertation will focus on the natural product pathway related

proteins, including their functional and structural identifications and engineering.

The first three chapters provided an insight into the microbial siderophore

pathways, specifically the NRPS-based fuscachelin pathway in Thermobifida fusca. Two

critical proteins, FscJ and FscN involved in the fuscachelin delivery and utilization have

been functionally and structurally characterized. FscN is the first structurally

characterized siderophore interaction protein with associated functions; FscJ is the first

periplasmic binding protein corresponds for the mix-type siderophore delivery.

21

The next two chapters described ribosomal peptides and their biosynthetic

pathways. Chapter 4 detailed the microviridin biosynthesis in cyanobacteria. Microviridin

inhibits protease activity and is a potential antitumor drug lead. The structural

information of the two key enzymes in the microviridin J pathway, MdnC and MdnB

helped to understand the unique precursor peptide recognition. Chapter 5 described a

proteinous natural product, Y3 from fungus Coprinus comatus. Antiviral and antitumor

Y3 belongs to a novel lectin family.

The last chapter of my dissertation described an under-recognized aspect of

metabolite damage, which raises useless or toxic byproducts. This chapter provided an

example of MmuM facilitated (R,S)-AdoMet metabolite repair.

22

CHAPTER 1

MICROBIAL SIDEROPHORE-BASED IRON ACQUISITION AND MEDICINAL APPLICATION

1.1 Introduction of Microbial Siderophores

Iron is an essential element for all microorganisms and an important cofactor

required for many cellular, metabolic and biosynthetic processes1,2. The specific

acquisition of elemental iron is critical for a microbe’s survival and frequently

virulence3,4. Under common aerobic and neutral pH conditions, environmental ferric ion

has a low bioavailability concentration (10-9-10-18 M), which is usually below the

requirement for microbe optimum growth5. As a consequence, bacteria and fungi have

evolved diverse strategies to import and utilize iron under iron-deficient conditions

exemplified as6: the direct extracellular reduction of ferric compounds of low solubility to

soluble ferrous iron7; the acquisition of iron-bound heme or heme-containing proteins8;

the removal of metal from host iron-bound transferrin, lactoferrin or ferritin via specific

outer membrane receptors9; and the synthesis, secretion and utilization of iron specific

chelators, ferric-binding siderophores10.

Siderophores are low molecular weight compounds (400-1200 Da) with

extraordinary high affinity for ferric (less so for ferrous) ion, and tend to used negatively

charged oxygen as donor atoms2,10. Broadly, siderophores can be classified into three

categories depending upon the ligand moiety for Fe3+ coordination and include

catecholate/phenolate; hydroxamate/carboxylate and mixed-type, exampled as

enterobactin (1, Escherichia coli)11, aerobactin (2, E. coli)12 and mycobactin (3,

Mycobacterium tuberculosis)13 respectively. The different subclasses typically are multi-

dentate ligands, and show strong affinity for higher oxidation state of iron with

association constants of ~1030 or greater. Siderophores help maintain the intracellular

23

iron concentration between 10-7 and 10-5 M for microbes' survival and multiplication5. In

addition to iron, several siderophores also have the ability to seize and transport other

metals14.

Figure 1-1. Microbial siderophores with diverse scaffolds.

1.2 General Approaches of Siderophore Biosynthesis

Siderophores are commonly produced in the cytosol or peroxisomes4,15. Large

varieties of siderophores have been discovered in past decades with two different

pathways for structurally distinct siderophores biosynthesis. Most polypeptide-based

siderophores are biosynthesized via well-characterized, non-ribosomal peptide

synthetase (NRPS) systems. NRPSs are large multi-enzyme complexes responsible for

the synthesis of non-ribosomal biologically important peptides16. NRPS consists of

common domains in a gene cluster (or clusters17) as adenylation (A), condensation (C),

peptidyl carrier protein (PCP) and thioesterase (TE) along with other specific functional

domains includes epimerization (E), oxidation (Ox), methylation (Mt), and cyclization

24

(Cy). Additional accessary proteins may also present in the gene cluster to provide the

molecular building blocks for the NRPS machinery18,19. NRPS functional modules carry

out steps of monomer selection (A-PCP), modification (E, Ox, Mt); peptidyl chain

elongation (C), and cyclization/termination (Cy, TE). Siderophore NRPS assemblies

frequently initiate the peptidyl chain with a non-canonical amino acid building block

derived from an aryl acid, such as salicylic acid (SA) or 2,3-dihydroxybenzoic acid

(DHB) acting as the catecholate chelating moiety. The presence of thiazoline or

oxazoline rings in NRPS pathway based siderophores are also common, installed by

NRPS embedded Cy/Ox domains. Post-assembly line modification of siderophore

NPRs is less usual, a typical example of is the enterobactin C-glycosylation catalysed

by glycosyltransferase IroB in salmochelin (4) production20. In addition, a suite of marine

siderophores have a peptidic head group for ferric chelation, as well as fatty acid

tailoring that varies in length and saturation to tune the overall molecular

amphiphilicity21,22. Marinobactin (5) in is produced by Marinobacter sp. largely using an

NRPS machinery and further acylation with saturated and unsaturated C12-18 fatty

acids23,24. Comparability, amphi-enterobactin (6) in Vibrio harveyi BAA-1116 also

contains a fatty acid moiety. The amphi-enterobactin biosynthesis pathway contains a

bi-functional C-domain, which can accept amino acyl-phosphopantothenyl-PCP, as well

as acyl-CoAs as substrates25. Other recently described novel NRPS siderophore

scaffolds include mirubactin (7) from Actinosynnema mirum, which contains an

unprecedented hydroxamic acid ester26.

In contrast to NRPS systems, a number of non-polypeptide bacterial

siderophores are constructed by NRPS-independent synthetase (NIS) pathways27,28.

25

NIS siderophores include aerobactin, achromobactin (8, Pseudomonas syringae)29 and

desferrioxamine (9, Streptomyces griseus)30. These siderophores are usually

assembled from alternating dicarboxylic acid and diamine/amino alcohol building blocks

linked by amide or ester bonds. Recent examples of NIS-siderophores include

baumannoferrin (10) isolated from acinetobactin (the preliminary NRPS siderophore)-

deficient Acinetobacter baumannii AYE strain under iron-limiting condition31; and

putrebactins (11), the first identified unsaturated macrocyclic dihydroxamic acid

siderophores in Shewanella putrefaciens synthesized in a precursor-directed

mechanism32.

Figure 1-2. Typical siderophore pathway in Gram-negative bacteria. Figure was prepared with PyMol (https://www.pymol.org/) with PDB entries used listed: 3WDO, 2XMN, 4DK0, 4C48, 1FI1, 2W76, 2GSK, 2PFU, 5DH0, 2GZR, 4G1U and 4YHB.

26

1.3 Siderophore Secretion

Once synthesized, apo-siderophore will be secreted into the media to scavenge

iron4. Siderophore secretion is one of the least understood steps of high-affinity iron

acquisition in microorganisms. Several different secretion systems have been identified

as implicated in the process; comprise transporters from major facilitator superfamily

(MFS); and the efflux pumps of the resistance, nodulation, and cell division superfamily

(RND).

Many NRPS based siderophore gene clusters contain an MFS transporter-

encoding gene. MFS is one of the largest groups of transporters conserved from

bacteria to human for a wild spectrum of substrates including ions, carbohydrates,

lipids, peptides, nucleosides, and other primary and secondary metabolites33. The

majority of MFS members comprise 12-transmembrane α-helices (TMs), with some

containing 14-TMs or more33. Several MFS proteins from different subfamilies have

been structurally elucidated, but none of them associates with siderophore secretion34.

In Gram-negative E. coli, apo-enterobactin is transported by EntS, a 43-kDa archetype

MFS transporter encoded by the gene ybdA within the Fur-regulated ent-fep gene

cluster35. EntS exports enterobactin across the cytoplasmic membrane. Mutant ΔentS

(ΔybdA) shows significant reduction of enterobactin secretion with an increased release

of its byproducts. Comparably, in the Gram-positive B. subtilis, the MFS protein YmfE

was identified to participate in siderophore bacillibactin secretion during the single

mutants screening36,37. YmfE mutant strain has a severe bacillibactin secretion defect

and an eliminated bacteria growth in iron deficient medium.

27

In addition to the MFS class, the RND superfamily is an ancient and ubiquitous

group of proton antiporter widespread especially among Gram-negative bacteria

catalyses active efflux of heavy metals, drugs and siderophores38. The MexAB-OprM

system in Pseudomonas aeruginosa is the first RND identified in siderophore

(pyoverdin) secretion39. The mex operon is iron-regulated. In E. coli, as with MFS

transporters, TolC corresponds for the siderophore enterobactin efflux across the cell

outer membrane40,41. TolC is a trans-outer membrane protein and is an essential

component of RND pump AcrAB-TolC. AcrAB-TolC is a general transporter for bacteria

antibiotics and toxic compounds tolerance, and has been structurally characterized as a

whole42. The AcrAB-TolC pump consists with an AcrB:AcrA:TolC 3:6:3 ratio. AcrA

interacts with TolC through the hairpin domain and connect with AcrB through the β-

barrel and membrane-proximal domains. Whereas, the role of AcrAB in enterobactin

secretion is not entirely clear. Triple mutant ΔacrB/ΔacrD/ΔmdtABC, but not any

individual deletions, does result decreased enterobactin excretion43. AcrAD and

MdtABC are two other TolC-dependent RND pumps43. In contrast, the MmpS5-MmpL5

(or 4) RND efflux pump, a homolog of AcrAB has been identified as an critical

siderophore export system for the virulence M. tuberculosis44. MmpS5-MmpL5 is

regulated by the MarR-like transcriptional regulator Rv0678, whose open reading frame

is located downstream of the MmpS5-MmpL5 operon45. Mutant lacking the mmpS4 and

mmpS5 genes does not change the uptake of external carboxymycobactin (12), but

stops the cell grow under low iron conditions46. MmpS5-MmpL5 efflux system also

corresponds for the M. tuberculosis azole resistance47.

28

Besides the MFS and RND subclasses, IroC, a type-I ATP-binding cassette

(ABC) transporter similar to eukaryotic multidrug resistance (MDR) proteins in E. coli,

was previously suggested as responsible for the uptake of ferric-salmochelin48, is

proposed to be an active enterobactin/salmochelin exporter49. The double mutant

ΔentS/ΔiroC has a severely compromised growth rate, while IroC does not impact the

ferric-enterobactin/salmochelin utilization in the growth promotion49,50.

1.4 Holo-Siderophore Acquisition

Holo-siderophore acquisition is one of the key steps in iron assimilation. Gram-

negative bacteria possess an outer membrane layer and a peptidoglycan layer. Most

siderophores have molecular weights in excess of 600-Da, and their ability to permeate

the outer membrane is low10. Therefore, specific energy-dependent integral outer

membrane porin protein is essential to deliver iron-siderophore complex from the

extracellular into the periplasmic space, as typified by ferric-ferricrocin receptor FhuA51 ,

and ferric-enterobactin receptor FepA (and IroN)52. FhuA is a 22-stranded β-barrel that

spans the outer membrane, with an N-terminus plug domain accessible to the barrel

pore to facilitate ligand recognition and binding. FhuA also includes an N-terminus TonB

box, which forms a four-stranded β-sheet with TonB, a cytoplasmic membrane sigma

regulator53 with receptors specificity54. Meanwhile, ferric-citrate receptor FecA55 in E. coli

and ferric-pyochelin receptor FpvA in Pseudomonas aeruginosa56 belongs to the TonB-

transducer family, which possess a signaling domain upstream from the TonB box and

are able to self-regulate their own synthesis as well as their cognate siderophores.

TonB is part of the TonB/ExbB/ExbD energy transduction systems. The complex is

located in the cytoplasmic membrane, and provides the energy required for the active

29

transport of holo-siderophores through the outer receptor57. TonB undergoes energized

motion in the bacterial cell envelope58, interacts with ExbB4-ExbD2 complex59,60 to the

electrochemical gradient and initiate the energization61,62. The interactions promote iron

uptake through outer membrane transporters in a rotational mechanism.

The ABC transporter superfamily constitute a group of a transmembrane proteins

that perform ATP-coupled translocation of a wide range of substrates across cell

membranes63. ABC transporters deliver environmental ferric-siderophore to cytosol in

Gram-positive bacteria, or from periplasm to cytosol in Gram-negative bacteria. ABC

transporters have a common architecture that consists of a pair of transmembrane

domains (TMDs) embedded in the membrane lipidic bilayer, and a pair of nucleotide-

binding domains (NBDs) that are located in the cytoplasm63. Most siderophore ABC

transporters belong to type-II ABC importers, and their homolog, heme transporter

HmuUV in Yersinia pestis has been structurally characterized and identified to employ

an coupling mechanism distinct from that of other ABC transporters64. Some

siderophore-specific ABC transporters has been reported to be regulated by iron-

dependent regulators65,66. Siderophore ABC transporters interact with an external type-

III ABC-transporter periplasmic binding proteins (PBPs) to facilitate the substrates

delivery. Unlike type-I and type-II PBPs, which undergo significant domain transition

during substrate interaction, most siderophore type-III PBPs and have a relatively rigid

α-helix structure serving as the hinge between the two domains67 and ensures the

overall structural stability and also controls the domain movement between

‘open’/‘closed’ states upon specific substrate(s) recognition68,69,70. ABC transporter

meditated siderophore delivery in Gram-positive bacteria, as the YxeB-dependent

30

system in Bacillus cereus shuttle with an iron-exchange mechanism from ferric-

siderophore to apo-siderophore without iron reduction71,72.

In addition to ABC importers, the MFS transporter MirB in fungal pathogen

Aspergillus fumigates, a 14-TMs protein is reported to be responsible for the uptake of

hydroxamate siderophore N,N’,N’’-triacetylfusarinine-C (TAFC)73. No bacterial MFS has

been reported in holo-siderophore acquisition so far.

1.5 Siderophore Utilization

Imported ferric iron bound to siderophore must be released upon holo-

siderophore delivery to be made available to cellular machinery. One strategy for iron

release is through hydrolytic destruction of the holo-siderophore by esterases of the α,β-

hydrolase family of enzymes, represented by the E. coli fes iroE gene product, and is a

common strategy for macrolactone-based siderophores74,75. IroE is a periplasmic

trilactone hydrolase that tends to hydrolyse enterobactin at one ester to produce the

linearized trimers, which still display a considerable affinity to ferric ion. Thus, in addition

to iron assimilation, IroE is likely involved in the production of triscatecholate

siderophores74. On the contrast, periplasmic trilactone esterase Cee in Campylobacter

hydrolyses both apo- and holo- forms of enterobactin effectively, and further digests the

linearized trimer into dimers and monomers, which significantly lost the ferric affinity76.

Genomic evidence further demonstrates that Cee is involved in enterobactin mediated

iron uptaken76.

For cleavage-independent siderophore paths, iron release is commonly proposed

to be facilitated by single-electron reduction of the ferric-siderophore to the ferrous

oxidation state77. The relative weaker reduced ferrous-siderophore interaction allows

31

kinetic exchange with downstream iron-chelating sites found in cellular protein or small

molecules10. Two families of proteins have been identified so far involved in the

process, termed siderophore-interacting protein (SIP), and ferric-siderophore reductase

(FSR). FSR contains a unique C-terminal [2Fe-2S] cluster as C-C-x10-C-x2-C, and does

not show significant similarities to other known [2Fe-2S] proteins. FSR as E. coli FhuF is

a part of siderophore utilization system and has a sufficient redox potential to reduce

ferric-ferrioxamine78. FchR, a FhuF homolog in Bacillus halodurans DSM497 has been

reported to work in a three-component electron donor system, along with NADPH and

ferredoxin to reduce various iron chelates with optimal demonstrated reduction activity

against ferric-dicitrate79. Meanwhile, SIP is a flavoreductase. YqjH contains a covalently

bound FAD and reduce ferric-enterobactin with the presence of NADPH80,81. The yqjH

gene is regulated by YqjI, a Zn-dependent regulator82,83. YqjH homolog, FscN in T.

fusca has been structurally identified consists of a FAD-binding domain and an

NAD(P)H-interaction domain77. FscN contains a non-covalently bound FAD and reduce

ferric-fuscachelin-A (13) with NADH. Remarkably, the M. tuberculosis iron-regulated

ABC transporter, IrtA contains extended N-terminus FAD/NAD(P)H binding domains

similar to the SIP family84. As a result, IrtA has dual functions include iron delivery and

assimilation.

1.6 Siderophores as Therapeutical Drug Leads

Bacterial siderophores, themselves, have been shown to exhibit specific

antifungal or antibiotic activities85,86. Therapeutic iron chelators containing siderophore

scaffolds have also been used in the treatments of blood-transfusion requiring

diseases87. In addition, in the past decade, siderophore-antibiotic conjugates have been

32

developed as drug leads with reduced permeability-mediated drug resistance using

“Trojan Horse”-type mechanisms88. The strategy can impressively reduced permeability-

mediated drug resistance with target selectivity, and especically advantageous in MDR

pathogen control89.

Figure 1-3. Examples of siderophore related therapeutic drug leads.

Successful design of siderophore-drug conjugates contain, a siderophore moiety

that can be recognized and imported; a suitable linker stable enough in extracellular

environment but can be cleaved by enzymatic reactions either in cytoplasm or

periplasm; and an effective drug moiety, commonly of the β-lactam drug family90.

Siderophores-linked lactam antibiotics have an increased penetration through the outer

membrane91, which pathegen selectivity can be tuned with the conjugated siderophores.

Specifically, tris-catecholate siderophore-aminopenicillin conjugates (14) Inhibit Gram-

negative bacteria especially against Pseudomonas aeruginosa92; biscatecholate-

monohydroxamate siderophore-carbacephalosporin conjugates (15) are selective for

pathogenic Acinetobacter baumannii93. Another hexadentate siderophore-mediated

33

drug delivery system has been successfuly addressed with enterobactin backboned

siderophore-cargo conjugates in E. coli (or Pseudomonas aeruginosa)94,95.

Enterobactin-antibiotic conjugates, include β-lactam antibiotics ampicillin (Amp) and

amoxicillin (Amx) conjugates (Ent-Amp/Amx, 16) have 1000-fold decrease in minimum

inhibitory concentration (MIC) value against E. coli CFT073 relative to Amp/Amx

alone95. The enterobactin-mediated delivery is FepA dependent. The Ent-Amp

antibiotics selectivity can be further constricted with salmochelin modification (GlcEnt-

Amp/Amx, 17) to target specific pathogenic E. coli with iroN encoded salmochelin

receptor96. Salmochelin-modification also present a low mammalian cell toxicity96.

Lactam-family drugs can also be delivered with relatively smaller siderophores. The

bidentate siderophore-sulfactam BAL30072 (18) has a promising activity against multi-

resistant Gram-negative Bacilli97, include the strains express multiple β-lactamases as

meropenem-resistance Acinetobacter baumannii98. The antimicrobial efficacy of

BAL30072 can be further enhanced through the carbapenem combimation treatments99.

BAL30072 has been submitted to clinical trials phase-I in 2013100. Comparably,

siderophore-monocarbam conjugates MC-1 (19) possess in vitro and predicted in vivo

activity against MDR pathogen as Pseudomonas aeruginosa101,102. The MC-1/penicillin-

binding protein co-complex crystal structure established the molecular basis for the

recognition specificity and coupling activity103,104. Similar pyridone-monobactam

conjugate also possess the in vitro antibacterial activity against clinically relevant MDR

Gram-negative species105. MC-1 and its analogs are most likely to use the TonB-

dependent outer membrane siderophore receptor (PiuA, or PirA) as the primary mean

of entry105,106.

34

Besides siderophore-lactam conjugates, several lactam-independent

siderophore-drug candidates have been reported. The trihydroxamate siderophore-

fluoroquinolone conjugates (20) are sensitive to Gram-positive Staphylococcus aureus

SG511107. Lactivicin analog-phthalimide conjugates (21) use a wider set of TonB

receptors (compare with hydroxypyridone-lactams) to target penicillin-binding proteins

and accommodate the in vitro Gram-negative antibiotics activity108. Mycobactin-T

analog-artemisinin conjugate (22) displays high activity against extensively drug-

resistance (XDR) M. tuberculosis strains109,110. Core drug moisty artemisinin has an

antimalarial activity, but no antituberculosis activity.

1.7 Targeting Siderophore Pathways

Specific targeting of siderophore pathways is another applicable therapeutic

approach to address the microbial virulence. The disruption of iron recycling constrains

the microbial survival and reproduction4, and the interruption of holo-siderophore

acquisition is a potential therapeutically approach. Most siderophore-drug conjugates do

have a dividend effect on jamming the native siderophore acquisition. In addition,

reported antibiotic lasso-peptide Microcin-J25 (23) and its analogs target FepA111.

Lasso-peptide structurally hijack FepA to disrupt the ferric-siderophore acquisition and

reduce the microbial growth in vitro112. Several pathogen survive with mammals

siderophore through FepA113.

Seizing the siderophore biosynthesis to reduce virulence has received much

attention recently. In NRPS-based siderophore biosynthesis, acyl adenylate

intermediates, includes SA- and DHB-AMP analogs interrupt the substrate recognition

of the initial A-domain, subsequently inhibit a broad range of microorganisms with aryl-

35

capped siderophores production114. Sal-AMS (24) inhibit NRPS MbtA in M.

tuberculosis115,116, and significantly reduce the pathogen growth in mouse lungs in

vivo117. Other possible siderophore synthases target includes the specific NRPS-

accessary proteins. Biaryl nitrile (25) and its analogs target PvdQ, an essential N-

terminal nucleophile hydrolase in pyoverdine biosynthesis pathway, and decrease the in

vitro siderophore production, thus limit the growth of Pseudomonas aeruginosa under

iron-limiting conditions19.

Siderophore secretion is an alternative eligible pathogenic target. Pathogenic E.

coli requires siderophores for iron acquisition during infection, while exporters EntS and

IroC are important for the systemic virulence in a chicken infection model50. Outer

membrane transporter ΔtolC mutant E. coli also displays morphological defect in

minimal medium, and is probably caused by periplasmic enterobactin accumulation118.

Similar approaches have been successfully addressed in RND efflux system in M.

tuberculosis46. Exporter-deficient ΔmmpS4/S5 M. tuberculosis has an inhibited growth

rate, and the growth defect could not be rescued by external mycobactin supplement46.

The cytoplasmic mycobactin (and carboxymycobactin) accumulation provides a

possible self-poisoning inhibition mechanism. In this context, siderophore secretion is a

more advantageous target, as the biosynthesis or specific acquisition defect can be

overcome by using alternative ferric-carrying heme or siderophores46.

1.8 Conclusions and Insights

The chemistry and biology of siderophore-mediated iron acquisition is a complex

process and a key for step for microbial survival and virulency. Understanding the

siderophore-related pathways enables the design and development of siderophore-

36

scaffold drugs, as well as the therapeutic approaches in targeting the pathogenic iron

assimilation. Recent progress in siderophores and their related applications provide a

much-improved insight into the complex and important process of iron acquisition. The

developing clinical endeavors to target microorganism/pathogen in “iron battle” will

continue to stimulate research in the area.

37

CHAPTER 2

STRUCTURE AND MECHANISM OF THE SIDEROPHORE INTERACTING PROTEIN FROM THE FUSCACHELIN GENE CLUSTER OF Thermobifida fusca1

2.1 Siderophore Utilization Pathway in Thermobifida fusca

The specific acquisition of elemental iron is essential for a majority of

organisms119,120. Additionally, the cellular machinery to transport and assimilate iron

often contributes to the virulence of human microbial pathogens121. As described in the

previous chapter, various siderophore pathway-based therapeutical applications have

been developed during the past decades. These approaches majorly focus on

siderophore production and acquisition, while siderophore-based iron reduction is still a

blank area.

The intracellular reduction of ferric-siderophores is proposed to occur through

SIP or FSR family proteins. A preference of NADPH for reduction of bound flavin has

been demonstrated for SIP YqjH, along with the ability to reduce diverse iron chelates

including ferric enterobactin81. The ygjH gene is a part of the Fur regulon and under

transcriptional control of yqjI, a winged helix transcriptional regulator82,122. Interestingly,

the DNA-binding of YqjI is dependent on Ni2+ binding (not iron), and it is proposed that

the gene product helps regulate iron acquisition in conditions of high nickel ion

concentration, which can disrupt iron homeostasis83. FSR contains a unique C-terminal

C-C- x10-C-x2-C iron-sulfur cluster binding motif which does not show significant

similarities to other known [2Fe-2S] proteins78. FsR FchR from B. halodurans DSM497

has been reported to work in a three-component electron donor system, along with

1 Adapted with permission from Li, K., Chen, W.-H. & Bruner, S. D. Structure and mechanism of the

siderophore-interacting protein from the fuscachelin gene Cluster of Thermobifida fusca. Biochemistry 54,

3989-4000 (2015) DOI: 10.1021/acs.biochem.5b00354. Copyright © 2015 American Chemical Society

38

NADPH and ferredoxin to reduce various iron chelates with optimal demonstrated

reduction activity against ferric dicitrate123.

Figure 2-1. The fuscachelin siderophore pathway of T. fusca. A) Structure of the nonribosomal based peptide siderophore fuscachelin A along with, B) a 3-dimensional model of fuscachelin A-iron (green) complex. C) Schematic diagram showing the predicted roles of genes in the T. fusca siderophore gene cluster.

Our group have previously characterized a novel siderophore producing

biosynthetic gene cluster in the actinomycete T. fusca along with the structure of the

unique metabolite (Figure 2-1AB)124,125. The nonribosomal peptide (NRP) fuscachelin is

a mixed catecholate/hydroxamate with several unique structural features. The gene

cluster for fuscachelin contains many of the commonly found components for iron

acquisition (Figure 2-1C and Figure 2-2)124. The biosynthetic machinery includes NRP

synthetases (FscFGHI) and accessory biosynthetic enzymes FscABCDEK, in addition

to genes for transport and iron utilization. FscM is homologous to EntS-siderophore

transporters and is likely involved in export of the apo-siderophore. FscJ is a

membrane-anchored siderophore binding protein proposed to bind and deliver the ferri-

siderophore to the ABC transporter system. There are no genes for an ABC transporter

39

system specific to the fuscachelin gene cluster; however examination of the T. fusca

genome suggests a cluster (tfu_0336-0338) as a likely candidate with homology to the

E. coli FepCDG/FhuCB machinery126. These three genes are regulated by DtxR, an

iron-dependent transcriptional repressor that controls the expression of siderophore

gene clusters in T. fusca127. We are interested to exploit the thermostability inherit in T.

fusca to characterize the structure and mechanisms of the complex machinery involved

in siderophore biosynthesis and iron acquisition/utilization.

Figure 2-2. Annotated gene list of the T. fusca fuscachelin siderophore gene cluster. Genes with verified function or high homology to genes with predictable function are annotated as fscA-P.

Representatives of the two proposed families involved with intracellular iron

reduction are present in the T. fusca cluster. FscN is a flavoprotein of the SIP family and

FscP is a member of the FSR family shares high homology Fe/S cluster with E. coli

40

FhuF. It is uncommon to find representatives of both protein classes in a single

siderophore gene cluster. There is limited structural data available to provide a basis for

the function of the either superfamily in siderophore utilization. For the SIP family, there

is a preliminary X-ray diffraction analysis of YqjH from E. coli (but without structural

determination)80 and a structure of a homolog from S. putrefaciens has been solved and

deposited (PDB #: 2GPJ) as part of the Joint Center for Structural Genomics128. The 2.2

Å structure of the S. putrefaciens protein establishes the SIP class as a member of the

broad flavoreductase family with N-terminal β-barrel and C-terminal α/β/α sandwich

domains form the unique structural clades present in the SIP family across bacterial

species81. There are no structures available (or significant structural homologs) for the

FSR family of reductases. Here I report the structure and biochemical characterization

of FscN as a flavin-dependent iron reductase. The results provide a structural basis for

SIP function and insight into siderophore utilization in Gram-positive bacteria.

2.2 FscN is a Flavin Containing Siderophore Interaction Protein

2.2.1 FscN Contains a Non-covalently Bound FAD Cofactor

The fscN gene was cloned from T. fusca ATCC 27730 genomic DNA and the

protein was purified as a hexa-histidine fusion using standard chromatographic

techniques. The purified protein was bright yellow, indicative of a bound flavin-type

cofactor. The UV-visible spectrum of the isolated protein was consistent with flavin

adenine dinucleotide (FAD) in the oxidized form, with the absorption maxima at 388 and

459 nm and distinct shoulders located at 459 and 481 nm (Figure 2-3B). In order to

further identify the cofactor, the purified protein was denatured at high temperature and

the yellow supernatant was analyzed with LC-MS resulting in a molecular weight of

785.55 Da corresponding to the UV-active component, supporting FAD as the isolated

41

cofactor (Figure 2-3A). The protein precipitation formed during the denaturation

appeared white and was not UV active, suggesting the cofactor is bound to protein

through a non-covalently or labile covalent interaction. In order to confirm that the FAD

cofactor was redox-active in the context of FscN, the protein was reduced by sodium

dithionite under anaerobic conditions129. During the reaction, the 459 nm shoulder

decreased suggesting the disappearance of FAD over time, while the increase of the

600 nm shoulder peak indicates the formation of reduced FADH2 (Figure 2-3B).

Figure 2-3. Properties of the cofactor flavin bound FscN. A) LC-MS analysis of purified flavin cofactor, confirming the cofactor identification as non-covalent bond FAD. B) UV-Visible spectral of bound FAD (Ox) along with time points (16, 32, 48, 64, 80 min) of reduction with sodium dithionite, to FADH2 (+R, dotted line). The figure insert shows the full spectrum of reduced protein.

2.2.2 The Siderophore Utilization Protein FscN Binds Ferric-Fuscachelin

Isothermal titration calorimetry (ITC) supports FscN binding to ferric-fuscachelin

A with a reasonable binding affinity. FscN was titrated against ferric-fuscachelin-A and

42

an overall binding mode was modeled as an enthalpy driven process with calculated

binding affinity (dissociation constant) of 30 μM (Figure 2-4). We tested several other

ferric-siderophores for binding to FscN in order to determine the specificity of FscN.

Ferric-EDTA, ferric-citrate and ferric-2,3-dihydroxybenzoic acid (DHB) were all tested

using identical ITC conditions and no significant binding affinity was observed.

Figure 2-4. ITC analysis of ferric-fuscachelin A binding to FscN. The calculated binding constant and corresponding error is listed on the figure.

2.2.3 FscN Specifically Reduces Ferric Fuscachelin

Ferric reductase assays were performed anaerobically in a phosphate buffer to

further address the bioactivity of FscN81. Purified FscN showed no ferric reductase

activity toward ferric chloride, ferric-EDTA, ferric citrate, ferric 2,3-dihydroxybenzoic acid

43

or ferric enterobactin complex (data not shown). Relatively weak activity (40 nmol Fe2+

min-1 mg-1) toward purified ferric-fuscachelin A was detected with the presence of NADH

(Figure 2-5). The range of activity is consistent with that previously reported for YqjH,

which was reported to be 22.0 nmol Fe2+ min-1 mg-1 with a KM of 33 μM82. In contrast to

YqjH, inclusion of NADPH in the reaction in place of NADH did not result in any

observable reduction, suggesting that FscN functions as a NADH-dependent ferric

siderophore reductase.

Figure 2-5. FscN reduces ferric ion with the presence of NADH. The reaction was carried anaerobically with 1.0 mM NADH, 20 mM purified ferric fuscachelin A in 100 mM phosphate buffer, pH 7.5. Detection wavelengths of 340 nm were selected to monitor the consumption of NADH. NADH remained stable in the reaction system (red line) and converted into NAD+ with the presence of 100 µM FscN (Blue line). The reduction rate of FscN toward purified ferric fuscachelin A was calculated as 40 nmol Fe2+ min-1 mg-1.

2.2.4 Structure Determination with X-ray Crystallography

FscN protein crystals diffracted in the space group C2 and the structure was

solved using the deposited structure of the SIP from S. putrefaciens (PDB entry: 2GPJ)

as a molecular replacement search model. The final model was refined to 1.89 Å

resolution with two monomers of FscN present in the asymmetric unit (Figure 2-6, and

Table 2-1). The observed dimer is most likely a result of crystal packing and does not

represent the biological unit based on analysis of the interface using the PISA server130

44

and the size exclusion chromatography elution profiles are consistent with an FscN

monomer. The two monomers in the asymmetric unit are highly similar, with a total atom

RMSD 0.26 Å and the following discussion/figure representations will refer specifically

to the deposited monomer A.

Figure 2-6. X-ray crystal structure of FscN. A) The dimer present in the asymmetric unit shown in ribbon format with the flavin cofactor in licorice representation. B) One monomer of FscN illustrating the two N- and C-terminal subdomains and the location of an observed bound metal (shown as a Zn2+ ion).

The overall molecular structure of FscN consists of two subdomains: an N-

terminal FAD-binding domain and the C-terminal NADH-binding domain that are

connected by a linker region consisting of a two-strand β-sheet. The cofactor FAD is

clearly resolved in the electron density maps and non-covalently bound in the boundary

cleft between the two major domains. The FAD-binding domain has a six-stranded anti-

parallel β-sheet with a small α-helix connecting a long loop. The arrangement of the

NADH- binding domain forms a-β1-α-β2 architecture as is typical of the NAD(P)H

binding motif common to the classic Rossmann fold.

45

Table 2-1. X-ray crystallography statistics of FscN

Data Collection FscN complex with FAD

Wavelength (Å) 1.100

Resolution range (Å) 36.74 - 1.89 (1.96 - 1.89)*

Space group C 1 2 1

Unit cell (Å/º) 129.54 75.07 73.90 90 102.62 90

Total reflections 199130 (17743)

Unique reflections 54927 (4941)

Multiplicity 3.4 (3.3)

Completeness (%) 98.95 (94.18)

Mean I/sigma(I) 12.81 (2.64)

Wilson B-factor (Å2) 26.66

Rmerge 0.0587 (0.44)

Rmeas 0.0686

CC1/2 0.998 (0.899)

Refinement

Rwork / Rfree 0.189 / 0.219

Number of atoms 4643

protein 4141

ligand 140

water 362

Protein residues 523

RMS deviations

bond lengths (Å) 0.007

bond angles (º) 1.31

Ramachandran

favored (%) 99

outliers (%) 0

Clashscore 3.79

Average B-factors (Å2) 35.20

protein 35.30

ligand/ion 29.30

water 37.20

PDB entry 4YHB *Data values for the highest resolution bin are shown in parentheses;

46

Figure 2-7. Global multiple-protein sequence alignment of putative SIP proteins. Subgroups with differences in the N- and C-terminal regions are clustered. Conserved residues with green for NAD(P)H binding motif, yellow for FAD binding pocket and red for residues associated with putative metal binding. Abbreviations: Tfu, Thermobifida fusca; Nal, Nocardiopsis alba; Npo, Nocardiopsis potens; Sba, Shewanella baltica; Ptu, Pseudoalteromonas tunicata; Spu, Shewanella Putrefaciens; Eco, Escherichia coli; Cyo, Citrobacter youngae; Pfl, Pseudomonas fluorescens.

FscN displays significant structural homology to the diverse family of

FAD/NAD(P)H oxidoreductases. Using the structural homology program Dali131, several

NAD(P)H/ferridoxin reductases show Z-scores in the modest 17-19 range with overall

sequence identity between 10-15%. The highest overall similarities are with

flavohemoglobins, a three domain proteins that contain an iron-heme domain in addition

to the two ferridoxin-reductase subdomains common to FscN. The FAD cofactor is

bound in a typical conformation as seen with examples from this superfamily (Figure 2-7

and Figure 2-8). The negatively charged FAD diphosphate group is stabilized largely via

hydrogen binding interactions with His108 and Ser261. The isoalloxazine ring of flavin is

bound through aromatic stacking interactions between Tyr255, His53, Tyr89 and

47

His265. Additionally, an aromatic stacking arrangement holds the adenine moiety

between Trp256 and His108, and hydrogen bond interactions between the FAD and the

protein are seen with the 2’-hydroxyl of the ribose and the exocyclic amino of the

adenine to further reinforce the protein-cofactor interaction.

Figure 2-8. Two views of the flavin binding site illustrating key residues involved with cofactor interaction. A) Adenine view and, B) riboflavin view.

2.2.5 Proposed NADH Cofactor Binding Site in FscN

The presented structure is in a ‘closed’ conformation where the C-terminus is

occupying the likely NADH binding site. This is a common structural rearrangement

observed in this reductase class and the PDB database is represented by both

NAP(P)H bound and unbound states. In the ferridoxin reductase superfamily, two

electrons are commonly transferred from NAD(P)H to oxidized flavin followed by an

additional electron acceptor. In FscN, the NADH binding site is at the interface of the

two domains and can only bind while the protein is in an ‘open’ conformation.

Overlaying the structure of FscN with NADH-cytochrome b5 reductase132 (PDB entry

3W2G) allows insight into the predicted NADH binding to FscN. The open structure

NADH-cytochrome b5 reductase has the nicotinamide stacked with the flavin

48

isoalloxazine, in a position consistent with electron transfer. This stacking position in

FscN is occupied by Tyr255 at the C-terminus and disordering and/or displacing of the

terminus would need to accompany NADH binding. The structural arrangement has the

common features consistent with an established ping-pong, bi-bi mechanism of NADH

oxidation and substrate binding/reduction133. Residues observed to bind the NADH

cofactor in cytochrome b5 reductase are conserved in FscN, suggesting a similar

binding mode (see Figure 2-9A). In contrast to E. coli YqjH, with reported activity using

NADPH as cofactor, FscN has a longer loop between β12 and α3 to bind the adenine

part of the cofactor (see Figure 2-7). Structural homologs determined to have different

cofactor affinity also have slightly different sequence characteristics on this region.

Enzymes that utilize NADPH have a relatively shorter loop while NADH-dependent

proteins commonly have longer loop with notable presence of negatively charged

residues134,135. Previous research also suggests that negatively charged amino acids

(residue Glu173 in FscN) located on this loop interact with adenosine ribose moiety and

decrease NADPH affinity, enhancing the NADH utilization ability136. The structural

analysis along with the kinetics assays are consistent with FscN serving as a reductase

with cofactor specificity for NADH.

2.2.6 Metal Binding Site Adjacent to the Flavin Cofactor

From the calculated electron density maps, there is a pronounced metal site

adjacent to the hydrophobic side of flavin binding pocket (Figure 2-9B). The calculated

anomalous difference map shows a strong peak at a 15σ contour level, with a

wavelength of 1.100 Å. The electron density at the metal site is present up to 12σ

contour level in 2mFo-DFc map (Figure 2-9B), indicating the high occupancy of a heavy

49

atom. The metal ion is coordinated by three histidines, His53, His255, His256, with

nitrogen-metal distances of 2.18, 2.17 and 2.16 Å respectively. Three water molecules

also surround the metal center with the distances of 2.18, 2.18 and 2.27 Å giving a six-

coordinated metal center with ideal octahedral geometry. To identify the metal bound to

FscN, ICP-MS analysis of IMAC purified protein sample resulted in Ni2+ or Zn2+ (0.8 and

0.25 molar ratios respectively, Table 2-2). To exclude possible external metal ion

influence from the Ni2+-NTA resin used for protein purification, further analysis was

carried with the protein purified through ammonium sulfate precipitation. Analysis of this

sample resulted in an FscN: Ni2+: Zn2+ ratio 1: 0.22: 0.80, indicating a predominant Zn2+

form, however, the 0.22 molar ratio of Ni2+ is not insignificant, beyond that expected for

background.

Table 2-2. ICP-MS metal analysis of FscN. The protein was purified with Ni2+-NTA based immobilized metal ion affinity chromatography (IMAC) and ammonium sulfate precipitation (ASP) methods. Target protein FscN concentration was pre-adjusted to 100 ppb before the analysis.

Purification Method Mg Mn Co Fe‡ Ni Zn

IMAC Blank 1 0.957† 0.32 0.638 11.4 0.861 17.8

Sample 1 0.047 0.285 0.538 13.5 81.2 43.51

ASP Blank 2 0.622 -0.06 0.026 12.1 0.692 1.064

Sample 2 2.483 1.044 0.49 22.6 22.75 81.5 †Unit: parts per billion (ppb);

‡Fe analysis was carried out separately and has an inherently higher

background due to instrumental limit and error.

2.3 Structure of FscN Facilitate the Understanding of Siderophore Pathway

Presented is the first structure coupled with functional characterization of a SIP

flavoenzyme. The fate of iron-siderophores after import into cells is a complex and

important step of bacterial iron acquisition. The study of the overall process provides

basic biochemical insight into the complex molecular machinery and can provide details

50

into virulence mechanisms. The fuscachelin siderophore biosynthetic gene cluster

encoded in the genome of the moderate thermophile T. fusca is a useful model system

to study the overall mechanism. All common genes, including those for siderophore

peptide biosynthesis, export/import and utilization are present in the cluster, including

representatives implicated in intracellular iron reduction, the SIP, FscN and the FSR,

FscP. Iron is imported into T. fusca complexed with siderophore in the oxidized ferric

form and is at some point reduced to the ferrous state for uptake by cellular machinery.

There exist limited published biochemical characterizations of intercellular siderophore

iron reduction. As discussed, several pathways have been proposed, including

extracellular reduction and non-specific intracellular reduction paths. A majority of

bacterial siderophore gene clusters contain one representative of either the SIP or FSR

superfamily proteins. The FSR family is represented by E. coli FhuF, a ferredoxin

homolog containing an uncommon C-C-x10/11-C-x2-C binding motif, which shares limited

sequence similarity to other characterized 2Fe-2S ferridoxins. Examples from both

families have recently been reported to reduce iron-siderophores in vitro: FchR

reduction of ferric dicitrate123 and SIP YqjH reduction of ferric enterobactin81. It is not

clear if these functions are redundant or synergistic in siderophore pathways.

The fscN gene is part of the siderophore gene cluster of T. fusca. FscN is a

member of the SIP family and homologous to ViuB, required for siderophore utilization

in V. cholerae137. This family functions as NAD(P)H:flavin oxidoreductases, and

homologs are found to be widely distributed in bacteria. Two different subfamilies of

SIPs can be distinguished, primarily based on C-terminal sequence alterations138,139.

Group I (Figure 2-7), including FscN from T. fusca, contains an additional 15 - 20 amino

51

acids predicted to be a C-terminal α-helical elements with a conserved HH(K)x5DE

sequence. In contrast, group II, represented by E. coli YqjH, has a relatively shorter C-

terminus. All sequences have highly conserved Flavin cofactor binding motifs including

RxYT, DxV(F)xH, Gx2S and YWK(R) highlighted in yellow (Figure 2-7). As compared to

group II, group I SIPs have a more conserved NAD(P)H binding motif (TA-x3-EVL- x3-

GE) (shown in green). The NAD(P)H motif, QA-x3-SVL common to group II is less

common. Consistent with bioinformatics analyses, the group I YqjH uses NADPH

whereas in this report we demonstrate that the group II FscN utilizes NADH as a

reducing agent.

The structure of FscN contains a flavin cofactor bound in an overall conformation

typical of flavoreductases, with the isoalloxine ring in a planar configuration indicative of

oxidized FAD (Figure 2-8). The dinucleotide is in an extended orientation with the

terminal adenosine at the surface of the protein. The identity of the cofactor was

determined by protein denaturation followed by spectroscopic characterization of the

isolated cofactor. These, in combination with the presented structure, clearly suggest a

noncovalently bound flavin. This is contrast to the reported characterization of the E. coli

homolog, YqjH in which the cofactor was suggested to be covalently bound. Covalently

bound flavin is commonly linked through the C6 atom or/and the 8-methyl on the

isoalloxine ring through cysteine, histidine or tyrosine residues140,141. In our structure of

FscN, a well conserved His53 faces the 8-methyl in close proximity (3.8 Å), has no

obvious constraints to form a FAD adduct. We have shown that purified holo-FscN is

redox-active by the reduction of the enzyme with sodium dithionite, which produced

spectra consistent with the formation of FADH2. Efforts to show direct FscN reduction

52

with NADPH or NADH have been unsuccessful in our hands; possibly due to instability

of the reduced flavoenzyme in the absence of substrate. From analysis of the structure

and comparison with co-complex structures of flavoenzyme bound to NADH (highest

structural homology: NADH-cytochrome b5, PDB entry 3W2G, Figure 2-9), residues

likely involved with NADH binding are present in FscN, however, the very low overall

structural identity (17% to 3W2G), suggests a possible divergent function for FscN. We

have made considerable efforts to demonstrate the reduction of various iron chelates

with FscN, using the ferrine assay that was previously used to characterize the

reduction of YqjH and FchR82,123. Unlike FchR, we do not see reduction of ‘generic’ iron

chelates as ferric citrate, ferrichrome or ferric aerobactin. The postulated natural

substrate for FscN would be a ferric-fuscachelin A. Although we have access to

fuscachelin through isolation from T. fusca and through total synthesis, the generation

of sufficient quantities of pure holo-siderophore is a challenge. Regardless, we were

able to measure holo-fuscachelin A reduction activity of 40 nmol Fe2+ min-1 mg-1,

comparable to E. coli SIP YqjH with a reported activity 22.0 nmol Fe2+ min-1 mg-1.

Interestingly, YqjH is a NADPH dependent reductase while FscN only accepts NADH.

Commonly, the ratio of NAD+ to NADH inside the cell is high, while the ratio of NADP+ to

NADPH is kept low, therefore bacteria would use NADPH in vitro as a reducing agent

and NAD+ as oxidizing agent. The NADH dependence of FscN may suggest that the

siderophore utilization pathway in T. fusca has uncommon redox equilibrium.

Additionally, we have demonstrated a 30 μM FscN-ferric fuscachelin A binding affinity

and basic molecular docking calculations using Hexserver142 suggests that the

siderophore may bind the surface of the protein anchored by Glu62 and Asp264 to

53

correspond to the two positively charged D-Arg residues on the siderophore. The

pseudo-symmetric D-arginines in the natural product (see Figure 2-1B) are about 25 Å

apart, which is about the same distance as Glu62 and Asp264 on the protein suggesting

a possible binding mode.

Despite significant effort, we were not able to observe significant direct reduction

of flavo-FscN with NADH or NADPH nor were we successful at reducing a large variety

of ferric iron chelates with chemically (sodium dithionate) reduced FADH2-FscN.

Structurally, FscN is mostly similar to flavoheme proteins such as cytochrome b5

reductase which catalyze electron transfer from the two-electron carrier, NADH to the

one-electron carrier (heme) cytochrome b5. During iron-siderophore reduction, the

terminal electron acceptor (ferric-siderophore) has to be a single electron acceptor.

There is no equivalent heme binding protein near the FscN regulon, and based on the

requirement for a single electron reduction step, it is possible the 2Fe-2S of the FSR

FscP plays a role in facilitating single electron transfer from NADH to holo-siderophore.

There are multiple known reduction pathways that utilize NAD(P)H/flavin/FexSx as

component in electron transfer clusters, through in cis or in trans protein interactions.

Additionally, FscN shares 14% sequence identity and structural similarity (RMSD of 3.3

Å) with the 1,2-dioxygenase reductase from Acinetobacter sp. (PDB entry 1KRH)143.

Both enzymes have similar NAD(P)H and FAD binding domains, while the dioxygenase

reductase has an additional N terminal 2Fe-2S subdomain for substrate reduction.

Another example is the phthalate dioxygenase reductase (PDB entry 2PIA, RMSD of

2.8 Å) which utilizes FMN to mediate electron transfer from the two-electron donor

NADH to a [2Fe-2S] single-electron acceptor located on the C-terminus of the protein

54

144. In the case of T. fusca, electron transfer could proceed from NADH to flavin/FscN to

the 2Fe-2S/FscP (or vice versa) then to the iron-siderophore similar to the domain

arrangements of other members of flavoprotein reductase families (see Figure 2-

9)143,145,146,147,148. In this scenario, FscP would be in the position of the iron-porphyrin

heme ring as cytochrome b5 in cytochrome b5 reductase. Another possibility is that the

2Fe-2S/FscP acts as a holder of a single electron from the flavin semiquinone after a

ferric to ferrous-siderophore reduction. This can allow the ferrous-siderophore to

dissociate and a ferric-siderophore to bind followed by the turnover of the flavin

semiquinone to complete the stoichiometric reaction.

Figure 2-9. Domain arrangement of selected members of the flavoprotein reductase and electron transferase family. Listed proteins have structurally similar FAD (FMN)/NAD domains as compared with FscN, along with linked or dissociable electron carriers. The carrier can be either iron sulfur protein or a porphyrin binding domain. Abbreviations: FNR, ferredoxin-NADP reductase; b5R, cytochrome b5 reductase; MMO-C, methane monooxygenase; BenC, benzoate 1,2-dioxygenase reductase; NapR, NADH-ferredoxin NAP reductase; PDR, phthalate dioxygenase reductase; VanB, Demethylase oxido-reductase.

In the T. fusca gene cluster, FscN and FscP are separated by a single gene and

are part of the siderophore biosynthesis, transportation and utilization gene island. This

is a unique orientation as compared to most bacteria, where the siderophore gene

cluster contains one or the other SIP/FSR. For example in V. cholera, the vibriobactin

55

utilization protein ViuB (SIP) is located next to the nonribosomal peptide synthetase

VibF and a FSR-like 2Fe-2S protein is not identified in the organism149,150. In E. coli,

both SIP and FSR homologs can be found in the genome, both with characterized

function but neither is located within a single siderophore biosynthetic regulon.

Recently, DtxR has been identified as an iron-dependent transcriptional repressor that

regulates the expression of siderophore gene clusters in T. fusca127. It is possible that

the fuscachelin gene cluster is also regulated by the same gene under iron-deficient

conditions. Meanwhile, another iron-dependent transcriptional regulator in T. fusca Tf-

IdeR binds the toxPO sequence in the presence of a variety of divalent metal ions66.

Neither of these regulators, however has been shown to effect transcription of genes in

the fuscachelin siderophore cluster. We have conducted initial experiments to assess a

potential FscN/FscP, protein/protein interaction (Figure 2-10). FscN and FscP were co-

expressed in a pACYDuet-1 vector with N-terminus His-tag and C-terminus S-tag

respectively. Ni-NTA resin elution fraction from the co-expression was analyzed with

SDS-PAGE and His/S-tag Western blots. The positive S-tag Western blot indicates the

prospective protein-protein interaction between FscP and FscN supporting a possible

synergistic in vivo interaction. Unfortunately, significant efforts to overexpress and purify

FscP for in vitro analyses and/or structure determination have been unsuccessful in our

hands likely due to the significant instability of the holo-iron/sulfur protein.

An intriguing and unexpected aspect of our FscN structure is the observation of a

bound metal adjacent to the flavin site. It is not clear if this is an artifact of our

purification/crystallization or a natural aspect of the function of FscN. A bound zinc ion

that plays a structural role in FscN function is reasonable, but we did observe a

56

surprisingly high molar ratio of nickel ion bound to FscN, even in the absence of metal

affinity purification and a hexa-His-tag. Octahedral coordinate Zn2+ in rarely observed in

protein crystal structures151, whereas octahedral Ni2+ centers are more common. A

relationship exists between nickel ions and the E. coli SIP, YqjH. Transcription of the

yqjH gene is under control of the YqjI transcription factor, a nickel responsive

transcriptional repressor82. This system and others, demonstrate a correlation between

iron homeostasis and nickel concentration. There is no apparent yqjI homolog in T.

fusca and an interesting possibility is that nickel binding to FscN could play a role in the

regulation of enzyme activity. To date, the low in vitro reduction activity of FscN has

prevented us from testing this hypothesis. In addition, the three histidine residues are

not absolutely conserved in SIP members (See Figure 2-7) and suggest the observed

metal binding is likely restricted to a subset of Gram-positive bacteria.

Figure 2-10. S-Tag and His-tag Western blot for FscP-FscN coexpression. FscN and FscP were coexpressed in pACYDuet-1 with an N-terminus His-tag and C-terminus S-tag respectively. Ni-NTA resin elution fractions from coexpression experiments were analyzed with SDS-PAGE and His/S-tag Western blots. The positive S-tag blot (middle line) indicates the potential protein-protein interaction between FscP and FscN. Degradation of the 2Fe/2S FscP is evident in the SDS-PAGE gel.

57

The chemistry and biology of iron acquisition by small molecule siderophores is a

complex and fascinating process involving metal coordination, protein/protein

interactions and redox chemistry. The specific intercellular fate of iron-bound

siderophore is among the least understood steps at the molecular levels, there are

limited examples of direct biochemical demonstration of intracellular iron reduction and

transfer to cellular components. The presented structure and biochemical

characterization of T. fusca FscN provides insight into the role of the SIP family and

interplay with siderophores and possibly other protein players, such as FSR. In addition,

unexpected structural features and the inability to observe predicted in vitro function

suggests further investigation into the role of the SIP/FSR family.

2.4 Experimental Procedures

2.4.1 General Methods

Unless otherwise stated, reagents and chemicals were purchased from Fisher

Scientific or Sigma/Aldrich.

2.4.2 Cloning, Expression, and Purification of FscN

The gene for fscN was amplified with PCR from T. fusca (ATCC 27730) genomic

DNA with the primers: FscN_Nterm (5’-GCG GGA TCC ATC ACC GCA ACC GTG), and

FscN_Cterm (5’-GCG AAG CTT CTA CTC GTC GTC ATC GTC). The PCR products

were purified through agarose gel-electrophoresis and cleaved with the corresponding

restriction endonucleases. The resultant fscN gene was then ligated into the expression

vector pET28a and transformed into E. coli BL21(DE3)pLysS cells for expression.

Cultures (1L) were grown to OD600 0.6 - 0.8 at 37°C, and overexpression was initiated

by the addition of 100 μM IPTG and growth continued for 16 hours at 16°C before

harvesting by centrifugation. Cell pellets were resuspended in 25 mL 20 mM Tris-HCl

58

pH 7.5, 0.5 M NaCl and lysed at 14,000 psi through a nitrogen-pressure microfluidizer

cell (M-110L Pneumatic). The lysate was clarified by centrifuged at 10,000 g for 40 min

at 4°C. FscN was purified using metal affinity chromatography (HisPur Ni-NTA Resin,

Thermo Scientific). After binding for 1 h, the resin was washed with 4 × 10 mL of 20 mM

Tris-HCl pH 7.5, 0.5 M NaCl, 25 mM imidazole, and the bound protein was eluted with 3

× 4 mL of wash buffer with 250 mM imidazole. The resulting yellow elution solution was

dialyzed against 1 L 20 mM Tris-HCl pH 7.5, 100 mM NaCl for 16 h and further purified

with HiTrapQ HP anion exchange column (column size 5 mL, GE Healthcare, AKTA

FPLC System) and HiLoad 16/60 SuperDex-200 gel filtration chromatography (GE

Healthcare). To rule out contaminating Ni2+ from Ni2+-NTA based purification, an

alternative protein purification strategy using ammonium sulfate precipitation (ASP) was

also used for ICP-MS analysis. Briefly, the cell pellet was resuspended and lysed as

described above and ammonium sulfate was added into the supernatant and the

concentration was increased stepwise, from 10%, 30%, 50%, to 80% saturation. For

each step of addition, ammonium sulfate was gently added to the tube over 5 min and

rocked for another 5 min to facilitate dissolution. When all of the ammonium sulfate was

dissolved, the tube was placed on ice for 30 min to allow further precipitation of protein

before harvesting the precipitant through centrifugation (3,000 g for 10 min, 4°C). The

precipitant was resuspended in 5 mL 20 mM Tris-HCl pH 7.5 and analyzed at each

stage, and the target protein was recovered in the 50% precipitant fraction. The protein

was dialyzed and further purified by anion-exchange and gel filtration chromatography

as described above for Ni-NTA purified protein.

59

2.4.3 FAD Determination, Reduction and Enzyme Kinetics

In order to determine the cofactor composition, purified FscN was denatured by

heating at 95°C for 20 min. After removing the precipitate by centrifugation, the

supernatant was analyzed by LC-MS (Agilent 6130, SB-C18 column) with a gradient of

2 - 98% CH3CN in aqueous 0.1% TFA over 17 min at a flow rate of 1 mL min-1.

Preparative HPLC (GraceVydac-C18) purification was carried out with a linear gradient

of 3 - 40% methanol in aqueous 0.1% TFA over 30 min at a flow rate of 10 mL min-1.

Detection wavelengths of 320/460 nm were chosen to monitor the elutions. The m/z 784

peak in LC-MS represent FAD as cofactor.

Reduction of the FAD cofactor in vitro was assayed as described previous80.

FscN at a concentration of 1.5 mg mL-1 in a 1 cm cuvette was reduced by addition of an

excess of sodium hydrosulfite, and the spectrum was subsequently recorded at 16 min

intervals (16, 32, 48, 64, and 80 min). The detection wavelength of 459 nm was

selected to determine FAD reduction.

The FscN catalyzed reduction assay was performed anaerobically in 100 mM

potassium phosphate buffer, pH 7.5 at 37°C. For substrates ferric-EDTA, ferric chloride,

ferric citrate, ferricyanide and ferric 2,3-dihydroxybenzoic acid, the enzyme

concentration was set at 10 μM with the NADH concentration from 0.5 - 2.0 mM. 0.6

mM of the ferrous indicator 3-(2-pyridyl)-5,6-bis(2-furyl)-1,2,4-triazine (ferene) was

added to monitor the formation of ferrous ion by measuring the absorbance at 595 nm.

For substrates ferric enterobactin, and ferric fuscachelin A, due to the self absorbance

change of the Fe2+/3+-siderophore complex near 600 nm, reactions were monitored

through the concentration change of reactant NADH with the detection wavelengths of

60

340 nm. Final concentration of enzyme, NADH and substrate were set at 10 μM, 0.1

mM and 0.2 mM respectively and the reaction rate was calculated using ε340(NAHD)=

6220 M-1 cm-1.

2.4.4 Purification of the Ferric-Fuscachelin Complex

Apo fuscachelins A-C were produced from T. fusca culture as previously reported

with modifications and identified correspondingly152. Briefly, T. fusca spores (ATCC

27730) were cultured in 5 mL LB broth at 55°C and 150 rpm shaking overnight then

cells were fully exchanged into 5 mL of iron-deficient Hägerdal medium before

inoculating 1 L iron-deficient Hägerdal medium and growth continued for 7 d at 55°C.

Cell pellets were collected by centrifugation and extracted with methanol. The methanol

extracts were concentrated and subjected to preparative HPLC by using a Vydac

218TP1022 protein and peptide C18 column (250×22 mm, 10 μm). A linear gradient of

2-50% methanol in 0.1% TFA and water was run over 30 min at 10 mL min-1 and

fractions collected at 17.6, 18.2 and 19.1 min were confirmed to be fuscachelin C, B,

and A, respectively. Purified fuscachelin A was incubated with ferric ammonium sulfate

in a 1:10 ratio and the resulting ferric fuscachelin A complex was purified using

preparative HPLC with a linear gradient of 3 - 80% methanol in 5 mM NH4OAc over 25

min at 10 mL min-1. Apo-fuscachelin A with an elution time 11.2 min was converted

quantitatively into holo-fuscachelin A with an elution time 14.6 min.

2.4.5 Purification Ferric-Enterobactin

For enterobactin production and purification, we applied a novel approach to

produce enterobactin by overexpressing the putative enterobactin exporter EntS gene in

E. coli. The entS gene was cloned from E. coli BL21(DE3) and ligated into pET16b for

overexpression. The EntS/pET16b plasmid was transformed into E. coli BL21(DE3) and

61

cell growth was performed in iron-deficient M9 media. Supernatant from 6 L of cultured

E. coli (grown to stationary phase) was lyophilized to 150 mL and extracted with equal

volumes of ethyl acetate four times and concentrated. The residue was dissolved in

methanol for further LC-MS analysis and HPLC purification153. LC-MS analysis was run

in a linear gradient of 2-98% acetonitrile in 0.1% trifluoroacetic acid /water over 15 min

at 0.2 mL min-1. Fractions eluted at 11.86, 12.91, 13.81 and 13.97 mins have molecular

weight of 464.2, 687.3 669.2 and 669.2 Da. Combined with UV-vis spretra, we assigned

the four fractions to be (DHB-Ser)2, (DHB-Ser)3, apo-enterobaction and holo-

enterobactin respectively. The total methanol extract was mixed with 1 mM ferric

ammonium sulfate before preparative HPLC purification. A linear gradient of 10-50%

acetonitrile in 0.1% trifluoroacetic acid and water was run over 25 min at 8 mL min-1 and

50-98% acetonitrile was carried sequentially from 25 to 30 min. Fractions collected at

26.0 min were confirmed to be holo-Fe+3-enterobactin by mass spec and UV-Vis

analyses.

2.4.6 Metal Analysis

The hexa-histidine tags of purified FscN from both nickel NTA affinity

chromatography and ammonia sulfate precipitation were removed using a standard

thrombin cleavage protocol. Briefly, 1 μL of thrombin (1 U L-1, Novagen) was mixed with

10 mg of protein in 1 mL thrombin cleavage buffer (20 mM Tris-HCl, 150 mM NaCl, 2.5

mM CaCl2, pH 8.4) and incubated for 16 hours at room temperature. Cleaved protein

was further purified using size exclusive chromatography (as described above). FscN

containing fractions (8 mL) were dialyzed against 1 L metal-free 0.1 M ammonium

acetate pH 6.5 for 16 h (pretreated with 5% w/v Chelex-100 Resin for 4 h) and diluted

62

with the same buffer to 10 mL (final concentration of 0.3 mg mL-1). Concentrated nitric

acid (trace metal grade, Sigma) was then added to a final concentration of 1%. A blank

10 mL reference was also prepared in parallel using identical procedures. Samples

were analyzed with inductively coupled plasma mass spectrometry (ICP-MS) at the

Center for Applied Isotope Studies, University of Georgia. Cu, Zn, Mn, Mg, Ni, Co and

Fe were analyzed for each sample along with the blank reference.

2.4.7 Isothermal Titration Calorimetry Binding Determination

FscN-siderophore binding affinities were determined using a MicroCal iTC200

isothermal titration calorimeter system. FscN was purified with Ni-NTA based

immobilized metal ion affinity chromatography followed by size exclusive

chromatography. Ferric-fuscachelin A was prepared from the isolated natural product

and purified to homogeneity154. 200 μL of 100 μM FscN in 10 mM HEPES-NaOH, 50

mM NaCl, pH 7.5 was placed in the ITC reaction cell and 40 μL Fe+3-fuscachelin A (1

mM) was titrated into the protein solution over time at 25°C. Heat change during the

reaction was detected and recorded for binding affinity calculation. All titrations were

repeated in triplicate. The binding parameters and corresponding error are calculated

using the Origin software package155.

2.4.8 Coexpression of FscN-FscP and Western Blot Analysis

FscN-FscP coexpression was carried out using the pACYDuet-1 vector. The fscN

gene was ligated into the multiple cloning site 1 (MCS1) with an N-terminal hexa-

histidine tag and the fscP gene was cloned from genomic DNA (primers FscP_Duet_N

and FscP_w/oSDuet_C, Table 2-3) and inserted into the MCS2 site with a C-terminal S-

tag. Cultures were grown to OD600 0.6 at 37°C and induced at 16°C overnight with 250

μM IPTG. Proteins were purified with IMAC as described previous and the elution

63

fraction was separated by SDS-PAGE. His-tag Western blot and S-tag assays were

carried out by using standard techniques. The proteins were transferred onto

polyvinylidene difluoride (PVDF) membranes (Bio-Red) for His-tag Western blot using a

poly-histidine monoclonal antibody (Sigma) as the primary antibody and goat anti-rabbit

IgG-HRP as secondary antibody. Western blot signal was developed with SuperSignal

West Pico Chemiluminescent Substrate (Thermo Forma) and detected on KODAK RP

X-OMAT film. Nitrocellulose membrane (S∙Tag AP Western Blot Kit, Novagen) was

used in S-tag Western blot. The transferred membrane was incubated in TBST + 1%

gelatin (TBST: 10 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.1% Tween-20) for 15 min to

block excess protein binding sites. Membrane was further incubated in 1/5,000 S-

Protein alkaline phosphatase conjugate TBST dilution for 15 min at room temperature

and developed with NBT + BCIP in alkaline phosphatase buffer. All Western blot

experiments were repeated twice to confirm the results.

Table 2-3. PCR primers used for FscN related cloning

Primer Vector Sequence

FscN_N pET28a/pACYDuet GCGGGATCCATCACCGCAACCGTGACG

FscN_C pET28a/pACYDuet GCGAAGCTTCTACTCGTCGTCATCGTC

FscP_Duet_N pACYDuet GCGCATATGATGACCCGTCAGTGTC

FscP_w/oSDuet_C pACYDuet GCGGATATCGTTGCGCAGCCACTC

C-EntS-Nhe1 pET16b GCGGCTAGCACTGTCGGACGCTGT

N-EntS-NcoI pET16b GCGCCATGGATAAACAATCCTGGC

2.4.9 Crystallography, Structure Determination, and Refinement

Initial crystal screening for FscN was performed in vapor diffusion sitting drop

format with commercial matrix screens (Hampton Research). Optimization of initial

conditions gave rod-shaped crystals with approximate dimensions of 400×100 ×100 µm

64

using a reservoir solution 0.95 M ammonium sulfate, 0.1 M Tris-HCl pH 6.6 and 35% v/v

2-methyl-2,4-pentanediol (hanging drop format, room temperature). Typically, crystals

appeared in 4 weeks and grew to maximum dimensions in 8-10 weeks. Crystals

suitable in size were harvested and frozen in liquid nitrogen using additional 30%

glycerol as a cryoprotectant. Diffraction data were collected at the PXRR X25 beamline

at Brookhaven National Labs National Synchrotron Light Source and processed with

XDS156 to a resolution of 1.89 Å in space group C2 (Table 2-1). The asymmetric unit

contains two molecules of FscN. Deposited SIP protein from S. putrefaciens (PDB code

2GPJ, 32% sequence identity) was used as starting models for molecular replacement

in PHASER157. Refinement was performed in PHENIX.REFINE158 in PHENIX 1.7.1159,

and model building was performed using COOT160. Based on the refined model, the

sigma-weighted, simulated composite omit maps and anomalous difference electron

density maps were calculated with PHENIX.MAPS159. Graphical representations were

prepared with PyMOL161. Several independent PCR and sequencing confirmed that the

construct of fscN from T. fusca ATCC 27730 deviates from the published genome of T.

fusca strain YX at one codon: position 99, is an Arg (AGA) in ATCC 27730 while a Gly

(GGA) in T. fusca strain YX. Arg99 is located on the surface of the resolved structure of

FscN and is outside the active site.

65

CHAPTER 3

STRUCTURE AND FUNCTIONAL ANALYSIS OF THE SIDEROPHORE PERIPLASMIC BINDING PROTEIN FROM THE FUSCACHELIN GENE CLUSTER OF

Thermobifida fusca1

3.1 Periplasmic Binding Protein Facilitates Siderophore Delivery

As noted in the chapter 1, iron is an essential cofactor required for the majority of

organisms, functioning as a global regulator for many cellular, metabolic, and

biosynthetic processes2,10. Importantly, siderophore based iron acquisition is commonly

critical for microbes’ survival and virulence1. Holo-siderophore uptake across the cellular

membrane(s) is similar in both Gram-positive and Gram-negative bacteria (see Figure

1-1). For Gram-positive bacteria, the FepBCDG-like multifunctional ABC-transporter

systems acquire and deliver the holo-siderophore through a siderophore-shuttle iron-

exchange mechanism71,72 with the assistance of a FepB-like periplasmic binding protein

(PBP)162.

Most siderophore PBP are type III PBPs67. Unlike type I and II PBPs, which

undergo significant domain transition during substrate interaction, type III PBPs have a

relatively rigid α-helix hinge between the two domains to controls the domain movement

between ‘open’/‘closed’ states67. Type III PBPs accept a wide range of substrates,

including free ion, ferric-siderophores, or ferric-hemes162. Aside from its natively

produced ligand, a particular PBP can also accept alternative ligands in vivo. The E. coli

PBP FhuD recognizes gallichrome as well as other structurally similar hydroxamate-

type siderophores, including the antibiotic albomycin163,164. To date, several type III

1 Adapted with permission (License Number 3779401197741) from Li, K. & Bruner, S. D. Structure and

functional analysis of the siderophore periplasmic binding protein from the fuscachelin gene cluster of Thermobifida fusca. Proteins Struct. Funct. Bioinforma. 84, 118-128 (2016). DOI: 10.1002/prot.24959. Copyright © 2015 Wiley Periodicals, Inc.

66

siderophore PBP structures have been solved through X-ray crystallography, illustrating

‘open’ and ‘closed’ conformations relevant to the substrate binding mechanism. For

example, ligand free FitE from E. coli has been identified in two different conformations

from the same crystallization condition and comparison of these two states shows a

change in the size of its binding pocket165. Likewise, the S. aureus siderophore receptor

HtsA undergoes a localized conformational shift upon staphyloferrin recognition166.

Additionally, recent NMR studies have provided insights into the structural dynamics of

the E. coli FepB in the solution phase68. Although most PBPs share a similar function

and have a relatively conserved general domain arrangement, the sequence diversity

within the superfamily is high and most solved PBP structures have a significant

different structural fold compare with each other, which provide difficulties to establish a

unified PBPs siderophore interaction mechanism. Moreover, most reported PBPs only

accept single type (hydroxamate or catecholate) siderophore. There is little information

about the mixed-type siderophore-PBP interaction.

Here I present the structure of FscJ, a type III periplasmic binding protein and its

interaction with a mixed-type siderophore, ferric-fuscachelin from T. fusca. The

siderophore fuscachelin is a secondary metabolite produced by a nonribosomal peptide

synthetase (NRPS) gene124,152 and is a catecholate /hydroxamate type siderophore with

positively charged D-Arginine subunits. The fscJ gene locates in the fuscachelin

biosynthetic gene cluster and encodes for the siderophore PBP, predicted to deliver the

ferric-fuscachelin A to the ABC transporter system137. To gain insight into the complex

recognition and transportation mechanisms of the unique siderophore fuscachelin, we

overexpressed FscJ and determined its structure through X-ray crystallography. Several

67

unique structures detail ligand-free conformational changes at different pH values,

indicating complex interdomain flexibility of the siderophore receptors. Additionally,

measured binding affinity of ligands and molecular docking provide valuable insight into

FscJ siderophore recognition mechanisms.

3.2 Characterization of FscJ as a Siderophore Binding Protein

3.2.1 FscJ and Fuscachelin Delivery

FscJ is a representative of the broad family of lipid-anchored periplasmic binding

proteins, key players in siderophore-mediated iron acquisition. The overall process of

siderophore-mediated iron acquisition is complex and most of the common protein

players are present in the T. fusca fuscachelin gene cluster. There is no gene encodes

for the ABC transporter in fuscachelin gene island. However, the cluster tfu_0336-0338

in T. fusca genome is a likely candidate with a high homology to the E. coli

FepCDG/FhuCB machinery126. As common to the PBP family, FscJ has a predicted N-

terminal transmembrane α-helix (residues 46-64)167. Our expression of full length FscJ

resulted in insoluble protein that was incorporated on E. coli membrane (Figure 3-1). In

order to improve the heterogeneous expression level and produce soluble protein, the

fscJ gene was engineered to exclude the transmembrane and lipidation sites (1-78).

Eight additional N-terminal residues (79-86) were also omitted due to predicted

disorder139. Purified His6-FscJ has a molecular weight of 35.6-kDa as measured in

MALDI-TOF mass spectroscopy, but appears as a dimer form on SDS-PAGE and in

size-exclusive chromatography (Figure 3-1).

68

Table 3-1. FscJ crystallization data collection and processing

Crystals FscJ_SeMet FscJ_P41 FscJ_ P21 FscJ_I222

Wavelength (Å) 0.9786 0.9786 0.9786 0.9786

Space group I222 P41 P21 I222

a, b, c (Å) 102.5 107.0 136.0

73.0 73.0 135.3 64.41 135.6 80.9

106.6 107.1 136.3

α, β, γ (°) 90 90 90 90 90 90 90 91.98 90 90 90 90

Mosaicity (°) 0.32 0.18 0.19 0.18

Resolution (Å) 41.76-3.20 (3.31-3.20)*

38.36-2.44 (2.53-2.44)

46.53-2.84 (2.94-2.84)

37.78-2.59 (2.68-2.59)

Total reflections 99379 (9626) 162648 (15443) 166896 (15936) 184378 (18691)

Unique reflections 12647 (1223) 26090 (2528) 32606 (3174) 24621 (2441)

Completeness (%)

99.7 (99.0) 99.6 (97.2) 99.7 (98.0) 99.9 (99.8)

Multiplicity 7.9 (7.9) 6.2 (6.1) 5.1 (5.0) 7.5 (7.7)

<I/σ(I)> 18.35 (3.09) 13.5 (3.3) 13.7 (2.2) 25.2 (3.8)

Wilson B factor (Å2)

79.7 17.0 12.6 33.4

Rmerge (%) 0.114 (0.92) 0.107 (0.82) 0.147 (0.78) 0.049 (0.57)

Rmeas (%) 0.126 0.118 0.166 0.053

CC1/2 0.998 (0.895) 0.997 (0.939) 0.992 (0.793) 1.000 (0.995)

Rwork (%) 0.173 (0.230) 0.1823 (0.267) 0.230 (0.282)

Rfree (%) 0.2267 (0.330) 0.247 (0.319) 0.263 (0.306)

No. of atoms 4411 8547 4282

protein 4325 8524 4247

water 86 23 35

Protein residues 556 1103 548

RMS deviations

bond lengths (Å) 0.008 0.012 0.004

bond angles (º) 1.01 1.39 0.74

Ramachandran

favored (%) 97 93 96

outliers (%) 0 0.64 0.19

Clashscore 1.78 4.16 2.66

Average B-factors 45.80 40.10 77.00

protein (Å2) 45.90 40.10 77.10

water (Å2) 42.20 36.20 66.10

PDB entry 5DH0 5DH1 5GH2 *Values for the outer shell are given in parentheses

69

Figure 3-1. FscJ purification and characterization. A) Size exclusive column purification of FscJ (with hexa-Histag). FscJ elution time is similar to proteins with molecular weight ~ 60-kDa; B) SDS-PAGE of purified FscJ with/without hexa-Histag (left/right line) with an observed molecular weight between 43~56-kDa; C) MALDI-TOF-MS analysis of FscJ (with hexa-Histag) provide a single mass of 35.6 kDa and, D) SDS-PAGE and Western Blotting analysis of full-length FscJ.

3.2.2 FscJ Structure Determination with X-ray Crystallography

FscJ structure was determined with SAD phasing. MR-SAD greatly facilitated the

final atomic coordination determination. Native FscJ protein crystals diffracted in

multiple space groups I222, P21 and P41 with a resolution range between 2.4-2.8 Å

(Table 3-1). The I222 crystal form was obtained exclusively at pH 4.6 condition, P21

crystals in pH 5.6 condition while P41 crystals were harvested in either pH 6.5 or 7.5

condition. The overall protein has a conserved two-domain structure (Figure 3-1A). The

N-terminal domain of FscJ monomer has a three-stranded parallel β-sheet sandwiched

by α-helices α4 and α5. Meanwhile, the C-terminal domain contains a five-stranded β-

sheet surrounded by α-helices on the protein surface. The two domains are connected

with a 40 Å long α6 spanning the whole molecule. A large cleft is formed between two

domains with an interface surface of 750 Å2 (PISA server)130.

70

Figure 3-2. Overall structure of FscJ. A) FscJ dimer (two orientations) present in the asymmetric unit contains two domains (N-terminal in blue and C-terminal in yellow) connected by an α-helical hinge (green). One monomer is shown in ribbon format and the other surface representation. A fragment of the hexa-Histag fragment is shown at the interface of one monomer. B) FscJ forms two different overall orientations when crystallized in different pH value conditions, F1 (I222) form in green and F2 (P41) in grey; C) Cα-Cα difference map between F1 and F2 forms (calculated with Matlab), the absolute value of the difference between the two matrices is displayed. The arrows indicate the center of motion.

FscJ forms a dimer in crystallographic asymmetric unit. The surface of

interdomain cleft is positioned close to the neighbouring α’6 hinge. A fragment of the

hexa-Histag is present between the interfaces of protein molecules contributing to the

71

dimerization (Figure 3-2A). Hydrogen bonds between Gln217 and His61 (of the hexa-

Histag), Glu221 and His61, His61 and neighbouring Glu’249, largely stabilized the

intermolecular interactions. The two monomers in both I222 and P41 space groups are

very similar, with a total atom root-mean-square deviation value (RMSD) of 0.18 Å

(I222) and 0.26 Å (P41) respectively. Meanwhile, the RMSD between monomers in I222

and P41 crystal form is 0.86 Å. Interestingly, space group P21 form crystal appears in a

intermediate state, with two out of four monomers similar to I222 form, and the other two

similar to P41 form. The following discussion/figure representations will focus on

monomer A in the I222 and P41 structures and refer them as Form 1 (F1) and Form 2

(F2) respectively.

Figure 3-3. Siderophore binding pockets in the SBP family (solvent accessible surface representations. A) The FscJ binding pocket spanning the cleft between the N- and C-domains. The conserved Arg164 on β2α4 is shown in blue, along with negative charged amino acids proposed to interact with fuscachelin in red; B) structure of fuscachelin A; C) YfiY surface with siderophore schizokinen in the binding pocket; D) SirA with staphyloferrin B.

The FscJ siderophore binding pocket is located on the top of central cleft

between the N-domain α2 and C-domain β7α11 loop region (Figure 3-2A). Residues

72

lining the pocket preliminary contain hydrophobic side chains; with one basic residue

(Arg164) locate in the pocket center, and three acidic residues (Glu279, Glu302 and

Asp310) at the side (Figure 3-3A). Like most of the other class III/α-helix bridged SBPs,

the N- and C-domains of FscJ overlay well in the F1 and F2 structures, while the size of

the binding pocket is altered with the protein C-domain movement. FscJ structural

changes between F1 and F2 were calculated quantitatively using a Cα-Cα (backbone

alpha carbon) difference distance map (Figure 3-2C). Between F1 and F2, there is no

observable Cα movement within N-domain (91-206) or C-domain (236-361) alone.

Meanwhile, the distance between N-domain Cα and C-domain Cα move up to 3.4 Å.

From the Cα-Cα difference distance map, FscJ domain movement is centered at the

residue region 222-228, and consists well with DynDom server calculation168. The

relatively ‘closed’ form, F1, has a hinged motion rotation angle of 5.5° compare with the

‘open’ form F2 (Figure 3-2B). F1 loop region between C-domain β4β5 shifted 3 Å

towards N-domain. There is also a 2 Å shift of F1 Phe229 on α9β6 loop towards the

middle substrate-binding cleft. The minimum distance between N-domain β2α4 loop

Arg164 and Phe281 in F1 and F2 structure shifted 3 Å closer to the middle binding cleft.

Trp258 on C-domain β5α8 loop also moves 3 Å closer to Arg164 in F1 structure.

3.2.3 FscJ Ferric-Siderophores Interaction

ITC supports FscJ binding with ferric-fuscachelin A with high affinity. FscJ was

titrated against ferric-fuscachelin A and an overall binding was modeled as an enthalpy

driven process with ΔH= -4.0 kcal mol-1 and a calculated dissociation constant of 5.4 μM

(Figure 3-4A). Mutation of the critical negative charged amino acids (E279A/N280A or

E302A/D310A) near the central cleft greatly decreased the FscJ ferric-fuscachelin A

73

binding ability (Figure 3-4B, C). Other ferric-siderophores include ferric-EDTA, ferric-

citrate, ferric-2,3-dihydroxybenzoic acid (DHB) and ferric-enterobactin were also tested

as putative substrates of FscJ. However no significant binding affinity was observed.

Figure 3-4. ITC binding analysis of FscJ (hexa-Histag free) against purified ferric-fuscachelin A complex. A) Wild type FscJ; B) E279A/N280A mutant and, C) E302A/D310A mutant.

Table 3-2. Protein structure comparison using the Dali server. FscJ monomer (P41-A) was used as the search model and top results are shown according to the Z-score and alignment root-mean-square deviation (RMSD, Å)

pdb Z-score RMSD lali Res. %ID Siderophore (Type)

3tny-A 23.0 3.4 251 280 28 Schizokinen (Hydroxamate)

4fkm-A 21.3 3.4 247 256 19 Ferrioxamine-B (Hydroxamate)

4b8y-A 20.9 3.7 250 277 22 (Hydroxamate)

3lhs-A 20.7 3.9 258 291 21 Staphyloferrin A (Hydroxamate)

3mwf-A 20.7 3.8 250 292 26 Staphyloferrin B (Hydroxamate)

3be5-A 20.7 3.8 255 294 24 (Unknown)

3eiw-A 20.6 4.0 260 292 20 Staphyloferrin A (Hydroxamate)

74

3.3 Structural Features and Siderophore Interaction Mechanism

3.3.1 FscJ Structural Comparison with Other Type III PBPs

FscJ shares a low structure homology to other substrate binding proteins in this

large protein family. A structure homology search (Dali server) shows that the top ten

most similar proteins have between 17%-28% sequence identity and a RMSD above

3.4 Å to FscJ (Table 3-2)131. Among the top structures, a number bind hydroxamate-

type siderophores ligands, including YfiY from B. cereus (PDB entry 3TNY, sequence

identity 28%)169 and SirA from S. aureus (PDB entry 3MWF, 26%)170. Heme-binding

proteins also share the topological similarities to FscJ with a common N-domain and a

similar large cleft located at the domain interface, among these are IsdE from S. aureus

(PDB entry 2Q8Q)171 and S. dysenteriae’s ShuT (PDB entry 2R7A)172. In contrast, FscJ

does not appear similar to free ion binding proteins belonging to type III PBP

family173,174. In order to confirm a lack of free metal binding, we employed single ion

soaking and no observable anomalous signal was present in crystals soaked with ferric

chloride, while diffraction data from an Hg derivative showed only an outer surface

interaction (HgI - His168). In neither experiment was there a resolved metal located in or

near the middle cleft binding pocket. Overall, FscJ shows the highest similarity to

uncharacterized proteins from N. alba and O. turbata with sequence identity 65% and

41% respectively, while other sequences in the current databases share an identity of

35% or lower (Figure 3-5). Interestingly, as compared to other homologs, FscJ has an

extended N terminus with an extra ~40 amino acids predicted to be intracellular of the

transmembrane helices. Similar to PBP TM0322 in Thermotoga maritima175, soluble

FscJ forms a strong dimer in solution. The stable dimers are hexa-Histag independent,

and the physiological roll of FscJ assembly is unknown. We postulate that this may

http://www.rcsb.org/pdb/search/smartSubquery.do?smartSearchSubtype=TreeEntityQuery&t=1&n=622

75

related to the in situ substrate-binding/ABC transporter system, which could possess

two substrate-binding sites per functional complex176.

Figure 3-5. FscJ sequence alignments with clustalW2. Top two non-redundant sequences (below 80% sequence identity) and top two sequences with structural information are shown aligned. FscJ residues are numbered as in full-length. Residues highlighted in yellow contains the Arg facing the ligand binding pocket, while residues highlighted in red are the negative charged amino acids locate on the edge of N- /C-domain cleft. Residues highlighted in blue are the hinged regions. Abbreviations: Tfu, Thermobifida fusca; Nal, Nocardiopsis alba; Otu, Oerskovia turbata; Bce, Bacillus cereus; Sau, Staphylococcus aureus.

3.3.2 pH-Dependent Dynamics Observed in Crystal Structures

We observed multiple forms of ligand-free FscJ in a similar crystallization

condition but with different pH values. FscJ conformations may represent the ‘open’ and

‘close’ state of the PBP with a hinged motion of 5.5° between F1 and F2 forms. (Figure

76

3-2B) The binding pocket of FscJ tends to ‘open’ for ligands binding with the pH

incensement. This observation could be relevant to iron acquition as the free ferric

concentration usually negatively corresponds to the environmental pH. FscJ is located

on the outer membrane of Gram-positive bacteria and has a direct environmental

contact. As a consequence, it may be favourable for FscJ to be in an open ligand

binding conformation over an extended pH range. The domain rotation region is located

between amino acids 224-228, on the end of long hinge α6. The hinged α6 consists

primarily of highly charged amino acid residues with negative Glu residues pointing

outward to bulk solvent (Figure 3-5, boxed in green). This charge arrangement may help

to immobilize the hinge region; especially the portion predicted to be closest to the cell

membrane177.

3.3.3 FscJ Has a Large Binding Pocket with Unique Charge Arrangement

Extensive efforts were made to cocrystallize various ligands with FscJ including

ferric-fuscachelin-A with and without a hexa-Histag. Unfortunately, we can only obtain

diffracting crystals with tagged FscJ and the structure shows the poly-histidine is bound

in the ligand pocket. Additionally, soaking of FscJ crystals with ligands failed to yield any

diffraction quality crystals. As all the prospective ligands are heavily charged and have

relatively a large size, the protein-ligand interaction likely disrupts the crystallographic

packing resulting in isomorphous crystals. ITC confirmed that FscJ binds with ferric-

fuscachelin A with an affinity of 5.4 μM. Fuscachelin A is the product from NRPS

biosynthesis pathway located in the same gene cluster with fscJ and likely is the

primary substrate for FscJ. FscJ shares a highest sequence identity with proteins that

transport six-coordinated hydroxamate type siderophores (YfiY and SirA). A structural

77

comparison of these and other homologues indicate that siderophores are bound in a

similar binding pocket (Figure 3-3). One prominent, common feature is a highly

conserved arginine (Arg164 in FscJ) located on the flexible loop between β2 and α4.

The residue is close to the ligated ferric ion and could help to balance the negative

charges of hydroxamate type ferric-siderophores. Compare to YfiY or SirA, FscJ has a

wilder binding pocket with N-terminal α2β2α4 and C-terminal β4β5β6 moving towards

outside (Figure 3-3). The larger pocket can accommodate ferric-fuscachelin A, a

relatively large siderophore. Meanwhile, Glu279, Glu302 and Asp310 on flexible loops

are conserved in uncharacterized proteins from N. alba and O. turbata, which could bind

similar siderophore compare to fuscachelins, but appear neither in YfiY nor SirA. The

negative charged glutamic/aspartic acids locate on the edge of siderophore binding

pocket will participate in the stabilization of positive charged D-arginines group on ferric-

fuscachelin A complex. Meanwhile, there is no negative charged amino acid residues

identified near YfiY/SirA substrate binding pocket. Neither YfiY nor SirA interact with

siderophores with large arginine subunits.

Docking attempts suggest that the siderophore binding occurs in the middle cleft,

as expected (Figure 3-6). Positive charges on siderophore D-Arg subunits are stabilized

through electrostatics/hydrogen bond interactions between Glu302/D-Arg1 (FuscA),

Asp310/D-Arg1 and Glu279/D-Arg2 (FuscA). The proposed binding site was surrounded

by hydrophobic residues including Phe281, Trp258, Phe244, Val246 and Ala282.

Arg164 on the N-domain faces the center of the pocket helping to stabilize the negative

charge on siderophore N-hydroxycarboxamino group. Overall, the ferric-

fuscachelin/FscJ interaction has a surface area of 1080 Å2 and predicted binding

78

constant of 6.6 μM consistent with our measured binding affinity (5.4 μM). Two sets of

mutagenesis were designed to confirm the critical electrostatics protein-ligand

interaction observed in the docking experiments. Mutant 1 contains E279A/N280A to

probe residues proposed to interact with D-Arg2 (Figure 3-6); mutants 2 (E302A/D310A)

examine side chains that stabilize D-Arg1. All protein mutants have a significantly

decreased or diminished binding affinity against ferric-fuscachelin A, which further

confirmed the essentiality of these negatively charged amino acids located on the edge

of FscJ binding pocket (Figure 3-4).

Figure 3-6. Molecular docking of FscJ with ferric fuscachelin A. The positively charged siderophore binds in the middle cleft and forms specific interactions with Glu302/D-Arg1 (FuscA, orange), Asp310/D-Arg1 and Glu279/D-Arg2 (FuscA, blue) interaction.

79

3.3.4 Conclusions and Insights

The chemistry and biology of metal binding siderophores is critical for microbe’s

survival and virulence. The ABC transporter system with its substrate specificity

commonly serves as the first ‘gate’ in the iron acquisition process. The siderophore

fuscachelin A of the moderate thermophile T. fusca is a unique siderophore with

uncommon molecular architecture. We are exploiting the production and utilization of

fuscachelin is a model system to study the specific iron acquisition process in Gram-

positive bacteria. The presented structure and biochemical characterization of type III

SBP FscJ provides insight into the complex molecular mechanisms of siderophore

recognition. Common to other family members, FscJ contains a large cleft for

fuscachelin recognition. Observed multiple conformations of FscJ provide insight into a

dynamic substrate binding mechanism. Unique binding models are utilized by FscJ,

including the recognition of a positively charged siderophore and the highly negative

charged hinge region facilitating subdomain movement. Additionally, FscJ contains an

unusually long mobile N-terminus of FscJ, that could be exploited in efforts to engineer

SBPs as biosensors178. Overall, the structure of FscJ helps understand the siderophore

recognition mechanism in gram positive bacteria, and allow unique insights that could

further facilitate the development of drugs that deliver through the iron acquisition

system in pathogen treatments88.

3.4 Materials and Methods

3.4.1 Cloning, Expression and Purification of FscJ

The gene for fscJ was amplified by using PCR from T. fusca genomic DNA with

the primers fscJ_N87 (5’-GCG GGATCC ACCGTCGAGATCCCCGCT), and fscJ_Cstop

(5’-GCG AAGCTT TCATAGGCCCTTGAGGAC) with N-terminal secretion signal and

80

lipidation site excluded (residues 1-86). The PCR product was treated with

corresponding endonucleases and then ligated into the expression vector pET28a.

Sequencing of several independent clones indicated that our construct of fscJ from T.

fusca ATCC 27730 deviates from the published genome of T. fusca strain YX at one

codon but did not change the encoded amino acid: TCT (Ser82) in ATCC 27730 while

TCC in T. fusca strain YX. The plasmid was transformed into E. coli BL21(DE3)pLysS

cells for overexpression. Cultures (Luria Broth, 1 L) were grown to OD600 = 0.8 at 37°C.

Overexpression was initiated by the addition of 400 μM isopropyl β-D-1-

thiogalactopyranoside (IPTG) and continued for 16 h at 25°C before harvesting by

centrifugation. Cell pellets were resuspended in 25 mL 0.5 M NaCl, 20 mM Tris-HCl, pH

7.5, and lysed at 14,000 psi through a microfluidizer (M-110L Pneumatic). The lysate

was then centrifuged at 15,000g for 20 min at 4°C. FscJ protein was purified through

immobilized metal affinity chromatography (IMAC). The supernatant was incubated for 1

h with 1 mL Ni-NTA Histag affinity resin (HisPur Ni-NTA resin, Thermo Scientific). The

bound resin was washed with 4 × 10 mL of 0.5 M NaCl, 25 mM imidazole, 20 mM Tris-

HCl, pH 7.5, and protein was eluted with 3 × 2 mL of 0.5 M NaCl, 250 mM imidazole, 20

mM Tris-HCl, pH 7.5. The elution fraction was further purified with gel filtration

chromatography (HiLoad 16/60 SuperDex-200 column, AKTA FPLC System, GE

Healthcare, 100 mM NaCl, 20 mM Tris-HCl, pH 7.5) before crystallization.

Selenomethionine (SeMet)-labelled protein was produced using metabolic inhibition of

methionine biosynthesis as described179. Cells were cultured in 1 L M9 minimal media

and grown to OD600 = 0.6 at 37°C. 60 mg L-SeMet was added to the culture 20 min

before cell induction and overexpression was initiated by adding 200 μM IPTG followed

81

by growth for 16 h at 25°C. The purification for Se-Met labelled protein was the same as

described above for the native protein with the addition of 10 mM dithiothreitol (DTT) to

the gel filtration purification buffer.

3.4.2 Cloning of Full Length FscJ, Expression and Localization

Full-length fscJ (fscJ_F) was amplified by using PCR from T. fusca genomic DNA

with the primers fscJ_N1 (GCG GGATCC ATG GGG TTG GGA AAG) and same

fscJ_Cstop. FscJ_F was expressed under similar conditions as the truncated proteins. 1

L Cell lysate was centrifuged at 10,000g for 2 h and the pelleted inclusion body

discarded. Cellular membrane was harvested with an additional 1 h centrifugation at

25,000g. The membrane was washed with 800 mM NaCl, 20 mM HEPES, pH 8.0, and

resuspended in 120 mM NaCl 20 mM HEPES, pH 8.0, with 0.05% dodecylmaltoside.

FscJ_F was purified through a Ni-NTA based IMAC method. Histag Western Blot was

used to confirm the expression; purified proteins from the membrane fraction were

analyzed with SDS-PAGE and transferred to a polyvinylidene difluoride (PVDF)

membrane (Bio-Red). Poly-histidine monoclonal antibody (Sigma) was used as the

primary antibody and goat anti-rabbit IgG-HRP was used as the secondary antibody.

Western Blot signal was developed with SuperSignal West Pico Chemiluminescent

Substrate (Thermo Forma) and detected on a KODAK RP X-OMAT film.

3.4.3 Isothermal Titration Calorimetry Binding Affinity Determination

FscJ-siderophore binding affinity was determined with isothermal titration

calorimetry (ITC, MicroCal iTC200 system, Malvern). For binding studies, the hexa-

Histag was removed: 1 μL of thrombin (1 U L-1, Novagen) was mixed with 10 mg of

protein in 1 mL thrombin cleavage buffer (150 mM NaCl, 2.5 mM CaCl2, 20 mM Tris-

HCl, pH 8.4) and incubated for 16 h at room temperature. Cleaved protein was further

82

purified using size exclusive chromatography (as described above). Ferric-fuscachelin A

was prepared from the isolated natural product and purified to homogeneity as reported

before77. 200 μL of 25 μM FscJ (0.025 mM) in 100 mM NaCl, 20 mM Tris-HCl, pH 7.5

was placed in the ITC reaction cell and 40 μL ferric-fuscachelin A (0.5 mM) was titrated

into the protein solution over time at 25°C. Heat change during the reaction was

detected and recorded for binding affinity calculation. Other protein ligands interaction

determinations were performed in a similar fashion. All titrations were repeated in

triplicate. The binding parameters and corresponding error are calculated using the

Origin software package155.

3.4.4 FscJ Crystallization and Optimization

FscJ initial crystal screenings were performed with vapor diffusion sitting drop

format (CrystalQuick 96 well sitting drop plate, Hampton Research). Small ‘angel hair’

like crystals were identified in the condition containing 0.2 M ammonium acetate, 0.1 M

sodium acetate, pH 4.6 and 30% w/v polyethylene glycol-4,000. Optimization of salt

concentration and pH were carried in vapor diffusion hanging drop format with

microseeding (24 well VDX crystallization plate, Hampton Research). 2 μL of protein (10

mg mL-1) solution plus 2 μL of precipitant were mixed and balanced against 1 mL of

reservoir solution at 25°C. Resultant rod-shaped single crystals were obtained in 0.16 M

ammonium acetate, 28% w/v polyethylene glycol-4,000 in and 0.1 M pH buffer at pH 4.6

(sodium acetate), 5.6 (sodium acetate), 6.5 (MES) and 7.5 (HEPES). Se-Met labelled

protein was crystallized in pH 4.6 conditions with same salt and precipitant as above. To

detect the possible metal/FscJ interaction, FscJ protein crystals were soaked in ferric

chloride or mercury acetate (1 mM) at pH 4.6 for 24 h. Crystals suitable in size and

83

shape were harvested and flash frozen in liquid nitrogen using 15% v/v glycerol as

cryoprotectant.

3.4.5 FscJ Atomic Structure Determination and Refinement

FscJ native and SeMet single-wavelength anomalous diffraction (SeMet-SAD)

data sets were collected at LS-CAT 21-ID-G beamline, Advanced Photon Source (APS),

Argonne National Laboratory (ANL) at a wavelength of 0.9786 Å (MARMOSAIC300

detector, 100 K). A mercury soaked data set was collected at GM/CA 23-ID-D beamline,

APS, ANL at a wavelength of 1.0064 Å (Pilatus3 6M, 100 K). Native data sets were

processed with XDS package156 into the space group I222, P21 or P41, with two, four or

two molecules per asymmetric unit, respectively. The SeMet-SAD data set was

processed into the space group I222. Four Se sites per asymmetric unit were found

using PHENIX.HYSS180,159 and phases were calculated with PHASER157. Initial rounds

of model building and refinement were completed using BUCCANEER181,182, COOT160

and REFMAC5183. The resultant poly-alanine model along with the diffraction data were

processed with six rounds of SAD-MR184. The location of heavy metals were further

refined with MLPHARE185 and PIRATE186, density maps were improved with

RESOLVE187, while model building was carried with SHELXE188 and BUCCANEER

sequentially. Native data sets were solved using molecular replacement with the model

from SAD-MR. Structures in space group I222, P21 and P41 were modeled and refined

independently with ARP/wARP189 and PHENIX.REFINE158. Hg ions in Hg-soaking data

set were located using anomalous difference maps calculated with PHENIX.MAP.

Sigma-A weighted, simulated annealing composite omit maps were used to judge and

verify structures throughout refinement. Water molecules were inspected carefully in

84

calculated electron density maps. Structural illustrations were prepared with PyMOL161

and LIGPLOT190 Crystallographic data and refinement statistics are summarized.

Table 3-3. FscJ MR-SAD phasing and building cycles

Cycle Built resn. # (BACANNEER) Rwork/free

0 (poly-Ala) - 0.4399 / 0.4951

1 474 0.3859 / 0.4873

2 522 0.3682 / 0.4834

3 537 0.3317 / 0.4593

4 546 0.3491 / 0.4653

5 549 0.3338 / 0.4303

6 556 0.3079 / 0.3859

3.4.6 Detailed FscJ SAD-MR Based Experimental Phasing

FscJ shares a low sequence identity compared with all other published homolog

structures (Table 3-2) and direct molecular replacement do not yield any potential

solution. FscJ has only two methionines in the total sequence of ~300 amino acids

which both located on the N-terminus of the protein, making standard SAD a challenge.

Harvested SeMet-FscJ crystals were screened for anomalous signal and the best data

set was indexed to 3.2 Å resolution in space group I222. The anomalous measurability

of the data set was 0.1 at 8.3 Å resolution and 0.02 at 6.2 Å resolution from

phenix.xtriage analysis. Se sites identification was initiated with PHENIX.HYSS at 6.6 Å

resolution and phases were calculated with an initial figure of merit of 0.38, which can

be improved to 0.64 with density modification at 5.6 Å. A poly-alanine model was built

manually into the 5.6 Å electron density map and combined model and heavy atom sites

yield an interpretable 4.5 Å map from PHENIX.AUTOSOL. Final 3.2 Å map was

achieved through combining the HA sites from 5.6 Å and model from 4.5 Å, and

85

corresponding full-length poly-Ala model was built in Coot based on predicted protein

3D structure from Rosetta server (RMSD of 3.6 Å compare with the final structure). With

a relatively low overall resolution and inaccurate phase information, the poly-alanine

model can be refined to Rfree/work =0.44/0.49% and the assignment of the corresponding

sidechain is rather difficult. A detailed SAD-MR strategy adapted from EMBL-Hamburg

Auto-Rickshaw was used to improve the model. The Rfree value dropped below 40%

after six rounds of MR-SAD (Table 3-3), and the resultant model was applied into

molecular replacement for native data sets in higher resolution.

3.4.7 Modeling the FscJ/Ferric-Siderophore Interaction

Ferric-fuscachelin A was docked into the FscJ structure (P41-A) with AutoDock 4

to understand the binding mode191. Ligand ferric-fuscachelin A was fully optimized with

density functional B3LYP/6-31G in Spartan '08 (solvent: water, pH 7.0)192. The core

region of ligand was restrained as rigid body while the D-arginine arms remain flexible

during the docking. Standard docking genetic algorithms were used with a rigid protein

in a grid covering the entire central cleft region.

Table 3-4. Primers for FscJ site-directed mutagenesis

Mutant # Site(s) Primers

1 E279AN280A CAGCAGCCGGCGGGGgcggccTTCGCTGCCTTCTAC

2 E302A GTCATCTTCTACgcgACCGACGCCCAG

2 D310A CAGGAGAACCCCgccCCGTTCACCGAG

3.4.8 FscJ Siderophore Binding Site Mutagenesis

Mutations of non-conserved binding site residues in the FscJ C-domain were

carried out using the Q5 site-directed mutagenesis kit (NEB) following the

manufacturer’s instructions. Mutant 1 contained mutations E279A and N280A; mutant 2

86

contained E302A and D310A. All complete mutagenesis structures are confirmed with

DNA sequencing. Primers designed for the mutagenesis are listed (Table 3-4). Protein

mutants were purified and prepared as above.

87

CHAPTER 4

PRECURSOR PROTEIN-DIRECTED PEPTIDE MACROCYCLIZATION IN A RIBOSOMAL PEPTIDE NATURAL PRODUCT BIOSYNTHETIC PATHWAY

4.1 RiPPs Biosynthesis and Microviridin Biosynthetic Pathway

The diverse biosynthetic pathways to natural product, secondary metabolites

produce enormous structural diversity and biological activity193,194. Natural products can

be divided into distinct classes based on their biosynthesis pathways and the origins of

building blocks, including: ribosomally produced and post-translationally modified

peptides (RiPPs), non-ribosomal peptides (NRPs), polyketides (PKs), saccharides,

terpenes, and alkaloids195. Members of the large and diverse family of RiPPs exhibit

therapeutically useful properties including antitumor, antifungal, antibacterial and

antiviral196. RiPPs are derived from short DNA-encoded precursor proteins, necessarily

composed of proteinogenic amino acids196. The precursor proteins contains an N-

terminal leader peptide region197,198 (or less common a C-follower peptide199), which is

disposed during maturation, releasing one or more core peptides. Enzymatic post-

translational modification of the core peptides install functional/structural motifs and

include, for example, dehydration, epimerization, heterocyclization, prenylation,

lanthionine formation and macrocyclization (≥ 6 amino acids)197,198. Post-translational

modification commonly requires the presence of a leader peptide, as the core peptide

region alone will not be recognized for post-translational modification200,201.

Genetic engineering of core peptides is an attractive and proven approach to

create libraries of biomedically relevant macrocyclic peptides, recent examples include

manipulation of lasso peptides and microviridins in the heterologous host E. coli112,202.

Lasso peptides, as indicated by their names, are peptides assembled as a threaded

lasso and target cell surface receptors111; while microviridins are depsipeptidyl toxins

88

inhibiting serine proteases in the nM range203. Engineered, optimized heterologous

expression can produce diverse RiPPs with a high yield, for example reconstruction of

the tru pathway in E. coli results in the production of diverse natural and designed

cyanobactins204.

Figure 4-1. Representative precursor peptides and gene clusters in microviridin biosynthetic pathways. A) Aligned precursor peptides have a strictly conserved region (PFFARFL) highlighted in green. Residues involved in the cyclizations are labeled in orange (lactonization) and blue (lactamization). B) The microviridin J biosynthesis gene cluster in Microcystis aeruginosa MRC contains the genes of mdnA-mdnC, with the notable absence of mdnD or mdnE-like genes, present in M. aeruginosa NIES298. Planktothrix agardhii microviridin biosynthetic gene cluster contains two genes (mvdE and mvdF) encode for two independent precursor peptides. Gene all7013 in Nostoc sp. PCC7120 encodes a putative microviridin precursor peptide with three sequential core peptide region following a leader peptide.

Microviridin L biosynthesis in M. aeruginosa NIES298 involves five clustered

genes (mdnA-E) (Figure 4-1)205. The gene mdnA encodes a 49-amino acid precursor

peptide with a 14-C-terminal amino acids representing the core peptide region of the

final natural product. MdnB and mdnC translate to two homologs of the ATP-grasp

ligases, consisting of 325 and 324 amino acids, respectively. This cluster also contains

89

mdnD, encoding an N-acetyl transferase, and mdnE, encoding a putative duel-function

polypeptide protease and transporter205,206. The leader peptide of MdnA is prerequisite

in the microviridin biosynthetic pathways, with a strictly conserved PFFARFL motif key

to recognition by the two the ATP grasp macrocylases (Figure 4-1)207. MdnC and MdnB

condense MdnA to form a tricyclic MdnAΔ3 (MdnA -3H2O) intermediate, then the leader

peptide region of MdnAΔ3 will be cleaved by MdnE and capped by N-terminal

acetylation (MdnD)205. The order of bond formation in the three macrocyclization

reactions is well resolved, typified by in the investigation of microviridin K biosynthesis in

Planktothrix agardhii CYA126/8 (Figure 4-1)208. The first ATP-grasp ligase (MvdD,

homologous the MdnC) sequentially catalyzes the Thr/Asp and Ser/Glu lactonization,

and the intermediate is then modified by MvdC (homologous to MdnB) through the

Lys/Glu lactamization. Besides the six key amino acid residues directly involved in the

cyclization, the core peptide region (MvdE) does possess limited modification

tolerancy209. Next, the leader peptide of MdnAΔ3 is cleaved presumably by MdnE, and

the nascent modified core peptide is then capped with N-terminal acetyl group by

MdnD205. In the microviridin biosynthesis, the leader peptide is prerequisite, and its

strictly conserved leader peptide motif is key to post-translational modifications (Figure

4-1)207.

On the basis of domain similarity and secondary structure prediction, the two

macrocyclases in microviridin pathways are three-domain ATP-grasp proteins

exemplified by RimK, a ligase involved in poly-α-glutamic acid synthesis and ribosomal

S6 modification210. The ATP-grasp superfamily is characterized by a unique ATP-

binding fold that ‘grasps’ a molecule of ATP in between the C-domain and the central

90

domain211. Core peptide binding and cyclization is most likely catalyzed between the N-

domain and the central domain212. Several, long loop regions between the three

subdomains of ATP-grasp ligases increase the overall protein flexibility and establish

the basis for the recognition of structurally diverse substrates. Most ATP-grasp ligases

are dimeric in structure with some examples of tetrameric, dimers of dimers. For

example, LysX involved in the conversion of α-aminoadipate (AAA) to lysine, is a

recently structurally characterized member of the ATP-grasp superfamily, with

relevance to the microviridin cyclases213. LysX ligates AAA to the protein LysW, to

activate the δ-carboxyl group for conversion to the ε-amine. The tetrameric LysX

consists of two dimers interacting through the central domains and binds two LysW

proteins, each sandwiched between two LysX N-domains. Additional members of the

ATP-grasp family that have been structurally characterized include D-alanine/D-alanine

ligase214, biotin carboxylase215 and diphosphoinositol pentakisphosphate kinase-2216.

While MdnB and MdnC produce secondary metabolite microviridins, most ATP-grasp

ligases are engaged in the primary metabolite pathways.

The microviridin J biosynthetic gene cluster (Figure 4-2) identified in M.

aeruginosa MRC, contains the genes mdnA-mdnC, with the notable absence of mdnD

or mdnE homologs (Figure 4-1)202. Mature microviridin J is a toxin that causes lethal

molting disruption in Daphnia through inhibition of serine-type proteases217. Compared

with other microviridins, microviridin J contains a unique Arg at position 40 that

facilitates its binding to trypsin202,218. The two ATP-grasp ligases, MdnB and MdnC, in

M. aeruginosa MRC share a significant degree of sequence homology (40%) between

91

each other, but little (< 20% sequence identity, see Figure 4-3) to known crystal

structures, limiting the generation of reliable homology models for either enzyme.

Figure 4-2. Biosynthesis of microviridin J. A) The microviridin J biosynthetic pathway in Microcystis aeruginosa MRC contains three gene products: MdnC and MdnB catalyze the macrocyclization of MdnA to generate the intermediates MdnAΔ2 and MdnAΔ3, respectively. The cleavage of leader peptide followed by an acetylation produces the mature microviridin J. The residues coupled by MdnC and colored red and likewise MdnB, blue. B) In vitro cyclization of MdnA with purified MdnC and MdnB enzymes. HPLC analysis shows separation of the precursor peptide and two macrocyclization intermediates (wavelength=220 nm).

92

Figure 4-3. Alignment of MdnC, MdnB and other reported ATP-grasp ligases. MdnB and MdnC belong to ATP-grasp ligase superfamily consisting of a tri-domain structure. Of deposited structures, LysX is most similar sharing sequence identity of 19.5% and 17.2% with MdnC and MdnB respectively, while MdnB and MdnC share a sequence identity of 40.1%. Key residues corresponding to ATP binding interactions are highlighted in red. The amino acid region highlighted in green corresponds to leader peptide recognition.

Cyclization of peptides is a common approach in both natural and synthetic

systems to generate biologically-active peptides. The cyclic structures are less

susceptible to in vivo degradation and proteolysis and present rigid conformations

entropically-favored to bind targets. . Several enzymatic strategies are known to

catalyze the cyclization of peptides including serine protease-type mechanisms in

nonribosomal219 and ribosomal peptide220,221 pathways along with ATP-dependent

activation paths202,222. These approaches often exhibit high degrees of specificity and

efficiency that can rival synthetic methods. Here we describe the structure and

biochemical characterization of the two peptide macrocyclases, MdnC and MdnB, from

the microviridin J gene cluster. The results provide a structural basis for the interactions

93

of peptide macrocyclases with their precursor peptide substrates, a common and

integral feature of pathways to ribosomal peptide natural products.

Figure 4-4. Expression and purification of MdnB and MdnC as dimeric proteins. A) MdnC and MdnB shares a similar retention time in size exclusion chromatograph and are predicted to be dimers in solution. Molecular weight standards (BioRad) were plotted as retention volume vs log (MW). B) MdnC and MdnB, with a molecular weight of ~35 kDa, run as dimers in native PAGE analysis (~80 kDa).

4.2 Leader Protein-Directed Microviridin J Biosynthesis

4.2.1 MdnB and MdnC Catalyze MdnA Cyclization

Full-length MdnB and MdnC were heterologously expressed in E. coli

independently as C-terminal hexa-histidine fusion proteins utilizing a short alanine linker

between the tag and the enzyme. As is common with the ATP grasp superfamily, size-

exclusion chromatography profiles and native-PAGE analysis indicate that both MdnB

and MdnC form homodimers (Figure 4). In vitro assays using purified MdnC and MdnB

showed complete cyclization of MdnA to the microviridin precursor in the presence of

Mg2+ and ATP. MdnC catalyzes two lactonizations to form MdnAΔ2, while MdnB installs

the final macrolactam to produce MdnAΔ3 (Figure 4-2, Figure 4-5 and Table 4-1).

Additionally, kinetic analysis of the MdnC catalyzed double-cyclization result in a

measured KM (MdnA) of 23.8±1.2 μM and apparent kcat of 0.47±0.02 min-1 at 25°C

(Figure 4-6).

94

Figure 4-5. Analysis of full-length MdnA cyclizations. MdnA and its cyclic products have been identified in MS analysis. Refer to Table 4-1 for masses.

Figure 4-6. Kinetic analysis of the MdnC catalyzed dicyclization. A) Representative HPLC traces for MdnC-catalyzed MdnA dicyclization and initial rate determination. B) Michaelis-Menten plot created with various initial rates at different substrate concentrations. Each initial rate has been determined as triplets.

4.2.2 MdnA Leader Peptide Interacts with MdnB and MdnC Macrocyclases

A hallmark of ribosomal peptide biosynthetic pathways is protein/protein

interactions of precursor peptides with the catalytic enzymes through specific structural

elements. MdnA is 49 amino acids in length with 13 of those amino acids included in the

final natural product structure. To understand the interaction of MdnA with the

macrocyclases, we conducted ITC-based binding assays. Full-length MdnA (amino

95

acids 1-49) binds MdnC, with a measured binding constant (KD) of 112±52 nM (Figure

4-7 and Table 4-2). This binding interaction is ATP-independent and the addition of the

stable ATP analog AMPPNP, or ADP does not change the MdnA/MdnC interaction

thermal profile. We next removed the core peptide region of MdnA to create a leader

peptide-only analog (MdnA1-35), and observed the similar level of interactions between

MdnA1-35 and MdnC (KD = 146±52 nM). These results indicate that the MdnA/MdnC

interaction is predominantly leader peptide-dependent, while the core peptide of MdnA

does not contribute significantly in the initial stage of substrate recognition. Using a

similar approach, we observed a weaker interaction between MdnB and full length

MdnA or MdnA1-35 (KD values were 4.8±1.0 and 2.8±0.8 μM, respectively). Because the

natural substrate for MdnB is the di-cyclo MdnAΔ2 and not MdnA, the reduced binding

affinities suggest that the modified core peptide likely contribute to the MdnB/MdnA

interactions.

Table 4-1. MS profiles of MdnA and MdnA variants

MS fragments (positive mode) Experimental Theoretical

+3 +4 +5 +6 +7 +8

MdnA 1430.3 1144.5 953.9 817.8 715.7 5717.5

5718.31

MdnAΔ1 1426.0 1140.8 950.8 815.3 5699.5

MdnAΔ2 1421.4 1137.3 948.0 812.8 711.3 5681.6

MdnAΔ3 1416.6 1134.1 944.8 810.2 5663.8

MdnA1-35 978.4 782.9 652.7 3909.8 3910.37

MdnAAc20-49 1189.4 892.3 714.1 3565.3 3563.76

MdnAAc20-49Δ1 1183.3 887.8 710.5 3547.2

MdnA11-49 1147.0 917.8 765.0 4584.0 4585.07

MdnA11-49Δ2 1138.1 910.7 759.1 4548.5

MdnA11-49Δ3 1133.6 907.0 756.0 4530.1

96

Figure 4-7. ITC profiles of MdnA variants interact with macrocyclases MdnB and MdnC. Detailed fitting information is listed in Table 4-2.

97

Figure 4-8. Overall structures of MdnC and MdnB. A) Two views of MdnB (blue/grey) and, B) MdnC dimers (orange/grey). MdnC forms a co-complex with the precursor peptide MdnA1-35 (green). C) Superimposition of the MdnC and MdnB protomers illustrate domain movement involved in precursor peptide binding. D) Evolutionary conservation map of MdnC protomer highlights conserved regions of the C-domain. The modeled ADP/Mg+2 binding site is noted on the map.

98

Table 4-2. ITC-based precursor peptide/cyclase interaction

Protein Ligand Site(s) Kd

MdnC MdnA 0.95±0.02 112±52 nM

MdnC MdnA1-35 0.90±0.01 146±52 nM

MdnC MdnA11-49 1.04±0.01 308±78 nM

MdnC MdnAAc20-49 n.d.*

MdnC ATP n.d.

MdnCAA MdnA n.d.

MdnCKK MdnA n.d.

MdnCE293A/N295A MdnA 0.84±0.05 575±302 nM

MdnCD281A MdnA 0.71±0.01 568±80 nM

MdnB MdnA 0.80±0.10 4.85±1.02 μM

MdnB MdnA1-35 1.12±0.02 2.85±0.77 μM

MdnB MdnAAc20-49 n.d. * n.d. - not detected, with no observable heat change during the titration.

4.2.3 Overall Structure of MdnC and MdnB

To further understand the leader peptide prerequisition in the microviridin

biosynthetic pathway, we determined the X-ray crystal structures of MdnC and MdnB.

The phase information for both MdnC and MdnB was calculated from SeMet derivatized

proteins using SAD (Table 4-3). Multiple crystallization trials of MdnC and MdnB with

MdnA and/or ATP/nonhydrolyzable analogs all resulted in the same “active site free”

structures, with several unresolved isomorphous loop regions key to the enzymatic

reaction and no substrate or cofactor apparent in the electron density maps. To obtain a

more complete model, MdnA1-35 was chosen as the alternative substrate mimic to probe

the precursor peptide/enzymes interactions. MdnC was crystallized in the presence of

the C-truncated MdnA1-35 (without the core peptide region) and the structure of the

complex was solved. The MdnC/MdnA1-35 complex was in a tetragonal form, with four

99

dimers per asymmetric unit. In contrast, only the apo-form of MdnB was obtained when

co-crystallized with MdnA1-35. The orthorhombic crystal form of MdnB consists of two

protomers per asymmetric unit, where the biological unit (dimer), is formed via

crystallographic symmetry. Compared with the circular octameric geometry of MdnC in

the crystal, MdnB dimers pack in a linear form (Figure 4-8 and Figure 4-9). Residues 2-

244/250-325 in MdnC (chain A) and 5-238/254-326 in MdnB (chain A) were successfully

built into the electron density maps. Residues 11-22 of MdnA (chain I) are resolved in

the MdnC/MdnA1-35 complex.

Figure 4-9. Crystallographic packing of MdnC and MdnB. MdnC crystal packs in space group P41, with four dimers per asymmetric unit. MdnB crystallize in space group C2, consisting of two protomers per asymmetric unit. Observed MdnA1-

35 bound with MdnC protomers in 1:1 ratio. Calculated mFo-DFc difference map corresponds for MdnA fragments are displayed in green at 2.0σ contour level.

Both MdnB and MdnC fit in the overall ATP-grasp ligase structural motif, with

three subdomains: N-domain, central domain and C-domain (Figure 4-8). MdnC and

MdnB form similar homodimer assemblies with an interface area of approximately 2748

100

Å2 (MdnC) and 2442 Å2 (MdnB) respectively. Comparison of the dimers of MdnC and

MdnB show an overall Cα RMSD of 1.7 Å. Amino acids near the dimeric interface are

predominantly hydrophobic in nature, while the ATP binding sites have a higher

calculated electrostatic potential (Figure 4-10). The dimeric interaction is mediated

through two key antiparallel β-strands between the N-domain (β3) and the neighboring

central domain (β’8) (Figure 4-11). These two strands are structurally conserved in all

ATP-grasp ligase family members. In addition, α3 extends along the neighboring central

domain, interacting with β’9 and contributing to the stabilization of the dimer. Compared

with other reported ATP-grasp ligases, the 23-amino acid long helix (α3) in MdnC/B is

atypical, as others commonly contain a much shorter helix at this position. Two helices,

α4 and α’4, connecting the N-domain and central domain, are located in the middle of

the dimeric interface. The α4 and α’4 in MdnC dimers are in a close contact, whereas in

the MdnB dimer they are ~6 Å apart, creating a top-to-bottom tunnel-like feature.

Figure 4-10. Electrostatics map of MdnC and MdnB dimers. Color-coded electrostatic surface map. MdnC and MdnB dimers are the same orientations as shown in Figure 4-8. A tunnel-like feature in MdnB dimeric interface has been observed. Hydrophobic residues are assembled near the dimeric interface, while hydrophilic residues are predominantly near the ATP binding site.

101

Figure 4-11. MdnC crystallized as a dimer. The N-domain of MdnC (chain A) is in orange, while the neighboring central domain (chain B) is in grey. Key interactions between antiparallel β3/β’8 stabilize the dimerization. Other interactions include α3/β’9 and α4/α’4.

Figure 4-12. C-alpha distance difference plotting between MdnC and MdnB protomers. Plot (x, y) indicate the absolute difference between the distance of residue x Cα and residue y Cα in MdnC and MdnB. Amino acid movement between MdnC and MdnB protomers has been observed in β9β10 hairpin and helix α7 regions.

The MdnC and MdnB protomers are overall structurally similar, but there are

several differences in their central domains (Figure 4-8), specifically in the β9β10

hairpin region and the α7 region (Figure 4-12). MdnC possesses a long two-stranded

antiparallel β-sheet (β9β10), followed by a relatively ordered helix, α7. The β9β10 and

102

α7 regions are anchored by the MdnA leader peptide. On the contrary, the β9β10

hairpin of MdnB lays on the C-domain, while the α7 region is disordered into a flexible

loop. In the two structures, the β9β10 hairpins are in significantly different

conformations, the turn of the hairpins are 25 Å apart comparing MdnC and MdnB.

Table 4-3. Data collection and refinement statistics of macrocyclases MdnB and MdnC

SeMet Derivative Native

MdnB MdnC MdnB MdnC+MdnA1-35

Data collection

Space group C121 P41 C121 P41

Cell dimensions

a, b, c (Å) 128.49, 36.13, 139.90

132.62, 132.62, 196.98

128.62 37.99 139.38

132.56, 132.56, 198.16

() 90, 116.9, 90 90, 90, 90 90, 116.6, 90 90, 90, 90

Resolution (Å) 45.2-2.51

(2.67-2.51)*

39.5-2.76

(2.81-2.76)

33.4-2.28

(2.36-2.28)

39.7-2.66

(2.71-2.66)

Rmerge 0.107 (0.66) 0.183 (0.80) 0.053 (0.41) 0.114 (0.67)

I / σI 18.7 (1.8) 19.4 (2.1) 17.0 (3.6) 17.1 (3.8)

Completeness (%) 88.3 (69.1) 99.8 (99.7) 99.9 (97.1) 99.9 (92.8)

Redundancy 4.6 (4.5) 12.5 (11.4) 4.9 (4.9) 11.9 (10.8)

Refinement

Resolution (Å) 33.4-2.28 39.7-2.66

No. reflections 137054 1155055

Rwork / Rfree 0.215 / 0.255 0.219 / 0.260

No. atoms 4433 20643

Protein 4376 19872

Precursor peptide - 771

Water 43 -

B-factors 47.82 55.97

Protein 47.81 55.16

Precursor peptide - 76.70

Water 48.60 -

R.m.s. deviations

Bond lengths (Å) 0.009 0.003

Bond angles () 1.02 0.64 *Single crystal was used for each data set;

*Values in parentheses are for highest-resolution shell;

103

Figure 4-13. MdnC uses ATP to catalyze the macrocyclization of the precursor peptide

MdnA. A) Modelled ADP interacts with the C-domain and central domain of MdnC. The positions of Mg2+ and ADP are adapted from the aligned LysX structure (PDB entry 3VPD, 20% sequence identity). B) Interactions of MdnC with the precursor peptide MdnA. The strictly conserved region of MdnA (PFFARFL) is well-resolved in the electron density map. Helix α7 and β9β10 hairpin of MdnC sandwich the helical region of the resolved MdnA leader peptide. Calculated mFo-DFc map is displayed at a 1.5σ contour level.

Table 4-4. Key residues in nucleotide interaction

MdnC MdnB LysX RimK DDL Proposed interaction

Lys125 Lys131 Lys127 Lys100 Lys97 Binds to -phosphate of ATP

Lys166 Lys171 Lys87 Lys141 Lys144 Binds to -phosphate of ATP

Gln207 Gln211 Gln167 Glu178 Glu180 Hydrogen bond to adenosine N6

Glu215 Glu219 Asp176 Asp187 Glu187 Hydrogen bond to ribose O3’

Asp281 Asp284 Asp234 Asp248 Asp257 Metal ion coordination

Glu294 Glu297 Glu250 Glu260 Glu270 Metal ion coordination

Asn296 Asn299 Asn252 Asn262 Asn272 Metal ion coordination

104

Figure 4-14. Characterization of determinants for MdnC catalyzed cyclization. A) Extraction of crude reaction mixture followed by mass spec analysis, relative ion intensity corresponding to ADP and AMP are plotted. MdnC produces ADP but not AMP during the catalysis. B) Mass spec analysis of product formation for MdnC and site-directed mutants. Masses (positive mode) are 1137:1138 for MdnAΔ2 and 1140:1141 for MdnAΔ1. Only the MdnCAA mutant produces cyclic product. C) MS for the MdnCAA reaction mixture, indicating the formation of MdnAΔ1.

Figure 4-15. MdnC central domain interacts with the precursor peptide MdnA. Alignment of the β9β10 hairpin and helix α7 regions among enzymes in microviridin biosynthetic pathways. D192 (dark red) is well conserved in MdnC-like and MdnB-like homologs. E191 is conserved in MdnC, MdnB and MdnB-like homologs, but is substituted as an aspartate in most other MdnC-like macrocyclases.

105

4.2.4 Key Residues for Nucleotide Binding and Catalysis

The macrolactone cyclizations of Thr39/Asp45, Ser43/Glu47 (by MdnC) and

Lys41/Glu48 (by MdnB) require Mg2+ and ATP208. The reactions of MdnC and MdnB

both generate ADP, but not AMP (Figure 4-13 and Figure 4-14A). These results suggest

that the cyclizations are most likely to occur via acylphosphorylation, as opposed to

(also common) the generation of acyl-AMP intermediates. Unfortunately, as mentioned

and despite extensive efforts, we were unable to observed ATP, ADP or various non-

hydrolyzable analogs soaked or co-crystallized with MdnC or MdnB in the calculated

electron density maps. Also, unlike other ATP-grasp ligases, whose interactions with

ATP have been experimentally determined 213, neither MdnC nor MdnB show an

observable heat change with ATP (and its analogs) using ITC analysis. Nonetheless,

the ATP binding pocket of the ATP-grasp ligases is structurally conserved, allowing the

determination of the key residues in MdnC and MdnB that interact with the nucleotide

and facilitate the macrocyclization (Figure 4-13A and Table 4-4). Using structural

alignment, critical residues involved in ATP interactions and metal ion coordination

include Lys125, Lys166, Gln207, Glu215, Asp281, Glu294 and Asn296 (for MdnC). To

probe the importance of these residues in the MdnC reaction, we mutated each to

alanine. The mutation of any of these residues to Ala (K125A, K166A, Q207A, D281A

and E194A/N296A, Figure 4-14) completely abolishes the MdnC cyclization activity.

These residues are predicted to be primarily involved in ATP binding and not interaction

with the precursor peptide. Indeed, representatives of these MdnC mutants (D281A and

E294A/N296A) showed the similar level of interaction with the precursor peptide MdnA

as measured with ITC (Figure 4-7).

106

4.2.5 Recognition of MdnA Leader Peptide by MdnC

Our binding measurements suggest that the MdnA interacts with MdnC in a core

peptide-independent manner (Figure 4-7 and Table 4-2). Consistent with this, amino

acids 11-22 of MdnA1-35 were well-resolved in the calculated electron density map of the

MdnC/MdnA1-35 complex. This region of the leader peptide includes the strictly

conserved motif of PFFARFL bound in an α-helix conformation (Figure 4-13B)207.

MdnA1-35 interacts with the central domain of MdnC through electrostatic interactions

between Arg17 of MdnA and Glu191/Asp192/Asn195 of MdnC α7. In most other MdnC-

like macrocyclases, Glu191 is replaced by an aspartate residue, while Asp192 is well

conserved among all microviridin macrocyclases (Figure 4-15). Meanwhile, electrostatic

interactions between the backbone amides of Ser20 (MdnA) and Val182 (MdnC)

induced the movement of MdnC β9β10 hairpin toward MdnA1-35. The interface area of

bound MdnA1-35 and MdnC is ~740 Å2.

To probe the key residues of MdnC in binding to MdnA, we created two MdnC

variants with double mutations, MdnCE191K/D192K and MdnCE191A/D192A (referred to as

MdnCKK and MdnCAA, respectively). The charge reversal of the conserved Glu191 and

Asp192 residues completely abolished the MdnA/MdnCKK interaction (Figure 4-7), and

MdnCKK had no macrocyclization activity toward the full length MdnA. On the other

hand, the more conservative mutant MdnCAA retained partial cyclization activity but had

no detectable MdnA/MdnCAA binding interaction via ITC analysis. Compared with WT

MdnC, MdnCAA exhibited a significantly slower reaction, and promoted the formation of

MdnAΔ1 and MdnAΔ2 in a ~1:1 ratio (Figure 4-14). These observations confirmed the

roles of Glu191 and Asp192 and α7 in precursor protein recognition.

107

Figure 4-16. Binding and cyclization activity of macrocyclases toward MdnA variants. Truncated MdnA variants were utilized to probe the binding and activity determinants of MdnC and MdnB. For binding, ++ represents strong binding and +, weak binding (see Figure 4-7 and Table 4-2). Cyclization is indicated by a ‘+’ or ‘-’ for no observed reaction. n.d. = no activity detected.

Figure 4-17. MdnA variants and their macrocyclizations. A) MdnA11-39 can be recognized and processed by both MdnC and MdnB to form the tri-cyclic product while, B) MdnAAc20-49 alone is not an active substrate for MdnC. MdnC can be constitutively activated with the presence of MdnA1-35, and catalyze the single cyclization of MdnAAc20-49. However, the single cyclic product in in trans reaction cannot be further converged into di- (or tri-) cyclo product, neither with an extended reaction time (data not shown), nor with an increased MdnA1-35 concentration.

108

To further evaluate the importance of observed MdnA structural elements in the

enzymatic macrocyclization chemistry, we created two truncated precursor protein

variants and characterized their functions (Figure 4-16). MdnAAc20-49 (amino acid 20-49

with an N-terminal acetyl group, Figure 4-16, line iv) does not include the conserved,

structurally ordered α-helix region, while MdnA11-49 (Figure 4-16, iii) contains the α-helix

but ten amino acids at the N-terminus of MdnA, which were not observed in the

MdnC/MdnA1-35 complex, are removed. As predicted, MdnAAc20-49 showed no binding to

MdnC and was not cyclized in our in vitro assays (Figure 4-17). On the contrast, the

MdnA11-49 variant bound to MdnC with comparable parameters as full length MdnA and

could be di-cyclized (Figure 4-7and Figure 4-17A). Additionally, the reaction product of

MdnC, MdnA11-49Δ2, was further converted into MdnA11-49Δ3 by MdnB. These results

along with the crystal structure have revealed the minimum requirements for efficient

precursor peptide binding and cyclization.

Next, we explored the potential of leader-free macrocyclization following similar

approaches used in other RiPPs biosynthetic pathways223,224. We attempted in trans

cyclization of an ‘inactive’ substrate (MdnAAc20-49) with the α-helical MdnA leader peptide

MdnA1-35. Indeed, the combination of ‘inactive’ MdnAAc20-49 with the inducer MdnA1-35 led

to the production of a cyclic intermediate. Unexpectedly, conversion to a single

cyclization product was observed (MdnAAc20-49Δ1) and the product of the second

cyclization was not detected. Identical results were obtained employing extended

reaction times or increased MdnA1-35 concentration. MdnAAc20-49Δ1 was not a substrate

of MdnB in the presence or absence of inducer MdnA1-35 (Figure 4-17B).

109

4.3 Structures of MdnC and MdnB Facilitate to Understand the Precursor Recognition in RiPP Pathway

Post-translational modification of ribosomal precursor peptides allows for the

production of large libraries of natural (and synthetic) variants. Macrocyclization is one

of the most common modifications in RiPPs and other natural product classes, including

nonribosomal peptides and polyketides. The cyclic architecture contributes to the

structural and functional diversity, as well as the rigidity and in vivo stability of the final

product. The well characterized patellamide (RiPP) biosynthetic pathway utilizes a

subtilisin-like serine protease domain (PatG) to generate the mature cyclic peptide

products220. PatG recognizes the linear precursor peptide PatA, and directs peptide

cyclization through serine acyl-enzyme intermediate chemistry. A similar Asp-His-Ser

catalytic assembly and mechanism is utilized by nonribosomal peptide synthetase

(NPRS) thioesterase domains219. The microviridins and marinostatins are structurally

complex ribosomal peptides with multiple macrolactam/lactone cyclizations, formed by

condensation of amino acid side chains and/or termini198. This class of RIPPs is

produced by cyanobacteria and exhibits serine protease inhibition leading to cytotoxic

effects on neighboring organisms217. The microviridin J gene cluster of M. aeruginosa

MRC contains a minimal set of genes (mdnA-C) for the biosynthesis of microviridin,

making it an excellent model system for the study of RiPP biosynthesis and cyclization.

In this report, we characterized two macrocyclases in the microviridin J biosynthesis

pathway (MdnC and MdnB), along with the precursor protein MdnA. The in vitro

catalytic activity of MdnC and MdnB has been demonstrated, and the crystal structures

of both enzymes have been solved. Both enzymes belong to the ATP-grasp family

whose members utilize a wide range of chemistries with ATP-based activation in

110

common, and represent a distinct strategy for peptide cyclization. The general

mechanism of the ligations has been dissected in similar systems and has been shown

to utilize ATP to activate the carboxylate substrates to yield an acylphosphate

intermediate and ADP212. A distinct structural feature of the two macrocyclases is the

presence of several long interdomain loops that show significant disorders in the

electron density maps. Similar to many ATP-grasp family members, this feature likely

provides the interdomain flexibility necessary for the binding of the 13-amino-acids core

peptide substrate.

MdnC catalyzes the first step in the post-translational modification of microviridin

J by the formation of two macrolactone rings (Figure 4-2, MdnA to MdnAΔ2). Compared

with the ‘apo-form’ of MdnB, MdnA bound MdnC exhibits a large movement of the

central domain. The ~25 Å shift of the β9β10 hairpin region in MdnC opens the

interdomain pocket, and allows the core peptide to access bound ATP between the

central domain and C-domain. Meanwhile, crystal structure of MdnB represents the

‘closed’ conformation with the β9β10 region blocking the pocket. Binding of the MdnA

leader peptide of MdnA restrains the β9β10 hairpin and the α7 helix in MdnC. These

regions may act as allosteric sites for subsequent core peptide recognition and

catalysis, assisting in orienting the inherently disordered active site. In support of this

hypothesis, mutagenesis of conserved residues on α7 significantly alters the enzymatic

efficiency.

The structural basis and mechanism of precursor peptide recognition is the key

factor in enzyme/substrate selectivity in RiPP pathways. Only limited structures of RiPP

biosynthetic enzymes have been determined in complex with their corresponding

111

precursor peptides, and include dehydratase NisB involved in nisin biosynthesis225 and

heterocyclase, LynD, involved in cyanobactin biosynthesis224. Both enzymes use a

three-stranded antiparallel β-sheet to interact with the conserved regions of the leader

region of the precursor peptide. An RiPP precursor peptide recognition element (RRE)

has been predicted based on these partial co-complex structures and related

mutagenesis studies198,226,227. By comparison, MdnC and MdnB do not possess a

similar RRE. The interaction between MdnA and the two ATP-grasp ligases represent a

novel type of RiPP precursor peptide recognition mechanism. Most leader peptides tend

to contain α-helices in solution, and presumably maintain this structure when bound to

the biosynthetic proteins197. Structural comparison between MdnA-bound MdnC and

apo-MdnB indicates that the α-helical leader peptide is critical for the conformation

change of the biosynthetic enzymes, while the conformational change is the key factor

modulating the enzyme activity. While the α-helix is important for post-translational

modification machinery, the first 10 amino acids at the N-terminus of MdnA (1-10) are

not required for precursor peptide recognition. Similar substrate specificity has been

observed in the lacticin biosynthetic pathway, where a propeptide LctA variant with the

deletion of its N-terminal eight amino acids could be fully processed200. In addition,

compared with MdnB, the MdnC/leader peptide (MdnA1-35) binding interaction is ~10

fold tighter. The differential binding affinities of the precursor peptides can be

rationalized in several ways. It is reasonable to predict that the linear MdnA peptide is

less stable and thus is required for it to be processed rapidly to the macrocycle. Also the

substrate for MdnB is the di-cyclized core peptide and it is likely that both the leader

peptide and the cyclized core contribute in binding and recognition by the second

112

enzyme, MdnB. Likewise, distinct leader peptide recognition may help to avoid the

competitive inhibition of MdnB by the precursor peptide.

In trans activation of the macrocyclase by the helical leader peptide alone led to

an active enzyme, an interesting feature of the a few RiPP systems studied.

Unexpectedly, MdnC in trans processing of the leader-free MdnAAc20-49 with MdnA1-35

only produced a product with a single cyclization. In comparison, LctM catalyzes

multiple dehydrations of leader-free LctA with a high concentration of leader peptide223.

The full length MdnA may also facilitate the movement (and rotation) following the first

cyclization of the core peptide, thus facilitating the second cyclization. This process

might require leader peptide and core peptide stay bound. A trapped MdnAΔ1 in the

MdnCAA mutant could inhibit the dynamics leading to the observation of no further

processing.

The helix of the MdnA leader peptide is ~15 amino acids away from the core

peptide of MdnA. This peptide region was unresolved in MdnC/MdnA1-35 complex

structure, and may be disordered and/or flexible. However, simple modelling suggests

the antiparallel β-strands (β9β10) in between the leader peptide and the ATP active site

could interact in a β-sheet motif in a similar fashion as of the PatE/LynD interaction224.

The MdnA α-helical C-terminus might extend along MdnC β10 near the MdnC dimeric

interface, and then fold back to the active site between MdnC N-domain/central domain,

where the core peptide could approach bound ATP for activation (Figure 4-18). Another

member of the ATP-grasp ligase, LysX uses the N-domain to interact with β-barrel

LysW. The C-terminus of LysW extended into LysX active site213. On the contrast, the

N-domain of both MdnC and MdnB show significant differences compare with LysX N-

113

domain (Figure 4-3). One of the structural differences between MdnB and MdnC is the

tunnel-like feature of MdnB between at the dimeric interface. This larger central cavity

could be necessary for binding and processing of its more rigid substrate, MdnA∆2.

We were not able to observe a complex structure of MdnC or MdnB with the

MdnA core peptide (the region cyclized and in the final product), nor did we observe any

core peptide binding interaction with MdnC or MdnB by ITC. This suggests that the

macrocyclase-core peptide interaction is not the key determinant for binding/recognition

and the interaction between the macrocyclase and the leader peptide plays a more

significant role. Indeed, several reports have shown that alteration of the core peptide

region is tolerated whereas changes to the conserved region of the leader peptide are

not207. This is also consistent with the natural, genetic variability observed in microviridin

gene clusters. In the P. agardhii gene cluster, mvdE and mvdF represent two distinct

propeptides that are both substrates for the macrocyclases (Figure 4-1). Additionally,

there are several examples of a single precursor peptide gene product containing one

leader peptide and multiple core peptides. The all7013 in Nostoc sp. PCC 7120

contains the strictly conserved α-helical element in the leader peptide region and three

different core peptide regions, indicating the potential to form three different final

products (Figure 4-1). Similar combinatorial assembly in RiPP systems has been

detailed in the patellamide biosynthetic pathway228.

The structural basis of substrate precursor peptide binding in the microviridin

biosynthetic pathway is summarized in Figure 4-18. The α-helical region of the leader

peptide binds to the α7 of macrocyclases and induces a conformational change of the

β9β10 hairpin, which allows the core peptide region to access the ATP binding pocket

114

for activation and cyclization. These processes require the presence of the helix

element on leader peptide, but the amino acids at the N-terminus to this helix region are

not required for the macrocyclizations. The disruption of leader peptide/macrocyclase

interaction by mutating the key residues on α7 also abolishes the cyclization reactions.

In addition, a constitutively activated macrocyclization can be observed through an in

trans reaction. The results presented provide novel insights into the understanding of

RiPP biosynthetic pathways and facilitate discovery and engineering of RiPPs

pathways.

Figure 4-18. Leader peptide directed peptide macrocyclization in the microviridin J

biosynthetic pathway. A) The leader peptide of MdnA activates macrocyclation by orienting α7 and by inducing a large shift of the β9β10 hairpin. B) The α-helical element of the precursor peptide, as well as residues on α7 (Glu191/Asp192, refers as to ‘ED’), are critical and removing these elements result in no cyclizations, C) whereas the N-terminus of the substrate is not requisite for either MdnB or MdnC macrocyclizations. D) In trans activation of MdnC using the α-helical element of the precursor peptide and an inactive substrate produces a partially active macrocyclase. E) Model of the interactions of full length MdnA (green) with the macrocyclase, MdnC. The cyclized core peptide region was adapted from pdb entry 4KTU and docked into MdnC with modelled ADP.

115

4.4. Materials and Methods

4.4.1 Protein Cloning, Expression and Purification

Codon-optimized, full-length mdnB and mdnC (M. aeruginosa MRC) were

purchased from Mr. Gene (Regenburg, Germany) supplied in pMA vectors. mdnB and

mdnC were amplified using PCR (Table 4-5) and a short C-terminal AAAHHHHHH

hexa-histidine tag was encoded in the C-terminal primers bearing the Xhol restriction

site for both MdnB and MdnC. The mdnB and mdnC genes were then ligated into the

expression vector pET30a. MdnC was expressed in E. coli BL21(DE3) pLysS. Cultures

(1 L) were grown at 37°C to an OD600=0.6 and overexpression was initiated by adding

isopropyl β-D-1-thiogalactopyranoside (IPTG, final concentration 100 μM). Growth was

continued for 12 h at 18°C, before the cells were harvested by centrifugation. Cell

pellets were resuspended in 25 mL of 0.5 M NaCl and 20 mM Tris-HCl, pH 7.5, and

lysed at 14,000 psi through a nitrogen-pressure microfluidizer cell (M-110L Pneumatic).

The lysate was clarified by centrifugation at 15,000g for 20 min at 4°C. MdnC was

purified by immobilized metal affinity chromatography (HisPur Ni-NTA Resin, Thermo

Scientific). After binding for 1 h, the resin was washed with 4X 10 mL of 0.5 M NaCl, 10

mM imidazole and 20 mM Tris-HCl, pH 7.5, and the bound protein was eluted with 3X 2

mL of 0.5 M NaCl, 250 mM imidazole and 20 mM Tris-HCl, pH 7.5. The elution fraction

was dialysed into a low salt buffer (50 mM NaCl and 20 mM Tris-HCl, pH 7.5), and

further purified by anion exchange chromatography (MonoQ HR 10/10, AKTA FPLC

System, GE Healthcare) with a linear gradient of 50-500 mM NaCl over 30 min, followed

by a size-exclusion chromatography (HiLoad 16/60 SuperDex-200 column, AKTA FPLC

System, GE Healthcare) with buffer 150 mM NaCl, and 20 mM Tris-HCl, pH 7.5. Pooled

116

MdnC was concentrated to ~10 mg mL-1 for crystallization. MdnB was overexpressed

and purified following the same procedure described for MdnC.

Table 4-5. Oligonucleotides for protein cloning and mutagenesis

Protein Primers

MdnB WT 5’-GACCTTCATATGAAAGAATCGCCGAA*

5’-CTAACTCGAGTCAGTGATGGTGATGGTGA

TGGGCGGCGGCACCGAACACCAGAAAATC

MdnC WT 5’-GCCATATGACGGTAGTGATTGTGACG

5’-GACTCGAGTCAGTGATGATGATGATGATG

AGCAGCAGCGGAGTTCACCAGAATCTC

K125A 5’-CGCCAACCACgcaCAACTGCAGC

5’-TGGTCCACTTTGGCAATC

K166A 5’-TATTGTCACTgcAATGCTGTCCCAG

5’-CCGGTCGCTTCAAACTCT

Q207A 5’-GATGACATTTgcAGAAAACATCCCG

5’-GGACAAAATTGCAGACCC

Q215A 5’-AAAGCACTGGcGCTGCGTATTAC

5’-CGGGATGTTTTCTTGAAATGTC

D281A 5’-GGCGCCATTGcTATGATCGTG

5’-ATAGTTCAGGCCGAAATATTTC

E294AN296A 5’-tgcTCCGGTTGGTGAGTTCTTC

5’-atcgCCAGGAAGATATAACGTTCATC

E191KD192K 5’-GGTTACAAAAaaaaaaCTGGATAACCTGG

5’-GGGCTGGTAAAGACGACC

E191AD192A 5’-GTTACAAAAGcagcTCTGGATAACCTGGAG

5’-CGGGCTGGTAAAGACGAC

SeMet M25K 5’- GCGATCGAAGCAAAAGGCAAAAAGGCC

M104L 5’- GTATTCGTGGCCTGATTGCCTCACTGTC

M167L 5’- CGGTATTGTCACTAAACTGCTGTCCCAGTTCG

M181N 5’-GGGGATAAACAGGAGGAAAACGTCGTCTTTAC

M282I 5’-CGCCATTGATATCATCGTGACCCCG *Restriction sites are underlined.

117

4.4.2 MdnC Mutagenesis

Mutations of MdnC were carried out using the Q5 Site-Directed Mutagenesis Kit

(NEB) following the manufacturer’s instructions. The validity of mutagenesis was

confirmed by DNA sequencing. The MdnC mutant list and primers designed for the

mutagenesis are listed in Table 4-5.

4.4.3 Designed MdnA Variants

.Chemical synthesized MdnA variants were purchased. Full-length MdnA

(residues1-49) (Genscript, US and SynPeptide, China); MdnA1-35 contains the leader

peptide region only (amino acids 1-35, SynPeptide); while MdnA11-49 (LifeTein, US) and

MdnAAc20-49 (EZBioLabs, US) are shorter versions of MdnA, with truncated amino acids

in the N-terminus leader peptide region.

4.4.4 Macrocyclization of MdnA Variants.

For MdnA (and its variants) cyclization in vitro, MdnA substrates were treated

with 100 mM Tris-HCl, pH 8.0, 2 mM ATP, 10 mM MgCl2, 50 mM KCl, and 1.8 μM

MdnC (or plus 1.8 μM MdnB) at 37°C and quenched at 4 h with equal volumes of 0.5 M

EDTA. HPLC was used for product analysis: 30 % acetonitrile for 3 min, and 30-40 %

acetonitrile from 3-14 min in 0.1 % TFA-water. LC-MS was used to analyze the

corresponding cyclization with a method: 3% 0.5% formic acid (FA)-acetonitrile for 3

min, and 3-98% 0.5% FA-acetonitrile from 5-11 min in 0.5 % FA-water.

4.4.5 MdnC-MdnA Dicyclization Kinetics

The kinetics of MdnC and MdnA dicyclization reaction was analyzed using an

HPLC based assay. Various concentrations of MdnA (8.75 - 52.5 μM) were processed

with 1.8 μM MdnC with the reaction conditions described above at 25°C, and quenched

at various time points (2-60 min). The reaction mixtures were analyzed with analytical

118

HPLC equipped with an auto-sampler and the method: 30 % acetonitrile for 5 min, and

30-40 % acetonitrile from 5-11 min in 0.1 % TFA-water. The linear form of MdnA eluted

at 8.5 min, and the MdnC reaction product, dicyclo-MdnA (MdnAΔ2) eluted at 9.5 min.

LC-MS analysis verified that the product was 35.9 Da difference from the standard as

described for microviridin K209, indicate two cyclizations had occurred. Kinetic

parameters were deduced from the integration of the starting material and product

peaks areas. Reactions at each MdnA initial concentration were measured in triplicate.

4.4.6 Leader Peptide Binding Affinity Determination

Isothermal titration calorimetry (ITC) was used to determine the MdnB, MdnC

and MdnA interactions. 200 μL of 0.1 mM enzyme in 50 mM KCl, 10 mM MgCl2 and 100

mM HEPES-NaOH, pH 8.0 was placed in a MicroCal iTC200 reaction cell, while 39.6 μL

full (or partial) length MdnA peptide (1.0 mM) was titrated into the protein solution over

time at 25°C. Heat change during the reaction was detected and recorded for binding

affinity calculation. All titrations were repeated in triplicate. The binding parameters and

corresponding error are calculated using the Origin software package155.

4.4.7 Preparation of SeMet Labelled Protein

Selenomethionine (SeMet) labelled MdnC was produced using methionine

auxotroph E. coli (T7 Crystal Express, NEB). To improve the crystal diffraction quality of

SeMet-MdnC, mutagenesis was carried to decrease the number of non-conserved Met

(M25K, M104L, M167L, M181N, and M282I). Cells were cultured in 6 L M9 minimal

media supplemented with 50 mg L-1 L-Met and grown to an OD600=0.6 at 37°C. The

cells were then pelleted by centrifugation, resuspended in 6 L of fresh M9, and

incubated at 37°C for 3 h to deplete the intercellular methionine. 50 mg L-1 L-SeMet was

then added to the culture 20 min before the overexpression, which was initiated by

119

adding 100 μM IPTG followed by growth for 16 h at 18°C. The purification for SeMet

labelled MdnC was the same as described above for the native protein with the addition

of 10 mM dithiothreitol (DTT) to the dialysis, anion exchange, and gel filtration

purification buffers. SeMet labelled MdnB was prepared and purified in the same

fashion.

4.4.8 Crystallization, Data Collection and Crystallographic Analysis

Purified MdnC was mixed with 10 eq. of AMP-PNP, 1.2 eq. of MgCl2 and 1.05

eq. of MdnA1-35 (final concentration [MdnC] = 7.4 mg mL-1) and allowed to equilibrate at

4°C overnight. Initial crystal screening was performed in a vapor diffusion, sitting drop

format using commercial sparse matrix screens. Small plate clusters were identified in a

condition containing 200 mM ammonium sulfate, 20% PEG-3350, and 100 mM bis-Tris-

HCl, pH 6.5. Optimization of salt and pH along with microseeding were performed in

hanging drop format at 20°C.The resultant crystals with a maximum size of

~50×200×200 μm were obtained in a final condition that contained 250 mM ammonium

sulfate, 28% PEG-3,350, 8% dioxane and 100 mM bis-Tris-HCl, pH 5.5.

MdnB was concentrated to 6.0 mg mL-1 and premixed with 5 eq. AMP-PNP and

1.5 eq. MgCl2. The mixture was incubated at 4°C for 4 h and screened with commercial

sparse matrix in a vapor diffusion, sitting drop format. Diffracted plate-shaped single

crystals with a size of ~40×100×100 μm were obtained in the best condition contains

20% PEG 4,000, 20% 2-propanol and 100 mM sodium citrate, pH 5.6. SeMet labelled

MdnB was crystallized in a same manner, crystals were harvested and flash frozen, with

15% glycerol as cryoprotectant.

120

Diffraction data for MdnB crystal was collected on beamline 22-ID of the

Advanced Photon Source-Argonne National Laboratory (APS-ANL) at a wavelength of

0.9787 Å. MdnC diffraction data were collected on beamline 21-ID-G of the Life

Sciences Collaborative Access Team (LS-CAT) facility, APS-ANL, at a wavelength of

0.9786 Å. Data were collected at 100 K, integrated, merged and scaled using XDS

package156. The MdnB SeMet-SAD data set was processed into the space group C2

with two molecules per asymmetric unit. Phases were calculated with

PHENIX.AUTOSOL180,159 and the resultant partial model was used in molecular

replacement for native MdnB data. PHENIX.AUTOBUILD succeeded in placing 80% of

the amino acids in MdnB sequence. The remaining residues were built into the electron

density maps manually with COOT160. MdnC crystals diffracted X-ray in space group

P41, with 8 monomers per asymmetric unit. MdnC structure was solved with dimeric

MdnB as molecular replacement model with PHASER157. PHENIX.AUTOBUILD placed

85% of the amino acids in MdnC sequence, while the rest of MdnC as well as MdnA

fragments were built manually with COOT. SeMet labelled MdnC data set facilitated the

phase improvement throughout the model building. Structures were refined using

REFMAC5183 and PHENIX.REFINE159. Sigma-A weighted, simulated annealing

composite omit maps were used to judge and verify structures throughout refinement.

The refined MdnC and MdnB structures include Ramachandran favored (outliers) as

95% (0.6%) and 96% (0.9%), respectively.

Crystallographic data and refinement statistics are shown in Table 4-3. Sequence

alignments were carried with Clustal Omega229. MdnC sequence evolutionary

conservation was calculated from ConSurf230. Protein surface electrostatics were

121

calculated and mapped with APBS231 and PDB2PQR232. Structural illustrations were

prepared with Pymol161.

122

CHAPTER 5

CRYSTAL STRUCTURES OF Y3 IN FUNGUS Coprinus comatus REVEAL A NOVEL LECTIN FAMILY WITH ANTIVIRAL AND ANTITUMOR ACTIVITIES

5.1 Introduction of Lectins, and Proteinous Natural Product Y3

Lectins are a diverse group of non-immune proteins that exhibit reversible and

specific binding to carbohydrates and have been found in almost all organisms,

including a high prevalence in fungi, plants and animals233. The proteins belong to a

number of unrelated structure classes, with substantially varying carbohydrate

specificities, from monosaccharides to complex oligosaccharides234. Lectin initiated

carbohydrate/protein interaction is a key feature of many fundamental biological

processes and often facilitate in cell adhesion, cell-to-cell interaction, as well as

carbohydrate storage and utilization235. The chemistry and biology of the extraordinary

tight and specific carbohydrates binding lectin is of general interest in terms of its

related biological relevant activities and is a potential target toward the development of

therapeutic treatments. Lectins have been exploited as biomarkers and applied as

diagnostic tools in cancer cell detection, imaging and targeting236,237,238. Additionally, the

potential application of utilizing these carbohydrate-binding proteins as antifungal,

antiviral, or antitumor agents239 is under investigation.

Fungal lectins have attracted significant attention during the past decades as a

source of biomolecules with interesting and novel characteristics. Additionally, lectin

genes account for many uncharacterized ORFs in a variety of species. A number of

novel fungal lectins have been described recently, with a majority arising from

mushroom species240. Mushrooms show great potential for the production of bioactive

metabolites and are a prolific resource for natural product drug leads241,242,243. Most

mushroom lectins are fruiting body cytoplasmic lectins244, with only a few secreted

123

mushroom lectins have been reported so far245,246. The biological roles of the mushroom

lectins is proposed to contribute to self-defense against predators244,247. Similar to

lectins from other species, mushroom lectins have significant pharmacological

potentials244, including mitogenic248, anti-proliferative249, antiviral250 and antitumor

activities251. Several mushroom lectins have been structurally identified recently through

X-ray crystallography with diverse folds: BEL in Boletus edulis has a jelly-roll fold252;

CNL in Clitocybe nebularis253 and rMpL in Macrolepiota procera are both β-trefoil folded

lectins254; secreted LDL in Lyophyllum decastes has been reported to form an αβ2α-

sandwich architecture245.

Y3 is a proteinous natural product from mushroom Coprinus comatus and was

previously reported as a probable glycoprotein, and an inhibitor against the

multiplication of tobacco mosaic virus (TMV)255. The amino acid sequence of Y3 was

deposited several years after the initial report (ADK35888.1). A limited number of

homologous sequences, sharing low similarity with Y3, are in current databases and

none has been characterize256 (Figure 5-1A). In order to understand the antiviral (TMV)

mechanisms, as well as the toxicity and pharmacology of Y3, we heterologously

expressed mature Y3 for functional and structural analysis. Purified Y3 contains several

disulfide bridges and interacts with carbohydrates in a non-covalent fashion. The

characterization of Y3 in vitro, along with the high resolution crystal structures indicate

that Y3 belongs to a novel lectin family. Structural identification of Y3 facilitates our

understanding of its antiviral mechanisms, and advances our characterization of its

expended antiviral/antitumor activity. Additionally, Y3 provides a reference for the

124

identification of novel lectins and provides a structurally basis for protein-glycan

interactions in situ.

Figure 5-1. Recombinant Y3 expression and characterization. A) Sequence alignment (PRALINE) of Y3 (C. comatus) with uncharacterized proteins from Hebeloma cylindrosporum h7 and Galerina marginata CBS 339.88. Predicted signal peptide regions are in grey; conserved cysteines are in orange; predicted N-glycosylation site (▲) of Y3 is in blue. B) Harvested Y3 from P. pastoris heterogeneous expression was purified with size exclusion chromatography as a probable dimer. Running positions of 8.4-kDa and 42-kDa molecular mass standards are indicated. C) Native and denatured PAGE analysis of Y3, suggested the aggregation of Y3 in solution. D) ESI-MS analysis indicates a mass peak belongs to a single Y3 protomer, excluded the intermolecular disulphate bridge. Calculated mass of Y3 protomer (start from mature peptide sequence •) is 12229.54 Da.

5.2 Characterization of Y3 as a Novel Lectin

5.2.1 Genetic and Biochemical Characterization of Y3

Y3 is isolated from C. comatus, a common fungus often seen growing on lawns,

roads and waste areas, and is an edible mushroom when young257. Full-length Y3

consists of 130-amino acids encoded by a 390-bp gene, with a 12-a.a. N-termini signal

peptide258. Consistent with previous reports as a glycoprotein255, several post-

translational modification prediction servers predict a potential N-glycosylation site

125

(Asn92) with a good contrast, as well as additional probable sites of O-

glycosylation259,260. Recombinant mature Y3 was amplified using a heterologous

expression system in P. pastoris for further characterization. The putative signal peptide

of Y3 was excluded in the clone. Meanwhile, to maximally preserve the original

structure and activity, no purification-tag was included.

Figure 5-2. Characterization of purified Y3. A) Phenol-sulfuric acid method was used for total carbohydrates determination. D-glucose (100 ug∙mL-1) or protein (10 mg∙mL-1) were diluted to 400 μL with water, and mixed with 1 mL concentrated sulfuric acid. Purified Y3 contain carbohydrates with w/w ration of 22%. B) LC-MS profile of Y3-containing size exclusive chromatography fractions. Two peaks have been identified in MS (positive mode) trace, only peak II contains proteinous 260 nm UV-vis absorbance. C) Peak II belongs to Y3 protomers with a calculated molecular weight of 12.20 kDa while, D) peak I corresponds to the bound carbohydrates.

Overexpressed Y3 was collected and purified from P. pastoris culture

supernatant. The size exclusion chromatography profile indicates Y3 (MW 12.2-kDa) as

an aggregation form in solution, with a similar retention time as the standard E. coli

maltose binding protein (MBP, MW 42-kDa, Figure 5-1B). Native PAGE analysis

126

suggests an even higher level of aggregation. Denaturing SDS-PAGE analysis indicates

a molecular weight of ~14-kDa, consistent with the mass of 12.2-kDa from mass

spectrometry (Figure 5-1CD and Figure 5-2). Based on MS, PNGase-F treatment did

not produce any additional species, indicating the protein is unlikely to be N-

glycosylated. Purified Y3 through heat (90 ºC for 20 min) and acid/base treatment (pHs

3~11, 100 mM pH buffer) retain the same SDS-PAGE profile and still produce protein

crystals in decent size and shape. Ellman's test excluded the presence of free thiols in

the purified protein, suggesting the eight cysteines form four disulfide bridges in mature

Y3. Carbohydrates have been identified in Y3 containing fractions through our

purification. Phenol sulfuric acid analysis indicates a 22% w/w carbohydrates/protein

ratio in the final size exclusion chromatography purified samples (Figure 5-2A). LC-MS

analysis also indicates a series of small peaks as potential carbohydrates fragments.

These peaks do not possess the proteinous UV-vis absorbance at 260 nm (consistent

with protein) and consist of a wide range of molecular weight (Figure 5-2D).

Figure 5-3. tNCS complicates the structural determination of Y3. A) Native Y3 data sets have a non-origin Patterson peak at [0.5 0.0 0.5] with the intensity of 47% of the origin peaks, indicated the presence of the translational non-crystallographic symmetry. B) ARCIMBOLDO ab initio phasing with a library containing three-antiparallel beta strands (library size ~ 8,000). Calculations of rotation functions suggest several top antiparallel β-strand candidates, but none of them was retained in transition function search and packing evaluation.

127

Table 5-1. X-ray data collection, processing and structure refinement of Y3.

Data Collection Derivative Form

Small unit cell - 5HA2

Native Form

High resolution - 5HA3

Resolution range (Å) 41.3 - 1.70

(1.76 - 1.70)*

41.8 - 1.18

(1.22 - 1.18)

Diffraction Source 21-ID-F (0.9787 Å) 21-ID-G (0.9787 Å)

Space group P21 P21

a b c (Å) , β (°) 41.1 55.2 41.0, 99.5 53.3 56.1 62.7, 92.7

Total no. of reflections 148850 (11787) 567305 (38773)

Unique reflections 20217 (1790) 113409 (9627)

Multiplicity 7.4 (6.6) 5.0 (4.0)

Completeness 0.99 (0.87) 0.93 (0.80)

<I/σ(I)> 14.57 (3.70) 22.98 (4.44)

Wilson B factor (Å2) 14.0 8.1

Rmerge 0.093 (0.44) 0.041 (0.29)

Rmeas 0.099 (0.48) 0.046 (0.34)

CC1/2 1 (0.93) 1 (0.91)

Structure Refinement

Rwork 0.159 (0.187) 0.172 (0.205)

Rfree 0.194 (0.248) 0.189 (0.240)

No. of non-H atoms 1950 4189

Protein 1751 3420

Ligand 34 52

Protein residues 224 448

RMS Bonds (Å) 0.019 0.006

RMS Angles (°) 1.54 1.05

Ramachandran

Favored (%) 95 97

Outliers (%) 0 0

Clashscore 2.9 1.3

Average B factors (Å2) 18.0 11.9

Protein 16.9 10.0

Ligand 30.5 14.9

Water 27.2 20.7

Pdb entry 5HA2 5HA3 *Values for the outer shell are given in parentheses.

128

Figure 5-4. Overall structure of Y3 dimer. A) Y3 dimer forms a 10-stranded β-sheet, helices α1 and α2 locate on one side of the dimeric interface, B) while two α3 locate on the other side. C) Structure topology illustration of Y3 dimer. D) LLD shares a similar protomer structure (pdb entry 4NDV, RMSD of 2.4 Å), but has distinct dimeric packing. E) Plant killer toxin SMKT (pdb enrty 1KVD) with a similar 10-stranded β-sheet forms dimeric heterodimers.

5.2.2 tNCS Complicates the Structure Determination

Native Y3 crystals were harvested in a high pH (pH 9.5) and diffract X-rays to

1.18 Å resolution in space group P21, with unit-cell parameters as a = 53.3, b = 56.1, c =

62.7 Å, β = 92.7º. Integrated data sets possess a large non-origin Patterson peak at

129

[0.5, 0, 0.5], with 45.5% of the origin peaks intensity (Figure 5-3A). Matthews’s

coefficient analysis indicates a possible tetramer solution with a low water percentage of

33.6 % (MW 12.4-kDa, Matthews’s coefficient 2.55 Å3∙Da-1). There is no homologous

structure available for using molecular replacement (MR) to solve crystal phases. We

made considerable efforts with ab initio phasing261,262,263 and sulfur/heavy atom (HA)-

SAD phasing, but the calculations failed owing (we believe) to the presence of

translational non-crystallographic symmetry (tNCS). tNCS is described as a pathology

of protein crystals264,265. With multiple copies of a macromolecule found in similar

orientations, phasing and refinement programs commonly confuse crystallographic and

non-crystallographic symmetry, thus produce problematic structure solution. The phase

problem was eventually solved for Y3 by identifying a crystal in with an alternative lattice

(P21, a = 41.1, b = 55.2, c = 41.0 Å, β = 99.5º). This crystal was soaked in K2PtCl4 and

does not exhibit tNCS. The location of HAs was identified with SHELXD188 and the

phase was calculated with SHELXE and refined with additional rounds of MR-SAD266,184

as described70.

5.2.3 Structure of Y3 Reveals a Novel Lectin Family

Y3 consists of two or one dimers (PISA server)130 per asymmetric unit in two

different space groups. Y3 protomer forms a compact single-domain architecture with

an αβα-sandwich, composed of three α-helices and a five-stranded β-sheet. The β-

sheet consisted of strands β1-β4-β5-β3-β2 from edge to central in an antiparallel

orientation (Figure 5-4). Two long α-helices α1 and α2 packed against one side of the β-

sheet, while C-terminus α3 locates on the other side. Helix α3 is a unique figure of Y3

compare with reported αβ2α-sandwich fold lectin (Figure 5-4D). The structure was highly

130

ordered, with only short turns and loops connecting the α-helices and β-sheet. As

predicted, there are four intramolecular disulfide bridges further aiding to stabilize the

protomer. Cys6-Cys83 connects β1 and α2, Cys18-Cys51, α1 and β3, Cys43-Cys108,

β2 and α3, while Cys26-Cys72 joins α1 and α2 (Figure 5-5B). The structure clearly

shows that the N-terminus of Y3 is modified267. The α-factor signal sequence was

precluded during the expression, while Gln1 was cyclized and formed PCA1

(pyroglutamic acid, Figure 5-5A) by glutamine cyclase during protein synthesis268. PCA

is the common +1 residue following the signal peptide269. A CHES, buffer molecule from

crystallization condition was well resolved in the electron density maps and located

between β3α2 and β4β5 loops with the anionic sulfate group interacting with Nphe97,

NSer68 and Asn71 sidechain, with the hydrophobic cyclohexane ring extended into the

bulk solvent (Figure 5-5E). The four Y3 protomers (native form) in a crystallographic

asymmetric unit were very similar to one another with an RMSD range of 0.14-0.36 Å.

Protomers differ primarily in β1α1 loop region, with a Cα (backbone carbon) movement

up to 4.5 Å (residues 8-13, Figure 5-5F). The two pairs of five-stranded anti-parallel β-

sheets from two Y3 protomers assembled to large ten-stranded anti-parallel β-sheet,

with all long α-helices (α1 and α2) located on one side of the β-sheet, and short α-

helices (α3) on the other side. In addition, residues 32-35 locate on the α1β2 loop close

to the neighboring β’2 and form intermolecular hydrogen bonds. Residues 106-108 of

α3 also interact with β’2 with electrostatics effect and π-π stacking (Figure 5-5D).

Structural homology searches (DALI server)131 revealed that Y3 shares the

highest structural similarity with an α-galactosyl binding lectin LDL from mushroom L.

decastes, with 12% sequence identity245. LDL is a 96 residues protein consists of a 5-

131

stranded anti-parallel β-sheet, two α-helices and three disulfide bridges. The two α-

helices packed against one side of the β-sheet, similar to the long helices α1 and α2 of

Y3 (Figure 5-4D). Y3 is also structurally similar to a plant secreted mannose-binding

lectin Gnk2 (Ginkgo biloba, 7% sequence identity) that exhibits antifungal activity270.

Neither LDL nor Gnk2 shares the same disulfide bridge position compare with Y3. Y3

protomers also form a distinct dimer as compared with LDL or Gnk2. Gnk2 exists as a

monomer in solution271; while LDL forms an alternate dimer by the stacking of the

exposed sides of the β-strands of the two subunits as an αβ2α sandwich272.

Interestingly, SMKT from Millerozyma farinosa273 shares a similar assembly as Y3

dimers. SMKT is a plant acidophilic, antifungal killer toxin and forms dimeric

heterodimers (Figure 5-4E).

5.2.4 Carbohydrates Binding Site

Y3 has a large hydrophilic pocket on the dimeric interface. The pocket sits on the

10-stranded β-sheet plane and is surrounded by two α3 helices, one from each

protomer. Extra electron density was identified in the middle of this pocket near the

sidechains of hydrophilic amino acids Asp102 and Asn104. With the absence of sulfur

anomalous signal, the density is excluded from CHES. Although the exact identity of the

molecular remains unclear, we assumed the molecular to be a carbohydrate fragment.

A molecular of mannose, which is the major carbohydrate unit of P. pastoris was

modeled into the calculated mFo-DFc map (Figure 5-6A). The C2-hydroxyl group is

close to Asp102 and Asn104, stabilized by the hydrogen bonds interaction. Aromatic

residue Tyr7 locates on the bottom of the carbohydrates hexa-cyclic ring and restrains

the molecule in the highly hydrophilic cave.

132

Figure 5-5. High-resolution crystal structure of Y3. A) The N-terminus Met0 was cleaved during Y3 expression. PCA1 was formed through Glu1 cyclization. B) Four intramolecular disulfide bridges stabilize Y3 protomers. 2mFo-DFc map is displayed at a 2.0σ contour level. C) HA-derivative Y3 dimer in alternative P21

space group. The anomalous difference map was displayed at a 4.0 σ contour level. Pt3 interacts with a disulfide bridge in an isomorphous fashion. D) Gln41 and Arg’35 stabilized the dimeric Y3. Gln41 and Arg’35. E) A bound CHES molecule was well resolved near β4β5 and β3α2 loops. F) Y3 protomers majorly differ in the β1α1 loop region. The orange loop (chain D) shifted ~ 4.5 Å compare with the loops of other protomers. The α2β4 loop is also noted. G) CCL shares a similar protomer structure, but possesses longer β1α1 and α2β4 loops to assist the protein-carbohydrate interaction. CCL also includes an extra disulfide bridge (*) between β1α1 loop and C-terminus loop.

133

The titration of mono- or disaccharide sugars into Y3 did not produce detectable

heat as measured by ITC, indicative of a binding interaction. The potential interaction

might be blocked by bound polysaccharide glycans upon our purification274. To further

address the protein/carbohydrates recognition, the dominating yeast glycan, poly-

mannose (1’, 4’ linkage) glycans were modeled into the Y3 carbohydrates binding site to

detail the interaction. Hexa-mannose interacted with Y3 dimer, with the 4’-termini

saccharides laid near Asp102/Asn104, and extended across the dimeric interface

through a groove along the flexible β1α1 loop (Figure 5-6B).

Figure 5-6. Y3 interact with carbohydrates in a novel mechanism. A) A mannose molecular was modeled into a hydrophilic pocket near Asp102 and Asn104. Extra electron density in calculated mFo-DFc map in the pocket was displayed at 3.0σ contour level. B) Hydrophobicity map of Y3 dimer. The electrostatic potential (±6 kT∙e-1) is plotted on the solvent-accessible surface. Y3 dimer possesses an amphiphilic ‘Janus’ architecture. Modelled polysaccharide (hexa-mannose) plugged into the middle hydrophilic pocket, and extended through the groove along the β1α1 loop. The 4’-terminus is noted in yellow.

5.2.5 Y3 with Antitumor and Antiviral Activities

Reported Y3 isolated from C. comatus inhibits TMV infection of N. glutinosa

leaves at a concentration of 12.5 μg∙ml-1 protein255. Heterogeneously expressed

recombinant Y3 maintains the reported anti-TMV activity (data not shown). In addition,

we evaluated Y3-mediated tumor cell cytotoxicity. Y3 displayed antitumor activity

134

against human cervical cancer (Hela) in a dose-dependent manner, with IC50 value of

13.2 μg∙mL-1. In addition, our preliminary results indicate that Y3 restrict HCV entry into

Huh7.5 cells. The inhibition effect is ~100% at a protein concentration of 200 µg∙mL-1.

5.3 Discussions and Insights

This study presents the functional and structural characterizations of Y3, a novel

lectin from the mushroom C. comatus. Recombinant Y3 is secreted from P. pastoris

harboring the pPICZαA-Y3 plasmid, and shares the same SDS-PAGE profile as well as

the same anti-TMV activity with the C. comatus isolates255. Our purified Y3 contains a

large w/w ratio of carbohydrates as previously reports suggested it is a glycoprotein.

Multiple analyses include mass spectroscopy indicate that Y3 interacts with large

glycans through a non-covalent association. The large aggregation of Y3 in native

PAGE also confirms the hypothesis. Hence, we assigned Y3 as a lectin. The presence

of the signal peptide along with multiple disulfide bridges suggests that Y3 is most likely

secreted in the natural producer. Most mushroom lectins are as cytosolic, with a

majority found in the fruit body, and a few are located in mycelium244. Only a few

examples of extracellular mushroom lectins have been reported so far, exemplified as

LEL from Lentinus edodes246 and LDL from L. decastes245. Suggesting Y3 is a secreted

lectin, the protein remains intact in a broad pH/temperature range, and the extraordinary

high stability expends its pharmacology potential.

Y3 crystallization was carried out using a conventional vapor diffusion method.

Although Y3 crystals diffract X-ray to a high resolution, attempted ab initio phasing did

not provide interpretable electron density maps. tNCS is most likely to be the key

problematic factor. ARCIMBOLDO phasing with a antiparallel β-strand library provided

135

rotation function solutions with good contrast, but none of the solutions were retained

during the transition function search and packing evaluation263 (Figure 5-3B). tNCS also

complicated the experimental phasing calculation. The final structural solution was

greatly facilitated by a single ‘anomalous’ crystal with a smaller unit cell and without

apparent tNCS. Of note is the new crystal form has a higher calculated water

percentage (37% compare with 33%), suggesting the new space group might be the

artifact caused by Pt soaking instead of direct dehydration275.

Detailed high-resolution crystal structure does not show protein N/O-

glycosylation, and confirms Y3 as a novel lectin. Dimeric Y3 forms a large bowl-shaped

10-stranded antiparallel β-sheet, with four and two α-helices on either side. Y3

possesses a unique tertiary structure, and does not share a similar folding compare with

other lectins. Reported fungi lectins have been summarized into different structural

classes276, typified by bladed β-propeller fold277, β-trefoil fold278, jelly roll fold273, and

αβ2α-sandwich fold245. Y3 protomer is similar to αβ2α-sandwich protomer. However, with

an additional C-terminal helix, Y3 dimer forms a large αβα-sandwich. This αβα-

sandwich is a novel class of lectin. Several lectins, include L-type lectin279 and C-type

lectin280 requires Ca2+ (and Mn2+) to mediate carbohydrates recognition and interaction.

There is no conclusive metal ion(s) resolved in the native structures, and Y3 may

employ a metal independent carbohydrate interaction. In spite of the fact that Y3

protomer is relatively similar to CCL, Y3 and CCL do not possess a same carbohydrates

interaction mechanism (Figure 5-5FG)245. CCL and Y3 share the same β1α2 disulfide

bridge. The β1α1 loop of CCL has 3 extra amino acids. The longer β1α1 loop, combine

with the longer α2β4 loop (2 amino acid residues longer), provides a large pocket for

136

CCL carbohydrates interaction. In addition, CCL contains an extra disulfide bridge

between β1α1LLC loop and C-terminusLLC loop. The extra disulfide bridge partially

restrains the flexibility of β1α1LLC for probable carbohydrate specificity. Partially resolved

carbohydrates bound in a highly hydrophilic pocket of Y3 locates on the dimeric

interface, with the C2’-hydroxyl group interact with Asp102 and Asn104. Tyr7 also help

to stabilize the modeled mannose molecular. Several other lectins have the similar

reported pockets, as the site 1 in CLL also consists of a Glu-Gln-Tyr pocket to interact

with N-acetylglucosamine (GlcNac)281. Other reported systems also use β-stranded

dimeric interface for specific molecular or peptide recognition, exemplified by the

allosteric site of procaspase-3282, 283. In addition, a molecular of CHES is well resolve at

β3α2 and β4β5 loop region of Y3. This region has a critical Asn71 to provide

electrostatics interaction, and shares the similar flexibility compare with the LLC

carbohydrate binding site near the β1α1LLC and α2β4LLC loops (Figure 5-5E)245. The

CHES binding site might also assist the carbohydrates interaction.

Y3 do not show detectable interaction against common mono- or disaccharide

sugars in our ITC based screening. As common with Chlorophyllum molybdites lectin

(CML)274, the Y3 protein might already been non-specifically saturated with longer chain

glycans upon yeast expression/purification, whereas the mono- or disaccharide sugar

usually possess a much low lectin binding affinity284. To better understand the Y3-

glycan interaction, poly-mannose chains (1’, 4’ linkage) in different have been modeled

into Y3 carbohydrate binding pocket with a grid covering the entire Y3 dimer. The

hydrophilic residues near the dimeric interface pocket include Asp102, Asn104 and

Asp111 largely stabilized modeled glycans. The 4-terminus of modeled polysaccharide

137

shares a similar protein contact surface compare with the fitted mono-mannose in the

calculated mFo-DFc map (Figure 5-6A). The hexa-mannose chain was extended from 4’

to 1’ along the protein β1α1 loop and interacts with several additional key residues as

Asp8 and Asn9, while 4’-terminus could also extend with the same fashion (Figure 5-

6B). Further determination of Y3-glycan specificity is still under investigation. However,

from the calculated protein contact potential map285, Y3 dimer forms a ‘Janus’

conformation, with all hydrophilic residues assembled in the on side, while the other

side is almost completely hydrophilic (Figure 5-6B). This is also a unique feature

compare with other reported lectins, and extend the pharmacology potential of Y3 in

preventing specific intercellular contact. Y3 may perform as ‘molecular lid’ to cover the

glycan chains on the cell /virus surface, and create a hydrophobic shell to prohibit the

cell (virus)-cell interaction. Structural information provides valuable insight into the

potential antiviral and antitumor activity of Y3.

With the information above, the anti-TMV mechanism of Y3 can be predicted.

TMV is archetype positive-sense single stranded RNA virus. Although the monomeric

unit TMV coat protein of does not possess glycosylation, reported lectin AAL (from

Agrocybe aegerita) interact with TMV in isoelectric focusing (IEF) in vitro250. In addition,

Y3 may interact with specific receptor /transporter on plant leaves and competes with

TMV binding and entry. Y3 could employ a similar antiviral mechanism as PR-1

(pathogenesis-related protein-1). RP-1 gene encodes an 168-amino acid extracellular

protein with an αβα sandwich architecture, and reacting hypersensitively to TMV286,287.

Pathogenesis-related (PR) genes, as well as the directly the protein secretory pathway

genes have been regulated by NPR1, the key factor relates to systemic immune

138

response in plants288. However, the substrate for PR-1 is unknown yet. Y3 is also a

potential candidate for TMV-inhibition in vivo. Transgenic plant N. tabacum with

overexpressed BanLec-1 (lectin from Musa paradisiacal) has an inhibited local spread

level of TMV compares to WT289.

In addition, we employed a broad antiviral and antitumor screen for Y3. Detected

antitumor activity of Y3 against human cervical cancer (Hela) has an IC50 value of 13.2

μg∙mL-1 (Figure 5-7A). Intriguingly, Y3 almost completely restrict HCV entry into Huh7.5

cells at a protein concentration of 200 µg∙mL-1 (Figure 5-7B). We predict that the activity

and specificity of Y3 can be further tuned by altering the amino acid residues on the

protein hydrophilic surface (Figure 5-6B). Designed mutagenesis studied and

corresponding activity assays are currently ongoing.

In sum, lectins employ widely different strategies to generate ligand specificity

and multivalency. The low primary sequence identity and diverse structural features

often complicate the identification and characterization of novel lectins290. As a novel

family of lectin, Y3 possess unique structures as well as unique biomedicinal potentials.

Our structural insight allowed us in understanding the mechanism of its specific

activities, and engineering the protein function rationally.


5.4.1 Heterologous Expression of Mature Y3

A codon-optimized gene encoding the leader peptide free Y3 (Eurofins

Genomics) was PCR amplified with primers Y3FXh (5’-GTATCT CTCGAG

AAAAGACAAGATCCTTTG) and Y3RNt (5’-TTTTCCTTTTGCGGCCGC

TTAAAAATCAGTGG), digested with XhoI and NotI, and ligated into pPICZαA plasmid.

139

Circular pPICZαA-Y3 was linearized with SacI endonucleases and transformed into P.

pastoris X-33 following the manufacturer’s instructions (Pichia Easycomp, Invitrogen) for

expression. Cultures (500 mL YPD) were grown at 30ºC for 48 h to an OD600 12~18 and

harvested by centrifugation (2,000g, 10 min). Cells were resuspended in 150 mL of

BMMY media (100 mM potassium phosphate, pH 6.0, 1.34% YNB, 4×10-5 biotin, and

0.5% v/v methanol) for protein overexpression. Methanol was added every 24 h to a

final concentration of 0.5% to maintain cell induction. Cell culture supernatant was

harvested on day 3 and dialyzed against PBS buffer (pH 7.4) for 48 h. The supernatant

was further concentrated (10,000 MWCO, Millipore centricon) and purified with gel

filtration chromatography (HiLoad 16/60 SuperDex-75 column, AKTA FPLC System, GE

Healthcare) with running buffer, 150 mM NaCl, and 20 mM Tris-HCl, pH 7.5. Purified Y3

was concentrated to 10.0 mg∙mL-1 (600 μL) for further analysis and crystallization.

Protein concentration was determined with Bradford assay (BSA as standard).

5.4.2 Ellman's Test for Free Thiol Determination

Standard Ellman's test was used to detect the free thiol groups in purified Y3

protein291. 1.0 μL Y3 (10 mg∙mL-1) was mixed with 1 mL DTNB reagent (GoldBio) and

incubate at room temperature for 5 min. The optical absorbance at 412 nm was

measured and recorded.

5.4.3 Carbohydrate Content Determination

Concentrated Y3 containing fractions from size exclusion chromatography was

processed with phenol-sulfuric acid method for total carbohydrates determination292.

Protein solutions (0.2-10 μL, 10 mg∙mL-1) were diluted to 400 μL with water and mixed

with 10 μL of 80% w/w phenol. 1.0 mL sulphuric acid was quickly added into the mixture

140

and stand for 10 min before 10 min water bath at 25ºC. UV-vis absorbance was

recorded at 490 nm. Protein free blank and D-glucose standards were also prepared.

5.4.4 Carbohydrates Specificity Determination with ITC

Isothermal titration calorimetry (ITC) was used to analyze carbohydrate

specificity of Y3. Purified Y3 was diluted to 1.2 mg∙mL-1 (100 μM) and dialysed against

150 mM NaCl and 20 mM Tris-HCl, pH 7.5 for 16 h. 200 μL protein was placed into a

MicroCal iTC200 (Malvern) reaction cell, while different oligosaccharides in same buffer

at a concentration of 1-5 mM were injected into the reaction cell over time at 25°C. D-

Glucose, α-D-mannose, β-D-galactose and lactose were tested with ITC. Raw data were

integrated, normalized and evaluated using ORIGIN 7.0 according to the one-site

binding model.

5.4.5 Protein Mass Analysis

For ESI mass spectrometry analysis, protein crystals were picked, washed with a

solution in crystallization condition, and dissolved in 3.0 μL water. MALDI-TOF-MS used

sinapic acid as matrix (Protea). Protein samples treated using a standard PNGase-F

digestion protocol (New England Biolabs) and LC-MS (Agilent) was used to the bound

carbohydrates. Protein fractions harvested from size exclusion chromatography was

analysis with a method: 3.0 % - 60 % of 0.1 % TFA-acetonitrile in 0.1 % TFA-water over

40 min with a flow rate of 1.0 mL∙min-1 (ZORBAX SB-C18, Agilent).

5.4.6 Crystallization of Y3

Initial Y3 crystallization was carried in a vapor diffusion sitting drop format with a

homemade matrix screen designed for glycoproteins (CrystalQuick 96 well sitting drop

plate). The homemade matrix screen adapted common conditions reported for other

glycoproteins. Comparison with commercially-available matrix screens, the homemade

141

matrix contains a broader pH range (pH 3.0 ~ 11.5) and a lower precipitant (PEGs)

percentage. Small plate clusters of Y3 crystals were identified in a condition containing

20% PEG-4000 and 0.1 M CHES, pH 10.5 in 15 min at 25ºC. Optimization of precipitant

and pH, along with microseeding were performed in a hanging drop format (24 well VDX

crystallization plate). Protein (1.8 μL, 10 mg∙mL-1) plus 2.0 μL of precipitant were

balanced against 1 mL of reservoir solution. The resultant plate-shaped single crystals

with a size of 20×20×100 μm were obtained in a final condition that contained 16%

PEG-4000, 10% v/v glycerol and 0.1 M CHES, pH 9.5 in 48 h. Crystals of suitable size

were harvested and flash frozen in liquid nitrogen with an additional 10% glycerol as

cryoprotectant. Heavy atoms derivative crystals for single wavelength derivative (SAD)

and multiple isomorphous replacement (MIR) were obtained through co-crystallization

or soaking. Compounds (Heavy Atom Screen Kits) contain derivative elements (Pt, Au,

Hg, I) were selected according to their solubility (pH 9.5~10.5). Cocrystallization and

soaking procedures were detailed (Table 5-2).

5.4.7 Diffraction Data Collection and Processing

Y3 native and derivative crystal X-ray diffraction data sets were collected on

beamlines 21-ID-G and 21-ID-F of the Life Sciences Collaborative Access Team (LS-

CAT) facility at the Advanced Photon Source (APS), Argonne National Laboratory (ANL)

with a wavelength of 0.9786 Å at 100 K. Sulfur-SAD or Iodine derivative data sets were

collected on in-house X-ray facility (Cu Kalpha source, R-AXIS IV++ imaging, Rigaku)

with a wavelength of 1.5418 Å at 100 K. Data sets were merged using XDS program

package156 and then scaled with AIMLESS from the CCP4 suite182 to space group P21,

in two diverse forms with different unit cell parameters. Y3 native crystals diffract X-ray

142

to 1.18 Å, while the Pt-derivative data set was truncated at 1.70 Å resolution (Table 5-

1). Native data sets in have large non-origin Patterson peaks as indicated by

PHENIX.XTRIAGE from PHENIX suite159, suggested the presence of translational

noncrystallographic symmetry (tNCS)264.

Table 5-2. Heavy atom derivative crystal preparation of Y3

Concentration

Co

Cry.

Soaking Time

30 s 10 m 4 h 24 h

Potassium iodide 500 mM -* - ×† ×

5-amino-2,4,6-triiodoisophthalic acid 3 mM - - - ×

Potassium tetrachloroplatinate(II) 10 mM

-‡ + +#

Potassium tetracyanoplatinate(II) 10 mM

+ ×

Gold(I) potassium cyanide 10 mM

- + +

Gold(III) chloride 10 mM

- +

Sodium tetrachloroaurate(III) 10 mM × × ×

Mercury(II) chloride Satu.

- -

Mercury(II) nitrate monohydrate Satu.

- -

Methyl mercury Satu.

- -

Ethyl mercuric phosphate Satu.

+ +

Ethyl mercury chloride Satu.

+ + *

Crystals without detectable anomalous signal; †crystals unable to harvest or do not diffract;

‡crystals with

detectable anomalous signal; and #protein crystals harvested in the alternative space group.

5.4.8 Atomic Structure Determination and Refinement

The structure of Y3 was determined with Pt-SAD. The number and location of

heavy atoms were identified with SHELXD (Figure 5-5C)188. Initial phase calculation

was carried using SHELXE188, and further refined with MLPHARE185, PIRATE186 and

PHASER157. An interpretable electron-density map was generated with SOLOMON with

20% solvent flattering293. The model building was initiated with SHELXE, and completed

using BUCCANEER181, ARP/wARP189 and COOT160. Final coordinates were refined

143

with REFMAC5183 and PHENIX.REFINE158. Ligands were added manually with

COOT160 upon careful inspection of the electron-density maps. Derivative ions were

added manually into the calculated anomalous difference map, with their occupancy

and anisotropic B-factors refined sequentially. The quality of the models were evaluated

using a sigma-weighted, simulated annealing composite omit map. Structural

illustrations were prepared with PyMOL161. Statistics on data collection and atomic

structure refinement are given in Table 5-1. The refined coordinates have been

deposited in the protein data bank (accession codes 5HA2 and 5HA3).

5.4.9 Models of Oligosaccharides Bound Y3

Fitting of oligosaccharides into the Y3 dimeric interface was conducted with

AutoDock 4.2191. Poly-mannose (1’, 4’ linkage, n = 3~6) were constructed with

eLBOW294 and further energy minimized with Spartan ’08192. Standard algorithms and

docking procedures were used for a rigid protein and a flexible ligand in a grid covering

the entire protein dimer. The best docking poses were analysed with AutoDockTools.

5.4.10 MTT Assay

The MTT assay was performed using human cervical cancer (Hela) cells. The

cells were cultured in DMEM medium containing 10 % fetal bovine serum and 100

U∙mL-1 penicillin and streptomycin, and maintained at 37°C in a humidified incubator

under 5 % CO2. The cells (104 cells in 100 μL) were seeded onto 96-well microtiter plate

and incubated overnight. Purified Y3 (final concentrations: 0, 5, 10, 20, 40, 80, 160

μg∙mL-1) were added into the wells. After incubation at 37 °C for 72 h, 10 µL of MTT (5

mg∙mL-1) in PBS was added and incubated for 4 h, followed by the aspiration of the

medium. Dimethyl sulfoxide (DMSO, 100 μL) was added to each well to dissolve the

MTT in the wells and the plate were agitated for 1 h. The optical density (OD) was

144

measured at 570 nm. The IC50 values for Y3 treatment of the tested tumor cells indicate

the Y3 concentrations causing 50 % cell inhibition (SPSS).

145

CHAPTER 6

HOMOCYSTEINE METHYLTRANSFERASE MmuM FROM Escherichia coli FACILITATES L-METHIONINE BIOSYNTHESIS AND DAMAGED COFACTOR (R,S)-

S-ADENOSYL-L-METHIONINE REPAIR1

6.1 Metabolite Damage is an Under-Recognized Fact of Life

Metabolites are usually restricted to small molecule natural products that are the

intermediates and products of metabolism295,296,297. Metabolites are recognized and

acted upon by enzymes, act as regulators that control the pace of metabolism, and

serve important biological functions in the cells. While cells have a well-developed

machinery of metabolites synthesis, utilization and degradation, many metabolites are

found damaged in vivo by spontaneous or enzymatic side-reactions298,299. This

metabolite damage occurs on a large scale in all organisms and produces metabolites-

like side products, which is useless and often toxic300,301. Thus, the damage-control

systems are established either to restore damaged metabolites to their original state or

to convert dangerous compounds harmless299,302. Metabolite damage and its control are

analogous to DNA and protein damage and repair, but have been far less studied and

remain poorly understood303,304.

6.2 Homocysteine S-Methyltransferase and Metabolite Repair

L-Methionine (Met) is a structurally unique amino acid with a hydrophobic

thioether side chain305. Although Met is one of the less abundant proteinogenic amino

acids, it has an indispensable role in the initiation of translation306 and commonly

contributes to the hydrophobic cores of proteins307. Met is also the main cellular carrier

of methyl groups as a key component of S-adenosyl-L-methionine (AdoMet), the

1 Adapted with permission from Li, K., Li, G., Bradbury, L. M. T., Hanson, A. D. & Bruner, S. D. Crystal

structure of the homocysteine methyltransferase MmuM from Escherichia coli. Biochem. J. 473, 277–284

(2016) DOI: 10.1042/BJ20150980. Copyright © 2015 The Author(s)

146

universal methyl donor308. Met biosynthesis in bacteria, fungi, and plants is well

understood309,310. The biosynthetic pathway proceeds from aspartate via homoserine

and cystathionine to homocysteine (Hcy), whose thiol group is then methylated to give

Met311. Met is converted to AdoMet in a reaction mediated by S-adenosylmethionine

synthetases, EC 2.5.1.6 (also known as methionine adenosyltransferases).

Figure 6-1. HMT converts L-homocysteine (Hcy) to L-methionine (Met) with unique substrate selectivity. A) HMT (MmuM)-catalyzed methionine biosynthesis with SMM or (R,S)-AdoMet, while structural analogs, B) (S,S)-AdoMet and, C) S-ribosylMet are not active substrates for MmuM.

Plants have a unique additional reaction in which Met is S-methylated to yield S-

methyl-L-methionine (SMM) by methionine S-methyltransferase (MMT, EC 2.1.1.12)312.

SMM can serve as a methyl donor to Hcy in a reaction mediated by homocysteine S-

methyltransferase (HMT, EC 2.1.1.10), an enzyme present in plants, bacteria, fungi,

and animals. The presence of both MMT and HMT in plants sets up a cycle (the SMM

cycle) that allows SMM to serve as a storage and transport form of Met313,314,315. The

occurrence of HMTs in bacteria, fungi, and animals311,316,317,303 enables the use of plant

SMM as a source of Met303,318. Additionally, almost all HMTs can also use the non-

natural (R,S) form of AdoMet, which is generated in vivo by spontaneous racemization

of the natural (S,S) form319 and which cannot serve as methyl donor for other

147

methyltransferases316,317. The ability to process (R,S)-AdoMet enables HMTs to recycle

it back to S-adenosylhomocysteine (AdoHcy) and re-enter the methyl cycle316,317,299.

HMT can thus serve as a repair enzyme for damaged AdoMet, and this is probably its

ancestral function, the ability to use SMM as methyl donor having arisen later in

evolution303. An additional evolutionary innovation is the capacity of HMT from the

selenium-resistant plant Astragalus bisulcatus to use selenocysteine as preferred

methyl acceptor320.

Escherichia coli K12 strain MG1655 has an HMT; this enzyme is encoded by the

mmuM gene, which is in a two-gene operon with the SMM transporter gene mmuP 321.

E. coli MmuM is a 310-residue protein that, like most other HMTs, can use either SMM

or the ‘damaged’ (R,S)-AdoMet as methyl donor; it shows a moderate preference for

SMM and cannot use (S,S)-AdoMet303. Surprisingly, E. coli MmuM - and most other

HMTs - cannot use S-ribosylmethionine (S-ribosylMet) as a methyl donor although this

compound is structurally intermediate between SMM and (R,S)-AdoMet (Figure 6-1)303.

MmuM and other HMTs show sequence similarity to Zn-binding enzymes:

betaine-homocysteine methyltransferase (BHMT, EC 2.1.1.5), which uses betaine as

the methyl donor322, and methionine synthase (MetS, EC 2.1.1.13), which uses 5-

methyltetrahydrofolate (5-MTHF) as the methyl donor323. Structural prediction indicates

that HMTs contain a functional Zn-binding domain. Although several BHMT and MetS

structures have been reported, no structural information is available for any HMT

protein. In this chapter, I present the first HMT protein structure, MmuM from E. coli.

Structures in the oxidized (Cys disulfide), apo (Zn2+-free), and metallated (Zn2+-Hcy)

forms provide a comprehensive view of the active site pocket and allows insights into

148

substrate selectivity, substrate binding, and the methyl transfer mechanism for the HMT

family. The project is collaboration between the Bruner group and the Hanson group

(Prof. Andrew D. Hanson, Horticultural Science Department, University of Florida).

6.3 HMT MmuM in E. coli

6.3.1 Biological Functions and Distribution of bacteria HMT

MmuM enables certain E. coli strains324 to utilize the plant compound SMM as a

methyl donor in methionine biosynthesis. Most E. coli strains also possess two

additional methionine synthases, MetE and MetH that catalyze the S-methylation of Hcy

using 5-MTHF as methyl donor. MetE is an 85-kDa monomer with a Zn cofactor325 while

MetH is a 136-kDa monomer containing Zn and cobalamin cofactors326. The E. coli

KL19 ΔmetE ΔmetH strain but not the ΔmetE ΔmetH ΔmmuM strain can utilize SMM as

a Met source321, indicating that SMM is a biologically relevant methyl donor.

Additionally, the ΔmmuM ΔmetK mutant in E. coli K-12 ilvA pSAMT (carrying the

AdoMet transporter gene from Rickettsia prowazekii) requires more Met for cell growth

than the ΔmetK (AdoMet synthetase) single mutant grown in the presence of equal

AdoMet327. This observation suggests that AdoMet is a methyl donor for MmuM in vivo.

Besides E. coli, the mmuM gene is present in several clinical pathogens, in most cases

encoded on the chromosome. Several mmuM genes are, however, found encoded on

plasmids, increasing the possibility of horizontal transfer and dissemination of mmuM in

bacteria. Interestingly, humans and other mammals also have a MmuM-like enzyme

with specific activity against SMM, and to a much lesser extent, (S,S)-AdoMet328. This

enzyme (hBHMT2) is annotated as a member of the BHMT family, not the HMT family,

although it does not use betaine as methyl donor329,330.

149

Table 6-1. X-ray data collection, processing and structure refinement of MmuM

Data Collection Oxidized form

Disulfide bridge

Apo form

Zn2+ free

Metallated form

Zn2+ and Hcy

Resolution (Å) 34.39 - 2.44 (2.54 - 2.44)*

38.42 - 2.89 (2.99 - 2.89)

24.23 - 1.78 (1.84 - 1.78)

Space group I222 P21212 I222

a, b, c (Å) 78.45 84.46 85.82

85.82 85.94 79.02

78.95 85.92 87.66

Total no. of reflections

58379 (6367) 82600 (7945) 141424 (13523)

Unique reflections 10331 (1158) 13530 (1302) 27033 (2711)

Multiplicity 5.7 (5.5) 6.1 (6.1) 5.2 (5.0)

Completeness (%) 99.76 (97.69) 99.71 (98.41) 93.40 (95.36)

<I/σ(I)> 15.8 (3.4) 11.9 (3.5) 14.8 (2.5)

Wilson B factor (Å2) 40.7 41.6 18.1

Rmerge 0.082 (0.49) 0.141 (0.68) 0.093 (0.60)

Rmeas 0.090 (0.59) 0.161 (0.70) 0.104 (0.72)

CC1/2 0.999 (0.996) 0.996 (0.932) 0.996 (0.735)

Structure Refinement

Rwork 0.202 (0.276) 0.225 (0.280) 0.181 (0.257)

Rfree 0.217 (0.296) 0.270 (0.324) 0.206 (0.282)

No. of non-H atoms

Protein 2169 3932 2190

Ligand - 10 13

Water 32 13 179

RMS Bonds (Å) 0.009 0.005 0.007

RMS Angles (°) 1.26 0.92 1.12

Ramachandran

Favored (%) 95.49 96.00 98.00

Outliers (%) 0 0.2 0.35

Clashscore 7.69 3.81 2.05

Average B factors

Protein 47.2 42.7 23.3

Ligand - 39.3 37.0

Water 43.2 38.8 35.7

pdb entry 5DML 5DMN 5DMM *Values for the outer shell are given in parentheses

150

Figure 6-2. Overall structure of E. coli MmuM. A) Side and, B) top view of overall structure of MmuM in the form of a (α/β)8 TIM barrel; C) topological diagram of the MmuM monomer. Missing loop (260-275) is boxed in cyan. D) Size exclusion chromatography of purified MmuM, suggesting that the protein is monomeric in solution. Running positions of 66-kDa and 29-kDa molecular mass standards are indicated.

6.3.2 Overall Structure of MmuM in E. coli

Purified recombinant MmuM has a molecular mass of approximately 35-kDa as

estimated on SDS-PAGE, and a calculated molecular weight of 34.49-kDa. The protein

crystallized in the space group I222 or P21212 and the structures were determined using

the deposited structure of T. maritima MetS (PDB entry 1Q7M, N-terminal domain) as

an initial molecular replacement search model. MmuM crystals were identified in the

oxidized (cysteine disulfide), apo (Zn2+ free) or metallated (Zn2+-Hcy) forms. The final

coordinates were refined to between 1.8~2.9 Å resolution with one or two monomers of

151

MmuM present in the asymmetric unit (Figure 6-2 and Table 6-1). Crystallographic

packing analysis using the PISA server130, along with the size exclusion

chromatography elution profile (Figure 6-2D), are both consistent with MmuM being a

monomer in solution. The resolved metallated and oxidized structures in the space

group I222 contain ordered residues 4-259 and 275-310. The missing residues (260-

274) constitute a loop region between β7 and α7 are presumed disordered. The apo

form, P21212, crystals have three additional disordered regions (34-39, 72-78 and 127-

138, Figure 6- Figure 6-3A). The overall crystallographic packing between I222 and

P21212 is similar. MmuM monomers have electrostatic interactions between α2’/α1 and

the neighbouring α4 region. Relative to the I222 structures, the second monomer in

space group P21212 has the region containing α1, α2 and α8 shifted ~ 3.0 Å relative to

other monomers (Figure 6-3B).

Figure 6-3. MmuM apo (P21212) and metallated (I222) form crystals have different crystallographic packing. A) Three missing loops (dark blue, superimposed from metallated form) in P21212 monomer A (grey) locate on the protein top surface. B) MmuM packs in a different fashion in different space groups. P21212 monomer B (light blue) was aligned with I222 monomer (dark blue), while P21212 monomer A (grey) shifted ~ 3 Å against I222 crystallographic symmetry copy (orange).

152

Figure 6-4. MmuM belongs to the HMT superfamily of proteins. Sequence alignments (Clustal W2) of MmuM from E. coli with HMTs from Bacillus subtilis (BSHMT) and Arabidopsis thaliana (ATHMT1), with MetS from T. maritima (MetH-Tm), and with BHMT from rat liver (rBHMT). Zn2+-binding cysteine residues are in red. Critical residues (Y71 and T169) for MmuM activity are in green. HMT unique α’5/ α’6 loop region is in blue.

MmuM forms a globular compact structure composed of an (α/β)8 TIM barrel331,

with eight outside α-helices and eight inside parallel β-strands that alternate along the

peptide backbone (Figure 6-2). The protein sequence starts from the α’1 helix, located

at the base of the (α/β) barrel. A missing loop (260-274) is located on the outer surface

of the protein, opposite the metal/substrate-binding site. Two short helixes α’2/α’3

between β1 and α1 are located on the top of the (α/β) barrel covering the β1, β2 and α1

region, while α’5/α’6 (inserted between β3/α3) cover the top of β3 - β5.

Structural homology searches (DALI server)131 reveal that MmuM shares the

highest structural similarity (Z-score of 33.6) with the Hcy-binding domain of T. maritima

MetS (MetH-Tm, PDB entry 1Q7Z)323. MetH-Tm contains two domains, the N-terminal

153

Hcy binding domain and the C-terminal 5-MTHF-binding domain, linked by a flexible

loop of ~12 amino acids. The RMSD value is 2.0 Å for 263 aligned Cα atoms between

the N-terminal domain of the MetH-Tm and the MmuM monomer. MmuM also shares a

high structural similarity with BMHT from rat liver (rBHMT, Z-score of 27.7, PDB entry

1UMY) with an RMSD value of 2.4 Å for 261 aligned Cα atoms between the N-terminal

domain of the rBHMT and MmuM monomers. rBHMT contains an extended ~30-residue

C-terminal region, which ends as a long α-helix that is involved in assembly of the active

tetramer322,332. HMT has an elongated α2 and an additional helix-turn-helix at α’5 and

α’6; while BHMT and MetS have a much shorter β3-α3 loop (Figure 6-4). All three of

these proteins have a conserved Cys on β6 and a Cys-Cys group at β8. These three

cysteines are metal-binding and form the enzyme active site.

6.3.3 The Zn2+-binding, Active Site

MmuM requires Zn2+ for enzymatic activity303. The oxidized MmuM structure is in

a metal-free state with a disulfide bridge between Cys229 and Cys295 (Figure 6-5A).

The biological role of oxidized form MmuM is not known, however the oxidized protein

can be reversibly reduced and charged with zinc to generate the metallated form. The

MmuM apo form (also Zn2+ free) has Cys229 and Cys295 in the reduced thiol form with

a sulfur-to-sulfur distance of 4.3 Å (Figure 6-5B). This suggests the observed apo form

is in an intermediate state between the oxidized and metallated forms. In contrast, the

metallated form has Cys229, Cys295 and Cys296 coordinated to a Zn2+ center with

distances of 2.4, 2.1 and 3.2 Å, respectively (Figure 6-5C). The coordination of Hcy to

metal with a distance of 2.8 Å contributes to a near-ideal Zn2+-centered tetrahedron

geometry. Additionally, Cys296 in the metallated form appears to form a (partial

154

occupancy) disulfide bond with a molecule of β-mercaptoethanol, BME (Figure 6-5C).

BME was introduced into the system during the protein dialysis step incorporating Zn2+

into the protein. The oxidized and metallated forms are similar, with an RMSD of 0.28 Å.

In contrast, the crystals of apo MmuM are in a different space group, with a RMSD of

0.42 Å as compared to the metallated form. With the absence of the Zn2+ center, or the

disulfide covalent linkage, several disordered loop regions are located on the top of the

TIM barrel, including the loop region between β2 and α2 surrounding the active site

pocket. This observation suggests that the Zn2+ center plays an important structural role

in addition to being a key player in catalysis.

Figure 6-5. The Zn2+ binding active site of MmuM. The structure of MmuM showing the Zn2+ binding pocket in, A) oxidized, B) apo and, C) metallated forms; D) Zn2+ binding site in rBHMT with thiolate ligands and Tyr160, critical for enzyme activity; and (E) MetH-Tm with bound Cd2+ (mimicking Zn2+) and substrate Hcy. 2mFo-DFc maps were shown at a contour level of 1.5σ.

155

Table 6-2. Kinetic parameters for HMT MmuM. MmuM activity was measured as described in the Experimental Section. Affinities for the substrates in mutants with less than 5% of the WT activity were not determined (n.d.). Standard deviation was calculated from three independent titrations

Relative Vmax (%) Km (μM, SMM)

WT 100 ± 2.6 45.2 ± 2.4

WT (no K+) 101.4 ± 4.2 73.8 ± 6.8

Y71F 8.7 ± 3.6 127.5 ± 55.7

Y71A n.d. n.d.

Y71T n.d. n.d.

Y71FT169Y n.d. n.d.

Figure 6-6. MmuM catalyzed kenetics. A) ITC-based MmuM (WT)-catalyzed methyl transfer reactions with SMM as co-substrate and B) converted thermal power (dq/dt) ∙ [SMM] plot.

Compared with other similar structures, MmuM has a unique Zn2+-binding pocket.

In the structure of rBMHT, the Zn2+ center is coordinated with three cysteine residues

and Tyr160 on β4 as the fourth ligand (Figure 6-5D). A tyrosine at this position is a

hallmark of the BHMT family. Mutagenesis studies have confirmed the key role of this

tyrosine, as none of the mutants (Tyr160Ala, Tyr160Phe or Tyr160Thr) has significant

BHMT activity332. HMT, as well as MetS family proteins, have a conserved threonine

(Thr169) instead of tyrosine at the corresponding position on β4. MmuM does have a

156

tyrosine (Tyr71) close to the Zn2+ binding site from β2. Tyr71 is highly conserved in the

HMT family, but is replaced by phenylalanine in the MetS family of proteins.

Figure 6-7. Stability of MmuM mutants. CD spectra of WT and mutant MmuM are shown. Spectra are recorded at a wavelength range of 200 - 250 nm

The co-substrate Hcy binds the Zn2+ center at a similar position in MmuM and

MetH-Tm (Figure 6-5E), and is sandwiched between a conserved Phe (MetH-Tm) or Tyr

(MmuM) and Zn2+. To probe the role and function of the Tyr71 near the metal binding

site, sets of site-directed mutants were designed and assayed. Y71F mutant MmuM has

decreased activity against co-substrate SMM, but retains a Vmax ~10% of that measured

of the wild type form (Figure 6-6, Table 6-2). Meanwhile, the Y71A mutant is > 20-fold

less active compared with wild type. This suggests that loss of the large aromatic side

chain significantly interferes with substrate recognition and interaction. Interestingly, the

Y71T mutant has no detectable activity. As all the MmuM mutants retain similar

secondary structure elements as WT protein (Figure 6-7), a possible explanation is that

the threonine hydroxyl group provides a hydrogen bonding interaction, forcing the

substrate Hcy away from its active conformation. To gain further insight into the

differences between the HMT and BHMT active site arrangements, a double mutant

containing Y71F and T169Y was designed to better mimic the BHMT substrate binding

157

pocket; this mutant displayed no activity, however. These data indicate that Tyr169 in

MmuM T169Y mutant may interact with Zn2+ in much the same way as Tyr160 does in

rBHMT (see Figure 6-5D).

Figure 6-8. MmuM substrate binding and recognition. Different methyl donors were modeled with metallated form protein. Bound Hcy (green) locates on the top of the pocket. A) Modeled (R,S)-AdoMet probed near Hcy; B) detailed interactions between (R,S)-AdoMet and Hcy. The S-methyl group in (R,S)-AdoMet is close to the Hcy thiol group, facilitating the methyl transfer. C) Modeled (S,S)-AdoMet; D) (S,S)-S-ribosylMet; E) (R,S)-S-ribosylMet and, F) SMM near active site respectively.

6.3.4 Modeling MmuM Substrate Recognition and Binding

The catalytic mechanism of BHMT has been dissected before, demonstrating

that the reaction may proceed through a transition state where the activated methyl

group reacts directly with the homocysteine thiol group333. HMT is both structurally and

functionally similar to BHMT, and likely retains a similar transition state of the methyl

158

transfer chemistry. The structure of metallated MmuM has the co-substrate Hcy bound

to the Zn2+ center, but the position of the other co-substrate, the methyl donor, was not

apparent from any of our attempts at co-crystallization. The various enzymes that

methylate Hcy use diverse methyl donors: MetS uses 5-MTHF, and BHMT betaine,

whereas the HMT family proteins generally utilize SMM, (R,S)-AdoMet, or both.

Previously, we demonstrated that MmuM can use SMM or (R,S)-AdoMet as the methyl

donor, but not (S,S)-AdoMet or S-ribosylMet303. To understand the basis for the

observed substrate preference, and the reaction mechanism, co-substrates SMM and

(R,S)-AdoMet, and substrate mimics (S,S)-AdoMet and S-ribosylMet were docked into

the metallated form of MmuM to detail the protein /substrate interaction (Table 6-3 and

Figure 6-8).

Table 6-3. Docking statistics of metallated MmuM (with Hcy) and different methyl donors

Methyl Donor No.

Cluster Rank 1 Cluster Rank 2

Free Energy of Binding†

Binding Constant‡

Free Energy of Binding

Binding Constant

(R,S)-AdoMet 300 -7.79 kcal mol-1 1.95 μM -6.21 kcal mol-1 27.9 μM

(S,S)-AdoMet 300 -7.09 kcal mol-1 6.39 μM -6.05 kcal mol-1 36.6 μM

(S,S)-ribosylMet 200 -4.21 kcal mol-1 827 μM -3.19 kcal mol-1 4.61 mM

(R,S)-ribosylMet 200 -4.13 kcal mol-1 931 μM -3.98 kcal mol-1 1.21 mM

SMM 200 -6.42 kcal mol-1 400 μM -6.13 kcal mol-1 664 uM †Calculated at 298.15 K;

‡Calculation values from docking energy.

(R,S)-AdoMet has a good docking contrast with metallated MmuM, the best

estimated inhibition constant being 1.9 μM. The aromatic adenine ring of (R,S)-AdoMet

lies in a pronounced hydrophobic binding pocket with the exocyclic amino group

interacting with the backbone of Gly294 and Asp20 on β1. The α-amino group of (R,S)-

AdoMet is stabilized through an electrostatic interaction with Glu134 -

159

carboxylic group interacts with Gln72 on β2 through a hydrogen bond (Figure 6-8AB).

Gln72 and Glu134 could be the key factors determining HMT preference for SMM or

(R,S)-AdoMet. Glu134 is located on the α’5/α’6 region connecting β3 and α3. The α’5/’6

region lies on the protein surface, and is a unique feature of HMT family proteins. This

region is disordered in our apo MmuM structure, indicating possible flexibility as a role in

protein/co-substrate interaction. Gln72 is located on the β2/α2 loop (71-79, L2) region,

which is also flexible in the MmuM apo form. L2 is critical for the substrate binding in

BHMT. rBHMT is very sensitive to cleavage between rArg86 and rLys93, but has

substantial resistance to proteolysis when bound to the Hcy mimic S-(δ-carboxybutyl)-

homocysteine332. Gln72 in MmuM is a tyrosine in rBHMT (rTyr77), and this residue

plays a role in Hcy recognition334. Gln72 in MmuM opens a larger pocket, allowing the

binding of co-substrate (R,S)-AdoMet.

Compared with (R,S)-AdoMet, the ‘undamaged’, (S,S)-AdoMet binds metallated

MmuM with a very similar estimated binding free energy and inhibition constant (Table

6-3). Most of the protein-ligand interactions remain the same; however, there is a

difference between the orientations of the S-methyl positions (Figure 6-8C). While the

S-methyl group in (R,S)-AdoMet faces the bound Hcy in a predicted, catalytically

productive mode (Figure 6-8B), in the (S,S)-AdoMet model, the activated methyl group

faces outward, pointing in the opposite direction to the thiol group of the Hcy co-

substrate. As a consequence, (S,S)-AdoMet is predicted to bind but not to serve as a

methyl donor.

Additionally, modeling of a S-ribosylMet /MmuM interaction does not result in a

binding pattern similar to that of either the AdoMet diastereomers. (S,S)-S-ribosylMet

160

docks into the substrate binding pocket in a non-productive orientation, with the α-amino

and carboxyl group near the co-substrate Hcy, and the ribosyl group on the protein

surface (Figure 6-8D). The interaction has an estimated inhibition constant of 827 μM,

which is about 500 times weaker than that for the metallated MmuM /(R,S)-AdoMet

interaction. (R,S)-S-ribosylMet binds the protein with an even higher free energy change

and a similar geometry (Figure 6-8E). As S-ribosylMet is not a productive methyl donor

for MmuM 303, we propose that the presence of the aromatic adenine ring in AdoMet

helps to correctly position the methyl donor, and thus to facilitate access of the activated

S-methyl group to the Hcy co-substrate.

Although SMM is a good methyl donor in vitro, modeling suggests that it binds

MmuM less tightly than the (R,S)-AdoMet diastereomer. In addition, our top three

docking solutions place SMM deep in a pocket, in a conformation not productive in

terms of methyl transfer (Figure 6-8F). Additional bond conformations (Rank 4, Figure 6-

8F) do show a productive solution with SMM bound in a fashion similar to that of the

Met moiety of (R,S)-AdoMet. This suggests that SMM may also access Hcy from α’5/α’6

and L2 region. As SMM is small compared to AdoMet, it is not surprising that docking

algorithms predict a relative lower binding affinity. The prediction of a higher binding

affinity for (R,S)-AdoMet than for SMM is particularly useful. Difficulties in separating

(R,S)-AdoMet and (S,S)-AdoMet along with the stability of the diastereomers, makes it

difficult to measure kinetic parameters for MmuM with (R,S)-AdoMet as methyl

donor303,320.

161

6.3.5 Specific Potassium Ion Requirement for MmuM Activity

Human hBHMT (pdb entry 4M3P) has a K+ ion near the Zn2+ site that is critical

for the catalytic activity335. hBHMT possess a much lower enzymatic activity in the K+-

deficient betaine /Hcy reaction. By comparison, our kinetic data indicates that HMT

MmuM reaction maintains the same activity with or without K+ (Table 6-2). Interestingly,

we did observe a significant, unexpected electron density located adjacent to the

substrate binding pocket in the MmuM oxidized form (Figure 6-5A). This electron

density is located near the α’2 helix and closest to NMet23 with distances of 3.3 Å, and a

contour level of 8 in the calculated mFo-DFc map. The density is also observable in

the metallated form of the protein, but has a much lower contour level. The position

does not exhibit an anomalous signal at 0.9787, ruling out the possibility of Ni2+ or Zn2+

ions. As there is a prominent dipole interaction of helix α’2 (N-terminus) pointing at the

feature, we fit the density as chloride in the deposited structure.

6.4 Conclusions and Insights

HMTs catalyze the conversion of homocysteine to methionine using S-

methylmethionine or S-adenosylmethionine as the methyl donor. HMTs play an

important role in methionine biosynthesis and are widely distributed among

microorganisms, plants, and animals. Additionally, HMTs play a role in metabolite repair

of S-adenosylmethionine by removing an inactive diastereomer from the pool. The

mmuM gene product from Escherichia coli is an archetypal HMT family protein and

contains a predicted Zn-binding motif in the enzyme active site. To summarize, we

present here the first structure of a member (MmuM) of the homocysteine

methyltransferase family, a widely distributed enzyme family that plays key roles in the

162

activated methyl cycle. Using a combination of X-ray crystallography, in vitro and in

silico assays, the work provides a structural basis for catalysis and substrate

discrimination.


6.5.1 Chemicals

Unless otherwise stated, reagents and chemicals were purchased from Fisher

Scientific, Sigma/Aldrich or Hampton.

6.5.2 Cloning, Expression, and Purification of MmuM

The gene for mmuM was amplified via polymerase chain reaction (PCR) from E.

coli K-12 MG1655 genomic DNA with the following primers: mmuM_NdeIF (GCG

CATATG TCGCAGAATAATCCGTTA), and mmuM_XhoIR (GCG CACGAG

TCAGCTTCGCGCTTTTAA). The PCR product was cleaved with the corresponding

endonucleases and then ligated into the expression vector pET30a with a C-terminal

hexa-His tag. The resultant plasmid was transformed into E. coli C43 (DE3) cells for

expression. Cultures (1 L) were grown at 37°C to an OD600=0.8, and overexpression

was initiated by adding isopropyl β-D-1-thiogalactopyranoside (final concentration 400

μM). Growth was continued for 25 h at 25°C, before the cells were harvested by

centrifugation. Cell pellets were resuspended in 25 mL of 0.5 M NaCl and 20 mM Tris-

HCl, pH 7.5, and lysed at 14,000 psi through a nitrogen-pressure microfluidizer cell (M-

110L Pneumatic). The lysate was clarified by centrifugation at 15,000g for 20 min at

4°C. MmuM was purified by immobilized metal affinity chromatography (HisPur Ni-NTA

Resin, Thermo Scientific). After binding for 1 h, the resin was washed with 4 × 10 mL of

0.5 M NaCl, 10 mM imidazole and 20 mM Tris-HCl, pH 7.5, and the bound protein was

eluted with 3 × 2 mL of 0.5 M NaCl, 250 mM imidazole and 20 mM Tris-HCl, pH 7.5.

163

The elution fraction was further purified by gel filtration chromatography (HiLoad 16/60

SuperDex-200 column, AKTA FPLC System, GE Healthcare) with buffer 150 mM NaCl,

5 mM dithiothreitol and 20 mM Tris-HCl, pH 7.5. The pooled protein was dialyzed

against 1 L 150 mM NaCl, 5mM ZnCl2, 1 mM βME and 20 mM Tris-HCl, pH 7.5 for 4 h,

then subsequently dialyzed against 1 L of 150 mM NaCl, 1 mM βME and 20 mM Tris-

HCl, pH 7.5, for an additional 4 h to remove non-specifically bound Zn2+. The protein

was centrifuged to remove precipitated protein before crystallization. Protein

concentration was determined with Bradford assay using bovine serum albumin (BSA)

as standard.

Figure 6-9. Crystallization and optimization of the MmuM. A) Crystals formed in sitting drop colored by methylene blue; B) optimization of pH and salt with glycerol as additive and, C) harvested MmuM rod-shaped single crystals.

6.5.3 MmuM Crystallization

Initial crystal screening was performed in a vapor diffusion, sitting drop format

using commercial sparse matrix screens. Small clusters of needle crystals were

identified in a condition containing 2.0 M ammonium sulfate and 0.1 M sodium acetate,

pH 4.6. Optimization of salt and pH along with microseeding were performed in hanging

drop format at 20°C. Protein (1.8 μL of a 4 mg mL-1 solution) plus 2 μL of precipitant

were mixed and balanced against 1 mL of reservoir solution. Resultant rod-shaped

164

single crystals with a size of ~20×20×100 μm were obtained in a final condition that

contained 1.6 M ammonium sulfate, 10% v/v glycerol and 0.1 M sodium acetate, pH 4.8

(Figure 6-9). Crystals of suitable size were harvested and frozen in liquid nitrogen

without additional cryoprotectant. Harvested MmuM crystals were in an oxidized metal-

free form. To obtain the apo form crystals, 10 mM dithiothreitol was added to the

reservoir solution and equilibrated with the crystal drop for 16 h. Metallated form crystals

were obtained by further soaking in 1 mM ZnCl2 and 10 mM Hcy for 5 min before

harvesting and flash freezing.

6.5.4 Data Collection and Processing, and Structure Refinement

Diffraction data were collected on beamline 21-ID-F/G of the Life Sciences

Collaborative Access Team (LS-CAT) facility at the Advanced Photon Source (APS),

Argonne National Laboratory. Data were collected at 100 K with a wavelength of 0.9786

Å, integrated, merged and scaled using XDS package156 to a resolution of 1.76 - 2.88 Å

in space group I222 or P21212, with one or two protein molecules per asymmetric unit.

The phase of MmuM in space group I222 was determined by molecular replacement

with MOLREP336 using the N-terminal structure of B12-dependent methionine

synthase323 (PDB entry 1Q7M, 26% sequence identity) from Thermotoga maritima as a

search model. The molecular replacement solution was refined by rigid-body refinement

in REFMAC5183 to a Rwork of 43% and Rfree of 51%. The refined molecular replacement

model, combined with an anomalous signal from Zn atoms, provided an interpretable

electron density map, and the atomic model was completed by several rebuilding and

refinement cycles using SHELXDE188, RESOLVE187, ARP/wARP189,

PHENIX.REFINE158 and COOT160. Zn2+ ions were added manually into the calculated

165

anomalous difference map, and occupancy and anisotropic B-factors were refined with

sequentially. Hcy was also added manually, adjacent to the Zn2+ ion. Our metallated

crystal form has extra electron density near Cys296, which we assigned as βME. A

distance restraint between SβME and SCys296 was added (2.05 Å) during the refinement.

Water molecules were placed in the structure based on manual inspection of the 2mFo-

DFc and mFo-DFc electron density maps. No TLS/NCS restraint was used during the

refinement process. The quality of the model was evaluated using a simulated

annealing composite omit map. The refined coordinates have been deposited in the

protein data bank (accession codes 5DML, 5DMN and 5DMN). Statistics on data

collection and atomic structure refinement are summarized (Table 6-1). Structural

illustrations were prepared with PyMOL161.

6.5.5 Models of Bound Methyl-Donors and Analysis

Fitting of methyl donors into the active site was conducted with AutoDock 4.2191.

The methyl donors SMM, (R,S)-AdoMet, or their mimics (S,S)-AdoMet and S-ribosylMet

were constructed and energy minimized with Spartan ’08192. The position of bound Hcy

molecule in the metallated form MmuM was held fixed as the methyl donors were fitted.

Standard algorithms and docking procedures were used for a rigid protein and a flexible

ligand in a grid covering the entire protein. The best docking poses were analyzed with

DS Visualizer 2.5 and the hydrogen bonding/π–π interactions and corresponding

estimated free energy of ligand binding (ΔG) were calculated.

6.5.6 MmuM Active Site Mutagenesis

Mutations of residues in the MmuM active site were carried out using the Q5

Site-Directed Mutagenesis Kit (New England Biolabs) following the manufacturer’s

instructions. Oligonucleotides utilized includes: Y71F

166

(CACTGCCAGCtttCAGGCGACGC), Y71A (CACTGCCAGCgctCAGGCGACGCCGG),

Y71T (CACTGCCAGCactCAGGCGACGCCG), and Y71F/T169Y (plus

GGCCTGCGAAtacCTGCCGAATTTTTCCGAGATTG). The validity of mutagenesis was

confirmed by DNA sequencing.

6.5.7 ITC-Based Activity Assay

MmuM wild type (WT) and mutant proteins were purified as described above. WT

and mutant proteins, prepared concurrently, were each concentrated to 10.0 mg mL-1

for isothermal titration calorimetry (ITC)-based kinetics assays. Protein samples (1.0 μL,

final concentration 1.0 μM), along with 2 mM DL-Hcy, were mixed with 300 μL of freshly

prepared reaction buffer (100 mM NaCl, 100 mM KCl, 20 mM Tris HCl, pH 7.5, 1mM

DTT) and injected into the MicroCal iTC200 (Malvern) reaction cell (cell volume, 200

μL). 39.6 μL of 5.0 mM DL-SMM was dissolved in the same reaction buffer and titrated

into the cell in the single injection mode over 800 s at 30°C. Potassium-free protein

(WT) kinetics were carried in a K+ deficient media (200 mM NaCl, 20 mM Tris HCl, pH

7.5, 1mM DTT). Protein-free, blank runs were performed in a similar fashion. Heat

change during the reaction was detected and recorded. All titrations were repeated in

triplicate. The reaction kinetics parameters337 were calculated using the Origin software

package155.

6.5.8 Determination of Protein Secondary Structure

Far-UV circular dichroism spectra (CD) of purified MmuM and site-directed

mutants were recorded at 25ºC (Aviv 202 CD Spectrometer). Proteins were dialyzed

against 150 mM NaCl, 5mM ZnCl2, 1 mM βME and 20 mM Tris-HCl, pH 7.5 for 4 h after

immobilized metal affinity chromatography purification, and then dialyzed against 100

mM NaCl, 100 mM KCl, 20 mM Tris HCl, pH 7.5 and diluted to a concentration of 0.3

167

mg ml-1 for CD spectrometry. Samples were placed in 0.1 cm pathlength cuvette with

ellipticities (θ) recorded at a wavelength range of 200 - 250 nm.

168

LIST OF REFERENCES

1. Sutak, R., Lesuisse, E., Tachezy, J. & Richardson, D. R. Crusade for iron: iron uptake in unicellular eukaryotes and its significance for virulence. Trends Microbiol. 16, 261–268 (2008).

2. Saha, R., Saha, N., Donofrio, R. S. & Bestervelt, L. L. Microbial siderophores: a mini review. J. Microbiol. 52, 1–15 (2012).

3. Weinberg, E. D. Iron availability and infection. Biochim. Biophys. Acta - Gen. Subj. 1790, 600–605 (2009).

4. Miethke, M. & Marahiel, M. A. Siderophore-based iron acquisition and pathogen control. Microbiol. Mol. Biol. Rev. 71, 413–451 (2007).

5. Braun, V., Hantke, K., Winkelmann, G. & Carrano, C. J. Transition Metals in Microbial Metabolism. (Harwood, Amsterdam, 1997).

6. Sandy, M. & Butler, A. Microbial iron acquisition: marine and terrestrial siderophores. Chem. Rev. 109, 4580–4595 (2009).

7. Deneer, H. G., Healey, V. & Boychuk, I. Reduction of exogenous ferric iron by a surface-associated ferric reductase of Listeria spp. Microbiology 141, 1985–1992 (1995).

8. Tong, Y. & Guo, M. Bacterial heme-transport proteins and their heme-coordination modes. Arch. Biochem. Biophys. 481, 1–15 (2009).

9. Braun, V. Iron uptake mechanisms and their regulation in pathogenic bacteria. Int. J. Med. Microbiol. 291, 67–79 (2001).

10. Hider, R. C. & Kong, X. Chemistry and biology of siderophores. Nat. Prod. Rep. 27, 637–657 (2010).

11. Raymond, K. N., Dertz, E. A. & Kim, S. S. Enterobactin: an archetype for microbial iron transport. Proc. Natl. Acad. Sci. U. S. A. 100, 3584–3588 (2003).

12. de Lorenzo, V., Bindereif, A., Paw, B. H. & Neilands, J. B. Aerobactin biosynthesis and transport genes of plasmid ColV-K30 in Escherichia coli K-12. J. Bacteriol. 165, 601–611 (1986).

13. Luo, M., Fadeev, E. A. & Groves, J. T. Mycobactin-mediated iron acquisition within macrophages. Nat. Chem. Biol. 1, 149–153 (2005).

14. Johnstone, T. C. & Nolan, E. M. Beyond iron: non-classical biological functions of bacterial siderophores. Dalt. Trans. 44, 6320–6339 (2015).

169

15. Gründlinger, M. et al. Fungal siderophore biosynthesis is partially localized in peroxisomes. Mol. Microbiol. 88, 862–875 (2013).

16. Condurso, H. L. & Bruner, S. D. Structure and noncanonical chemistry of nonribosomal peptide biosynthetic machinery. Nat. Prod. Rep. 29, 1099–1110 (2012).

17. Lazos, O. et al. Biosynthesis of the putative siderophore erythrochelin requires unprecedented crosstalk between separate nonribosomal peptide gene clusters. Chem. Biol. 17, 160–173 (2010).

18. di Russo, N. V., Condurso, H. L., Li, K., Bruner, S. D. & Roitberg, A. E. Oxygen diffusion pathways in a cofactor-independent dioxygenase. Chem. Sci. 6, 6341–6348 (2015).

19. Wurst, J. M. et al. Identification of inhibitors of PvdQ, an enzyme involved in the synthesis of the siderophore pyoverdine. ACS Chem. Biol. 9, 1536–1544 (2014).

20. Fischbach, M. A., Lin, H., Liu, D. R. & Walsh, C. T. In vitro characterization of IroB, a pathogen-associated C-glycosyltransferase. Proc. Natl. Acad. Sci. U. S. A. 102, 571–576 (2005).

21. Martinez, J. S. & Butler, A. Marine amphiphilic siderophores: marinobactin structure, uptake, and microbial partitioning. J. Inorg. Biochem. 101, 1692–1698 (2007).

22. Kem, M. P. & Butler, A. Acyl peptidic siderophores: structures, biosyntheses and post-assembly modifications. BioMetals 28, 445–459 (2015).

23. Gauglitz, J. M., Iinishi, A., Ito, Y. & Butler, A. Microbial tailoring of acyl peptidic siderophores. Biochemistry 53, 2624–2631 (2014).

24. Kem, M. P., Naka, H., Iinishi, A., Haygood, M. G. & Butler, A. Fatty acid hydrolysis of acyl marinobactin siderophores by marinobacter acylases. Biochemistry 54, 744–752 (2015).

25. Zane, H. K. et al. Biosynthesis of amphi-enterobactin siderophores by Vibrio harveyi BAA-1116: identification of a bifunctional nonribosomal peptide synthetase condensation domain. J. Am. Chem. Soc. 136, 5615–5618 (2014).

26. Giessen, T. W. et al. Isolation, structure elucidation, and biosynthesis of an unusual hydroxamic acid ester-containing siderophore from Actinosynnema mirum. J. Nat. Prod. 75, 905–914 (2012).

27. Barry, S. M. & Challis, G. L. Recent advances in siderophore biosynthesis. Curr. Opin. Chem. Biol. 13, 205–215 (2009).

170

28. Challis, G. L. A widely distributed bacterial pathway for siderophore biosynthesis independent of nonribosomal peptide synthetases. Chembiochem 6, 601–611 (2005).

29. Berti, A. D. & Thomas, M. G. Analysis of achromobactin biosynthesis by Pseudomonas syringae pv. syringae B728a. J. Bacteriol. 191, 4594–4604 (2009).

30. Barona-Gómez, F., Wong, U., Giannakopulos, A. E., Derrick, P. J. & Challis, G. L. Identification of a cluster of genes that directs desferrioxamine biosynthesis in Streptomyces coelicolor M145. J. Am. Chem. Soc. 126, 16282–16283 (2004).

31. Penwell, W. F. et al. Discovery and characterization of new hydroxamate siderophores, Baumannoferrin A and B, produced by Acinetobacter baumannii. ChemBioChem 16, 1896–1904 (2015).

32. Soe, C. Z. & Codd, R. Unsaturated macrocyclic dihydroxamic acid siderophores produced by Shewanella putrefaciens using precursor-directed biosynthesis. ACS Chem. Biol. 9, 945–956 (2014).

33. Reddy, V. S., Shlykov, M. A., Castillo, R., Sun, E. I. & Saier, M. H. The major facilitator superfamily (MFS) revisited. FEBS J. 279, 2022–2035 (2012).

34. Yan, N. Structural biology of the major facilitator superfamily transporters. Annu. Rev. Biophys. 44, 257–283 (2015).

35. Furrer, J. L., Sanders, D. N., Hook-Barnard, I. G. & McIntosh, M. A. Export of the siderophore enterobactin in Escherichia coli: involvement of a 43 kDa membrane exporter. Mol. Microbiol. 44, 1225–1234 (2002).

36. Miethke, M., Schmidt, S. & Marahiel, M. A. The major facilitator superfamily-type transporter YmfE and the multidrug-efflux activator Mta mediate bacillibactin secretion in Bacillus subtilis. J. Bacteriol. 190, 5143–5152 (2008).

37. Hotta, K., Kim, C. Y., Fox, D. T. & Koppisch, A. T. Siderophore-mediated iron acquisition in Bacillus anthracis and related strains. Microbiology 156, 1918–1925 (2010).

38. Nikaido, H. & Pagès, J. M. Broad-specificity efflux pumps and their role in multidrug resistance of Gram-negative bacteria. FEMS Microbiol. Rev. 36, 340–363 (2012).

39. Li, X. Z., Nikaido, H. & Poole, K. Role of MexA-MexB-OprM in antibiotic efflux in Pseudomonas aeruginosa. Antimicrob. Agents Chemother. 39, 1948–1953 (1995).

40. Bleuel, C. et al. TolC is involved in enterobactin efflux across the outer membrane of Escherichia coli. J. Bacteriol. 187, 6701–6707 (2005).

171

41. Pei, X.-Y. et al. Structures of sequential open states in a symmetrical opening transition of the TolC exit duct. Proc. Natl. Acad. Sci. U. S. A. 108, 2112–2117 (2011).

42. Du, D. et al. Structure of the AcrAB-TolC multidrug efflux pump. Nature 509, 512–515 (2014).

43. Horiyama, T. & Nishino, K. AcrB, AcrD, and MdtABC multidrug efflux systems are involved in enterobactin export in Escherichia coli. PLoS One 9, e108642 (2014).

44. Wells, R. M. et al. Discovery of a siderophore export system essential for virulence of Mycobacterium tuberculosis. PLoS Pathog. 9, (2013).

45. Radhakrishnan, A. et al. Crystal structure of the transcriptional regulator Rv0678 of Mycobacterium tuberculosis. J. Biol. Chem. 289, 16526–16540 (2014).

46. Jones, C. M. et al. Self-poisoning of Mycobacterium tuberculosis by interrupting siderophore recycling. Proc. Natl. Acad. Sci. U. S. A. 111, 1945–1950 (2014).

47. Milano, A. et al. Azole resistance in Mycobacterium tuberculosis is mediated by the MmpS5–MmpL5 efflux system. Tuberculosis 89, 84–90 (2009).

48. Caza, M., Lépine, F., Milot, S. & Dozois, C. M. Specific roles of the iroBCDEN genes in virulence of an avian pathogenic Escherichia coli O78 strain and in production of salmochelins. Infect. Immun. 76, 3539–3549 (2008).

49. Crouch, M.-L. V., Castor, M., Karlinsey, J. E., Kalhorn, T. & Fang, F. C. Biosynthesis and IroC-dependent export of the siderophore salmochelin are essential for virulence of Salmonella enterica serovar Typhimurium. Mol. Microbiol. 67, 971–983 (2008).

50. Caza, M., Lépine, F. & Dozois, C. M. Secretion, but not overall synthesis, of catecholate siderophores contributes to virulence of extraintestinal pathogenic Escherichia coli. Mol. Microbiol. 80, 266–282 (2011).

51. Ferguson, A. D., Hofmann, E., Coulton, J. W., Diederichs, K. & Welte, W. Siderophore-mediated iron transport: crystal structure of FhuA with bound lipopolysaccharide. Science 282, 2215–2220 (1998).

52. Rabsch, W., Voigt, W., Reissbrodt, R., Tsolis, R. M. & Bäumler, A. J. Salmonella typhimurium IroN and FepA proteins mediate uptake of enterobactin but differ in their specificity for other siderophores. J. Bacteriol. 181, 3610–3612 (1999).

53. Braun, M. et al. Structure of TonB in complex with FhuA, E. coli outer membrane receptor. Science. 1399–1402 (2006).

172

54. Naikare, H. et al. Campylobacter jejuni ferric–enterobactin receptor CfrA is TonB3 dependent and mediates iron acquisition from structurally different catechol siderophores. Metallomics 5, 988 (2013).

55. Ferguson, A. D. et al. Structural basis of gating by the outer membrane transporter FecA. Science. 295, 1715–1719 (2002).

56. Brillet, K. et al. A β Strand lock exchange for signal transduction in TonB-dependent transducers on the basis of a common structural motif. Structure 15, 1383–1391 (2007).

57. Krewulak, K. D. & Vogel, H. J. TonB or not TonB: is that the question? Biochem. Cell Biol. 89, 87–97 (2011).

58. Jordan, L. D. et al. Energy-dependent motion of TonB in the Gram-negative bacterial inner membrane. Proc. Natl. Acad. Sci. U. S. A. 110, 11553–11558 (2013).

59. Sverzhinsky, A. et al. Coordinated rearrangements between cytoplasmic and periplasmic domains of the membrane protein complex ExbB-ExbD of Escherichia coli. Structure 22, 791–797 (2014).

60. Sverzhinsky, A. et al. Amphipol-trapped ExbB–ExbD membrane protein complex from Escherichia coli: a biochemical and structural case study. J. Membr. Biol. 247, 1005–1018 (2014).

61. Ollis, A. a. & Postle, K. ExbD mutants define initial stages in TonB energization. J. Mol. Biol. 415, 237–247 (2012).

62. Ollis, A. A. & Postle, K. Identification of functionally important TonB-ExbD periplasmic domain interactions in vivo. J. Bacteriol. 194, 3078–3087 (2012).

63. Rees, D. C., Johnson, E. & Lewinson, O. ABC transporters: the power to change. Nat. Rev. Mol. Cell Biol. 10, 218–227 (2009).

64. Woo, J.-S., Zeltina, A., Goetz, B. A. & Locher, K. P. X-ray structure of the Yersinia pestis heme transporter HmuUV. Nat. Struct. Mol. Biol. 19, 1310–1315 (2012).

65. Rodriguez, G. M. & Smith, I. Identification of an ABC transporter required for iron acquisition and virulence in Mycobacterium tuberculosis. J. Bacteriol. 188, 424–430 (2006).

66. Granger, J. B., Lu, Z., Ferguson, J. B., Santa Maria, P. J. & Novak, W. R. P. Cloning, expression, purification and characterization of an iron-dependent regulator protein from Thermobifida fusca. Protein Expr. Purif. 92, 190–194 (2013).

173

67. Chu, B. C. H. & Vogel, H. J. A structural and functional analysis of type III periplasmic and substrate binding proteins: their role in bacterial siderophore and heme transport. Biol. Chem. 392, 39–52 (2011).

68. Chu, B. C. H., Otten, R., Krewulak, K. D., Mulder, F. A. A. & Vogel, H. J. The solution structure, binding properties, and dynamics of the bacterial siderophore-binding protein FepB. J. Biol. Chem. 289, 29219–29234 (2014).

69. Grigg, J. C., Cooper, J. D., Cheung, J., Heinrichs, D. E. & Murphy, M. E. P. The Staphylococcus aureus siderophore receptor HtsA undergoes localized conformational changes to enclose staphyloferrin A in an arginine-rich binding pocket. J. Biol. Chem. 285, 11162–11171 (2010).

70. Li, K. & Bruner, S. D. Structure and functional analysis of the siderophore periplasmic binding protein from the fuscachelin gene cluster of Thermobifida fusca. Proteins Struct. Funct. Bioinforma. 84, 118–128 (2016).

71. Fukushima, T., Allred, B. E. & Raymond, K. N. Direct evidence of iron uptake by the Gram-positive siderophore-shuttle mechanism without iron reduction. ACS Chem. Biol. 9, 2092–2100 (2014).

72. Fukushima, T. et al. Gram-positive siderophore-shuttle with iron-exchange from Fe-siderophore to apo-siderophore by Bacillus cereus YxeB. Proc. Natl. Acad. Sci. U. S. A. 110, 13821–13826 (2013).

73. Raymond-Bouchard, I. et al. Structural requirements for the activity of the MirB ferrisiderophore transporter of Aspergillus fumigatus. Eukaryot. Cell 11, 1333–1344 (2012).

74. Lin, H. & Fischbach, M. In vitro characterization of salmochelin and enterobactin trilactone hydrolases IroD, IroE, and Fes. J. Am. Chem. Soc. 6097–6104 (2005).

75. Larsen, N. A., Lin, H., Wei, R., Fischbach, M. A. & Walsh, C. T. Structural characterization of enterobactin hydrolase IroE. Biochemistry 45, 10184–10190 (2006).

76. Zeng, X., Mo, Y., Xu, F. & Lin, J. Identification and characterization of a periplasmic trilactone esterase, Cee, revealed unique features of ferric enterobactin acquisition in Campylobacter. Mol. Microbiol. 87, 594–608 (2013).

77. Li, K., Chen, W.-H. & Bruner, S. D. Structure and mechanism of the siderophore-interacting protein from the fuscachelin gene cluster of Thermobifida fusca. Biochemistry 54, 3989–4000 (2015).

78. Matzanke, B. F., Anemüller, S., Schünemann, V., Trautwein, A. X. & Hantke, K. FhuF, part of a siderophore-reductase system. Biochemistry 43, 1386–1392 (2004).

174

79. Miethke, M., Pierik, A. J., Peuckert, F., Seubert, A. & Marahiel, M. A. Identification and characterization of a novel-type ferric siderophore reductase from a gram-positive extremophile. J. Biol. Chem. 286, 2245–2260 (2011).

80. Bamford, V. A. et al. Preliminary X-ray diffraction analysis of YqjH from Escherichia coli: a putative cytoplasmic ferri-siderophore reductase. Acta Crystallogr. F 64, 792–796 (2008).

81. Miethke, M., Hou, J. & Marahiel, M. A. The siderophore-interacting protein YqjH acts as a ferric reductase in different iron assimilation pathways of Escherichia coli. Biochemistry 50, 10951–10964 (2011).

82. Wang, S., Wu, Y. & Outten, F. W. Fur and the novel regulator YqjI control transcription of the ferric reductase gene yqjH in Escherichia coli. J. Bacteriol. 193, 563–74 (2011).

83. Wang, S., Blahut, M., Wu, Y., Philipkosky, K. E. & Outten, F. W. Communication between binding sites is required for YqjI regulation of target promoters within the yqjH-yqjI intergenic region. J. Bacteriol. 196, 3199–3207 (2014).

84. Ryndak, M. B., Wang, S., Smith, I. & Rodriguez, G. M. The Mycobacterium tuberculosis high-affinity iron importer, IrtA, contains an FAD-binding domain. J. Bacteriol. 192, 861–869 (2010).

85. Sulochana, M. B., Jayachandra, S. Y., Kumar, S. K. A. & Dayanand, A. Antifungal attributes of siderophore produced by the Pseudomonas aeruginosa JAS-25. J. Basic Microbiol. 54, 418–424 (2014).

86. Pramanik, A. et al. Albomycin is an effective antibiotic, as exemplified with Yersinia enterocolitica and Streptococcus pneumoniae. Int. J. Med. Microbiol. 297, 459–469 (2007).

87. Sajid, R., Ghani, F., Adil, S. & Khurshid, M. Oral iron chelation therapy with deferiprone in patients with Thalassemia major. JPMA 59, 388–390 (2009).

88. Górska, A., Sloderbach, A. & Marszałł, M. P. Siderophore-drug complexes: potential medicinal applications of the ‘Trojan horse’ strategy. Trends Pharmacol. Sci. 35, (2014).

89. Wencewicz, T. A., Möllmann, U., Long, T. E. & Miller, M. J. Is drug release necessary for antimicrobial activity of siderophore-drug conjugates? Syntheses and biological studies of the naturally occurring salmycin ‘Trojan Horse’ antibiotics and synthetic desferridanoxamine-antibiotic conjugates. BioMetals 22, 633–648 (2009).

90. de Carvalho, C. C. C. R. & Fernandes, P. Siderophores as ‘Trojan Horses’: tackling multidrug resistance? Front. Microbiol. 5, 1–3 (2014).

175

91. Kline, T. et al. Antimicrobial effects of novel siderophores linked to β-lactam antibiotics. Bioorg. Med. Chem. 8, 73–93 (2000).

92. Ji, C., Miller, P. A. & Miller, M. J. Iron transport-mediated drug delivery: practical syntheses and in vitro antibacterial studies of tris-catecholate siderophore-aminopenicillin conjugates reveals selectively potent antipseudomonal activity. J. Am. Chem. Soc. 134, 9898–9901 (2012).

93. Wencewicz, T. A. & Miller, M. J. Biscatecholate–monohydroxamate mixed ligand siderophore–carbacephalosporin conjugates are selective sideromycin antibiotics that target Acinetobacter baumannii. J. Med. Chem. 56, 4044–4052 (2013).

94. Zheng, T., Bullock, J. L. & Nolan, E. M. Siderophore-mediated cargo delivery to the cytoplasm of Escherichia coli and Pseudomonas aeruginosa: syntheses of monofunctionalized enterobactin scaffolds and evaluation of enterobactin-cargo conjugate uptake. J. Am. Chem. Soc. 134, 18388–18400 (2012).

95. Zheng, T. & Nolan, E. M. Enterobactin-mediated delivery of β-lactam antibiotics enhances antibacterial activity against pathogenic escherichia coli. J. Am. Chem. Soc. 136, 9677–9691 (2014).

96. Chairatana, P., Zheng, T. & Nolan, E. M. Targeting virulence: salmochelin modification tunes the antibacterial activity spectrum of β-lactams for pathogen-selective killing of Escherichia coli. Chem. Sci. 6, 4458–4471 (2015).

97. Page, M. G. P., Dantier, C. & Desarbre, E. In vitro properties of BAL30072, a novel siderophore sulfactam with activity against multiresistant Gram-negative bacilli. Antimicrob. Agents Chemother. 54, 2291–2302 (2010).

98. Higgins, P. G., Stefanik, D., Page, M. G. P., Hackel, M. & Seifert, H. In vitro activity of the siderophore monosulfactam BAL30072 against meropenem-non-susceptible Acinetobacter baumannii. J. Antimicrob. Chemother. 67, 1167–1169 (2012).

99. Hofer, B. et al. Combined effects of the siderophore monosulfactam BAL30072 and carbapenems on multidrug-resistant Gram-negative bacilli. J. Antimicrob. Chemother. 68, 1120–1129 (2013).

100. Butler, M. S., Blaskovich, M. A. & Cooper, M. A. Antibiotics in the clinical pipeline in 2013. J Antibiot 66, 571–591 (2013).

101. Flanagan, M. E. et al. Preparation, Gram-negative antibacterial activity, and hydrolytic stability of novel siderophore-conjugated monocarbam diols. ACS Med. Chem. Lett. 2, 385–390 (2011).

176

102. Murphy-Benenato, K. E. et al. Discovery of efficacious Pseudomonas aeruginosa-targeted siderophore-conjugated monocarbams by application of a semi-mechanistic pharmacokinetic/pharmacodynamic model. J. Med. Chem. 58, 2195–2205 (2015).

103. Han, S. et al. Structural basis for effectiveness of siderophore-conjugated monocarbams against clinically relevant strains of Pseudomonas aeruginosa. Proc. Natl. Acad. Sci. U. S. A. 107, 22002–22007 (2010).

104. Murphy-Benenato, K. E. et al. SAR and structural analysis of siderophore-conjugated monocarbam inhibitors of Pseudomonas aeruginosa PBP3. ACS Med. Chem. Lett. 6, 537–542 (2015).

105. Brown, M. F. et al. Pyridone-conjugated monobactam antibiotics with gram-negative activity. J. Med. Chem. 56, 5541–5552 (2013).

106. McPherson, C. J. et al. Clinically relevant Gram-negative resistance mechanisms have no effect on the efficacy of MC-1, a novel siderophore-conjugated monocarbam. Antimicrob. Agents Chemother. 56, 6334–6342 (2012).

107. Wencewicz, T. A., Long, T. E., Möllmann, U. & Miller, M. J. Trihydroxamate siderophore-fluoroquinolone conjugates are selective sideromycin antibiotics that target Staphylococcus aureus. Bioconjug. Chem. 24, 473–486 (2013).

108. Starr, J. et al. Siderophore receptor-mediated uptake of lactivicin analogues in gram-negative bacteria. J. Med. Chem. 57, 3845–3855 (2014).

109. Miller, M. J. et al. Design, synthesis, and study of a mycobactin-artemisinin conjugate that has selective and potent activity against tuberculosis and malaria. J. Am. Chem. Soc. 133, 2076–2079 (2011).

110. Juárez-Hernández, R. E., Franzblau, S. G. & Miller, M. J. Syntheses of mycobactin analogs as potent and selective inhibitors of Mycobacterium tuberculosis. Org. Biomol. Chem. 10, 7584 (2012).

111. Mathavan, I. et al. Structural basis for hijacking siderophore receptors by antimicrobial lasso peptides. Nat. Chem. Biol. 10, 340–342 (2014).

112. Pan, S. J. & Link, A. J. Sequence diversity in the lasso peptide framework: discovery of functional microcin J25 variants with multiple amino acid substitutions. J. Am. Chem. Soc. 133, 5016–5023 (2011).

113. Liu, Z. et al. Regulation of mammalian siderophore 2,5-DHBA in the innate immune response to infection. J. Exp. Med. 211, 1197–1213 (2014).

114. Miethke, M. et al. Inhibition of aryl acid adenylation domains involved in bacterial siderophore synthesis. FEBS J. 273, 409–419 (2006).

177

115. Ferreras, J. A., Ryu, J.-S., Di Lello, F., Tan, D. S. & Quadri, L. E. N. Small-molecule inhibition of siderophore biosynthesis in Mycobacterium tuberculosis and Yersinia pestis. Nat. Chem. Biol. 1, 29–32 (2005).

116. Qiao, C. et al. 5′-O-[(N-acyl)sulfamoyl]adenosines as antitubercular agents that inhibit MbtA: An adenylation enzyme required for siderophore biosynthesis of the mycobactins. J. Med. Chem. 50, 6080–6094 (2007).

117. Lun, S. et al. Pharmacokinetic and in vivo efficacy studies of the mycobactin biosynthesis inhibitor salicyl-AMS in mice. Antimicrob. Agents Chemother. 57, 5138–5140 (2013).

118. Vega, D. E. & Young, K. D. Accumulation of periplasmic enterobactin impairs the growth and morphology of Escherichia coli tolC mutants. Mol. Microbiol. 91, 508–521 (2013).

119. Sigel, A. & Sigel, H. Metal ions in biological systems, Volume 35: Iron transport and storage microorganisms, plants, and animals. Met. Based. Drugs 5, 262 (1998).

120. Kobayashi, T. & Nishizawa, N. K. Iron uptake, translocation, and regulation in higher plants. Annu. Rev. Plant Biol. 63, 131–152 (2012).

121. Goetz, D. H. et al. The neutrophil lipocalin NGAL is a bacteriostatic agent that interferes with siderophore-mediated iron acquisition. Mol. Cell 10, 1033–1043 (2002).

122. Kaplan, C. D. & Kaplan, J. Iron acquisition and transcriptional regulation. Chem. Rev. 109, 4536–4552 (2009).

123. Miethke, M., Pierik, A., Peuckert, F., Seubert, A. & Marahiel, M. Identification and characterization of a novel-type ferric siderophore reductase from a Gram-positive extremophile. J. Biol. Chem. 286, 2245–2260 (2011).

124. Dimise, E. J., Widboom, P. F. & Bruner, S. D. Structure elucidation and biosynthesis of fuscachelins, peptide siderophores from the moderate thermophile Thermobifida fusca. P. Natl. Acad. Sci. USA 105, 15311–15316 (2008).

125. Dimise, E. J., Condurso, H. L., Stoker, G. E. & Bruner, S. D. Synthesis and structure confirmation of fuscachelins A and B, structurally unique natural product siderophores from Thermobifida fusca. Org. Biomol. Chem. 10, 5353 (2012).

126. Lykidis, A. et al. Genome sequence and analysis of the soil cellulolytic actinomycete Thermobifida fusca YX. J. Bacteriol. 189, 2477–2486 (2007).

178

127. Deng, Y. & Zhang, X. DtxR, an iron-dependent transcriptional repressor that regulates the expression of siderophore gene clusters in Thermobifida fusca. FEMS Microbiol. Lett. 362, 1–6 (2014).

128. JCSG. Crystal structure of siderophore-interacting protein (ZP_00813641.1) from Shewanella putrefaciens CN-32 at 2.20 A resolution. To be Publ.

129. Fox, J. L. Sodium dithionite reduction of flavin. FEBS Lett. 39, 53–55 (1974).

130. Krissinel, E. & Henrick, K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797 (2007).

131. Holm, L. & Rosenström, P. Dali server: conservation mapping in 3D. Nucleic Acids Res. 38, 545–549 (2010).

132. Yamada, M. et al. Elucidations of the catalytic cycle of NADH-cytochrome b5 reductase by X-ray crystallography: new insights into regulation of efficient electron transfer. J. Mol. Biol. 425, 4295–4306 (2013).

133. Koder, R. L. & Miller, A. F. Steady-state kinetic mechanism, stereospecificity, substrate and inhibitor specificity of Enterobacter cloacae nitroreductase. Biochim. Biophys. Acta - Protein Struct. Mol. Enzymol. 1387, 395–405 (1998).

134. Rane, M. J. & Calvo, K. C. Reversal of the nucleotide specificity of ketol acid reductoisomerase by site-directed mutagenesis identifies the NADPH binding site. Arch. Biochem. Biophys. 338, 83–89 (1997).

135. Petschacher, B., Leitgeb, S., Kavanagh, K. L., Wilson, D. K. & Nidetzky, B. The coenzyme specificity of Candida tenuis xylose reductase (AKR2B5) explored by site-directed mutagenesis and X-ray crystallography. Biochem. J. 385, 75–83 (2005).

136. Brinkmann-Chen, S. et al. General approach to reversing ketol-acid reductoisomerase cofactor dependence from NADPH to NADH. Proc. Natl. Acad. Sci. U. S. A. 110, 10946–10951 (2013).

137. Butterton, J. R. & Calderwood, S. B. Identification, cloning, and sequencing of a gene required for ferric vibriobactin utilization by Vibrio cholerae. J. Bacteriol. 176, 5631–5638 (1994).

138. McWilliam, H. et al. Analysis tool web services from the EMBL-EBI. Nucleic Acids Res. 41, 597–600 (2013).

139. Söding, J., Biegert, A. & Lupas, A. N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, 244–248 (2005).

179

140. Mewies, M., McIntire, W. S. & Scrutton, N. S. Covalent attachment of flavin adenine dinucleotide (FAD) and flavin mononucleotide (FMN) to enzymes: the current state of affairs. Protein Sci. 7, 7–20 (1998).

141. Huang, C. H. et al. Crystal structure of glucooligosaccharide oxidase from Acremonium strictum: A novel flavinylation of 6-S-cysteinyl, 8a-N1-histidyl FAD. J. Biol. Chem. 280, 38831–38838 (2005).

142. Macindoe, G., Mavridis, L., Venkatraman, V., Devignes, M. D. & Ritchie, D. W. HexServer: An FFT-based protein docking server powered by graphics processors. Nucleic Acids Res. 38, 445–449 (2010).

143. Karlsson, A. et al. X-ray crystal structure of benzoate 1,2-dioxygenase reductase from Acinetobacter sp. Strain ADP1. J. Mol. Biol. 318, 261–272 (2002).

144. Correll, C. C. et al. Phthalate dioxygenase reductase: a modular structure for electron transfer from pyridine nucleotides to [2Fe-2S]. Science. 258, 1604–1610 (1992).

145. Rosenzweig, A. C., Frederick, C. A., Lippard, S. J. & Nordlund, P. Crystal structure of a bacterial non-haem iron hydroxylase that catalyses the biological oxidation of methane. Nature 366, 537–543 (1993).

146. Senda, M. et al. Molecular mechanism of the redox-dependent interaction between NADH-dependent ferredoxin reductase and rieske-type [2Fe-2S] ferredoxin. J. Mol. Biol. 373, 382–400 (2007).

147. Fang, R. et al. LSD2/KDM1B and Its cofactor NPAC/GLYR1 endow a structural and molecular model for regulation of H3K4 demethylation. Mol. Cell 49, 558–570 (2013).

148. Bruns, C. M. & Karplus, P. A. Refined crystal structure of spinach ferredoxin reductase at 1.7 A resolution: oxidized, reduced and 2’-phospho-5'-AMP bound states. J. Mol. Biol. 247, 125–145 (1995).

149. Butterton, J. R., Choi, M. H., Watnick, P. I., Carroll, P. A. & Calderwood, S. B. Vibrio cholerae VibF is required for vibriobactin synthesis and is a member of the family of nonribosomal peptide synthetases. J. Bacteriol. 182, 1731–1738 (2000).

150. Altschup, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

151. Patel, K., Kumar, A. & Durani, S. Analysis of the structural consensus of the zinc coordination centers of metalloprotein structures. Biochim. Biophys. Acta - Proteins Proteomics 1774, 1247–1253 (2007).

180

152. Dimise, E. J., Condurso, H. L., Stoker, G. E. & Bruner, S. D. Synthesis and structure confirmation of fuscachelins A and B, structurally unique natural product siderophores from Thermobifida fusca. Org. Biomol. Chem. 10, 5353–5356 (2012).

153. Young, I. A. N. G. & Gibson, F. Isolation of enterochelin from Escherichia coli. Methods Enzym. LVI, 394–398 (1979).

154. Dimise, E. J., Widboom, P. F. & Bruner, S. D. Structure elucidation and biosynthesis of fuscachelins, peptide siderophores from the moderate thermophile Thermobifida fusca. Proc. Natl. Acad. Sci. U. S. A. 105, 15311–15316 (2008).

155. Keller, S. et al. High-precision isothermal titration calorimetry with automated peak-shape analysis. Anal. Chem. 84, 5066–5073 (2012).

156. Kabsch, W. XDS. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 125–132 (2010).

157. McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).

158. Afonine, P. V. et al. Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr. Sect. D Biol. Crystallogr. 68, 352–367 (2012).

159. Adams, P. D. et al. PHENIX: A comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 213–221 (2010).

160. Emsley, P. & Cowtan, K. Coot: Model-building tools for molecular graphics. Acta Crystallogr. Sect. D Biol. Crystallogr. 60, 2126–2132 (2004).

161. Pymol: The PyMOL molecular gaphics system, version 1.4.1, Schrödinger, LLC.

162. Berntsson, R. P. A., Smits, S. H. J., Schmitt, L., Slotboom, D. J. & Poolman, B. A structural classification of substrate-binding proteins. FEBS Lett. 584, 2606–2617 (2010).

163. Clarke, T. E., Ku, S. Y., Dougan, D. R., Vogel, H. J. & Tari, L. W. The structure of the ferric siderophore binding protein FhuD complexed with gallichrome. Nat. Struct. Biol. 7, 287–291 (2000).

164. Clarke, T. E., Braun, V., Winkelmann, G., Tari, L. W. & Vogel, H. J. X-ray crystallographic structures of the Escherichia coli periplasmic protein FhuD bound to hydroxamate-type siderophores and the antibiotic albomycin. J. Biol. Chem. 277, 13966–13972 (2002).

181

165. Shi, R. et al. Trapping open and closed forms of FitE - a group III periplasmic binding protein. Proteins Struct. Funct. Bioinforma. 75, 598–609 (2009).

166. Grigg, J. C., Cooper, J. D., Cheung, J., Heinrichs, D. E. & Murphy, M. E. P. The Staphylococcus aureus siderophore receptor HtsA undergoes localized conformational changes to enclose staphyloferrin a in an arginine-rich binding pocket. J. Biol. Chem. 285, 11162–11171 (2010).

167. Ikeda, M., Masafumi, A., Okuno, T. & Shimizu, T. TMPDB: a database of experimentally-characterized transmembrane topologies. Nucleic Acids Res. 31, 406–409 (2003).

168. Taylor, D., Cawley, G. & Hayward, S. Quantitative method for the assignment of hinge and shear mechanism in protein domain movements. Struct. Bioinforma. 30, 3189–3196 (2014).

169. Clifton, M. C., Strong, R. K. & Raymond, K. N. Structure of YfiY from Bacillus cereus bound to the siderophore iron (III) schizokinen. To be Publ.

170. Grigg, J. C., Cheung, J., Heinrichs, D. E. & Murphy, M. E. Staphylococcus aureus SirA specificity for staphyloferrin B is driven by localized conformational change. To be Publ.

171. Grigg, J. C., Vermeiren, C. L., Heinrichs, D. E. & Murphy, M. E. P. Heme coordination by Staphylococcus aureus IsdE. J. Biol. Chem. 282, 28815–28822 (2007).

172. Ho, W. W. et al. Holo- and apo-bound structures of bacterial periplasmic heme-binding proteins. J. Biol. Chem. 282, 35796–35802 (2007).

173. Sun, X. et al. Crystal structure and metal binding properties of the lipoprotein MtsA, responsible for iron transport in Streptococcus pyogenes. Biochemistry 48, 6184–6190 (2009).

174. Loisel, E. et al. AdcAII, a new Pneumococcal Zn-binding protein homologous with ABC transporters: biochemical and structural analysis. J. Mol. Biol. 381, 594–606 (2008).

175. Cuneo, M. J. et al. Structural analysis of a periplasmic binding protein in the tripartite ATP-independent transporter family reveals a tetrameric assembly that may have a role in ligand transport. J. Biol. Chem. 283, 32812–32820 (2008).

176. van der Heide, T. & Poolman, B. ABC transporters: one, two or four extracytoplasmic substrate-binding sites? EMBO Rep. 3, 938–943 (2002).

177. Inouye, S. et al. Role of positive charge on the amino-terminal region of the signal peptide in protein secretion across the membrane. P. Natl. Acad. Sci. USA 79, 3438–3441 (1982).

182

178. Dwyer, M. A. & Hellinga, H. W. Periplasmic binding proteins: a versatile superfamily for protein engineering. Curr. Opin. Struct. Biol. 14, 495–504 (2004).

179. Doublié, S. Preparation of selenomethionyl proteins for phase determination. Methods Enzymol. 276, 523–530 (1997).

180. Adams, P. D. Substructure search procedures for macromolecular structures. Acta Crystallogr. Sect. D Biol. Crystallogr. 1966–1973 (2003).

181. Cowtan, K. The Buccaneer software for automated model building. 1. Tracing protein chains. Acta Crystallogr. Sect. D Biol. Crystallogr. 62, 1002–1011 (2006).

182. Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. Sect. D Biol. Crystallogr. 67, 235–242 (2011).

183. Murshudov, G. N. et al. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr. Sect. D Biol. Crystallogr. 67, 355–367 (2011).

184. Panjikar, S., Parthasarathy, V., Lamzin, V. S., Weiss, M. S. & Tucker, P. A. On the combination of molecular replacement and single-wavelength anomalous diffraction phasing for automated structure determination. Acta Crystallogr. Sect. D Biol. Crystallogr. 65, 1089–1097 (2009).

185. Otwinowski, Z. CCP4 Study Weekend. Daresbury Laboratory, Warrington, UK: Science and Engineering Research Council; maximum likelihood refinement of heavy atom parameters. (1991).

186. Cowtan, K. Recent developments in classical density modification. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 470–478 (2010).

187. Terwilliger, T. C. Maximum likelihood density modification by pattern recognition of structural motifs. Acta Crystallogr Sect D Biol Crystallogr 57, 1755–1762 (2001).

188. Sheldrick, G. M. Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 479–485 (2010).

189. Langer, G., Cohen, S. X., Lamzin, V. S. & Perrakis, A. Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat. Protoc. 3, 1171–1179 (2008).

190. Wallace, A. C., Laskowski, R. A. & Thornton, J. M. Ligplot - a program to generate schematic diagrams of protein ligand interactions. Protein Eng. 8, 127–134 (1995).

191. Morris, G. M. et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2790 (2009).

183

192. Shao, Y. et al. Advances in methods and algorithms in a modern quantum chemistry program package. Phys. Chem. Chem. Phys. 8, 3172–3191 (2006).

193. Newman, D. J. & Cragg, G. M. Natural products as sources of new drugs over the 30 years from 1981 to 2010. J. Nat. Prod. 75, 311–335 (2012).

194. Harvey, A. L., Edrada-Ebel, R. & Quinn, R. J. The re-emergence of natural products for drug discovery in the genomics era. Nat. Rev. Drug Discov. 14, 111–129 (2015).

195. Medema, M. H. & Fischbach, M. A. Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015).

196. McIntosh, J. A., Donia, M. S. & Schmidt, E. W. Ribosomal peptide natural products: bridging the ribosomal and nonribosomal worlds. Nat. Prod. Rep. 26, 537–559 (2009).

197. Arnison, P. G. et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).

198. Ortega, M. A. & van der Donk, W. A. New insights into the biosynthetic logic of ribosomally synthesized and post-translationally modified peptide natural products. Cell Chem. Biol. 23, 31–44 (2016).

199. Crone, W. J. K., Leeper, F. J. & Truman, A. W. Identification and characterisation of the gene cluster for the anti-MRSA antibiotic bottromycin: expanding the biosynthetic diversity of ribosomal peptides. Chem. Sci. 3, 3516 (2012).

200. Xie, L. et al. Lacticin 481: in vitro reconstitution of lantibiotic synthetase activity. Science 303, 679–681 (2004).

201. Oman, T. J. & van der Donk, W. A. Follow the leader: the use of leader peptides to guide natural product biosynthesis. Nat. Chem. Biol. 6, 9–18 (2010).

202. Weiz, A. R. et al. Harnessing the evolvability of tricyclic microviridins to dissect protease-inhibitor interactions. Angew. Chemie - Int. Ed. 53, 3735–3738 (2014).

203. López-Otín, C. & Matrisian, L. M. Emerging roles of proteases in tumour suppression. Nat. Rev. Cancer 7, 800–808 (2007).

204. Tianero, M. D. et al. Metabolic model for diversity-generating biosynthesis. Proc. Natl. Acad. Sci. U. S. A. 113, 1772–1777 (2016).

205. Ziemert, N., Ishida, K., Liaimer, A., Hertweck, C. & Dittmann, E. Ribosomal synthesis of tricyclic depsipeptides in bloom-forming cyanobacteria. Angew. Chemie - Int. Ed. 47, 7756–7759 (2008).

184

206. Lin, D. Y., Huang, S. & Chen, J. Crystal structures of a polypeptide processing and secretion transporter. Nature 523, 425–430 (2015).

207. Weiz, A. R. et al. Leader peptide and a membrane protein scaffold guide the biosynthesis of the tricyclic peptide microviridin. Chem. Biol. 18, 1413–1421 (2011).

208. Philmus, B., Christiansen, G., Yoshida, W. Y. & Hemscheidt, T. K. Post-translational modification in microviridin biosynthesis. ChemBioChem 9, 3066–3073 (2008).

209. Philmus, B., Guerrette, J. P. & Hemscheidt, T. K. Substrate specificity and scope of MvdD, a GRASP-like ligase from the microviridin biosynthetic gene cluster. ACS Chem. Biol. 4, 429–434 (2009).

210. Zhao, G. et al. Structure and function of Escherichia coli RimK, an ATP-grasp fold, L-glutamyl ligase enzyme. Proteins Struct. Funct. Bioinforma. 81, 1847–1854 (2013).

211. Iyer, L. M., Abhiman, S., Maxwell Burroughs, A. & Aravind, L. Amidoligases with ATP-grasp, glutamine synthetase-like and acetyltransferase-like domains: synthesis of novel metabolites and peptide modifications of proteins. Mol. Biosyst. 5, 1636–1660 (2009).

212. Fawaz, M. V., Topper, M. E. & Firestine, S. M. The ATP-grasp enzymes. Bioorg. Chem. 39, 185–191 (2011).

213. Ouchi, T. et al. Lysine and arginine biosyntheses mediated by a common carrier protein in Sulfolobus. Nat. Chem. Biol. 9, 277–283 (2013).

214. Liu, S. et al. Allosteric inhibition of Staphylococcus aureus D-alanine:D-alanine ligase revealed by crystallographic studies. Proc. Natl. Acad. Sci. U. S. A. 103, 15178–15183 (2006).

215. Chou, C. Y., Yu, L. P. C. & Tong, L. Crystal structure of biotin carboxylase in complex with substrates and implications for its catalytic mechanism. J. Biol. Chem. 284, 11690–11697 (2009).

216. Wang, H. et al. Synthetic Inositol phosphate analogs reveal that PPIP5K2 has a surface-mounted substrate capture site that is a target for drug discovery. Chem. Biol. 21, 689–699 (2014).

217. Rohrlack, T., Christoffersen, K., Kaebernick, M. & Neilan, B. A. Cyanobacterial protease inhibitor microviridin J causes a lethal molting disruption in Daphnia pulicaria. Appl. Environ. Microbiol. 70, 5047–5050 (2004).

185

218. Rohrlack, T. et al. Isolation, characterization, and quantitative analysis of microviridin J, a new Microcystis metabolite toxic to Daphnia. J. Chem. Ecol. 29, 1757–1770 (2003).

219. Liu, Y., Zheng, T. & Bruner, S. D. Structural basis for phosphopantetheinyl carrier domain interactions in the terminal module of nonribosomal peptide synthetases. Chem. Biol. 18, 1482–1488 (2011).

220. Koehnke, J. et al. The mechanism of patellamide macrocyclization revealed by the characterization of the PatG macrocyclase domain. Nat. Struct. Mol. Biol. 19, 767–772 (2012).

221. Wang, B., Zhao, A., Novick, R. P. & Muir, T. W. Key driving forces in the biosynthesis of autoinducing peptides required for staphylococcal virulence. Proc. Natl. Acad. Sci. U. S. A. 112, 10679–10684 (2015).

222. Pan, S. J., Rajniak, J., Cheung, W. L. & Link, A. J. Construction of a single polypeptide that matures and exports the lasso peptide microcin J25. ChemBioChem 13, 367–370 (2012).

223. Oman, T. J., Knerr, P. J., Bindman, N. A., Velásquez, J. E. & van der Donk, W. A. An engineered lantibiotic synthetase that does not require a leader peptide on its substrate. J. Am. Chem. Soc. 134, 6952–6955 (2012).

224. Koehnke, J. et al. Structural analysis of leader peptide binding enables leader-free cyanobactin processing. Nat. Chem. Biol. 11, 558–563 (2015).

225. Ortega, M. A. et al. Structure and mechanism of the tRNA-dependent lantibiotic dehydratase NisB. Nature 517, 509–512 (2014).

226. Burkhart, B. J., Hudson, G. A., Dunbar, K. L. & Mitchell, D. A. A prevalent peptide-binding domain guides ribosomal natural product biosynthesis. Nat. Chem. Biol. 11, 564–570 (2015).

227. Dong, S.-H. et al. The enterococcal cytolysin synthetase has an unanticipated lipid kinase fold. Elife 4, e07607 (2015).

228. Schmidt, E. W. et al. Patellamide A and C biosynthesis by a microcin-like pathway in Prochloron didemni, the cyanobacterial symbiont of Lissoclinum patella. Proc. Natl. Acad. Sci. U. S. A. 102, 7315–7320 (2005).

229. Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).

230. Goldenberg, O., Erez, E., Nimrod, G. & Ben-Tal, N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. 37, 323–327 (2009).

186

231. Baker, N. A., Sept, D., Joseph, S., Holst, M. J. & McCammon, J. A. Electrostatics of nanosystems: application to microtubules and the ribosome. Proc. Natl. Acad. Sci. U. S. A. 98, 10037–10041 (2001).

232. Dolinsky, T. J. et al. PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res. 35, 522–525 (2007).

233. Komath, S. S., Kavitha, M. & Swamy, M. J. Beyond carbohydrate binding: new directions in plant lectin research. Org. Biomol. Chem. 4, 973–988 (2006).

234. Sharon, N. & Lis, H. History of lectins: from hemagglutinins to biological recognition molecules. Glycobiology 14, 53–62 (2004).

235. Sharon, N. Lectins: carbohydrate-specific reagents and biological recognition molecules. J. Biol. Chem. 282, 2753–2764 (2007).

236. Matsuda, A. et al. Lectin microarray-based sero-biomarker verification targeting aberrant O-linked glycosylation on Mucin 1. Anal. Chem. 87, 7274–7281 (2015).

237. Mody, R., Joshi, S. & Chaney, W. Use of lectins as diagnostic and therapeutic tools for cancer. J. Pharmacol. Toxicol. Methods 33, 1–10 (1995).

238. Madariaga, D. et al. Detection of tumor-associated glycopeptides by lectins: the peptide context modulates carbohydrate recognition. ACS Chem. Biol. 10, 747–756 (2015).

239. Varrot, A., Basheer, S. M. & Imberty, A. Fungal lectins: structure, function and potential applications. Curr. Opin. Struct. Biol. 23, 678–685 (2013).

240. Singh, R. S., Bhari, R. & Kaur, H. P. Characteristics of yeast lectins and their role in cell-cell interactions. Biotechnol. Adv. 29, 726–731 (2011).

241. Nowacka, N. et al. Antibacterial, antiradical potential and phenolic compounds of thirty-one Polish mushrooms. PLoS One 10, e0140355 (2015).

242. Chang, C.-J. et al. Ganoderma lucidum reduces obesity in mice by modulating the composition of the gut microbiota. Nat. Commun. 6, 7489 (2015).

243. Seow, S. L. et al. Lignosus rhinocerotis (Cooke) Ryvarden mimics the neuritogenic activity of nerve growth factor via MEK/ERK1/2 signaling pathway in PC-12 cells. Sci. Rep. 5, 16349 (2015).

244. Singh, R. S., Bhari, R. & Kaur, H. P. Mushroom lectins: current status and future perspectives. Crit. Rev. Biotechnol. 30, 99–126 (2010).

187

245. van Eerde, A., Grahn, E. M., Winter, H. C., Goldstein, I. J. & Krengel, U. Atomic-resolution structure of the α-galactosyl binding Lyophyllum decastes lectin reveals a new protein family found in both fungi and plants. Glycobiology 25, 492–501 (2015).

246. Vetchinkina, E. P., Nikitina, V. E., Tsivileva, O. M. & Garibova, L. V. Activity of Lentinus edodes intracellular lectins at various developmental stages of the fungus. Appl. Biochem. Microbiol. 44, 76–83 (2008).

247. Bleuler-MartÍnez, S. et al. A lectin-mediated resistance of higher fungi against predators and parasites. Mol. Ecol. 20, 3056–3070 (2011).

248. Ho, J. C. K., Sze, S. C. W., Shen, W. Z. & Liu, W. K. Mitogenic activity of edible mushroom lectins. Biochim. Biophys. Acta - Gen. Subj. 1671, 9–17 (2004).

249. Zhang, G. Q., Sun, J., Wang, H. X. & Ng, T. B. A novel lectin with antiproliferative activity from the medicinal mushroom Pholiota adiposa. Acta Biochim. Pol. 56, 415–421 (2009).

250. Sun, H., Zhao, C. G., Tong, X. & Qi, Y. P. A lectin with mycelia differentiation and antiphytovirus activities from the edible mushroom Agrocybe aegerita. J. Biochem. Mol. Biol. 36, 214–222 (2003).

251. Patel, S. & Goyal, A. Recent developments in mushrooms as anti-cancer therapeutics: a review. 3 Biotech 2, 1–15 (2012).

252. Bovi, M. et al. Structure of a lectin with antitumoral properties in king bolete (Boletus edulis) mushrooms. Glycobiology 21, 1000–1009 (2011).

253. Pohleven, J. et al. Bivalent carbohydrate binding is required for biological activity of Clitocybe nebularis lectin (CNL), the N,N′-diacetyllactosediamine (GalNAcβ1-4GlcNAc,LacdiNAc)-specific lectin from basidiomycete C. nebularis. J. Biol. Chem. 287, 10602–10612 (2012).

254. Žurga, S. et al. A novel β-trefoil lectin from the parasol mushroom (Macrolepiota procera) is nematotoxic. FEBS J. 281, 3489–3506 (2014).

255. Wu, L., Wu, Z., Lin, Q. & Xie, L. Purification and activities of an alkaline protein from mushroom Coprinus camatus. Acta Microbiol. Sin. 43, 793–798 (2003).

256. Simossis, V. A. & Heringa, J. PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information. Nucleic Acids Res. 33, 289–294 (2005).

257. Stojković, D. et al. Nutrients and non-nutrients composition and bioactivity of wild and cultivated Coprinus comatus (O.F.Müll.) Pers. Food Chem. Toxicol. 59, 289–296 (2013).

188

258. Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 8, 785–786 (2011).

259. Chuang, G. Y. et al. Computational prediction of N-linked glycosylation incorporating structural properties and patterns. Bioinformatics 28, 2249–2255 (2012).

260. Hamby, S. E. & Hirst, J. D. Prediction of glycosylation sites using random forests. BMC Bioinformatics 9, 500 (2008).

261. Yao, J. X. ACORN in CCP4 and its applications. Acta Crystallogr. Sect. D Biol. Crystallogr. 58, 1941–1947 (2002).

262. Rappleye, J., Innus, M., Weeks, C. M. & Miller, R. SnB version 2.2: an example of crystallographic multiprocessing. J. Appl. Crystallogr. 35, 374–376 (2002).

263. Rodríguez, D. D. et al. Crystallographic ab initio protein structure solution below atomic resolution. Nat. Methods 6, 651–653 (2009).

264. Read, R. J., Adams, P. D. & McCoy, A. J. Intensity statistics in the presence of translational noncrystallographic symmetry. Acta Crystallogr. Sect. D Biol. Crystallogr. 69, 176–183 (2013).

265. Sliwiak, J., Jaskolski, M., Dauter, Z., McCoy, A. J. & Read, R. J. Likelihood-based molecular-replacement solution for a highly pathological crystal with tetartohedral twinning and sevenfold translational noncrystallographic symmetry. Acta Crystallogr. Sect. D Biol. Crystallogr. 70, 471–480 (2014).

266. Panjikar, S., Parthasarathy, V., Lamzin, V. S., Weiss, M. S. & Tucker, P. A. Auto-Rickshaw: an automated crystal structure determination platform as an efficient tool for the validation of an X-ray diffraction experiment. Acta Crystallogr. Sect. D Biol. Crystallogr. 61, 449–457 (2005).

267. Zhao, H. L. et al. Secretory expression of glycosylated and aglycosylated mutein of onconase from Pichia pastoris using different secretion signals and their purification and characterization. FEMS Yeast Res. 9, 591–599 (2009).

268. Huang, K.-F., Liu, Y.-L., Cheng, W.-J., Ko, T.-P. & Wang, A. H.-J. Crystal structures of human glutaminyl cyclase, an enzyme responsible for protein N-terminal pyroglutamate formation. Proc. Natl. Acad. Sci. U. S. A. 102, 13117–13122 (2005).

269. Zhang, Z. & Henzel, W. J. Signal peptide prediction based on analysis of experimentally verified cleavage sites. Protein Sci. 13, 2819–2824 (2004).

189

270. Miyakawa, T. et al. A secreted protein of the plant-specific DUF26 family functions as a mannose-binding lectin that exhibits antifungal activity. Plant Physiol. 166, 766–778 (2014).

271. Miyakawa, T., Miyazono, K. I., Sawano, Y., Hatano, K. I. & Tanokura, M. Crystal structure of ginkbilobin-2 with homology to the extracellular domain of plant cysteine-rich receptor-like kinases. Proteins Struct. Funct. Bioinforma. 77, 247–251 (2009).

272. Goldstein, I. J. et al. A new α-galactosyl-binding protein from the mushroom Lyophyllum decastes. Arch. Biochem. Biophys. 467, 268–274 (2007).

273. Kashiwagi, T. et al. The novel acidophilic structure of the killer toxin from halotolerant yeast demonstrates remarkable folding similarity with a fungal killer toxin. Structure 5, 81–94 (1997).

274. Kobayashi, Y. et al. Purification, characterization, and sugar binding specificity of an N-Glycolylneuraminic acid-specific lectin from the mushroom Chlorophyllum molybdites. J. Biol. Chem. 279, 53048–53055 (2004).

275. Heras, B. & Martin, J. L. Post-crystallization treatments for improving diffraction quality of protein crystals. Acta Crystallogr. Sect. D Biol. Crystallogr. 61, 1173–1180 (2005).

276. Perez, S. et al. Glyco3D: a portal for structural glycoscience. (2013).

277. Cioci, G. et al. Beta-propeller crystal structure of Psathyrella velutina lectin: an integrin-like fungal protein interacting with monosaccharides and calcium. J. Mol. Biol. 357, 1575–1591 (2006).

278. Grahn, E. et al. Crystal structure of the Marasmius Oreades mushroom lectin in complex with a xenotransplantation epitope. J. Mol. Biol. 369, 710–721 (2007).

279. Sousa et al. High-resolution structure of a new Tn antigen-binding lectin from Vatairea macrocarpa and a comparative analysis of Tn-binding legume lectins. Int. J. Biochem. Cell Biol. 59, 103–110 (2015).

280. McMahon, S. A. et al. The C-type lectin fold as an evolutionary solution for massive sequence variation. Nat. Struct. Mol. Biol. 12, 886–892 (2005).

281. Patil, D. N. et al. Structural investigation of a novel N-acetyl glucosamine binding chi-lectin which reveals evolutionary relationship with class III chitinases. PLoS One 8, e63779 (2013).

282. Feeney, B. & Clark, A. C. Reassembly of active caspase-3 is facilitated by the propeptide. J. Biol. Chem. 280, 39772–39785 (2005).

190

283. Cade, C., Swartz, P., MacKenzie, S. H. & Clark, A. C. Modifying caspase-3 activity by altering allosteric networks. Biochemistry 53, 7582–7595 (2014).

284. Kanagawa, M. et al. Structural basis for multiple sugar recognition of Jacalin-related human ZG16p lectin. J. Biol. Chem. 289, 16954–16965 (2014).

285. Dolinsky, T. J., Nielsen, J. E., McCammon, A. A. & Baker, N. A. PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations. Nucleic Acids Res. 32, W665–667 (2004).

286. Fernández, C. et al. NMR solution structure of the pathogenesis-related protein P14a. J. Mol. Biol. 266, 576–593 (1997).

287. Stintzi, A. et al. Plant ‘pathogenesis-related’ proteins and their role in defense against pathogens. Biochimie 75, 687–706 (1993).

288. Wang, D., Weaver, N. D., Kesarwani, M. & Dong, X. Induction of protein secretory pathway is required for systemic acquired resistance. Science 308, 1036–1040 (2005).

289. Liu, X., Li, H. & Zhang, W. The lectin from Musa paradisiaca binds with the capsid protein of tobacco mosaic virus and prevents viral infection. Biotechnol. Biotechnol. Equip. 28, 408–416 (2014).

290. Abhinav, K. V., Samuel, E. & Vijayan, M. Archeal lectins: An identification through a genomic search. Proteins Struct. Funct. Bioinforma. 84, 21–30 (2016).

291. Ellman, G. L. Tissue sulfhydryl groups. Arch. Biochem. Biophys. 82, 70–77 (1959).

292. Masuko, T. et al. Carbohydrate analysis by a phenol-sulfuric acid method in microplate format. Anal. Biochem. 339, 69–72 (2005).

293. Abrahams, J. P. & Leslie, A. G. Methods used in the structure determination of bovine mitochondrial F1 ATPase. Acta Crystallogr. Sect. D Biol. Crystallogr. 52, 30–42 (1996).

294. Moriarty, N. W., Grosse-Kunstleve, R. W. & Adams, P. D. Electronic Ligand Builder and Optimization Workbench ( eLBOW ): a tool for ligand coordinate and restraint generation. Acta Crystallogr. Sect. D Biol. Crystallogr. 65, 1074–1080 (2009).

295. Demain, A. L. Microbial production of primary metabolites. Naturwissenschaften 67, 582–587 (1980).

296. Maplestone, R. A., Stone, M. J. & Williams, D. H. The evolutionary role of secondary metabolites--a review. Gene 115, 151–157 (1992).

191

297. Cairns, R. A., Harris, I. S. & Mak, T. W. Regulation of cancer cell metabolism. Nat. Rev. Cancer 11, 85–95 (2011).

298. Golubev, A. G. The other side of metabolism: a review. Biochemistry 61, 1443–1460 (1996).

299. Linster, C. L., Van Schaftingen, E. & Hanson, A. D. Metabolite damage and its repair or pre-emption. Nat. Chem. Biol. 9, 72–80 (2013).

300. Keller, M. A., Piedrafita, G. & Ralser, M. The widespread role of non-enzymatic reactions in cellular metabolism. Curr. Opin. Biotechnol. 34C, 153–161 (2015).

301. Notebaart, R. A. et al. Network-level architecture and the evolutionary potential of underground metabolism. Proc. Natl. Acad. Sci. U. S. A. 111, 11762–11767 (2014).

302. Hanson, A. D., Henry, C. S., Fiehn, O. & de Crécy-Lagard, V. Metabolite damage and metabolite damage control in plants. Annu. Rev. Plant Biol. 67, annurev–arplant–043015–111648 (2016).

303. Bradbury, L. M. T., Ziemak, M. J., Elbadawi-Sidhu, M., Fiehn, O. & Hanson, A. D. Plant-driven repurposing of the ancient S-adenosylmethionine repair enzyme homocysteine S-methyltransferase. Biochem. J. 286, 279–286 (2014).

304. Li, K., Li, G., Bradbury, L. M. T., Hanson, A. D. & Bruner, S. D. Crystal structure of the homocysteine methyltransferase MmuM from Escherichia coli. Biochem. J. 473, 277–284 (2016).

305. Ferla, M. P. & Patrick, W. M. Bacterial methionine biosynthesis. Microbiology 160, 1571–1584 (2014).

306. Pasamontes, A. & Garcia-Vallve, S. Use of a multi-way method to analyze the amino acid composition of a conserved group of orthologous proteins in prokaryotes. BMC Bioinformatics 7, 257 (2006).

307. Valley, C. C. et al. The methionine-aromatic motif plays a unique role in stabilizing protein structure. J. Biol. Chem. 287, 34979–34991 (2012).

308. Chuang, P. K. et al. S-Adenosylmethionine methylation. FASEB J 10, 471–480 (1996).

309. Alaminos, M. & Ramos, J. L. The methionine biosynthetic pathway from homoserine in Pseudomonas putida involves the metW, metX, metZ, metH and metE gene products. Arch. Microbiol. 176, 151–154 (2001).

310. Frank, A., Cohen, H., Hoffman, D. & Amir, R. Methionine and S-methylmethionine exhibit temporal and spatial accumulation patterns during the Arabidopsis life cycle. Amino Acids 47, 497–510 (2015).

192

311. Blanco, J., Moore, R. A., Kabaleeswaran, V. & Viola, R. E. A structural basis for the mechanism of aspartate-beta-semialdehyde dehydrogenase from Vibrio cholerae. Protein Sci. 12, 27–33 (2003).

312. Bourgis, F. et al. S-Methylmethionine plays a major role in phloem sulfur transport and is synthesized by a novel type of methyltransferase. Plant Cell 11, 1485–1498 (1999).

313. Ranocha, P. et al. The S-methylmethionine cycle in angiosperms: ubiquity, antiquity and activity. Plant J. 25, 575–584 (2001).

314. Mudd, S. H. & Datko, A. H. The S-Methylmethionine cycle in Lemna paucicostata. Plant Physiol. 93, 623–630 (1990).

315. Hanson, A. D. & Roje, S. One-carbon metablism in higher plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 52, 119–137 (2001).

316. Vinci, C. R. & Clarke, S. G. Homocysteine methyltransferases Mht1 and Sam4 prevent the accumulation of age-damaged (R,S)-AdoMet in the yeast Saccharomyces cerevisiae. J. Biol. Chem. 285, 20526–20531 (2010).

317. Vinci, C. R. & Clarke, S. G. Recognition of age-damaged (R,S)-adenosyl-L-methionine by two methyltransferases in the yeast Saccharomyces cerevisiae. J. Biol. Chem. 282, 8604–8612 (2007).

318. Brosnan, J. T. & Brosnan, M. E. The sulfur-containing amino acids: an overview. J. Nutr. 1636–1640 (2006).

319. Hoffman, J. L. Chromatographic analysis of the chiral and covalent instability of S-adenosyl-L-methionine. Biochemistry 25, 4444–4449 (1986).

320. Neuhierl, B., Thanbichler, M., Lottspeich, F. & Bock, A. A family of S-methylmethionine-dependent thiol/selenol methyltransferases. Role in selenium tolerance and evolutionary relation. J. Biol. Chem. 274, 5407–5414 (1999).

321. Thanbichler, M., Neuhierl, B. & Böck, A. S-Methylmethionine metabolism in Escherichia coli. J. Bacteriol. 181, 662 (1999).

322. Evans, J. C. et al. Betaine-homocysteine methyltransferase: Zinc in a distorted barrel. Structure 10, 1159–1171 (2002).

323. Evans, J. C. et al. Structures of the N-terminal modules imply large domain motions during catalysis by methionine synthase. Proc. Natl. Acad. Sci. U. S. A. 101, 3729–3736 (2004).

324. Ying, J. et al. Molecular variation and horizontal gene transfer of the homocysteine methyltransferase gene mmuM and its distribution in clinical pathogens. Int. J. Biol. Sci. 11, 11–21 (2015).

193

325. Pejchal, R. & Ludwig, M. L. Cobalamin-independent methionine synthase (MetE): A face-to-face double barrel that evolved by gene duplication. PLoS Biol. 3, 0254–0265 (2005).

326. Datta, S., Koutmos, M., Pattridge, K. A., Ludwig, M. L. & Matthews, R. G. A disulfide-stabilized conformer of methionine synthase reveals an unexpected role for the histidine ligand of the cobalamin cofactor. Proc. Natl. Acad. Sci. U. S. A. 105, 4115–4120 (2008).

327. El-Hajj, Z. W., Reyes-Lamothe, R. & Newman, E. B. Cell division, one-carbon metabolism and methionine synthesis in a metK-deficient Escherichia coli mutant, and a role for MmuM. Microbiology 159, 2036–2048 (2013).

328. Szegedi, S. S., Castro, C. C., Koutmos, M. & Garrow, T. A. Betaine-homocysteine S-methyltransferase-2 is an S-methylmethionine-homocysteine methyltransferase. J. Biol. Chem. 283, 8939–8945 (2008).

329. Ganu, R. S. et al. Evolutionary analyses and natural selection of betaine-homocysteine S-methyltransferase (BHMT) and BHMT2 Genes. PLoS One 10, e0134084 (2015).

330. Vinci, C. R. & Clarke, S. G. Yeast, plants, worms, and flies use a methyltransferase to metabolize age-damaged (R,S)-AdoMet, but what do mammals do? Rejuvenation Res. 13, 362–364 (2010).

331. Wierenga, R. K. The TIM-barrel fold: A versatile framework for efficient enzymes. FEBS Lett. 492, 193–198 (2001).

332. González, B., Pajares, M. A., Martínez-Ripoll, M., Blundell, T. L. & Sanz-Aparicio, J. Crystal structure of rat liver betaine homocysteine S-methyltransferase reveals new oligomerization features and conformational changes upon substrate binding. J. Mol. Biol. 338, 771–782 (2004).

333. Castro, C. et al. Dissecting the catalytic mechanism of betaine-homocysteine S-methyltransferase by use of intrinsic tryptophan fluorescence and site-directed mutagenesis. Biochemistry 43, 5341–5351 (2004).

334. González, B. et al. Active-site-mutagenesis study of rat liver betaine-homocysteine S-methyltransferase. Biochem. J. 370, 945–952 (2003).

335. Mládková, J. et al. Specific potassium ion interactions facilitate homocysteine binding to betaine-homocysteine S-methyltransferase. Proteins 82, 2552–2564 (2014).

336. Vagin, A. & Teplyakov, A. Molecular replacement with MOLREP. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 22–25 (2010).

194

337. Noske, R., Cornelius, F. & Clarke, R. J. Investigation of the enzymatic activity of the Na+,K+-ATPase via isothermal titration microcalorimetry. Biochim. Biophys. Acta - Bioenerg. 1797, 1540–1545 (2010).

195

BIOGRAPHICAL SKETCH

Kunhua Li was born in Kunming, Yunnan province, China in 1988. He attended

Yunnan University Secondary Middle School, and Kunming No. 1 High School. Kunhua

represented Yunnan Province in the 2006 National Chemistry Olympiad Competition

and was admitted to Peking University (PKU) for the undergraduate study, where he

majored in Chemistry. Kunhua joined the key Laboratory of Polymer Chemistry under

Professor Yu-Guo Ma in 2007, and focused his undergraduate research on the

synthesis and characterization of amphiphilic supramolecules. Kunhua also worked in

the College of Pharmaceutical Science 2009-2010, which inspired his interests in

biochemistry. Kunhua received multiple awards while at PKU, including several

outstanding student awards and two grants for undergraduate research.

After obtaining Bachelor of Science (BS) degree from PKU, Kunhua moved to the

University of Florida (UF) to begin his PhD in Biochemistry, where he worked under

Professor Steven D. Bruner. While at UF, Kunhua focused on natural product pathways

and have utilized protein crystallography, organic synthesis, chemical biology, cell

biology heavily in his projects. Kunhua worked on the siderophore-based iron

acquisition and utilization project, which resulted in several publications. Kunhua

participated in several long-standing collaborative projects with Professor Andrew D.

Hanson and Professor Yousong Ding, parts of which are highlighted in this dissertation.

Kunhua also focused on the structural insight into the microviridin biosynthesis pathway

as his dissertation project. Besides graduate research, Kunhua also devoted himself in

teaching and attained Master of Science in Teaching, Chemistry in 2013. While at UF,

196

Kunhua was named outstanding international student; awarded the UF CLAS

Dissertation Fellowship; and received several travel grants.

For the further research, Kunhua desires to utilize his interdisciplinary skills to

pursue a career in using structural approaches to understand the protein-protein

interactions relate to cell signaling and regulation. Kunhua is particularly interested in

cancer related research and will join Harvard Medical School (HMS) and Dana-Farber

Cancer Institute (DFCI) as a post-doctoral fellow, where he will work with Professor

Michael J. Eck.

Date post:	31-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

To my Familyufdcimages.uflib.ufl.edu/UF/E0/04/97/65/00001/LI_K.pdfAcoX project and Prof. Adrian...

Documents