CHAPTER 1 INTRODUCTION -...

1

CHAPTER 1

INTRODUCTION

1.1 STATINS

Statins are among the largest selling drugs available worldwide.

They are cholesterol-lowering agents used to treat and prevent coronary

diseases by reducing cholesterol levels in the blood (Maron et al 2000). High

levels of low-density lipoprotein (LDL) form plaques in the artery walls and

decrease the flow of blood and oxygen to the heart which results in heart

attack (Ross 1999). Statins reduce LDL from the bloodstream and decreases

plasma LDL cholesterol levels by blocking certain enzymes in liver that

stimulates LDL receptor thus preventing heart attack (Shepherd 2006). In

addition to treatment of cardiovascular disease, statins can also be used in

treating Alzheimer‟s disease to high blood pressure and cancer (Austen 2002,

Dale et al 2006). The statins can be broadly categorized into fermentation-

derived statins and synthetic statins.

1.1.1 Fermentation-Derived Statins

Lovastatin and pravastatin belong to category of natural statins or

fermentation-derived statins while simvastatin is a semi-synthetic statin.

Lovastatin (Mevacor – manufactured by Merck) is the first statin

approved by FDA in August 1987 (von Haehling and Anker 2005) for the

treatment of hypercholesterolemia.

Pravastatin (Pravachol – manufactured by BMS (Bristol-Myers

Squibb)) is an antilipemic fungal metabolite which is isolated from cultures of

Nocardia autotrophica. It acts as a competitive inhibitor of HMG-CoA

2

reductase, the enzyme which catalyzes the conversion of HMG-CoA to

mevalonate, a key step in cholesterol synthesis (Todd and Goa 1990).

Pravastatin lowers plasma cholesterol and lipoprotein levels.

Simvastatin (Zocor – manufactured by Merck) is a lipid lowering

agent derived synthetically from lovastatin which lowers LDL by up to 50%.

This lovastatin derivative is efficiently synthesized from Monacolin J

(Lovastatin without the side chain) by a process that uses Aspergillus terreus

enzyme acyltransferase LovD. Simvastatin competitively inhibits hepatic

hydroxymethyl-glutaryl coenzyme A (HMG-CoA) reductase. Huvastatin is a

semisynthetic-statin which is also used in the treatment of hyperlipidemia

with good medical effect and lower dose usage. LovD converts 6-hydroxyl-6-

desmethylmonacolin J into huvastatin when alpha-dimethylbutyryl-N-

acetylcysteamine thioesters is used as the acyl donor (Xie et al 2006).

Both pravastatin and simvastatin lowers plasma cholesterol and

lipoprotein levels, and modulates immune responses by suppressing MHC II

(Major Histocompatibility Complex II) on interferon gamma-stimulated,

antigen-presenting cells such as human vascular endothelial cells.

1.1.2 Synthetic Statins

Atorvastatin (Lipitor - manufactured by Pfizer) is used to reduce

the amounts of bad cholesterol (LDL), total cholesterol, triglycerides (another

type of fat), and apolipoprotein B (a protein needed to make cholesterol), but

increases the level of good cholesterol (HDL) in the blood (Funatsu et al

2001). These actions are important in reducing the risk of hardening of the

arteries, which can lead to heart attacks, stroke, and peripheral vascular

disease.

3

Cerivastatin (Baycol- marketed by the pharmaceutical company

Bayer A.G.) is manufactured in the late 1990s as a new synthetic statin to

compete with Pfizer's highly successful Atorvastatin. On August 8, 2001 the

U.S. Food and Drug Administration (FDA) announced that Bayer had

voluntarily withdrawn Baycol from the U.S. market, due to the reports of fatal

rhabdomyolysis (Angelmar 2007).

Fluvastatin (Lescol – manufactured by Novartis) is a synthetic

lipid-lowering agent with antilipidemic and potential antineoplastic properties.

Fluvastatin competitively inhibits hepatic 3-hydroxy-3-methylglutaryl

coenzyme A (HMG-CoA) reductase, which catalyzes the conversion of

HMG-CoA to mevalonate, a key step in cholesterol synthesis. Fluvastatin has

also been shown to exhibit antiviral activity against Hepatitis C (Milazzo et al

2009). The extended release of Fluvastatin is Lescol XL.

Pitavastatin (Mukhtar et al 2005) (usually as a calcium salt) is a

novel member of the medication class of statins. Like the other statins, it is an

inhibitor of HMG-CoA reductase, the enzyme that catalyses the first step of

cholesterol synthesis. It has been available in Japan since 2003, and is being

marketed under license in South Korea and in India.

Crestor (Rosuvastatin) (Teramoto and Watkins 2005) is a new

member of the HMG-CoA reductase inhibitors. In addition to its LDL (low-

density lipoprotein) or bad cholesterol lowering effects, Crestor, has been

shown to provide a significant increase in HDL (high-density lipoprotein) or

good cholesterol.

4

1.2 STATIN MARKET

Merck‟s Mevacor (lovastatin), Bristol-Myer‟s Squibb‟s Pravachol

(pravastatin sodium), Merck‟s Zocor (simvastatin), Novartis‟ Lescol

(fluvastatin sodium), AstraZeneca‟s Crestor (rosuvastatin calcium),

Merck/Schering-Plough Vytorin (ezetimibe/simvastatin) are the leading statin

brands in the world market.

Biocon, India‟s largest and USFDA qualified producer and exporter

of statins, is the market leader. Other prominent players in this market include

Ranbaxy, Lupin, Themis Medicare, RPG Life Sciences, Claris Lifesciences,

Intas Pharma, Medley, Sun Pharma, USV, Concord Biotech, Emcure, Zydus

Cadila, Torrent, Cadila Pharma, Carsyon a division of Micro Labs, Cipla etc.

Among these, Biocon, Ranbaxy, Themis Medicare and Zydus Cadila focus

more on exports.

Pfizer‟s Lipitor (atorvastatin calcium) is the largest selling

pharmaceutical product in the world. Lipitor generated revenues to the tune of

$13.2 billion in the year 2009 (Kaitin 2010).

The market value of lovastatin is higher because it also serves as an

immediate precursor to a multi-billion dollar drug simvastatin (Campbell and

Vederas 2010).

1.3 STATIN SIDE EFFECTS

Typically all statins possess side effects. The most dominant side

effect, cited in the withdrawal of cerivastatin, is rhabdomyolysis (lysis of

rhabdomyose) or weakening of skeletal muscles (Angelmar 2007). Most

statins are metabolized in part by one or more hepatic cytochrome P450

enzymes, leading to an increased potential for drug interactions and problems

with certain foods (such as grapefruit juice). Pitavastatin appears to be a

5

substrate of CYP2C9 and not CYP3A4, which is a common source of

interactions with other statins.

1.4 WHY LOVASTATIN?

Multistep synthesis of lovastatin derivative, performed by

enzymatic transformations using lipases and esterases are important for

engineered biosynthesis of pharmaceutically relevant statin compounds and

cost reduction. Simvastatin is semi-synthetic derivative of the fungal

polyketide, lovastatin, and is an important drug for lowering cholesterol levels

in adults. Production of these statins through combinatorial approach is not

economically feasible because of its complex structure. Alternative methods

like genetic engineering for increasing the yield are also time-consuming and

expensive. Fermentation process can be utilized for increasing yields of

secondary metabolite lovastatin, on addition of certain inhibitors without

involving any additional infrastructure cost.

1.5 LOVASTATIN

Lovastatin, a polyketide, is produced by A. terreus and is used in

various biomedical applications including the treatment of cardiovascular

disease, Alzheimers disease, renal disease and cancer (Seenivasan et al 2008).

Although lovastatin as a drug is usually well tolerated, it rarely causes

myopathy or rhabdomyolysis. As with all statin drugs, drinking grapefruit

juice during therapy increases the risk of serious side effects of decreasing the

metabolism of statins and increasing their plasma concentrations.

1.5.1 Structure

All statins have a common hexahydronapthalene system and a

β-hydroxyl lactone. Lovastatin, being statin family, also possesses a

hexahydronaphthalene structure and a β-hydroxyl lactone (Figure 1.1). The

6

side chains, methylbutyryl group and methyl group are present at 8α and 6α of

naphthalene ring respectively.

Figure 1.1 Chemical structure of lovastatin

1.5.2 Systematic (IUPAC) Name

(1S,3R,7S,8S,8aR)-8-{2-[(2R,4R)-4-hydroxy-6-oxooxan-2-yl]ethyl}-3,

7-dimethyl-1,2,3,7,8,8a-hexahydronaphthalen-1-yl(2S)-2-methylbutanoate.

1.5.3 Chemical Data

Formula C24H36O5

Mol. mass 404.54 g/mol

1.5.4 Pharmacokinetic Data

Oral Bioavailability : <5%

Protein binding : >95%

Metabolism : hepatic (CYP3A substrate)

Half life : 1.1-1.7 hours

Excretion : negligible

Routes : oral

O

O

O

CH3

CH3

OH O

CH3

CH3

H

7

1.5.5 An Overview of Lovastatin Biosynthetic Cluster

In contrast to most primary metabolism genes, the genes involved

in secondary metabolism and certain nutrient utilization pathways are

clustered in fungi. A cluster of 18 genes is involved in lovastatin biosynthesis

in A. terreus. The lovastatin biosynthetic gene cluster consists of lovastatin

nonaketide synthase (lovB), lovastatin diketide synthase (lovF), enoyl

reductase gene (lovC), transesterase gene (lovD), HMG-CoA reductase gene

(ORF8), regulatory genes (lovE and ORF13) and cytochrome P450

monooxygenase gene.

The biosynthesis of lovastatin (Xie et al 2006) is coordinated by

two megasynthases lovastatin nonaketide synthase (LNKS) and lovastatin

diketide synthase (LovF) and numerous accessory enzymes. LNKS is in

iterative form and along with dissociated enoyl reductase (LovC) synthesize

intermediate dihydromonacolin L and methylbutyrate.

The overall chemical reaction catalyzed by the enzyme LNKS

(EC 2.3.1.161) is given in Equation (1.1).

acetyl CoA + 8 malonyl CoA + 11 NADPH + 10 H+ + S-adenosyl-L-methionine dihydromonacolin L + 9 CoA + 8 CO2 + 11 NADP+ + S-adenosyl-L-homocysteine + 6 H2O (1.1)

LDKS is in non-iterative form. The release of diketide is done by

lovD due to the lack of transesterase domain in LDKS. LDKS is involved in

the synthesis of 5-carbon unit 2R-2 methyl butyrate (Equation (1.2)).

Dihydromonacolin L produced by LNKS is converted to monacolin J.

The C-8 hydroxyl group of monacolin J is assembled with 2-methyl butyrate

to form lovastatin.

8

The overall chemical reaction catalyzed by the enzyme LDKS is

given below:

acetyl CoA + 6 malonyl CoA

+ S-adenosyl-L-methionine 2R-2 methyl butyrate (1.2)

There are 35 enzymatic reaction steps (Appendix 1) involved in

biosynthesis of dihydromonacolin L from acetyl coenzyme A, malonyl

coenzyme A, NADPH, and s-adenosylmethionine (SAM). The double

oxidation of dihydromonacolin L to monacolin J is catalyzed by CYP450

oxygenases.

1.6 SOURCE ORGANISMS

The organisms producing lovastatin include Pleurotus ostreatus,

Pencillium citrinin, P. funiculosum, Monascus ruber, M. purpureus, M. paxi,

M. anka, Aspergillus terreus, A. flavipes, A. fischeri, A. flavus, A. umbrosus, A.

parasiticus, Accremonium chrysogenum, Trichoderma viridae,

T. longibrachiatum out of which A. terreus is found to be more prominent in

lovastatin production (Samiee et al 2003).

1.6.1 Aspergillus terreus

Aspergillus terreus is the major source organism for the lovastatin

production (Seenivasan et al 2008). The scientific classification of A. terreus

is given in Table 1.1. It has both medicinal and industrial importance. It acts

as an opportunistic pathogen and produces aspergillosis, a pulmonary disorder

(Mokaddas et al 2010). It also produces commercially important enzymes like

xylanase and various toxins asterrein, patulin, terreic acid, citrinin.

Aspergillus terreus species are ubiquitous in nature. i.e. they are common and

widespread. They are among the most successful groups of molds with

important roles in natural ecosystems and the human economy. The fungus

9

shows tenacious resistance to amphotericin B therapy which is a crucial

treatment for fungal infections.

Table 1.1 The Scientific classification of Aspergillus terreus

Kingdom Fungi

Subkingdom Dikarya

Phylum Ascomycota

Class Eurotiomycetes

Order Eurotiales

Family Trichocomaceae

Genus Aspergillus

Species Terreus

1.6.1.1 Colony characteristics

A. terreus is usually identified by their macroscopic colony

morphology. Their colonies are from cinnamon-buff to sand brown in color.

The colonies grow rapidly and texture is downy to powdery.

1.6.1.2 Microscopy

It has conidial heads and is compactly columnar. Conidiophore

stipes is hyaline and smooth-walled. Conidiogenous cells are biseriate and

phialides limited mainly to the upper part of the subspherical vesicle surface

conidia are in chains, round to ovoid, hyaline to slightly yellow and smooth-

walled. The microscopic structure of A. terreus NIH2624 is given in

Figure 1.2.

10

Figure 1.2 The microscopic structure of A. terreus NIH2624

Courtesy: (http://www.mycology.adelaide.edu.au/Fungal_Descriptions/

Hyphomycetes_(hyaline)/Aspergillus/terreus.html)

1.6.1.3 Pathogenicity

The species occasionally causes pulmonary aspergillosis in human.

The fungus is also found as an isolate from otomycosis (ear infection) and

onychomycosis (infection of finger or toe nails).

1.6.1.4 Ecology

Aspergillus terreus is cosmopolitan and more common in tropical

or subtropical areas. It grows in warmer soil and in grains, straw, cotton, and

decomposing vegetation.

1.6.1.5 Genome information

The whole genome sequence analysis of A. terreus is required for

attaining much knowledge on the metabolism which facilitates lovastatin

biosynthesis. Although Aspergillus terreus ATCC 20542 is widely used in

industry for the production of lovastatin, the genome sequence is not available

for further computational analysis. Aspergillus terreus strain NIH2624,

http://www.mycology.adelaide.edu.au/Fungal_Descriptions/%20Hyphomycetes_(hyaline)/Aspergillus/terreus.html

http://www.mycology.adelaide.edu.au/Fungal_Descriptions/%20Hyphomycetes_(hyaline)/Aspergillus/terreus.html

11

sequenced by Broad Fungal Genome Initiative funded by the National

Institute of Allergy and Infectious Disease (NIAID) through the Broad's

Microbial Sequencing Center (MSC) (http://www.broadinstitute.org/seq/msc)

using whole genome shotgun sequencing method, was used for computational

analysis in this study. The genome of A. terreus is haploid in nature with 35

MB size and organized in 8 chromosomes. It has about 50-60% of G+C

content. The genome assembly has 267 contigs placed into 26 scaffolds.

1.7 THE INTEGRATED APPROACH FOR AUGMENTING

INDUSTRIAL PRODUCTS

Several tools are being developed for handling the huge dynamic

data stored in databases. The integration of tools in different areas of

Bioinformatics to understand the biological systems can be further utilized in

augmenting industrial bio-products. The exploitation of the data obtained

from genomics and the related research areas of genome wide

transcriptomics, proteomics, and metabolomics helps in understanding of

biological systems. The production of desired industrial bio-products can be

improved using stoichiometric or kinetic modeling approaches (Reed et al

2010). Oksman-Caldentey and Saito (2005) have integrated genomic data

with metabolic profiles for identification of key genes that could be

engineered for the production of improved crop plants. L-valine production

by Escherichia coli is improved through transcriptome analysis and gene

knockout simulation of the in silico genome-scale metabolic network (Park

et al 2007). Askenazi et al (2003) used association analysis of transcription

and metabolic profiles for improving the strain for lovastatin production.

Aldor and Keasling (2003) have used external substrate manipulation,

inhibitor addition, recombinant gene expression, host cell genome

manipulation and protein engineering of polyhydroxyalkanoates (PHA)

http://www.broadinstitute.org/seq/msc

12

biosynthetic enzymes. Thus the integration of different strategies can be used

for manipulation of important secondary metabolites.

1.7.1 Functional Annotation

The function of more than 40% of newly sequenced genomes is

initially not available and they are labeled as hypothetical proteins (protein

with unknown function). These hypothetical proteins lead to gaps in

metabolic networks which can be filled from functional reannotation data

(Vongsangnak et al 2008). Functional annotation is assigning the function of

genes/proteins. The computational method of functional annotation is a rapid

way of studying the bio-molecular activities of the protein since the

conventional method of functional assignment through biochemical methods

are time consuming and cost ineffective (Joshi et al 2004). An integrated

genome-scale reannotation is the most promising approach to predict the

functions of hypothetical proteins (Rajadurai et al 2011). There are different

meta-servers available for functional annotation providing automated analysis

of protein sequences. JAFA (Friedberg et al 2006), GeneQuiz (Andrade et al

1999), Pedant (Frishman et al 2001), IprScan (Hunter et al 2009), Phydbac

(Enault et al 2005) are meta-servers used for functional annotation. The most

popular method used for annotation is by sequence similarity or homology-

based inferences. Functional annotations by homology search of similar

sequences in the available databases are most often used (Itoh et al 2005).

Homologous proteins are proteins in different organisms that are likely to

share common ancestry. Homology search is an alignment of proteins or DNA

to measure the similarity. The alignment may be local in which a portion of

the sequence matches with other and global in which the whole sequence

match with the other. The Basic Local Alignment Search Tool -BLAST

(Altschul et al 1990) is a standard tool used for finding the homologous

relationship between sequences. BLAST search allows the user to identify the

13

sequence of interest similar to library of sequences and calculates the

statistical significance of matches. BLAST can be used to infer functional and

evolutionary relationships between sequences as well as help identify

members of gene families. AIM-BLAST (Aravindhan et al 2009) facilitates

the multiple sequences BLAST using AJAX as an interface. The proteins

which do not share similar functions with another organism must be analyzed

based on the perspective of biological significant sites, domain, orthologous

or subfamilies. The profiles, families and domains are stored in databases and

search against these databases provide functions for specific proteins. Tools

like KOGnitor (Tatusov et al 2003), OrthoMCL-DB (Chen et al 2006), Pfam

(Bateman et al 2004) CDD (Marchler-Bauer et al 2005), ProDom (Servant et

al 2002), BLAST (Altschul et al 1990), ScanProsite (Gasteiger et al 2003),

PRINTS (Attwood et al 2003), SMART (Letunic et al 2012), Bioinfotracker

(Ramesh Kumar et al, 2009) provide search function for identifying specific

protein sequences. When comparing the protein sequences, there are some

sequences sharing a short region or “motif” in common that is explicit to

specific functions. Thus, identifying such distinctive motif patterns in the

protein sequences could help in predicting the functions of unannotated

proteins with similar motifs. The ScanProsite makes use of context-dependent

annotation templates to discover functional and structural intra-domain

residues by scanning the protein sequences for the occurrence of possible

motifs and predicts their function.

Knowledge of protein domains and their organization helped in

protein sequence analysis. Domains are conserved between protein families

and they also aid in predicting function of the protein although no orthologous

gene was found by a homology search. ProDom can be useful in providing

functional information of uncharacterized proteins by carrying out a global

comparison of the submitted sequence against the structural, functional and

evolutionary relationships of known protein sequences. Functional annotation

14

based on such domain super families will support enhanced knowledge on the

protein functions. The sequences that are likely to perform the similar

function/activities in same or different species are of homologous origin.

Homologs are of two classes. The sequences performing the same function in

one species are paralogs, while same function in different species are called

orthologs. Knowledge detection of orthologous sequences is expected to have

the highest level of pairwise similarity. However, inference accuracy also

depends on evolutionary distance and the particular functional attribute under

consideration. The database of Clusters of Orthologous Groups of proteins

(COGs) consists of information on the classification of protein sequences

based on their phylogenies. Family based classification also remains as an

important means of providing functional annotation for the biological

sequences. Pfam is a collection of multiple sequences alignments and profile

Hidden Markov Models that represent protein families. The integrated

approach using the above methodologies is likely to find the function with less

frequency of errors and detect more relevant function of an uncharacterized

protein.

1.7.2 Comparative Genomics

Comparative genomics is the analysis and comparison of genomes

of two or more species. The comparison is usually done using alignment of

two protein sequences (pairwise) or can be extended to alignment of multiple

protein sequences. The alignment can be made either local or global. Global

alignments, align entire residues in sequences and uses Needleman–Wunsch

algorithm (Needleman and Wunsch, 1970), based on dynamic programming.

Local alignments are more useful for dissimilar sequences that are suspected

to contain regions of similarity or similar sequence motifs within their larger

sequence context. The Smith–Waterman (Smith and Waterman, 1981)

algorithm is a general local alignment method based on dynamic programming.

BLAST is the standard local alignment search tool used universally. The

15

major application of comparative genomics is in the field of drug discovery

and the genome comparison of pathogenic and non-pathogenic organisms

provides similarity and differences between organisms in which the difference

of genes obtained can be used as a possible target for drug discovery. Another

major application of comparative genomics is functional reannotation. In

general, the genome of any microorganisms has only 60% of gene functions

documented and the remaining 40% is indicated as hypothetical protein or

protein of unknown function. The most popular approach of comparative

genomics is the study of phylogenetic relationship that conveys knowledge on

the evolutionary relationship between organisms which can be used for

functional annotation. Most of the available comparative genomics tools like

BLAST (Altschul et al 1990), MUMmer (Schatz et al 2007), VISTA (Mayor

et al 2000), CGAT (Uchiyama et al 2006), PipMaker (Schwartz et al 2000),

WABA (Baillie and Rose 2000), ACGT (Xie et al 2003), genoPlotR (Guy et

al 2010), CGView Server (Grant and Stothard 2008) etc., are primarily

alignment-based visualization tools. Very few tools provide information for

both, homologous and non-homologous genes in a tabular format for

microorganism‟s comparison. Comprehensive information of homologous genes of known and unknown functions in tabular format placed aside to help

in rapid functional annotation transfer.

1.7.3 Protein Interaction

Although proteins perform function independently most proteins

interact with others for regulating the function of proteins and its biological

activity. Most of the enzyme activities are mediated by these protein

interactions. (Sharan et al 2006). Interaction between proteins is used for

determining the functional association of proteins (Deng et al 2003). It also

provides information such as involvement in specific pathways, or confirming

existing annotations provided by other methods. The protein interaction can

also be used for discovering unknown protein function in a pathway and also

cross talks between pathways (Van Leene 2010). Thus the analysis of protein

interaction data provides a framework for understanding systems biology (De

Las Rivas et al 2010). Different computational methods are available for

16

predicting protein interactions and integrating the different sources of

evidence provides a comprehensive set of binary interaction partners. The

fusion events (Phizicky et al 1995) of the proteins in closely related genomes

are used for determining the interacting proteins. The proteins are supposed to

be fused, (Enright et al 1999) if they satisfy the triangular inequality. Suppose

two proteins A and B of Organism 1 matches with protein C of Organism 2 in

such a way that A <-> C and B <-> C where A ≠ B, then A and B are said to be interacting. Fused (composite) proteins in a given reference genome are

detected by searching for unfused component protein sequences that are

homologous to the reference protein, but not to each other. These unfused

query sequences align to different regions of the reference protein, indicating

that they are composite proteins resulting from a gene fusion event. Candidate

fusion genes, detected in this way are said to be physically interacting.

Phylogenetic profiles are applied for finding co-evolution in organisms with

whole genome sequence (Kim et al 2006). Cellular functions are performed

by means of physical interaction between proteins, which takes place using

interacting domains. The interaction information of Pfam domain pairs is

obtained from DOMINE database (Raghavachari et al 2008), a database of

known and predicted interactions. The physical contacts can be determined

using domain based interactions. Consider domains D1 and D2 are present in

proteins P1 and P2 respectively, i.e. P1 has {D1} and P2 has {D2}, then P1

and P2 are predicted as interacting if D1 and D2 are interacting. A wide range

of computational tools like STRING (von Mering et al 2005) APID (Prieto et

al 2006), ADVICE (Tan et al 2004), IPPRED (Goffard et al 2003), ISPOT

(Brannetti et al 2003) is available online for the prediction of protein

interactions.

1.7.4 Metabolic Pathway Reconstruction

Metabolic pathways are series of chemical reactions occurring

within a cell and those series of reactions are interlinked together for a

continuous flow of biochemical process (Thomas et al 2007). This flow of

process is controlled by certain enzymes acting on reactants such as

17

metabolite or biochemicals which are involved in the accomplishment of

inhibiting or reversing the reaction flow. The reactions can be of type single-

substrate; single-product, single-substrate; multi-products, multi-substrates;

multi-products and multi-substrates; single-product. The reactants and

products can be joined together based on the substrate-product relation to

form pathways. This method of joining reactants and products to form a

pathway from seed metabolites is known as Bottom-up approach (Natalie et al

2006). The metabolic pathway reconstructed can be further used in metabolic

flux analysis. The enzyme assigned to reaction catalyzes the reaction flow in a

pathway and can inhibit or reverse the reaction flow. Formation of certain

products can be inhibited or augmented by either increasing flux to desired

products or diverting flux from undesired products in the pathway (Siahsar

et al 2011). So to improve the productivity of industrially important

bio-product lovastatin from A. terreus, the detailed understanding of

metabolic process involved in the production is crucial and the information

can be utilized for engineering strain (Askenazi et al 2003). There are many

pathway databases such as KEGG (Kanehisa et al 2008), Cyclone (Francois et

al 2007), MedicCyc (Urbanczyk-Wochniak and Sumner 2007), AsperCyc

(http://130.88.248.2/) and PathCase (Elliott et al 2008) available for metabolic

pathway information. AsperCyc is a database having the metabolic pathways

of Aspergillus group, which uses the Pathway-tools for construction of

metabolic pathways in Aspergillus. Reconstruction of genome-scale

metabolic pathway database is essential to understand cellular processes.

A database is required to reduce the accessible complexity of retrieving

metabolic pathway data. The sub networks from the database can be used to

fill metabolic holes or create alternative paths by means of Network

Expansion method (Handorf et al 2005).

18

1.7.5 Metabolic Flux Analysis

The traditional method for improvement of yield of industrial

bio-products is random mutagenesis and screening which is time consuming.

Flux Balance Analysis (FBA) can be used to identify the gene knockout

targets, which is more efficient than the traditional methods. FBA is a

mathematical approach for analyzing the flow of metabolites through a

metabolic network. Regulation of flux is vital for all metabolic pathway

activities under different conditions like glucose, glyoxylate, acetate. Flux is

regulated by the enzyme involved in pathway within the cell. The

understanding of cellular behavior can be achieved through modeling and

analysis of metabolic pathways, regulatory and signal transduction networks.

The different modeling techniques include interaction-based modeling (graph-

based representations of networks), constraint-based modeling (stoichiometric

representation) and mechanistic modeling (kinetic parameters and

stoichiometric representation). In most cases the kinetic data is not available

for simulation and so constraint based modeling is used as an alternative for

mechanistic modeling. The constraints to be considered for constraint based

modeling include physico-chemical constraints, spatial or topological

constraints, environmental constraints or gene regulatory constraints. Flux

balance analysis (FBA) is a constraint-based approach and it involves

(i) Mathematical representation of individual reactions and their

constraints.

(ii) The mass balance of steady state reactions is given in

Equation (1.3). The assumption of a steady state:

S x V = 0. (1.3)

19

where v=[v1v2 … vn, b1 b2 …extnb ]T, vi signifies the internal fluxes, bi

represents the exchange fluxes in the system, ni is the number of internal

metabolites and next is the number of external metabolites in the system.

(iii) Defining biologically relevant objective function and addition

of other biochemical constraints.

(iv) Optimization of the objective function f (v) (Equation (1.4)).

r

i i

i 1

f (v) c v (1.4)

A wide spectrum of tools available for the simulation and analysis

of biochemical systems includes MATLAB: CellNetAnalyzer (Klamt et al

2007), the COBRA toolbox (Becker et al 2007), OptKnock (Burgard et al

2003), OptStrain (Pharkya et al 2004), OptForce (Ranganathan et al 2010),

METATOOL (von Kamp and Schuster 2006), MetaFluxNet (Lee et al 2003)

and General Algebraic Model System (GAMS) (http://www.gams.com/)

software. The major application of flux balance analysis is in the field of

drug discovery through the analysis of cellular metabolism and in the

production of industrially important metabolites (Raman and Chandra 2009).

By altering the bounds on certain reactions, growth on different media or with

multiple gene knockouts, the metabolic pathway for production of bio-

products can be simulated (Edwards et al 2001). Linear programming is

widely used to identify flux distribution for optimization of the biological

objective function. The objective function here is improvement of lovastatin

production by reducing the byproducts formation through inhibition of certain

enzymes and redirection of the flux flow through lovastatin biosynthetic

pathway.

http://www.gams.com/

20

1.8 AUGMENTATION OF LOVASTATIN BIOSYNTHESIS IN

WETLAB

There are different ways of enhancing lovastatin biosynthesis using

fermentation techniques and gene silencing techniques. Of the five statins

currently prescribed by physicians, three (pravastatin, simvastatin, and

lovastatin) are derived by fermentation in A. terreus and is found to be mostly

accumulated in mycelia (Wei et al 2007). Lovastatin biosynthesis by

fermentation reduces the cost of production compared to cost of chemical

synthesis (Seenivasan et al 2008). The various research methodologies for

augmenting lovastatin production in different countries and their production

amount are given in Appendix 2.

1.8.1 Substrate Concentration

Lovastatin production is dependent on the amount of substrates

provided. Lovastatin biosynthesis is high with the addition of glutamate and

histidine as substrates. In the presence of glucose and glutamate, the synthesis

of lovastatin is initiated when glucose consumption is leveled off. High yield

of lovastatin is obtained in the presence of slowly utilized carbon sources such

as lactose and glycerol (Pecyna and Bizukojc 2011). Lactose produces least

biomass but higher lovastatin production and soluble starch is more beneficial

to lovastatin production (Jia et al 2009a). Rice or wheat bran can be a suitable

substrate for lovastatin production (Jaivel and Marimuthu 2010, Sri

Ramireddy et al 2011, Patil et al 2011). Chang et al (2002) used Rice-

Glycerol complex medium for improving lovastatin production. Kaur et al

(2010), Osman et al (2011), Latha et al (2012) and Szakacs et al (1998) used

soybean, baggase, coconut oil cake and cheese whey as substrate for

lovastatin production respectively. Hajjaj et al (2001) states that for lovastatin

biosynthesis carbon source starvation is required, in addition to glucose

repression. According to Barrios-Gonzalez et al (2008) lovastatin yield can be

21

improved by arresting growth by limiting nitrogen and not by limiting carbon

source. Optimal C: N mass ratio increase lovastatin yield (Lopez et al 2003).

Box–Behnken‟s factorial design (Panda et al 2008) and Response surface

methodology (RSM) technologies (Lopez et al 2004b, Panda et al 2009a,

Panda et al 2009b, Sayyad et al 2007, Pansuriya and Singhal 2010) are also

used for designing optimal medium for lovastatin production. Supplements

like tylosin (Jia et al 2010) and tween80 (Danuri 2008) are also added for

improving the lovastatin production.

1.8.2 Effect of Metal Ions

Jia et al (2009b) studied lovastatin biosynthesis in the presence of

metal ions and achieved the following results. Cu2+ inhibited cell growth with

no influence on lovastatin biosynthesis. Fe2+, Ca2+, Zn2+, Mg2+, Mn2+

promoted cell growth and lovastatin biosynthesis. Zn2+ promotes the activity

of lovD which inturn increases lovastatin production and it also promotes the

upstream steps in lovastatin biosynthesis. LovE and orf 13 requires zinc ion

participating in regulation during lovastatin biosynthesis. Highest lovastatin

improvement is with Mn2+ on raise from 1 to 2mm.

1.8.3 pH and Dissolved Oxygen Level

Maintaining pH above 4.5 is crucial with glucose as carbon source.

Fermentations are typically carried out at approximately around 28 oC and pH

of 5.8 to 6.3. The dissolved oxygen level is controlled at greater than 40% of

air saturation.

1.8.4 Gene Manipulation

More intense transcription of biosynthetic genes would lead to

enhanced lovastatin production. LovC disruptant is able to convert

22

dihydromonacolin L or monacolin J into lovastatin. LovA mutant has an

unexpected Beta-oxidation and gives a small amount of lovastatin upon

addition of immediate precursor monacolin J (Sorensen et al 2003). Huang

and Li (2009) proved lovE is essential for lovastatin biosynthesis, by

introducing an additional copy of lovE which increases lovastatin production.

In the absence of lovE, no lovastatin production is observed. Expression of

gldB gene in solid state fermentation for lovastatin biosynthesis senses

Osmotic stress (Barrios-Gonzalez et al 2008). When alpha-dimethyl-butyryl-

N-acetylcysteamine thioesters used as acyl donor, LovD converts monacolin J

and 6-hydroxyl -6 desmethyl monacolin J into simvastatin and huvastatin

respectively. LovD and lovF interaction is essential for the lovastatin

biosynthesis (Xie et al 2006). LaeA is a universal regulator of secondary

metabolites. LaeA transcriptionally regulates multiple novel secondary

metabolite clusters and can be used as a potent metabolite-mining tool (Keller

et al 2006). Deletion of laeA, blocks the expression of metabolic gene clusters

which results in lowering the levels of several secondary metabolites (Fox and

Howlett 2008). Lovastatin is a secondary metabolite and its gene transcription

is triggered by over expression of laeA. Over expression of aflR down

regulates laeA transcript level, while over expression of PKA and RasA

completely inhibits laeA expression (Bok and Keller 2004).

1.8.5 Modes of Fermentation

Solid state fermentation shows higher levels of lovastatin

accumulation in the extract (Extracellular and Intracellular) than liquid

submerged fermentation due to accumulation of lovE and lovF transcripts

(Barrios-González et al 2008). Banos et al (2009) optimized lovastatin

production using solid-state fermentation on polyurethane foam. Lovastatin is

mostly accumulated in mycelia (Dewi 2011). Fed batch fermentation is

superior to batch which last for less than 10 days. A feedback regulatory

mechanism exists in lovastatin biosynthesis and eliminating or suppressing

this mechanism greatly enhances the production of lovastatin (Lopez et al

2004a).

23

1.8.6 Growth Mechanism

Pelleted growth of A. terreus yields higher titers of lovastatin than

filamentous growth (Lopez et al 2005, Porcel et al 2007). Formation of A.

terreus secondary metabolite is mixed growth associated or non-growth

associated (Bizukojc and Pecyna 2011). Highest lovastatin concentration in

batch fermentation is obtained with vegetative inoculums of spherical pellet

and short thick branched-peripheral hyphae by maintaining DOT at 70%

(Kumar et al 2000).

1.8.7 Co-metabolites

The most predominant co-metabolites of lovastatin are

benzophenone and sulochrin. Other co-metabolites are asterric acid,

butyrolactone, citrinin, emodin, itaconic acid, geodin and terretonin.

Production of sulochrin and asterric acid are from common polyketide

precursor emodin (Vinci et al 2001). Sulochrin is a toxic co-metabolite and is

a contaminant which has to be removed in the downstream process. Couch

and Gaucher (2004) genetically disrupted the emodin anthrone PKS gene to

eliminate sulochrin production. Lovastatin biosynthesis proceeds through

utilization of acetate units as precursors in polyketide elongation.

Butyrolactone-I has the ability to induce morphological change and increase

lovastatin production (Schimmel et al 1998).

1.9 OBJECTIVES OF THE RESEARCH

Lovastatin is a statin drug produced by fermentation using

A. terreus. Lovastatin production using the genetic engineering methodology

or isolation of newer improved strains are time-consuming and expensive. The

overall objective is to enhance the production of lovastatin on analyzing whole

24

genome without increasing the cost of the fermentation process by adding

inhibitors in the production medium.

1. To shed light on the entire biochemical role in A. terreus that

provides clues behind the augmentation of lovastatin

production.

2. Gaining a better understanding of sequenced A. terreus from

whole genome sequence analysis to achieve the motivation of

systems biology research “understand gene functions”.

3. Constraint based modeling and flux analysis of central carbon

metabolism to redirect maximum flux through acetyl CoA and

malonyl CoA, precursors for lovastatin biosynthesis.

4. Elimination or reduction of byproducts formation and

increasing flux flow through the desired product lovastatin.

5. Wet lab validation to prove the result obtained from

computational analysis.

1.10 IMPORTANCE OF STUDY

Increasing yields of secondary metabolites of pharmaceutically

important compounds like lovastatin turn out to be the main goals for the

commercial success of a fermentation process. The market value of lovastatin

is higher because it also serves as an immediate precursor to a multi-billion

dollar drug (Campbell and Vederas 2010) simvastatin (Zocor). Aspergillus

terreus ATCC 20542 is widely used in industries for production of lovastatin,

but the whole genome sequence is not available for further analysis. So the

computational analysis of pathways and interaction networks is done for A.

25

terreus strain NIH2624 whose whole genome sequence is available publicly.

Various Bioinformatics tools are employed to understand the biological

system which is further utilized in the augmentation of lovastatin

biosynthesis. Each tool is specialized in its own way of analysis and

combining these tools increases the prediction capability with increasing

confidence level. The integrated functional annotation approach comprising

of homology, feature and orthology predicts the function of A. terreus with

much confidence level. Understanding similarities and differences between

genomes offer necessary information for the reconstruction of metabolic

pathways and targets for strain improvement. Comparison of uncharacterized

proteins with a known protein of closely related Aspergillus species is

performed to acquire consistent functional annotation. Flux optimization was

done on modeled central carbon metabolism to increase the lovastatin

precursors acetyl and malonyl CoA. The protein interaction network analysis

was done for determining the interacting partners of lovastatin biosynthetic

cluster proteins and validation of function. The interacting partners of

lovastatin biosynthetic cluster, if associated with toxins were identified which

serves as a better target to improve lovastatin biosynthesis. The validations of

the clues obtained from computational analysis for improving lovastatin

production were performed through wet lab approach.

Date post:	10-Apr-2019
Category:	Documents
Upload:	hoangkhuong
View:	236 times
Download:	0 times

CHAPTER 1 INTRODUCTION -...

Documents