Date post: | 10-Apr-2019 |
Category: |
Documents |
Upload: | hoangkhuong |
View: | 236 times |
Download: | 0 times |
1
CHAPTER 1
INTRODUCTION
1.1 STATINS
Statins are among the largest selling drugs available worldwide.
They are cholesterol-lowering agents used to treat and prevent coronary
diseases by reducing cholesterol levels in the blood (Maron et al 2000). High
levels of low-density lipoprotein (LDL) form plaques in the artery walls and
decrease the flow of blood and oxygen to the heart which results in heart
attack (Ross 1999). Statins reduce LDL from the bloodstream and decreases
plasma LDL cholesterol levels by blocking certain enzymes in liver that
stimulates LDL receptor thus preventing heart attack (Shepherd 2006). In
addition to treatment of cardiovascular disease, statins can also be used in
treating Alzheimer‟s disease to high blood pressure and cancer (Austen 2002,
Dale et al 2006). The statins can be broadly categorized into fermentation-
derived statins and synthetic statins.
1.1.1 Fermentation-Derived Statins
Lovastatin and pravastatin belong to category of natural statins or
fermentation-derived statins while simvastatin is a semi-synthetic statin.
Lovastatin (Mevacor – manufactured by Merck) is the first statin
approved by FDA in August 1987 (von Haehling and Anker 2005) for the
treatment of hypercholesterolemia.
Pravastatin (Pravachol – manufactured by BMS (Bristol-Myers
Squibb)) is an antilipemic fungal metabolite which is isolated from cultures of
Nocardia autotrophica. It acts as a competitive inhibitor of HMG-CoA
2
reductase, the enzyme which catalyzes the conversion of HMG-CoA to
mevalonate, a key step in cholesterol synthesis (Todd and Goa 1990).
Pravastatin lowers plasma cholesterol and lipoprotein levels.
Simvastatin (Zocor – manufactured by Merck) is a lipid lowering
agent derived synthetically from lovastatin which lowers LDL by up to 50%.
This lovastatin derivative is efficiently synthesized from Monacolin J
(Lovastatin without the side chain) by a process that uses Aspergillus terreus
enzyme acyltransferase LovD. Simvastatin competitively inhibits hepatic
hydroxymethyl-glutaryl coenzyme A (HMG-CoA) reductase. Huvastatin is a
semisynthetic-statin which is also used in the treatment of hyperlipidemia
with good medical effect and lower dose usage. LovD converts 6-hydroxyl-6-
desmethylmonacolin J into huvastatin when alpha-dimethylbutyryl-N-
acetylcysteamine thioesters is used as the acyl donor (Xie et al 2006).
Both pravastatin and simvastatin lowers plasma cholesterol and
lipoprotein levels, and modulates immune responses by suppressing MHC II
(Major Histocompatibility Complex II) on interferon gamma-stimulated,
antigen-presenting cells such as human vascular endothelial cells.
1.1.2 Synthetic Statins
Atorvastatin (Lipitor - manufactured by Pfizer) is used to reduce
the amounts of bad cholesterol (LDL), total cholesterol, triglycerides (another
type of fat), and apolipoprotein B (a protein needed to make cholesterol), but
increases the level of good cholesterol (HDL) in the blood (Funatsu et al
2001). These actions are important in reducing the risk of hardening of the
arteries, which can lead to heart attacks, stroke, and peripheral vascular
disease.
3
Cerivastatin (Baycol- marketed by the pharmaceutical company
Bayer A.G.) is manufactured in the late 1990s as a new synthetic statin to
compete with Pfizer's highly successful Atorvastatin. On August 8, 2001 the
U.S. Food and Drug Administration (FDA) announced that Bayer had
voluntarily withdrawn Baycol from the U.S. market, due to the reports of fatal
rhabdomyolysis (Angelmar 2007).
Fluvastatin (Lescol – manufactured by Novartis) is a synthetic
lipid-lowering agent with antilipidemic and potential antineoplastic properties.
Fluvastatin competitively inhibits hepatic 3-hydroxy-3-methylglutaryl
coenzyme A (HMG-CoA) reductase, which catalyzes the conversion of
HMG-CoA to mevalonate, a key step in cholesterol synthesis. Fluvastatin has
also been shown to exhibit antiviral activity against Hepatitis C (Milazzo et al
2009). The extended release of Fluvastatin is Lescol XL.
Pitavastatin (Mukhtar et al 2005) (usually as a calcium salt) is a
novel member of the medication class of statins. Like the other statins, it is an
inhibitor of HMG-CoA reductase, the enzyme that catalyses the first step of
cholesterol synthesis. It has been available in Japan since 2003, and is being
marketed under license in South Korea and in India.
Crestor (Rosuvastatin) (Teramoto and Watkins 2005) is a new
member of the HMG-CoA reductase inhibitors. In addition to its LDL (low-
density lipoprotein) or bad cholesterol lowering effects, Crestor, has been
shown to provide a significant increase in HDL (high-density lipoprotein) or
good cholesterol.
4
1.2 STATIN MARKET
Merck‟s Mevacor (lovastatin), Bristol-Myer‟s Squibb‟s Pravachol
(pravastatin sodium), Merck‟s Zocor (simvastatin), Novartis‟ Lescol
(fluvastatin sodium), AstraZeneca‟s Crestor (rosuvastatin calcium),
Merck/Schering-Plough Vytorin (ezetimibe/simvastatin) are the leading statin
brands in the world market.
Biocon, India‟s largest and USFDA qualified producer and exporter
of statins, is the market leader. Other prominent players in this market include
Ranbaxy, Lupin, Themis Medicare, RPG Life Sciences, Claris Lifesciences,
Intas Pharma, Medley, Sun Pharma, USV, Concord Biotech, Emcure, Zydus
Cadila, Torrent, Cadila Pharma, Carsyon a division of Micro Labs, Cipla etc.
Among these, Biocon, Ranbaxy, Themis Medicare and Zydus Cadila focus
more on exports.
Pfizer‟s Lipitor (atorvastatin calcium) is the largest selling
pharmaceutical product in the world. Lipitor generated revenues to the tune of
$13.2 billion in the year 2009 (Kaitin 2010).
The market value of lovastatin is higher because it also serves as an
immediate precursor to a multi-billion dollar drug simvastatin (Campbell and
Vederas 2010).
1.3 STATIN SIDE EFFECTS
Typically all statins possess side effects. The most dominant side
effect, cited in the withdrawal of cerivastatin, is rhabdomyolysis (lysis of
rhabdomyose) or weakening of skeletal muscles (Angelmar 2007). Most
statins are metabolized in part by one or more hepatic cytochrome P450
enzymes, leading to an increased potential for drug interactions and problems
with certain foods (such as grapefruit juice). Pitavastatin appears to be a
5
substrate of CYP2C9 and not CYP3A4, which is a common source of
interactions with other statins.
1.4 WHY LOVASTATIN?
Multistep synthesis of lovastatin derivative, performed by
enzymatic transformations using lipases and esterases are important for
engineered biosynthesis of pharmaceutically relevant statin compounds and
cost reduction. Simvastatin is semi-synthetic derivative of the fungal
polyketide, lovastatin, and is an important drug for lowering cholesterol levels
in adults. Production of these statins through combinatorial approach is not
economically feasible because of its complex structure. Alternative methods
like genetic engineering for increasing the yield are also time-consuming and
expensive. Fermentation process can be utilized for increasing yields of
secondary metabolite lovastatin, on addition of certain inhibitors without
involving any additional infrastructure cost.
1.5 LOVASTATIN
Lovastatin, a polyketide, is produced by A. terreus and is used in
various biomedical applications including the treatment of cardiovascular
disease, Alzheimers disease, renal disease and cancer (Seenivasan et al 2008).
Although lovastatin as a drug is usually well tolerated, it rarely causes
myopathy or rhabdomyolysis. As with all statin drugs, drinking grapefruit
juice during therapy increases the risk of serious side effects of decreasing the
metabolism of statins and increasing their plasma concentrations.
1.5.1 Structure
All statins have a common hexahydronapthalene system and a
β-hydroxyl lactone. Lovastatin, being statin family, also possesses a
hexahydronaphthalene structure and a β-hydroxyl lactone (Figure 1.1). The
6
side chains, methylbutyryl group and methyl group are present at 8α and 6α of
naphthalene ring respectively.
Figure 1.1 Chemical structure of lovastatin
1.5.2 Systematic (IUPAC) Name
(1S,3R,7S,8S,8aR)-8-{2-[(2R,4R)-4-hydroxy-6-oxooxan-2-yl]ethyl}-3,
7-dimethyl-1,2,3,7,8,8a-hexahydronaphthalen-1-yl(2S)-2-methylbutanoate.
1.5.3 Chemical Data
Formula C24H36O5
Mol. mass 404.54 g/mol
1.5.4 Pharmacokinetic Data
Oral Bioavailability : <5%
Protein binding : >95%
Metabolism : hepatic (CYP3A substrate)
Half life : 1.1-1.7 hours
Excretion : negligible
Routes : oral
O
O
O
CH3
CH3
OH O
CH3
CH3
H
7
1.5.5 An Overview of Lovastatin Biosynthetic Cluster
In contrast to most primary metabolism genes, the genes involved
in secondary metabolism and certain nutrient utilization pathways are
clustered in fungi. A cluster of 18 genes is involved in lovastatin biosynthesis
in A. terreus. The lovastatin biosynthetic gene cluster consists of lovastatin
nonaketide synthase (lovB), lovastatin diketide synthase (lovF), enoyl
reductase gene (lovC), transesterase gene (lovD), HMG-CoA reductase gene
(ORF8), regulatory genes (lovE and ORF13) and cytochrome P450
monooxygenase gene.
The biosynthesis of lovastatin (Xie et al 2006) is coordinated by
two megasynthases lovastatin nonaketide synthase (LNKS) and lovastatin
diketide synthase (LovF) and numerous accessory enzymes. LNKS is in
iterative form and along with dissociated enoyl reductase (LovC) synthesize
intermediate dihydromonacolin L and methylbutyrate.
The overall chemical reaction catalyzed by the enzyme LNKS
(EC 2.3.1.161) is given in Equation (1.1).
acetyl CoA + 8 malonyl CoA + 11 NADPH + 10 H+ + S-adenosyl-L-methionine dihydromonacolin L + 9 CoA + 8 CO2 + 11 NADP+ + S-adenosyl-L-homocysteine + 6 H2O (1.1)
LDKS is in non-iterative form. The release of diketide is done by
lovD due to the lack of transesterase domain in LDKS. LDKS is involved in
the synthesis of 5-carbon unit 2R-2 methyl butyrate (Equation (1.2)).
Dihydromonacolin L produced by LNKS is converted to monacolin J.
The C-8 hydroxyl group of monacolin J is assembled with 2-methyl butyrate
to form lovastatin.
8
The overall chemical reaction catalyzed by the enzyme LDKS is
given below:
acetyl CoA + 6 malonyl CoA
+ S-adenosyl-L-methionine 2R-2 methyl butyrate (1.2)
There are 35 enzymatic reaction steps (Appendix 1) involved in
biosynthesis of dihydromonacolin L from acetyl coenzyme A, malonyl
coenzyme A, NADPH, and s-adenosylmethionine (SAM). The double
oxidation of dihydromonacolin L to monacolin J is catalyzed by CYP450
oxygenases.
1.6 SOURCE ORGANISMS
The organisms producing lovastatin include Pleurotus ostreatus,
Pencillium citrinin, P. funiculosum, Monascus ruber, M. purpureus, M. paxi,
M. anka, Aspergillus terreus, A. flavipes, A. fischeri, A. flavus, A. umbrosus, A.
parasiticus, Accremonium chrysogenum, Trichoderma viridae,
T. longibrachiatum out of which A. terreus is found to be more prominent in
lovastatin production (Samiee et al 2003).
1.6.1 Aspergillus terreus
Aspergillus terreus is the major source organism for the lovastatin
production (Seenivasan et al 2008). The scientific classification of A. terreus
is given in Table 1.1. It has both medicinal and industrial importance. It acts
as an opportunistic pathogen and produces aspergillosis, a pulmonary disorder
(Mokaddas et al 2010). It also produces commercially important enzymes like
xylanase and various toxins asterrein, patulin, terreic acid, citrinin.
Aspergillus terreus species are ubiquitous in nature. i.e. they are common and
widespread. They are among the most successful groups of molds with
important roles in natural ecosystems and the human economy. The fungus
9
shows tenacious resistance to amphotericin B therapy which is a crucial
treatment for fungal infections.
Table 1.1 The Scientific classification of Aspergillus terreus
Kingdom Fungi
Subkingdom Dikarya
Phylum Ascomycota
Class Eurotiomycetes
Order Eurotiales
Family Trichocomaceae
Genus Aspergillus
Species Terreus
1.6.1.1 Colony characteristics
A. terreus is usually identified by their macroscopic colony
morphology. Their colonies are from cinnamon-buff to sand brown in color.
The colonies grow rapidly and texture is downy to powdery.
1.6.1.2 Microscopy
It has conidial heads and is compactly columnar. Conidiophore
stipes is hyaline and smooth-walled. Conidiogenous cells are biseriate and
phialides limited mainly to the upper part of the subspherical vesicle surface
conidia are in chains, round to ovoid, hyaline to slightly yellow and smooth-
walled. The microscopic structure of A. terreus NIH2624 is given in
Figure 1.2.
10
Figure 1.2 The microscopic structure of A. terreus NIH2624
Courtesy: (http://www.mycology.adelaide.edu.au/Fungal_Descriptions/
Hyphomycetes_(hyaline)/Aspergillus/terreus.html)
1.6.1.3 Pathogenicity
The species occasionally causes pulmonary aspergillosis in human.
The fungus is also found as an isolate from otomycosis (ear infection) and
onychomycosis (infection of finger or toe nails).
1.6.1.4 Ecology
Aspergillus terreus is cosmopolitan and more common in tropical
or subtropical areas. It grows in warmer soil and in grains, straw, cotton, and
decomposing vegetation.
1.6.1.5 Genome information
The whole genome sequence analysis of A. terreus is required for
attaining much knowledge on the metabolism which facilitates lovastatin
biosynthesis. Although Aspergillus terreus ATCC 20542 is widely used in
industry for the production of lovastatin, the genome sequence is not available
for further computational analysis. Aspergillus terreus strain NIH2624,
11
sequenced by Broad Fungal Genome Initiative funded by the National
Institute of Allergy and Infectious Disease (NIAID) through the Broad's
Microbial Sequencing Center (MSC) (http://www.broadinstitute.org/seq/msc)
using whole genome shotgun sequencing method, was used for computational
analysis in this study. The genome of A. terreus is haploid in nature with 35
MB size and organized in 8 chromosomes. It has about 50-60% of G+C
content. The genome assembly has 267 contigs placed into 26 scaffolds.
1.7 THE INTEGRATED APPROACH FOR AUGMENTING
INDUSTRIAL PRODUCTS
Several tools are being developed for handling the huge dynamic
data stored in databases. The integration of tools in different areas of
Bioinformatics to understand the biological systems can be further utilized in
augmenting industrial bio-products. The exploitation of the data obtained
from genomics and the related research areas of genome wide
transcriptomics, proteomics, and metabolomics helps in understanding of
biological systems. The production of desired industrial bio-products can be
improved using stoichiometric or kinetic modeling approaches (Reed et al
2010). Oksman-Caldentey and Saito (2005) have integrated genomic data
with metabolic profiles for identification of key genes that could be
engineered for the production of improved crop plants. L-valine production
by Escherichia coli is improved through transcriptome analysis and gene
knockout simulation of the in silico genome-scale metabolic network (Park
et al 2007). Askenazi et al (2003) used association analysis of transcription
and metabolic profiles for improving the strain for lovastatin production.
Aldor and Keasling (2003) have used external substrate manipulation,
inhibitor addition, recombinant gene expression, host cell genome
manipulation and protein engineering of polyhydroxyalkanoates (PHA)
12
biosynthetic enzymes. Thus the integration of different strategies can be used
for manipulation of important secondary metabolites.
1.7.1 Functional Annotation
The function of more than 40% of newly sequenced genomes is
initially not available and they are labeled as hypothetical proteins (protein
with unknown function). These hypothetical proteins lead to gaps in
metabolic networks which can be filled from functional reannotation data
(Vongsangnak et al 2008). Functional annotation is assigning the function of
genes/proteins. The computational method of functional annotation is a rapid
way of studying the bio-molecular activities of the protein since the
conventional method of functional assignment through biochemical methods
are time consuming and cost ineffective (Joshi et al 2004). An integrated
genome-scale reannotation is the most promising approach to predict the
functions of hypothetical proteins (Rajadurai et al 2011). There are different
meta-servers available for functional annotation providing automated analysis
of protein sequences. JAFA (Friedberg et al 2006), GeneQuiz (Andrade et al
1999), Pedant (Frishman et al 2001), IprScan (Hunter et al 2009), Phydbac
(Enault et al 2005) are meta-servers used for functional annotation. The most
popular method used for annotation is by sequence similarity or homology-
based inferences. Functional annotations by homology search of similar
sequences in the available databases are most often used (Itoh et al 2005).
Homologous proteins are proteins in different organisms that are likely to
share common ancestry. Homology search is an alignment of proteins or DNA
to measure the similarity. The alignment may be local in which a portion of
the sequence matches with other and global in which the whole sequence
match with the other. The Basic Local Alignment Search Tool -BLAST
(Altschul et al 1990) is a standard tool used for finding the homologous
relationship between sequences. BLAST search allows the user to identify the
13
sequence of interest similar to library of sequences and calculates the
statistical significance of matches. BLAST can be used to infer functional and
evolutionary relationships between sequences as well as help identify
members of gene families. AIM-BLAST (Aravindhan et al 2009) facilitates
the multiple sequences BLAST using AJAX as an interface. The proteins
which do not share similar functions with another organism must be analyzed
based on the perspective of biological significant sites, domain, orthologous
or subfamilies. The profiles, families and domains are stored in databases and
search against these databases provide functions for specific proteins. Tools
like KOGnitor (Tatusov et al 2003), OrthoMCL-DB (Chen et al 2006), Pfam
(Bateman et al 2004) CDD (Marchler-Bauer et al 2005), ProDom (Servant et
al 2002), BLAST (Altschul et al 1990), ScanProsite (Gasteiger et al 2003),
PRINTS (Attwood et al 2003), SMART (Letunic et al 2012), Bioinfotracker
(Ramesh Kumar et al, 2009) provide search function for identifying specific
protein sequences. When comparing the protein sequences, there are some
sequences sharing a short region or “motif” in common that is explicit to
specific functions. Thus, identifying such distinctive motif patterns in the
protein sequences could help in predicting the functions of unannotated
proteins with similar motifs. The ScanProsite makes use of context-dependent
annotation templates to discover functional and structural intra-domain
residues by scanning the protein sequences for the occurrence of possible
motifs and predicts their function.
Knowledge of protein domains and their organization helped in
protein sequence analysis. Domains are conserved between protein families
and they also aid in predicting function of the protein although no orthologous
gene was found by a homology search. ProDom can be useful in providing
functional information of uncharacterized proteins by carrying out a global
comparison of the submitted sequence against the structural, functional and
evolutionary relationships of known protein sequences. Functional annotation
14
based on such domain super families will support enhanced knowledge on the
protein functions. The sequences that are likely to perform the similar
function/activities in same or different species are of homologous origin.
Homologs are of two classes. The sequences performing the same function in
one species are paralogs, while same function in different species are called
orthologs. Knowledge detection of orthologous sequences is expected to have
the highest level of pairwise similarity. However, inference accuracy also
depends on evolutionary distance and the particular functional attribute under
consideration. The database of Clusters of Orthologous Groups of proteins
(COGs) consists of information on the classification of protein sequences
based on their phylogenies. Family based classification also remains as an
important means of providing functional annotation for the biological
sequences. Pfam is a collection of multiple sequences alignments and profile
Hidden Markov Models that represent protein families. The integrated
approach using the above methodologies is likely to find the function with less
frequency of errors and detect more relevant function of an uncharacterized
protein.
1.7.2 Comparative Genomics
Comparative genomics is the analysis and comparison of genomes
of two or more species. The comparison is usually done using alignment of
two protein sequences (pairwise) or can be extended to alignment of multiple
protein sequences. The alignment can be made either local or global. Global
alignments, align entire residues in sequences and uses Needleman–Wunsch
algorithm (Needleman and Wunsch, 1970), based on dynamic programming.
Local alignments are more useful for dissimilar sequences that are suspected
to contain regions of similarity or similar sequence motifs within their larger
sequence context. The Smith–Waterman (Smith and Waterman, 1981)
algorithm is a general local alignment method based on dynamic programming.
BLAST is the standard local alignment search tool used universally. The
15
major application of comparative genomics is in the field of drug discovery
and the genome comparison of pathogenic and non-pathogenic organisms
provides similarity and differences between organisms in which the difference
of genes obtained can be used as a possible target for drug discovery. Another
major application of comparative genomics is functional reannotation. In
general, the genome of any microorganisms has only 60% of gene functions
documented and the remaining 40% is indicated as hypothetical protein or
protein of unknown function. The most popular approach of comparative
genomics is the study of phylogenetic relationship that conveys knowledge on
the evolutionary relationship between organisms which can be used for
functional annotation. Most of the available comparative genomics tools like
BLAST (Altschul et al 1990), MUMmer (Schatz et al 2007), VISTA (Mayor
et al 2000), CGAT (Uchiyama et al 2006), PipMaker (Schwartz et al 2000),
WABA (Baillie and Rose 2000), ACGT (Xie et al 2003), genoPlotR (Guy et
al 2010), CGView Server (Grant and Stothard 2008) etc., are primarily
alignment-based visualization tools. Very few tools provide information for
both, homologous and non-homologous genes in a tabular format for
microorganism‟s comparison. Comprehensive information of homologous genes of known and unknown functions in tabular format placed aside to help
in rapid functional annotation transfer.
1.7.3 Protein Interaction
Although proteins perform function independently most proteins
interact with others for regulating the function of proteins and its biological
activity. Most of the enzyme activities are mediated by these protein
interactions. (Sharan et al 2006). Interaction between proteins is used for
determining the functional association of proteins (Deng et al 2003). It also
provides information such as involvement in specific pathways, or confirming
existing annotations provided by other methods. The protein interaction can
also be used for discovering unknown protein function in a pathway and also
cross talks between pathways (Van Leene 2010). Thus the analysis of protein
interaction data provides a framework for understanding systems biology (De
Las Rivas et al 2010). Different computational methods are available for
16
predicting protein interactions and integrating the different sources of
evidence provides a comprehensive set of binary interaction partners. The
fusion events (Phizicky et al 1995) of the proteins in closely related genomes
are used for determining the interacting proteins. The proteins are supposed to
be fused, (Enright et al 1999) if they satisfy the triangular inequality. Suppose
two proteins A and B of Organism 1 matches with protein C of Organism 2 in
such a way that A <-> C and B <-> C where A ≠ B, then A and B are said to be interacting. Fused (composite) proteins in a given reference genome are
detected by searching for unfused component protein sequences that are
homologous to the reference protein, but not to each other. These unfused
query sequences align to different regions of the reference protein, indicating
that they are composite proteins resulting from a gene fusion event. Candidate
fusion genes, detected in this way are said to be physically interacting.
Phylogenetic profiles are applied for finding co-evolution in organisms with
whole genome sequence (Kim et al 2006). Cellular functions are performed
by means of physical interaction between proteins, which takes place using
interacting domains. The interaction information of Pfam domain pairs is
obtained from DOMINE database (Raghavachari et al 2008), a database of
known and predicted interactions. The physical contacts can be determined
using domain based interactions. Consider domains D1 and D2 are present in
proteins P1 and P2 respectively, i.e. P1 has {D1} and P2 has {D2}, then P1
and P2 are predicted as interacting if D1 and D2 are interacting. A wide range
of computational tools like STRING (von Mering et al 2005) APID (Prieto et
al 2006), ADVICE (Tan et al 2004), IPPRED (Goffard et al 2003), ISPOT
(Brannetti et al 2003) is available online for the prediction of protein
interactions.
1.7.4 Metabolic Pathway Reconstruction
Metabolic pathways are series of chemical reactions occurring
within a cell and those series of reactions are interlinked together for a
continuous flow of biochemical process (Thomas et al 2007). This flow of
process is controlled by certain enzymes acting on reactants such as
17
metabolite or biochemicals which are involved in the accomplishment of
inhibiting or reversing the reaction flow. The reactions can be of type single-
substrate; single-product, single-substrate; multi-products, multi-substrates;
multi-products and multi-substrates; single-product. The reactants and
products can be joined together based on the substrate-product relation to
form pathways. This method of joining reactants and products to form a
pathway from seed metabolites is known as Bottom-up approach (Natalie et al
2006). The metabolic pathway reconstructed can be further used in metabolic
flux analysis. The enzyme assigned to reaction catalyzes the reaction flow in a
pathway and can inhibit or reverse the reaction flow. Formation of certain
products can be inhibited or augmented by either increasing flux to desired
products or diverting flux from undesired products in the pathway (Siahsar
et al 2011). So to improve the productivity of industrially important
bio-product lovastatin from A. terreus, the detailed understanding of
metabolic process involved in the production is crucial and the information
can be utilized for engineering strain (Askenazi et al 2003). There are many
pathway databases such as KEGG (Kanehisa et al 2008), Cyclone (Francois et
al 2007), MedicCyc (Urbanczyk-Wochniak and Sumner 2007), AsperCyc
(http://130.88.248.2/) and PathCase (Elliott et al 2008) available for metabolic
pathway information. AsperCyc is a database having the metabolic pathways
of Aspergillus group, which uses the Pathway-tools for construction of
metabolic pathways in Aspergillus. Reconstruction of genome-scale
metabolic pathway database is essential to understand cellular processes.
A database is required to reduce the accessible complexity of retrieving
metabolic pathway data. The sub networks from the database can be used to
fill metabolic holes or create alternative paths by means of Network
Expansion method (Handorf et al 2005).
18
1.7.5 Metabolic Flux Analysis
The traditional method for improvement of yield of industrial
bio-products is random mutagenesis and screening which is time consuming.
Flux Balance Analysis (FBA) can be used to identify the gene knockout
targets, which is more efficient than the traditional methods. FBA is a
mathematical approach for analyzing the flow of metabolites through a
metabolic network. Regulation of flux is vital for all metabolic pathway
activities under different conditions like glucose, glyoxylate, acetate. Flux is
regulated by the enzyme involved in pathway within the cell. The
understanding of cellular behavior can be achieved through modeling and
analysis of metabolic pathways, regulatory and signal transduction networks.
The different modeling techniques include interaction-based modeling (graph-
based representations of networks), constraint-based modeling (stoichiometric
representation) and mechanistic modeling (kinetic parameters and
stoichiometric representation). In most cases the kinetic data is not available
for simulation and so constraint based modeling is used as an alternative for
mechanistic modeling. The constraints to be considered for constraint based
modeling include physico-chemical constraints, spatial or topological
constraints, environmental constraints or gene regulatory constraints. Flux
balance analysis (FBA) is a constraint-based approach and it involves
(i) Mathematical representation of individual reactions and their
constraints.
(ii) The mass balance of steady state reactions is given in
Equation (1.3). The assumption of a steady state:
S x V = 0. (1.3)
19
where v=[v1v2 … vn, b1 b2 …extnb ]T, vi signifies the internal fluxes, bi
represents the exchange fluxes in the system, ni is the number of internal
metabolites and next is the number of external metabolites in the system.
(iii) Defining biologically relevant objective function and addition
of other biochemical constraints.
(iv) Optimization of the objective function f (v) (Equation (1.4)).
r
i i
i 1
f (v) c v (1.4)
A wide spectrum of tools available for the simulation and analysis
of biochemical systems includes MATLAB: CellNetAnalyzer (Klamt et al
2007), the COBRA toolbox (Becker et al 2007), OptKnock (Burgard et al
2003), OptStrain (Pharkya et al 2004), OptForce (Ranganathan et al 2010),
METATOOL (von Kamp and Schuster 2006), MetaFluxNet (Lee et al 2003)
and General Algebraic Model System (GAMS) (http://www.gams.com/)
software. The major application of flux balance analysis is in the field of
drug discovery through the analysis of cellular metabolism and in the
production of industrially important metabolites (Raman and Chandra 2009).
By altering the bounds on certain reactions, growth on different media or with
multiple gene knockouts, the metabolic pathway for production of bio-
products can be simulated (Edwards et al 2001). Linear programming is
widely used to identify flux distribution for optimization of the biological
objective function. The objective function here is improvement of lovastatin
production by reducing the byproducts formation through inhibition of certain
enzymes and redirection of the flux flow through lovastatin biosynthetic
pathway.
20
1.8 AUGMENTATION OF LOVASTATIN BIOSYNTHESIS IN
WETLAB
There are different ways of enhancing lovastatin biosynthesis using
fermentation techniques and gene silencing techniques. Of the five statins
currently prescribed by physicians, three (pravastatin, simvastatin, and
lovastatin) are derived by fermentation in A. terreus and is found to be mostly
accumulated in mycelia (Wei et al 2007). Lovastatin biosynthesis by
fermentation reduces the cost of production compared to cost of chemical
synthesis (Seenivasan et al 2008). The various research methodologies for
augmenting lovastatin production in different countries and their production
amount are given in Appendix 2.
1.8.1 Substrate Concentration
Lovastatin production is dependent on the amount of substrates
provided. Lovastatin biosynthesis is high with the addition of glutamate and
histidine as substrates. In the presence of glucose and glutamate, the synthesis
of lovastatin is initiated when glucose consumption is leveled off. High yield
of lovastatin is obtained in the presence of slowly utilized carbon sources such
as lactose and glycerol (Pecyna and Bizukojc 2011). Lactose produces least
biomass but higher lovastatin production and soluble starch is more beneficial
to lovastatin production (Jia et al 2009a). Rice or wheat bran can be a suitable
substrate for lovastatin production (Jaivel and Marimuthu 2010, Sri
Ramireddy et al 2011, Patil et al 2011). Chang et al (2002) used Rice-
Glycerol complex medium for improving lovastatin production. Kaur et al
(2010), Osman et al (2011), Latha et al (2012) and Szakacs et al (1998) used
soybean, baggase, coconut oil cake and cheese whey as substrate for
lovastatin production respectively. Hajjaj et al (2001) states that for lovastatin
biosynthesis carbon source starvation is required, in addition to glucose
repression. According to Barrios-Gonzalez et al (2008) lovastatin yield can be
21
improved by arresting growth by limiting nitrogen and not by limiting carbon
source. Optimal C: N mass ratio increase lovastatin yield (Lopez et al 2003).
Box–Behnken‟s factorial design (Panda et al 2008) and Response surface
methodology (RSM) technologies (Lopez et al 2004b, Panda et al 2009a,
Panda et al 2009b, Sayyad et al 2007, Pansuriya and Singhal 2010) are also
used for designing optimal medium for lovastatin production. Supplements
like tylosin (Jia et al 2010) and tween80 (Danuri 2008) are also added for
improving the lovastatin production.
1.8.2 Effect of Metal Ions
Jia et al (2009b) studied lovastatin biosynthesis in the presence of
metal ions and achieved the following results. Cu2+ inhibited cell growth with
no influence on lovastatin biosynthesis. Fe2+, Ca2+, Zn2+, Mg2+, Mn2+
promoted cell growth and lovastatin biosynthesis. Zn2+ promotes the activity
of lovD which inturn increases lovastatin production and it also promotes the
upstream steps in lovastatin biosynthesis. LovE and orf 13 requires zinc ion
participating in regulation during lovastatin biosynthesis. Highest lovastatin
improvement is with Mn2+ on raise from 1 to 2mm.
1.8.3 pH and Dissolved Oxygen Level
Maintaining pH above 4.5 is crucial with glucose as carbon source.
Fermentations are typically carried out at approximately around 28 oC and pH
of 5.8 to 6.3. The dissolved oxygen level is controlled at greater than 40% of
air saturation.
1.8.4 Gene Manipulation
More intense transcription of biosynthetic genes would lead to
enhanced lovastatin production. LovC disruptant is able to convert
22
dihydromonacolin L or monacolin J into lovastatin. LovA mutant has an
unexpected Beta-oxidation and gives a small amount of lovastatin upon
addition of immediate precursor monacolin J (Sorensen et al 2003). Huang
and Li (2009) proved lovE is essential for lovastatin biosynthesis, by
introducing an additional copy of lovE which increases lovastatin production.
In the absence of lovE, no lovastatin production is observed. Expression of
gldB gene in solid state fermentation for lovastatin biosynthesis senses
Osmotic stress (Barrios-Gonzalez et al 2008). When alpha-dimethyl-butyryl-
N-acetylcysteamine thioesters used as acyl donor, LovD converts monacolin J
and 6-hydroxyl -6 desmethyl monacolin J into simvastatin and huvastatin
respectively. LovD and lovF interaction is essential for the lovastatin
biosynthesis (Xie et al 2006). LaeA is a universal regulator of secondary
metabolites. LaeA transcriptionally regulates multiple novel secondary
metabolite clusters and can be used as a potent metabolite-mining tool (Keller
et al 2006). Deletion of laeA, blocks the expression of metabolic gene clusters
which results in lowering the levels of several secondary metabolites (Fox and
Howlett 2008). Lovastatin is a secondary metabolite and its gene transcription
is triggered by over expression of laeA. Over expression of aflR down
regulates laeA transcript level, while over expression of PKA and RasA
completely inhibits laeA expression (Bok and Keller 2004).
1.8.5 Modes of Fermentation
Solid state fermentation shows higher levels of lovastatin
accumulation in the extract (Extracellular and Intracellular) than liquid
submerged fermentation due to accumulation of lovE and lovF transcripts
(Barrios-González et al 2008). Banos et al (2009) optimized lovastatin
production using solid-state fermentation on polyurethane foam. Lovastatin is
mostly accumulated in mycelia (Dewi 2011). Fed batch fermentation is
superior to batch which last for less than 10 days. A feedback regulatory
mechanism exists in lovastatin biosynthesis and eliminating or suppressing
this mechanism greatly enhances the production of lovastatin (Lopez et al
2004a).
23
1.8.6 Growth Mechanism
Pelleted growth of A. terreus yields higher titers of lovastatin than
filamentous growth (Lopez et al 2005, Porcel et al 2007). Formation of A.
terreus secondary metabolite is mixed growth associated or non-growth
associated (Bizukojc and Pecyna 2011). Highest lovastatin concentration in
batch fermentation is obtained with vegetative inoculums of spherical pellet
and short thick branched-peripheral hyphae by maintaining DOT at 70%
(Kumar et al 2000).
1.8.7 Co-metabolites
The most predominant co-metabolites of lovastatin are
benzophenone and sulochrin. Other co-metabolites are asterric acid,
butyrolactone, citrinin, emodin, itaconic acid, geodin and terretonin.
Production of sulochrin and asterric acid are from common polyketide
precursor emodin (Vinci et al 2001). Sulochrin is a toxic co-metabolite and is
a contaminant which has to be removed in the downstream process. Couch
and Gaucher (2004) genetically disrupted the emodin anthrone PKS gene to
eliminate sulochrin production. Lovastatin biosynthesis proceeds through
utilization of acetate units as precursors in polyketide elongation.
Butyrolactone-I has the ability to induce morphological change and increase
lovastatin production (Schimmel et al 1998).
1.9 OBJECTIVES OF THE RESEARCH
Lovastatin is a statin drug produced by fermentation using
A. terreus. Lovastatin production using the genetic engineering methodology
or isolation of newer improved strains are time-consuming and expensive. The
overall objective is to enhance the production of lovastatin on analyzing whole
24
genome without increasing the cost of the fermentation process by adding
inhibitors in the production medium.
1. To shed light on the entire biochemical role in A. terreus that
provides clues behind the augmentation of lovastatin
production.
2. Gaining a better understanding of sequenced A. terreus from
whole genome sequence analysis to achieve the motivation of
systems biology research “understand gene functions”.
3. Constraint based modeling and flux analysis of central carbon
metabolism to redirect maximum flux through acetyl CoA and
malonyl CoA, precursors for lovastatin biosynthesis.
4. Elimination or reduction of byproducts formation and
increasing flux flow through the desired product lovastatin.
5. Wet lab validation to prove the result obtained from
computational analysis.
1.10 IMPORTANCE OF STUDY
Increasing yields of secondary metabolites of pharmaceutically
important compounds like lovastatin turn out to be the main goals for the
commercial success of a fermentation process. The market value of lovastatin
is higher because it also serves as an immediate precursor to a multi-billion
dollar drug (Campbell and Vederas 2010) simvastatin (Zocor). Aspergillus
terreus ATCC 20542 is widely used in industries for production of lovastatin,
but the whole genome sequence is not available for further analysis. So the
computational analysis of pathways and interaction networks is done for A.
25
terreus strain NIH2624 whose whole genome sequence is available publicly.
Various Bioinformatics tools are employed to understand the biological
system which is further utilized in the augmentation of lovastatin
biosynthesis. Each tool is specialized in its own way of analysis and
combining these tools increases the prediction capability with increasing
confidence level. The integrated functional annotation approach comprising
of homology, feature and orthology predicts the function of A. terreus with
much confidence level. Understanding similarities and differences between
genomes offer necessary information for the reconstruction of metabolic
pathways and targets for strain improvement. Comparison of uncharacterized
proteins with a known protein of closely related Aspergillus species is
performed to acquire consistent functional annotation. Flux optimization was
done on modeled central carbon metabolism to increase the lovastatin
precursors acetyl and malonyl CoA. The protein interaction network analysis
was done for determining the interacting partners of lovastatin biosynthetic
cluster proteins and validation of function. The interacting partners of
lovastatin biosynthetic cluster, if associated with toxins were identified which
serves as a better target to improve lovastatin biosynthesis. The validations of
the clues obtained from computational analysis for improving lovastatin
production were performed through wet lab approach.