+ All Categories
Home > Documents > Dissertation

Dissertation

Date post: 16-Aug-2015
Category:
Upload: philip-allen-dy
View: 4 times
Download: 2 times
Share this document with a friend
Popular Tags:
70
Comparative Genomics of Sterol Homeostasis Dissertation by Philip Allen Dy In Partial Fulfilment of the Requirements for the Degree of Bachelor of Science in Biosciences 2 nd – May – 2014 Submitted To: Dr. Steve Meaney School of Biological Sciences College of Sciences and Health Dublin Institute of Technology
Transcript
Page 1: Dissertation

!

! !

Comparative+Genomics+of+Sterol+Homeostasis+

+Dissertation*by*

!!!

Philip!Allen!Dy!!!

In!Partial!Fulfilment!of!the!Requirements!for!the!Degree!of!

Bachelor!of!Science!in!Biosciences!2nd!–!May!–!2014!

!

!!

!

Submitted+To:+

Dr.!Steve!Meaney!School!of!Biological!Sciences!

College!of!Sciences!and!Health!Dublin!Institute!of!Technology!

Page 2: Dissertation

i!!

!

Acknowledgements+!Foremost,!I!would!like!to!express!my!special!appreciation!and!thanks!to!my!supervisor!Dr.!Steve!Meaney,!you!have!been!an!exceptional!mentor.!His!guidance!has!made!this!a!thoughtful!and!rewarding!journey.!!Besides!my!supervisor,!I!would!like!to!thank!the!rest!of!the!members!of!the!dissertation!committee,!Dr.!Orla!Howe,!Dr.!Celine!Herra,!Dr.!John!Kearney,!and!Dr.!Alison!Malkin!for!providing!guidance!and!support!of!the!different!aspects!involved!in!the!dissertation.!!!Also,!thanks!to!Alan!Lennon!and!Gavin!Meehan!for!the!stimulating!discussions!and!for!providing!input!in!completion!with!the!dissertation.!!!Last!but!not!the!least,!an!extended!gratitude!goes!to!my!family!and!friends!for!their!continuing!support!in!the!final!stages!of!the!dissertation.!

Page 3: Dissertation

ii""

Abbreviations,"

Abbreviation" Description,

Annu"Rev"Genomics"Hum"

Genet,,

Annual"Review"of"Genomics"and"Human"Genetics"

ATP, Adenosine"Triphosphate"

Biochem"Biophys"Res"

Commun,

Biochemical"and"Biophysical"Research"

Communications"

BLAST, Basic"Local"Alignment"Search"Tool"

BLAT,, BlastAlike"Alignment"Tool"

BMC"Cancer, Biomed"Central"Cancer"

BMC"Evol"Biol" Biomed"Central"Evolutionary"Biology"

bp" Base"pair"

cDNA", complementary"DNA"

Clin"Biochem, Clinical"Biochemistry"

CNS,, Central"Nervous"System"

CRM,, cisARegulatory"Modules"

Curr"Opin"Cell"Biol, Current"Opinion"in"Cell"Biology"

CYP46A1,, Cholesterol"24Ahydroxylase"

CYP7B1,, 25Ahydroxycholesterol"7AalphaAhydroxylase"

DHEA,, dehydroepiandrosterone"

DNA, Deoxyribonucleic"Acid"

DOE" Department"of"Energy"

ECR,, Evolutionary"Conserved"Regions"

ENCODE" Encyclopedia"Of"DNA"Elements"

Exp"Biol"Med, Experimental"Biology"and"Medicine"

Genome"Res, Genome"Research"

HGNC" HUGO"Gene"Nomenclature"Committee"

HMGACoA,, 3AhydroxyA3AmethylglutarylAcoenzyme"A"

HMGCR,, 3AhydroxyA3AmethylAglutarylACoA"reductase"

HMM" Hidden"Markov"Model"

Page 4: Dissertation

iii!!

J!Am!Coll!Nutr+ Journal!of!the!American!College!of!Nutrition!J!Biol! Journal!of!Biology!J!Biol!Chem+ Journal!of!Biological!Chemistry!J!Clin!Invest+ Journal!of!Clinical!Investigation!J!Mol!Biol+ Journal!of!Molecular!Biology!J!Neurochem+ Journal!of!Neurochemistry!J!Neurosci+ Journal!of!Neuroscience!LDF! Linear!Discriminant!Function!Methods!Mol!Biol! Methods!in!Molecular!Biology!Mol!Syst!Biol! Molecular!Systems!Biology!mRNA! messenger!RNA!MULAN++ Multiple!sequence!Local!Alignment!NADPH++ Nicotinamide!Adenine!Dinucleotide!Phosphate!Nat!Genet! Nature!Genetics!Nat!Rev!Mol!Cell!Biol! Nature!Reviews!Molecular!Cell!Biology!NCBI++ National!Centre!for!Biotechnology!Information!NFY++ Nuclear!Factor!Y!NHGRI! National!Human!Genome!Research!Institute!NIH! National!Institutes!of!Health!Nucleic!Acids!Res! Nucleic!Acids!Research!PLoS!Genetics++ PloS!Genetics!PNS+ Peripheral!Nervous!System!Prog!Neurobiol+ Progress!in!Neurobiology!PLoS!Genet! PloS!Genetics!RNA++ Ribonucleic!Acid!SP1+ Specificity!Protein!1!SREBP++ Sterol!Regulatory!ElementSBinding!Proteins!TFBS! Transcription!Factor!Binding!Sites!tRNA++ transfer!RNA!TSS! Transcription!Start!Site!UV++ Ultraviolet!!

Page 5: Dissertation

iv!!

List+of+Figures+Figure*1:*Cholesterol*Biosynthesis*Pathway*...................................................................................................................*6!Figure*2:*Intermembrane*cholesterol*regulation*via*sterol*regulatory*element?binding**

proteins*(SREBPs)*......................................................................................................................................................................*9!Figure*3:*Cytogenic*location*of*CYP7B1*gene.*............................................................................................................*11!Figure*4:*ENSEMBL*homepage.*.........................................................................................................................................*15!Figure*5:*Search*results*for*the*CYP7B1*gene.*............................................................................................................*16!Figure*6:*Results*for*the*CYP7B1*Transcript.*..............................................................................................................*16!Figure*7:*Summary*of*the*CYP7B1?001*Transcript.*.................................................................................................*17!Figure*8:*Choices*of*different*configurations*for*data*extraction.*.....................................................................*17!Figure*9:*Fasta*sequence*of*the*human*CYP7B1*gene.*............................................................................................*18!Figure*10:*The*species*to*be*sequenced*are*organised*into*folders.*..................................................................*18!Figure*11:*Mulan*homepage*which*can*be*accessed*at*http://mulan.dcode.org.*......................................*19!Figure*12:*Homepage*for*the*sequences*to*be*applied.*..........................................................................................*19!Figure*13:*Summary*page*of*results*of*sequences*aligned.*...................................................................................*20!Figure*14:*Dynamic*visualisation*profile*in*standard*stacked?pairwise*structure.*...................................*20!Figure*15:*Dynamic*visualisation*profile*in*color*density*by*interspecies*conservation*

configuration.*............................................................................................................................................................................*21!Figure*16:*Summary*of*conservation*of*the*sequences*inputted.*.......................................................................*21!Figure*17:*Phylogenetic*tree*result*of*the*sequences*submitted.*........................................................................*22!Figure*18:*multiTF*homepage*within*Mulan.*.............................................................................................................*22!Figure*19:*Results*summary*of*multiTF*profile.*.........................................................................................................*23!Figure*20:*zPicture*homepage*for*the*sequences*to*be*applied.*.........................................................................*23!Figure*21:*Results*page*of*the*sequences*differentiated.*.......................................................................................*24!Figure*22:*Dynamic*visualisation*profile*of*ECR*of*species*of*interest.*...........................................................*24!Figure*23:*Spidey*homepage.*.............................................................................................................................................*25!Figure*24:*Summary*results*of*the*sequence*inputted.*...........................................................................................*25!Figure*25:*FPROM*homepage*.............................................................................................................................................*26!Figure*26:*Summary*page*of*results*of*the*position*of*the*predicted*promoter*regions.*........................*26!Figure*27:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.5kb*to*212.6kb*

compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*The*

conservation*profile*depicts*the*differential*prediction*of*the*non?coding*ECRs*in*different*species*

(the*legend*on*the*left*describes*colouring*of*different*type*of*elements).*....................................................*30!Figure*28:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*0kb*to*53.2kb*compared*

with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*.........................*31!Figure*29:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*53.2kb*to*106.3*compared*

with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*.........................*32!

Page 6: Dissertation

v!!

Figure*30:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*106.3kb*to*159.5*

compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*...*33!Figure*31:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.9kb*to*212.6kb*

compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*...*34!Figure*32:*Phylogenetic*tree*represents*the*evolutionary*history*of*the*CYP7B1*gene*in*different*

vertebrate*lineages*(with*the*numbers*corresponding*to*the*number*of*nucleotide*mismatches**

per*kb).*.........................................................................................................................................................................................*35!Figure*33:*Phylogenetic*shadowing*profile*of*the*human*and*10*other*species*comparison*

(100bp/70%)*identity*threshold.*.....................................................................................................................................*36!Figure*34:*multiTF*identification*of*conserved*TFBS*of*the*CYP7B1*gene.*The*identified*TFBS*are*

depicted*as*coloured*tick*marks*above*the*conservation*profile.*......................................................................*38!Figure*35:*The*potential*transcription*binding*sites*found*indicated*within*the*CYP7B1*gene.*.........*40!!

Page 7: Dissertation

vi##

Table&of&Contents&&

ABSTRACT&.........................................................................................................................................&1#1.0&INTRODUCTION&........................................................................................................................&1#1.1#GENETICS#.....................................................................................................................................................#1#1.2#GENOMICS#....................................................................................................................................................#2#1.3#COMPARATIVE#GENOMICS#........................................................................................................................#3#1.4#CHOLESTEROL#BIOSYNTHESIS#PATHWAY#..............................................................................................#6#1.5#STUDY#OF#CHOLESTEROL#BIOSYNTHESIS#...............................................................................................#9#1.6#CYP7B1#GENE#........................................................................................................................................#10#1.7#OBJECTIVES#...............................................................................................................................................#11#

2.0&MATERIALS&AND&METHODS&..............................................................................................&12#2.1#METHODOLOGY#FLOW#CHART#..............................................................................................................#12#2.2#MATERIALS#...............................................................................................................................................#13#2.2.1$Software$..............................................................................................................................................$13#

2.3#METHODS#..................................................................................................................................................#14#2.3.1$Retrieval$of$Sequences$..................................................................................................................$14#2.3.2$Generating$Alignments$.................................................................................................................$19#

3.0&RESULTS&...................................................................................................................................&27#3.1#GENERATING#AND#VISUALISING#DNA#ALIGNMENTS:#MULAN#.........................................................#27#3.2#PHYLOGENETIC#SHADOWING#IMPLEMENTED#IN#MULAN#.................................................................#36#3.3#DETECTION#OF#EVOLUTIONARY#CONSERVED#TFBS:#MULTITF#.......................................................#37#3.4#EXON#AND#PROMOTER#REGION#USING#SPIDEY#AND#FPROM#............................................................#41#

4.0&DISCUSSION&.............................................................................................................................&49#4.1#GENETICS#AND#SEQUENCING#.................................................................................................................#49#4.2#HOW#CHOLESTEROL#PLAYS#A#ROLE#IN#CYP7B1#................................................................................#50#4.3#PROBLEMS#AND#CHALLENGES#IN#BIOINFORMATICS#.........................................................................#51#4.4#PROBLEMS#WITH#ONLINE#TOOLS#..........................................................................................................#52#4.5#SUMMARY#OF#RESULTS#...........................................................................................................................#53#4.6#FUTURE#OF#COMPARATIVE#GENOMICS#.................................................................................................#54#

BIBLIOGRAPHY&.............................................................................................................................&55#

APPENDIX&I&.....................................................................................................................................&62#

Page 8: Dissertation

1!!

Abstract+Comparative!genomics!provides!the!means!to!delimitate!functional!regions!in!anonymous!DNA!sequences.!The!successful!application!of!this!method!is!currently!shifting!to!deciphering!the!nonScoding!encryption!of!gene!regulation!across!genomes.!The!CYP7B1!gene!transcripts!of!eleven!mammals!(human,!armadillo,!cat,!cow,!dog,!dolphin,!elephant,!guinea!pig,!horse,!mouse,!and!rabbit)!have!been!sequenced.!To!facilitate!practical!application!of!comparative!sequence!analysis!to!genetics!and!genomics,!several!analytical!and!visualisation!tools!are!used!for!the!analysis!of!arbitrary!sequences!and!whole!genomes.!These!tools!include!Mulan,!alignment!tool,!a!phylogenetic!tree,!evolutionary!transcription!factor!analysis!tool,!multiTF,!an!exon!prediction!tool,!Spidey!and!a!human!promoter!prediction!tool,!FPROM.!The!overall!goal!of!this!research!is!to!determine!if!the!gene,!CYP7B1,!share!a!common!relationship!in!terms!of!function!and!structure!to!the!species!in!investigation.!

1.0+Introduction+

1.1+Genetics+

Genetics!is!the!study!of!heredity!and!hereditary!variation.!A!monk!named!Gregor!Mendel!documented!a!particulate!mechanism!for!inheritance!in!the!midS19th!century!(Weiling,!1991).!He!observed!that!organisms!inherit!traits!by!way!of!discrete!‘units!of!inheritance’,!which!is!somewhat!the!obscure!translation!of!a!gene.!Genes!are!made!from!a!long!molecule!called!deoxyribonucleic!acid!(DNA).!A!DNA!is!a!polymer!that!is!made!up!of!monomer!subunits!that!code!for!thousands!of!different!kinds!of!proteins!with!each!gene!made!up!of!a!sugar,!a!base!and!a!phosphate!group.!The!sugar!in!DNA!is!2’Sdeoxyribose!containing!a!fiveScarbon!(pentose)!sugar.!There!are!four!types!of!nucleobases:!adenine!(A),!thymine!(T),!cytosine!(C)!and!guanine!(G).!Adenine!and!guanine!have!two!nitrogenScontaining!rings!and!exists!as!purines.!Thymine!and!cytosine!have!a!single!nitrogenScontaining!ring,!which!exists!as!pyrimidines.!The!bases!are!attached!to!the!1’Scarbon!of!the!sugar,!deoxyribose.!The!observation!of!that!bases!

Page 9: Dissertation

2!!

are!present!in!the!genomes!of!different!species!led!to!the!concept!that!the!sequence!of!bases!is!the!form!in!which!the!genetic!information!is!carried!out.!The!power!of!DNA!sequencing!as!a!research!tool!has!triggered!the!dramatic!advancement!of!DNA!sequencing!technology!since!the!discovery!of!the!DNA!double!helical!structure!model!by!Watson!and!Crick!in!1953!(Watson!&!Crick,!1953),!allowing!even!more!genomes!to!be!sequenced!and!making!comparative!genomics!an!accessible!focal!point!for!the!study!of!any!form!of!life.!!

1.2+Genomics+

Genomics!focuses!on!the!study!of!genome!structure!and!its!function.!Bioinformatics!is!a!branch!of!science!that!uses!computational!approaches!to!solve!genomics!problems!(Lesk,!2008).!Bioinformatics!develops!databases!to!store,!retrieve,!organise!and!analyse!relationship!between!biological!data!sets.!The!central!dogma!of!molecular!biology!is!that!the!sequence!specifies!the!function!i.e.,!

DNA!!!Ribonucleic!Acid!(RNA)!!!Protein!!

Frederick!Sanger!and!colleagues!(Sanger,!1981),!and!Alan!Maxam!and!Walter!Gilbert!proposed!methods!for!rapid!sequencing!of!DNA!molecules!in!1975.!Geneticists!uses!two!approaches!to!genome!sequencing!(Reece!et!al.,!2011):!

• MapSbased!sequencing:!starts!with!vectors!that!can!accommodate!large!fragments!of!an!organism’s!genome!which!are!then!convened!into!genetic!maps!using!recombinational!analysis.!The!National!Institutes!of!Health!(NIH)!and!the!Department!of!Energy!(DOE)!chose!the!mapSbased!sequencing!method!for!the!Human!Genome!Project!(HGP).!

• Shotgun!sequencing:!was!made!by!advances!in!sequencing!technology!and!development!of!software!to!assemble!sequences!where!DNA!is!broken!up!into!shorter!fragments!into!a!continuous!sequence.!

!The!International!Human!Genome!Sequencing!Consortium!publicised!the!first!draft!of!the!HGP!in!the!journal!Nature!in!February!2001,!with!the!sequence!of!the!entire!genome’s!three!billion!base!pairs!almost!90%!complete.!The!full!sequence!

Page 10: Dissertation

3!!

was!completed!and!published!in!April!2003!(http://www.genome.gov/11006929).!!!After!the!completion!of!the!HGP,!the!National!Human!Genome!Research!Institute!(NHGRI)!has!launched!the!Encyclopedia!of!DNA!Elements!(ENCODE)!Consortium!in!September!2003!which!aims!to!identify!all!functional!elements!in!the!human!genome,!primarily!the!remaining!component!regarded!as!‘junk’!(the!DNA!that!contains!no!biological!function!and!is!not!transcribed).!!Projects!like!the!‘1000!Genomes!Project’!was!implemented!as!a!result!of!a!reduction!in!costs!by!next!generation!sequencing!technologies!(“nextSgen”!sequencing!platforms).!The!1000!Genomes!Project!is!the!first!project!to!sequence!the!genomes!of!a!large!number!of!anonymous!participants!of!different!ethnic!groups!providing!a!comprehensive!resource!on!human!genetic!variation.!They!have!recently!announced!the!sequencing!of!1,092!genomes!(McVean,!2012).!!

1.3+Comparative+Genomics+

Comparative!genomics!is!a!field!of!biological!research!in!which!the!genome!sequences!of!different!species!are!compared!(Touchman,!2010)!to!gain!insights!into!how!genomes!and!genes!evolved!and!to!assess!the!functional!significance!of!genome!components!(Strachan!&!Read,!2011)!using!computers.!!A!simple!comparison!of!the!general!features!of!genomes!such!as!genome!size,!number!of!genes!and!chromosome!number!present!an!entry!point!into!comparative!genomic!analysis.!Comparative!genomics!rely!on!the!comparison!of!sequences!at!the!genome!scale!(Strachan!&!Read,!2011).!It!begins!with!powerful!computer!programs!that!identify!homologous!regions!within!the!genomes!under!comparison.!!The!analysis!of!the!individual!genome!sequences!gives!much!insight!into!genome!structure!but!less!into!genome!function.!One!big!challenge!for!the!next!phase!of!genomics!is!to!distinguish!functional!DNA!and!assign!a!role!to!it!(Collins!et!al.,!2003).!Only!a!small!fraction!of!the!genome!(5%)!consists!of!protein!coding!

Page 11: Dissertation

4!!

proteins!(Korf,!2007)!(Lesk,!2008).!The!average!human!gene!is!approximately!30!kilobases!(kb)!long!and!not!the!initial!estimated!100,000!total!numbers!of!genes!predicted!in!the!early!1990’s!(Korf,!2007);!which!is!only!a!minute!increase!compared!over!flies!and!worms!which!has!13,600!(50%!of!human!equivalence)!and!19,000!genes!(40%!of!human!equivalence)!(Howard!Hughes!Medical!Institute,!2001),!respectively.!!Functional!sequences,!are!regions!of!similarity!between!the!sequences,!are!subject!to!evolutionary!selection!that!can!result!in!a!signature!being!left!in!the!aligned!sequences.!Comparing!sequences!can!find!these!signatures!of!selection!and!so!putative!functional!sequences!can!be!deduced!(Miller!et!al.,!2004).!Pairwise!sequence!alignments,!the!identification!of!residueSresidue!correspondences!to!compare!genomic!sequences,!are!the!basic!tool!of!bioinformatics.!By!contrast,!multiple!sequence!alignment!is!an!alignment!of!three!or!more!biological!sequences!that!gives!a!more!reliable!assessment!of!similarity!(Zvelebil!&!Baum,!2008).!These!multiple!sequence!alignments!can!then!be!used!to!detect!evolutionary!conserved!regions!(ECRs)!in!the!sequence.!These!conserved!regions!may!act!as!potential!transcription!factor!binding!sites!(cisSregulatory!elements)!in!regulating!gene!expression!(Lu,!2011)!–!cisSregulatory!modules!(CRMs)!include!promoters,!enhancers,!silencers,!and!insulators!or!boundary!elements!(Miller!et!al.,!2004).!!Pairwise!sequence!alignments!can!be!carried!out!using!online!comparative!genomics!tools!such!as!Basic!Local!Alignment!Search!Tool!(BLAST)!and!BlastSLike!Alignment!Tool!(BLAT).!!!BLAST,!implemented!at!National!Centre!for!Biotechnology!Information!(NCBI)!(http://blast.ncbi.nlm.nih.gov/Blast.cgi),!is!used!regularly!via!a!web!interface!as!to!compare!a!query!sequence!to!a!database!or!library!of!sequences!in!a!rapid!comparison!(Altschul!et!al.,!1990).!BLAST!uses!a!heuristic!method!to!look!for!short!matches!between!two!sequences!and!attempts!to!find!similarities!in!sequences!and!provides!statistical!information!about!the!alignment;!this!is!the!expected!value,!or!false!positive!rate!(Ye!et!al.,!2006).!!

Page 12: Dissertation

5!!

!BLAT!available!at!(http://genome.ucsc.edu/cgiSbin/hgBlat),!is!a!new!alignment!tool!similar!to!BLAST!used!to!compare!biological!sequences!such!as!DNA,!RNA!and!proteins!but!is!structured!differently.!BLAT!uses!a!different!indexing!approach!where!it!keeps!an!index!of!an!entire!genome!in!memory,!therefore!the!target!database!for!BLAT!is!the!index!derived!from!the!assemble!of!the!entire!genome!rather!than!a!set!of!sequences.!It!is!stated!by!(Kent,!2002)!that!BLAT!is!more!accurate.!!Multiple!sequence!alignments!can!be!performed!with!online!comparative!genomics!software!such!as!Clustal!and!VISTA!as!the!most!widely!used!tools.!!Clustal!Omega!is!available!at!https://www.ebi.ac.uk/Tools/msa/clustalo/,!which!is!the!latest!addition!to!the!Clustal!family.!The!new!release!increases!scalability!over!the!previous!versions!and!improves!the!accuracy!of!the!progressive!alignment!procedure!(Thompson!et!al.,!1994).!!!VISTA!is!developed!and!hosted!at!Genomics!Division!of!Lawrence!Berkeley!National!Laboratory,!which!is!accessible!at!http://genome.lbl.gov/vista/index.shtml.!It!is!a!collection!of!tools!and!databases!that!allows!for!extensive!comparative!genomics!analyses.!mVISTA!is!the!server!that!is!used!to!align!and!compare!multiple!sequences!of!species.!!Phylogenetic!footprinting!is!an!approach!for!finding!functional!elements!from!sequence!data!(Ganley!&!Kobayashi,!2007).!It!relies!on!detecting!high!degrees!of!conservation!across!different!species!(Zhang!&!Gerstein,!2003).!Phylogenetic!footprinting!shortens!the!amount!of!sequence!under!consideration!by!focusing!attention!on!conserved!regions!that!are!more!likely!to!serve!a!biological!function!(Thompson!et!al.,!2004)!(Wasserman!et!al.,!2000).!!

Page 13: Dissertation

6!!

1.4+Cholesterol+Biosynthesis+Pathway+

Cholesterol!is!the!major!sterol!in!animal!tissues.!It!is!a!27Scarbon!sterol!derived!from!a!single!precursor,!acetate!(Nelson!&!Cox,!2008).!The!steroid!nucleus!consists!of!4!planar!rings!and!hydrocarbon!chain!extends!from!C17.!Cholesterol!is!essential!in!regulating!cell!membrane!permeability!and!fluidity!and!for!the!production!of!bile!and!other!steroid!hormones!such!as!oestrogen,!testosterone!and!cortisone.!!!Cholesterol!plays!a!unique!role!among!the!many!lipids!in!mammalian!cells.!The!endoplasmic!reticulum!is!the!main!organelle!responsible!for!regulation!of!cholesterol!synthesis.!Intracellular!cholesterol!concentration,!Adenosine!Triphosphate!(ATP)!levels!and!hormones!(glucagon!and!insulin)!regulate!cholesterol!production.!In*vivo!cholesterol!concentration!is!dictated!by!the!diet!and!biosynthesis.!High!concentrations!are!generally!associated!with!increased!risk!of!cardiovascular!disease!and!health.!According!to!(Maxfield!&!van!Meer,!2010),!the!levels!of!cholesterol!vary!in!different!organelles!by!5!–!10!fold!but!the!mechanisms!for!these!differences!are!only!partially!understood.!!

!Figure*1:!Cholesterol*Biosynthesis*Pathway!(Rosanoff!&!Seelig,!2004).!

Page 14: Dissertation

7!!

Nicotinamide!adenine!dinucleotide!phosphate!(NADPH)!produced!in!the!pentose!phosphate!pathway!is!required!for!cholesterol!synthesis.!Cholesterol!is!made!from!acetylSCoA!in!four!distinct!stages:!(1)!the!condensation!of!3!x!2!carbon!compounds!acetylSCoA!molecules!to!form!6!carbon!compound,!mevalonate;!(2)!the!conversion!of!mevalonate!to!activated!isoprene;!(3)!the!polymerisation!of!6!x!5!carbon!compounds!isoprene!to!form!the!30Scarbon!linear!squalene;!and!(4)!the!cyclisation!of!squalene!to!form!the!steroid!nucleus!which!subsequently!forms!cholesterol.!(Nelson!&!Cox,!2008).!!Cholesterol!is!widespread!in!biological!membranes!especially!in!animals!and!its!presence!can!modify!the!role!of!membrane!bound!proteins.!The!presence!of!cholesterol!in!membrane!reduces!fluidity!by!stabilising!extended!chain!conformations!of!the!hydrocarbon!tails!of!fatty!acids!and!hydrocarbon!chains!by!van!der!Waals!interactions!(Campbell!&!Farell,!2012).!Cholesterol!is!rich!in!glycosphingolipids,!glycosylphosphatidyllinositol!anchored!proteins!and!signalling!molecules,!which!function!as!signalling!platforms!and!have!been!shown!to!be!crucial!for!the!assembly!and!activity!of!various!signalling!networks!(Le!Roy!&!Wrana,!2005).!!The!cholesterol!biosynthesis!pathway!synthesizes!nonSsterol!isoprenoids!such!as!dolichol,!hemeSA,!isopentenyl!transfer!RNA!(tRNA)!and!ubiquinone.!As!reported!by!(Buhaescu!&!Izzedine,!2007),!these!molecules!appear!to!be!potential!interesting!therapeutic!targets!for!onSgoing!research!in!oncology,!autoimmune!disorders,!atherosclerosis!and!Alzheimer’s!disease.!Statins!(e.g.!atorvastatin,!lovastatin,!simvastatin),!which!are!a!class!of!drug!that!lowers!the!levels!of!cholesterol!by!inhibiting!3ShydroxyS3SmethylglutarylScoenzyme!A!(HMGSCoA)!reductase!enzyme,!prevent!cardiovascular!disease!to!those!who!are!at!high!risk!(Lewington!et!al.,!2007)!by!acting!as!reversible,!competitive!inhibitors!of!HMGSCoA!reductase.!They!are!also!being!tested!for!neuroprotective!properties.!The!fundamental!mechanism!of!statins!is!to!inhibit!cellular!cholesterol!synthesis.!However,!the!cholesterol!biosynthesis!pathway!also!has!several!bySproducts,!the!nonSsterol!isoprenoids,!which!are!also!important!in!cellular!functioning!(van!der!Most!et!al.,!2009).!IsoprenoidSmediated!inhibition!of!cholesterol!synthesis!is!also!being!applied!to!cancer!chemotherapy!and!chemoprevention!(Mo!&!Elson,!2004).!!

Page 15: Dissertation

8!!

Cholesterol!is!also!the!precursor!of!important!steroid!hormones!(Campbell!&!Farell,!2012).!Like!cholesterol,!these!hormones!have!a!4Sring!sterol!nucleus.!Glucocorticoids,!mineralocorticoids,!and!sex!hormones!(steroid!hormones)!–!are!produced!from!cholesterol!by!modifications!of!the!side!chain!and!the!addition!of!oxygen!atoms!into!the!steroid!ring!system!(Nelson!&!Cox,!2008)!but!lack!the!alkyl!chain!attached!to!the!DSring!of!cholesterol.!1,!25!dihydroxycholecalciferol!is!synthesised!in!the!skin!from!the!action!of!ultraviolet!(UV)!light!on!7Sdehydrocholesterol,!which!is!derived!from!cholesterol.!The!cholecalciferol!produced!is!hydroxylated!in!the!liver!by!25Shydroxyvitamin!D!which!is!the!main!circulating!form!of!vitamin!D.!25Shydroxyvitamin!D!is!then!further!hydroxylated!in!the!kidneys!which!results!in!the!final!product,!1,25!dihydroxycholecalciferol!(active!form!of!vitamin!D).!!Cholesterol!in!the!mammalian!brain!is!a!risk!factor!for!certain!neurodegenerative!diseases.!In!the!vertebrate!nervous!system,!majority!of!cholesterol!resides!in!the!myelin!(Saher!et!al.,!2009).!Schwann!cells!synthesise!essentially!all!cholesterol!that!they!require!for!myelination!autonomously!in!the!peripheral!nervous!system!(PNS)!(Fu!et!al.,!1998).!A!crucial!enzyme!of!the!cholesterol!biosynthesis!pathway!–!oligodendroglial!inactivation!of!squalene!synthase!–reported!that!cholesterol!is!a!rateSlimiting!factor!for!central!myelination!(Saher!et!al.,!2005).!Myelin!forms!an!insulating!sheath!around!axons!that!consists!of!tightly!compacted!membranes.!A!study!by!(Saher!et!al.,!2011)!acknowledged!that!cholesterol!appeared!to!be!the!only!integral!myelin!component!that!is!critical!and!rate!limiting!for!the!advancement!of!the!central!nervous!system!(CNS)!and!PNS!myelin.!The!function!of!myelin,!which!is!highly!enriched!in!glycosphingolipids,!is!dependent!on!its!unique!composition!for!rapid!and!efficient!saltatory!nerve!conduction.!!

Page 16: Dissertation

9!!

1.5+Study+of+Cholesterol+Biosynthesis+

!Figure*2:*Intermembrane*cholesterol*regulation*via*sterol*regulatory*element?binding*proteins*

(SREBPs)*(Sun!et!al.,!2005).*SREBPs,!which!are!the!membraneSembedded!transcriptional!activators!of!cholesterol!synthesis,!are!transported!by!SCAP!to!the!Golgi!complex!in!COPII–coated!vesicles!for!processing.!Cholesterol!triggers!SCAP!by!binding!to!Insig,!which!blocks!binding!of!COPII!proteins!to!SCAP!revoking!SREBP!transport!and!thus,!terminating!cholesterol!synthesis.!!The!ubiquity!of!cholesterol!and!its!precursors!in!the!cell!membranes!of!eukaryotic!species!make!cholesterol!biosynthesis!an!ideal!pathway!to!analyse!and!model.!Studies!have!already!been!taken!by!(Ohyama!et!al.,!2006)!to!evaluate!the!biological!importance!of!transcriptional!regulation!of!cholesterol!24Shydroxylase!(CYP46A1)!genes!by!analysing!orthologous!sequence!comparison!to!localised!conserved!nonScoding!regions,!i.e.!potential!regulatory!regions.!CYP46A1!is!a!key!regulator!of!brain!cholesterol!elimination!(Shafaati!et!al.,!2009).!!The!synthesis!of!cholesterol!and!its!derivatives!provides!an!example!of!a!novel!eukaryotic!membrane!component,!which!in!higher!animals!is!used!as!a!precursor!for!the!synthesis!of!higher!molecules!(Freilich!et!al.,!2008).!The!production!of!cholesterol!is!regulated!by!intracellular!cholesterol!concentration.!

Page 17: Dissertation

10!!

The!rateSlimiting!step!in!the!pathway!to!cholesterol!is!the!conversion!of!HMGSCoA!to!mevalonate,!the!reaction!catalysed!by!HMGSCoA!reductase!(Nelson!&!Cox,!2008)!(Wilcox!et!al.,!2007).!!!The!regulation!in!response!to!cholesterol!levels!is!mediated!by!a!system!of!transcrtiptional!regulation!encoding!the!3ShydroxyS3SmethylglutarylSCoA!reductase!(HMGCR)!gene.!!A!system!of!transcriptional!regulation!of!the!gene!encoding!HMGSCoA!reductase!mediates!the!regulation!in!response!to!cholesterol!levels.!Sterol!regulatory!elementSbinding!proteins!(SREBPs)!are!regulatory!proteins!that!control!the!HMGSCoA!gene!along!with!other!genes!to!mediate!the!uptake!of!cholesterol.!The!SREBP!family!member!SREBP1!is!a!major!transcriptional!activator!of!cholesterol!and!fatty!acid!metabolism!that!has!been!involved!in!insulin!resistance,!diabetes!and!other!dietSrelated!diseases!(Reed!et!al.,!2008).!Recent!studies!have!shown!that!SREBP!interact!with!transcription!factors,!specificity!protein!1!(SP1)!and!nuclear!factor!Y!(NFY)!in!regulating!specific!classes!of!target!genes!(Reed!et!al.,!2008)!(Horton!et!al.,!2002).!!

1.6+CYP7B1+Gene+

25Shydroxycholesterol!7SalphaShydroxylase!is!an!enzyme!that!is!encoded!by!the!CYP7B1!gene!in!humans!(Setchel!et!al.,!1998).!This!gene!encodes!a!member!of!the!cytochrome!P450,!family!7,!subfamily!B,!polypeptide!1!superfamily!of!enzymes.!It!is!a!proteinScoding!gene!that!catalyses!many!reactions!including!drug!metabolism!and!synthesis!of!steroids,!other!lipids!and!cholesterol.!!Stapleton!first!discovered!CYP7B!in!a!differential!screen!of!transcripts!expressed!in!a!rat!hippocampal!complementary!DNA!(cDNA)!library!versus!the!remainder!of!the!brain.!CYP7B!catalyses!the!7SalphaShydroxylation!of!oxysterols!and!3SbetaShydroxysteroids!including!dehydroepiandrosterone!(DHEA)!(Rose!et!al.,!1997),!a!major!adrenal!steroid!(Stapleton!et!al.,!1995).!The!CYP7B1!gene!has!demonstrated!by!report!tagging!to!be!expressed!particularly!strongly!in!the!brain,!liver,!spleen,!kidney!and!heart!(Rose!et!al.,!2001).!!

Page 18: Dissertation

11!!

The!first!reaction!in!the!cholesterol!catabolic!pathway!is!catalysed!by!the!endoplasmic!reticulum!membrane!protein,!which!essentially!converts!cholesterol!to!bile!acids.!It!also!plays!a!minor!role!in!total!bile!acid!synthesis,!but!may!also!be!involved!in!the!development!of!atherosclerosis,!neurosteroid!metabolism!and!sex!hormone!synthesis!(NCBI!RefSeq,!2008).!!

!Figure*3:!Cytogenic*location*of*CYP7B1*gene.!

The!CYP7B1!gene!is!located!from!the!base!pair!64,595,971!to!base!pair!64,798,790!(or!the!long!(q)!arm!of!chromosome!8!at!position!21.3)!(National!Library!of!Medicine,!2014).!

1.7+Objectives+

1. Retrieve!gene!sequences!from!each!of!the!ten!species!(armadillo,!cat,!cow,!dog,!dolphin,!elephant,!guinea!pig,!horse,!mouse!and!rabbit)!and!human!as!a!reference!sequence!and!save!in!fasta*format.!

2. Perform!multiple!sequence!alignment!using,!e.g.!Mulan,!Clustal!omega!and/or!mVISTA.!

3. Identify!the!ECRs!and!analyse!these!regions!for!conserved!transcription!factor!binding!sites!(TFBS).!

4. Review!the!data!and!identify!TFBS!of!regions!present!in!some/!all!species!(e.g.!mammals).!

5. Summarise!data!results.!6. Analyse!the!presumptive!sites!and!try!to!assign!meaning!tot!the!possible!

regulatory!networks.!

Page 19: Dissertation

! 12!

2.0+Materials+and+Methods+

2.1+Methodology+Flow+Chart+

!A*flow*chart*of*the*methods*involved*in*retrieving*the*sequence*of*CYP7B1*gene*from*the*NCBI*

database.!

Analyse!data!results!

Perform!promoter!prediction!using!FPROM!

Perform!exon!prediction!using!NCBI's!spidey!

Tabulate!multiTF!data!

Open!multiTF!within!Mulan!

View!dynamic!overlay!

Perform!multiple!sequence!alignment!using!Mulan!

Organise!saved!qiles!into!folders!

Save!sequence!in!fasta!format!

Retrieve!sequence!from!ENSEMBL!

Page 20: Dissertation

! 13!

2.2+Materials+

2.2.1+Software+

Since!it!is!a!computerSbased!discipline,!there!are!online!programs!used!in!order!to!retrieve!sequences!and!compare!them!with!other!species.!This!includes:!!ENSEMBL!Genome!Browser!(http://www.ensembl.org/index.html)!is!a!scientific!project!jointly!constructed!by!the!European!Bioinformatics!Institute!and!Wellcome!Trust!Sanger!Institute!in!1999.!It!is!mostly!used!to!locate!and!describe!the!relationships!of!individual!genes!that!can!be!identified.!!BLAST!(http://blast.ncbi.nlm.nih.gov/Blast.cgi)!is!used!to!compare!primary!biological!sequence!information!including!the!amino!acid!sequences!of!different!proteins!or!the!nucleotides!of!DNA!sequences.!It!is!by!far!the!most!widely!used!technique!for!detecting!similarity!between!sequences!of!interest!(Altschul!et!al.,!1990).!!Clustal!Omega!(https://www.ebi.ac.uk/Tools/msa/clustalo/)!is!a!new!multiple!sequence!alignment!tool!that!uses!seeded!guide!trees!and!hidden!Markov!model!(HMM)!profileStechniques!to!generate!hundreds!and!thousands!of!alignments!and!virtually,!align!any!number!of!protein!sequences!quickly!which!results!in!the!delivery!of!accurate!alignments!(Sievers!et!al.,!2011).!!MULAN!Multiple!Sequence!Local!Alignment!(http://mulan.dcode.org)!employs!two!alignment!strategies,!which!allow!for!comparative!analysis!of!multiple!sequences!that!are!present!either!as!draft!or!finished!configuration.!!mVista!(http://genome.lbl.gov/vista/mvista/submit.shtml)!is!a!set!of!programs!for!comparing!DNA!sequences!from!two!or!more!species!up!to!megabases!long!and!visualise!these!alignments!with!annotation!information!(Frazer!et!al.,!2004).!!

Page 21: Dissertation

! 14!

multiTF!(http://multitf.dcode.org)!identifies!transcription!factor!binding!sites!(TFBS)!conserved!across!multiple!species!involved!into!the!alignment.!It!is!dynamically!interconnected!with!Mulan.!!!Spidey!(http://www.ncbi.nlm.nih.gov/spidey/)!is!a!messenger!RNA!(mRNA)StoSgenomic!alignment!program.!Sarah!Wheelan!created!the!program!relying!heavily!on!the!alignment!manager!to!easily!manage!and!quickly!access!alignments!and!sets!of!alignments.!!FPROM!Human!Promoter!Prediction!(http://linux1.softberry.com/berry.phtml?topic=fprom&group=programs&subgroup=promoter)!is!an!online!program!that!identifies!promoter!regions!and!regulatory!sites!(Solovyev!et!al.,!2010).!!Promoter!2.0!Prediction!Server!(http://www.cbs.dtu.dk/services/Promoter/)!is!an!online!bioinformatics!tool!used!to!predict!transcription!starts!sites!in!DNA!sequences.!It!has!been!developed!as!an!evolution!of!simulated!transcription!factors!that!interact!with!sequences!n!promoter!regions.!!

2.3+Methods+

2.3.1+Retrieval+of+Sequences+

The!gene,!CYP7B1,!was!entered!into!the!human!genome!database!at!http://www.ensembl.org/index.html!and!the!sequences!were!retrieved.!Each!human!gene!name!and!international!symbol!were!retrieved!from!the!HUGO!gene!nomenclature!committee!(HGNC)!website!(http://www.genenames.org).!It!is!necessary!to!provide!a!unique!symbol!for!each!gene!to!facilitate!electronic!data!retrieval!from!publications.!Each!symbol!maintains!parallel!construction!in!different!members!of!a!gene!family!and!can!also!be!used!in!other!species.!

Page 22: Dissertation

! 15!

!ENSEMBL!Accession!#.! Linnaeus!Classification! Common!Name!

ENST00000310193! Homo!Sapiens! Human!ENSDNOT00000004738! Dasypus!Novemcinctus! Armadillo!ENSFCAT00000010214! Felis!Catus! Cat!ENSBTAT00000001710! Bos!Taurus! Cow!ENSCAFT00000011608! Canis!Lupus!Familiaris! Dog!ENSTTRT00000001843! Tursiops!Truncatus! Dolphin!ENSLAFT00000012755! Loxodonta!Africana! Elephant!ENSCPOT00000004704! Cavia!Porcellus! Guinea!Pig!ENSECAT00000007698! Equus!Caballus! Horse!ENSMUST00000035625! Mus!Musculus! Mouse!ENSOCUT00000000440! Oryctolagus!Cuniculus! Rabbit!

Table*1:*List*of*species*to*be*analysed*and*gene*sequences*to*be*retrieved.*!A!detailed!procedure!to!retrieve!sequences!via!screenshots!follows:!!

!Figure*4:!ENSEMBL*homepage.!

The!gene!of!interest,!CYP7B1,!was!entered!along!with!‘Human’!selected!into!the!search!button!before!selecting!go.!

Page 23: Dissertation

! 16!

!!

!Figure*5:*Search*results*for*the*CYP7B1*gene.*

On!the!leftShand!side,!the!category!is!restricted!to!just!transcript.!!

!Figure*6:*Results*for*the*CYP7B1*Transcript.*

The!ENSEMBL!human!transcript!of!interest!is!then!selected!(CYP7B1S001).!!

Page 24: Dissertation

! 17!

!Figure*7:*Summary*of*the*CYP7B1?001*Transcript.*

Data!is!to!be!extracted!which!can!be!found!on!the!leftShand!side.!!

!Figure*8:*Choices*of*different*configurations*for*data*extraction.*

A!fasta!sequence!is!choosen!as!the!output!file!which!will!then!be!used!to!input!sequences!into!different!online!software.!The!sequence!for!upstream!(5’)!and!downstream!(3’)!is!restricted!to!5000bases!and!cDNA!button!is!ticked!for!use!in!exon!and!promoter!predictions.!!

Page 25: Dissertation

! 18!

!Figure*9:*Fasta*sequence*of*the*human*CYP7B1*gene.*

This!fasta!sequence!is!then!saved!in!a!normal!.txt!file!in!fasta*format!and!saved!into!a!folder!for!easy!access.!Fasta!format!is!a!text!format!with!all!files!beginning!within!a!single!line!description.!A!‘>’!must!appear!in!the!first!column!and!the!rest!of!the!title!line!is!arbitrary!but!must!be!informative!(Lesk,!2008).!!

!Figure*10:*The*species*to*be*sequenced*are*organised*into*folders.*

Each!sequence!retrieved!was!saved!in!fasta!format!with!each!file!annotated!with!the!common!biological!classification!of!each!species.!Each!sequence!was!then!saved!in!its!corresponding!folder.!

Page 26: Dissertation

! 19!

2.3.2+Generating+Alignments+

!Figure*11:*Mulan*homepage*which*can*be*accessed*at*http://mulan.dcode.org.*

Having!all!the!sequences!saved!in!fasta!format,!it!is!then!inputted!to!Mulan!for!a!multiple!sequence!alignment.!On!the!homepage,!the!number!of!species!under!investigation!(e.g.!11)!is!chosen.!!

!Figure*12:*Homepage*for*the*sequences*to*be*applied.*

Each!sequence!can!be!pasted!in,!in!FASTA!format,!uploaded!as!a!FASTA!file,!or!entered!as!an!accession!number!with!the!available!annotation!on!the!right!side!before!hitting!the!‘submit’!button.!

Page 27: Dissertation

! 20!

!Figure*13:*Summary*page*of*results*of*sequences*aligned.*

A!completed!alignment!request!results!in!a!‘summary!page’!which!provides!links!to!the!interactive!dynamic!visualisation!tool,!pairwise!dynamic!plots,!dotSplots,!annotation!files,!sequence!files!and!a!portal!to!the!transcription!factor!binding!site!analysis!tool,!MultiTF.!!

!Figure*14:*Dynamic*visualisation*profile*in*standard*stacked?pairwise*structure.*

A!dynamic!visualisation!gives!a!graphical!representation!of!the!relationship!of!the!species!inputted!including!conserved!evolutionary!regions!indicated!in!red.!A!legend!is!attached!on!the!leftShand!side!which!highlights!intronic!regions!in!pink,!coding!regions!in!blue,!untranslated!regions!in!yellow!and!repeat!regions!in!green.!!

Page 28: Dissertation

! 21!

!Figure*15:*Dynamic*visualisation*profile*in*color*density*by*interspecies*conservation*configuration.*At!the!visualisation!option,!‘color!density!by!interspecies!conservation’!can!be!selected.!This!illustrates!the!relationship!between!a!conserved!element!and!the!number!of!species!that!share!a!particular!region!(Loots!&!Ovcharenko,!2007).!!

!Figure*16:*Summary*of*conservation*of*the*sequences*inputted.*

Summary!conservation!can!also!be!selected!as!the!visualisation!option.!It!collects!shared!similarities!from!all!the!pairwise!comparisons!into!a!single!conservation!profile.!!

Page 29: Dissertation

! 22!

!Figure*17:*Phylogenetic*tree*result*of*the*sequences*submitted.*

There!is!an!option!to!view!the!phylogenetic!tree!of!the!sequences!in!question.!It!describes!evolutionary!relationships!between!the!human,!dolphin,!armadillo,!mouse,!horse,!guinea!pig,!rabbit,!cat,!elephant,!cow!and!dog!sequences!of!the!CYP7B1!gene.!Every!tree!branch!estimates!a!number!of!substitutions!per!1kb!of!sequence.!!

!Figure*18:*multiTF*homepage*within*Mulan.*

The!Mulan!alignment!can!be!submitted!to!‘multiTF’.!It!is!a!tool!used!for!the!identification!of!conserved!TFBS.!After!submitting!the!alignment!to!multiTF,!the!transcription!factors!to!be!investigated!can!be!selected.!!

Page 30: Dissertation

! 23!

!Figure*19:*Results*summary*of*multiTF*profile.*

A!results!readout!page!is!then!displayed.!This!page!can!be!used!to!navigate!a!summary!of!multiSconserved!sites.!!

!Figure*20:*zPicture*homepage*for*the*sequences*to*be*applied.*

zPicture!can!be!used!as!an!alternative!to!identifying!ECRs.!It!is!highly!flexible!which!allows!users!to!differentially!predict!ECRs.!All!eleven!sequences!are!inputted!to!each!corresponding!box.!!

Page 31: Dissertation

! 24!

!Figure*21:*Results*page*of*the*sequences*differentiated.*

Options!to!view!results!include!dynamic!visualisation,!pairwise!visualisation!and!dotSplots.!!!

!Figure*22:*Dynamic*visualisation*profile*of*ECR*of*species*of*interest.*

A!graphical!representation!of!the!different!species!added.!Ultraconserved!regions!are!displayed!as!red.!There!are!annotations!that!are!highlighted!with!pink!for!introns,!blue!for!known!coding!exons,!yellow!for!untranslated!regions!and!green!for!repeats.!!

Page 32: Dissertation

! 25!

!Figure*23:*Spidey*homepage.*

Spidey!is!ultimately!used!to!predict!regions!of!exons!in!a!sequence!of!DNA.!Each!species!cDNA!and!genomic!DNA!is!inputted!in!its!corresponding!box!before!hitting!‘align’.!All!of!the!eleven!species!were!inputted!and!each!species!exons!were!predicted.!!

!Figure*24:*Summary*results*of*the*sequence*inputted.*

The!summary!page!exhibits!the!predicted!regions!of!the!exons!for!each!species.!It!shows!genomic!coordinates!of!the!exons!as!well!as!mRNA!coordinates!and!actual!length!of!each!of!the!exons.!!

Page 33: Dissertation

! 26!

!Figure*25:*FPROM*homepage*

FPROM! is! an! online! tool! used! to! predict! the! site! of! the! promoters! in! a! gene! sequence.! The!genomic!DNA!sequence!is!inputted!before!hitting!‘proceed’.!!

!Figure*26:*Summary*page*of*results*of*the*position*of*the*predicted*promoter*regions.*

The!summary!page!shows!the!different!positions!of!the!predicted!promoter!in!the!sequence!as!well!as!the!position!of!the!TATA!box.!The!promoter!regions!and!TATA!boxes!are!found!within!5000bases!long!and!therefore,!results!over!5000bases!are!generally!excluded.!!

Page 34: Dissertation

! 27!

After!retrieving!all!the!required!sequences!for!each!gene,!the!results!of!the!alignment!studies!were!tabulated!to!make!the!output!easier!to!analyse!and!summarise.!The!first!result!created!was!a!physical!map!of!the!overall!gene!of!each!of!the!species!created!in!Microsoft!PowerPoint!including!predicted!exons!and!promoter!regions.!All!other!results!are!pasted!directly!from!the!site’s!page.!

3.0+Results+

3.1+Generating+and+visualising+DNA+alignments:+Mulan+

The!genome!envelops!biologically!functional!elements!that!have!mutated!at!a!slower!rate!than!the!neutrally!evolving!genomic!background.!Therefore,!comparative!sequence!analysis!of!different!species!that!identifies!ECRs!facilitates!the!prediction!of!functional!regions!(Loots!&!Ovcharenko,!2005).!It!is!currently!a!widely!employed!technique!to!graphically!represent!sequence!conservation!profiles!in!reference!to!the!base!DNA!sequence!that!is!linear!along!horizontal!xSaxis,!while!the!vertical!coordinate!displays!the!percent!identity!ration!with!the!secondary!sequence!(50%,!75%!and!100%)!(Schwartz!et!al.,!2000).!The!regions!of!conservation!are!graphically!characterised!as!peaks!and!evolutionary!thresholds!can!be!defined!to!highlight!ECRs!of!userSdefined!minimal!percent!identity!and!length.!Analytically,!a!100bp/70%!identity!threshold!provides!high!sensitivity!for!analysing!human/!species!conservation!profiles.!!MultipleSsequence!comparative!analysis!is!a!challenging!task!in!terms!of!generating!highly!reliable!alignments!and!graphically!displaying!the!alignment!results.!To!address!the!complexity!stemming!from!user!input!sequence!files!that!potentially!consist!of!a!large!number!of!sequences!of!varying!lengths!and!different!phylogenetic!relationships,!a!set!of!different!visualisation!options!is!application!to!any!finished!multipleSsequence!local!alignments.!For!example,!the!reference!sequence!can!be!dynamically!changed,!and!the!new!stacking!order!of!conservation!profiles!with!the!rest!of!the!species!will!be!automatically!determined!using!the!evolutionary!relationship!of!each!sequence!to!the!reference!sequence,!where!most!closely!related!species!are!the!bottom.!!

Page 35: Dissertation

! 28!

In!figure!27,!the!graph!represents!a!dynamic!visualisation!profile!showing!the!similar!conservation!profiles!for!each!of!the!species.!There!are!“ghosting”!regions,!which!appear!blank!but!evidently,!these!regions!still!has!a!function!that!is!not!akin!to!the!conserved!regions!found!among!all!the!species.!These!ECRs!represents!a!relationship!between!all!of!the!species!in!query!that!share!the!same!function.!These!ECRs!will!determine!the!locations!of!the!different!TFBS!that!would!be!predicted!using!multiTF.!!The!results!that!are!shown!in!figures!28S31!exhibit!a!dynamic!visualisation!profile!in!‘color!density!by!interspecies!conservation’!where!it!illustrates!a!relationship!between!the!colour!density!of!a!conserved!element!and!the!number!of!species!that!share!a!particular!region.!It!shows!the!full!gene!sequence!of!the!species!investigated.!There!are!similarities!that!exist!throughout!the!sequence!but!potentially!have!a!different!function.!The!colour!intensity!of!a!conserved!region!depends!on!the!number!of!different!species!that!contain!the!region!(the!darker,!the!more!conserved!species).!From!the!graph,!this!conservation!is!shared!among!all!of!the!mammals.!This!analysis!is!performed!for!every!pixelSwide!region!of!the!conservation!plot.!The!number!of!ECRs!from!different!species!that!overlap!with!a!particular!pixel!count!towards!the!number!of!species!sharing!this!region.!In!a!recent!study,!it!was!observed!that!regions!conserved!in!multiple!species!often!correlate!with!functional!elements!(Frazer!at!al,!2004).!Therefore,!the!colour!density!of!the!plot!can!potentially!highlight!different!DNA!segments!in!the!base!sequence!with!unique!evolutionary!character.!!Two!additional!data!representation!modules!are!implemented!in!the!Mulan!tool:!phylogenetic!shadowing!and!‘summary!of!conservation’.!While!‘summary!of!conservation’!collects!all!the!shared!nucleotide!similarities!from!all!the!pairwise!comparisons!into!a!single!conservation!profile,!the!phylogenetic!shadowing!option!effectively!collects!all!the!cumulative!nucleotide!matches!(Ovcharenko!et!al,!2004a).!The!phylogenetic!shadowing!visualisation!display!accurately!depicts!the!coding!exon!as!the!most!highly!conserved!element!(Figure!33).!In!addition,!the!identified!ECR!sharply!defines!the!exon!boundaries!without!any!priori!knowledge!of!its!location.!

Page 36: Dissertation

! 29!

A!phylogenetic!tree,!or!more!commonly!known!as!phylogeny,!is!a!graphical!representation!that!depicts!evolutionary!similarities!among!a!set!of!species!(Baum,!2008).!!!As!seen!in!figure!32,!the!dimension!lines!give!the!amount!of!genetic!change.!The!lines!are!branches!and!represent!evolutionary!lineages!changing!over!time!(the!longer!the!branch,!the!larger!the!amount!of!change).!Every!tree!branch!estimates!a!number!of!substitutions!per!1kilobite!(kb)!of!sequence.!Each!lineage!has!a!part!of!its!history!that!is!unique!to!it!alone!and!parts!that!are!shared!with!other!lineages,!for!example,!as!shown!in!figure!32,!the!dog!and!horse!respectively!has!its!own!unique!history!but!also!contains!shared!history*that!is!common!with!other!lineages,!e.g.!cow!and!dolphin.!Similarly,!each!lineage!has!ancestors!that!are!unique!to!that!lineage!and!ancestors!that!are!shared!with!other!lineages.!!Phylogenetic!trees!often!provide!an!efficient!structure!for!organising!knowledge!of!biodiversity!and!allow!one!to!develop!an!accurate,!nonSprogressive!conception!of!the!totality!of!evolutionary!history!(Baum,!2008).!

Page 37: Dissertation

! 30!

!Figure*27:*Mulan*conservation*analysis*for*the*human*CYP7B1*gene*from*159.5kb*to*212.6kb*

compared*with*guinea*pig,*elephant,*cat,*rabbit,*mouse,*cow,*dog,*armadillo,*horse*and*dolphin.*The*conservation*profile*depicts*the*differential*prediction*of*the*non?coding*ECRs*in*different*species*

(the*legend*on*the*left*describes*colouring*of*different*type*of*elements).*

Page 38: Dissertation

! 31!

!Figure'28:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'0kb'to'53.2kb'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'

horse'and'dolphin.''

Page 39: Dissertation

! 32!

!Figure'29:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'53.2kb'to'106.3'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'

horse'and'dolphin.'

Page 40: Dissertation

! 33!

!Figure'30:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'106.3kb'to'159.5'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'armadillo,'

horse'and'dolphin.'!

Page 41: Dissertation

! 34!

!Figure'31:'Mulan'conservation'analysis'for'the'human'CYP7B1'gene'from'159.9kb'to'212.6kb'compared'with'guinea'pig,'elephant,'cat,'rabbit,'mouse,'cow,'dog,'

armadillo,'horse'and'dolphin.'

Page 42: Dissertation

! 35!

!Figure'32:'Phylogenetic'tree'represents'the'evolutionary'history'of'the'CYP7B1'gene'in'different'vertebrate'lineages'(with'the'numbers'corresponding'to'the'number'of'

nucleotide'mismatches'per'kb).

Page 43: Dissertation

! 36!

3.2$Phylogenetic$shadowing$implemented$in$Mulan$

Phylogenetic!shadowing!has!emerged!as!a!strategy!for!deciphering!putative!

regulatory!elements!in!comparisons!of!closely!related!species.!It!compares!many!

closely!related!sequences!simultaneously!and!combining!mutations!from!all!the!

sequences!into!a!single!conservation!profile!(Loots!&!Ovcharenko,!2005).!

Phylogenetic!shadowing!implemented!in!Mulan!provides!easy!sequence!and!

annotation!through!several!venues!combined!with!a!fast!and!dynamic!

visualisation!interface.!It!is!best!used!for!the!analysis!of!large!sequence!intervals!

with!prior!set!up!or!known!conservation!detection!parameters.!

!

Based!on!the!results!showed!in!figure!33,!this!justifies!the!location!of!TFBS!found!

using!multiTF!which!are!shown!on!figure!35.!Most!of!the!TFBS!found!using!

multiTF!are!within!184kb!–!199kb.!It!is!important!to!keep!in!mind!though!that!

not!all!TFBS!can!be!found!using!phylogenetic!shadowing!owing!to!the!nature!of!

its!technique!and!that!false!positives!could!occur!(Blanchette!&!Tompa,!2002).!

!Figure'33:'Phylogenetic'shadowing'profile'of'the'human'and'10'other'species'comparison'

(100bp/70%)'identity'threshold.''

Page 44: Dissertation

! 37!

3.3$Detection$of$evolutionary$conserved$TFBS:$multiTF$

The!complexity!of!transcriptional!regulation!in!vertebrates!achieved!through!the!

combinatorial!and!synchronised!binding!of!different!transcription!factors!to!

gene!regulatory!elements.!These!CRMs!contain!a!specific!footprint!consisting!of!

several!TFBS!(Loots!&!Ovcharenko,!2005).!CRMs!are!usually!several!hundred!

base!pairs!in!length!and!stand!out!of!the!neighbouring!genomic!sequence!as!wellV

conserved!regions.!Their!function!can!be!inferred!computationally!only!by!

functions!that!have!been!associated!with!known!TFBS!patterns!present!in!CRM.!

!

Predicting!functional!TFBS!is!a!very!challenging!process!originating!from!the!

nature!of!binding!sites!that!are!very!short!in!length!(usually!ranging!from!6!to!

12bp).!Therefore,!TFBS!occur!at!a!highVfreqeuncy!across!a!genome!and!result!in!

an!overabundance!of!falseVpositive!predictions.!But!the!ability!to!accurately!

predict!functional!TFBS!is!a!powerful!approach!for!sequenceVbased!discovery!of!

gene!regulatory!sequences.!

!

Mulan!is!integrated!with!the!multiTF!that!operates!with!multipleVsequence!

alignments!and!benefits!from!extensive!sampling!of!the!phylogeny!and!performs!

a!search!for!TFBS!that!are!represented!in!all!the!species.!

!

In!this!report,!multiTF!is!used!to!analyse!the!CYP7B1!gene!to!look!for!any!TFBS!

known!to!enhance!the!expression!of!this!gene!across!the!10!species!in!query!with!

the!human!gene!being!the!reference!sequence.!10!TFBS!were!identified!as!wellV

defined!clusters!towards!the!end!region!(184kb!–!199kb)!of!the!CYP7B1!gene!

(Figure!34).!These!are!typically!illustrated!as!coloured!tick!marks!above!the!

conservation!profile.!The!potential!locations!of!each!of!these!predicted!TFBS!are!

highlighted!in!Figure!35!with!*!indicating!a!high!similarity!between!species.!

!

These!data!suggests!that!by!analysing!TFBS!pattern!in!multipleVsequence!

alignments,!one!can!dramatically!filter!out!sites!that!have!diverged!throughout!

evolution,!and!select!for!sites!that!are!most!likely!functional.!

Page 45: Dissertation

! 38!

!Figure'34:'multiTF'identification'of'conserved'TFBS'of'the'CYP7B1'gene.'The'identified'TFBS'are'

depicted'as'coloured'tick'marks'above'the'conservation'profile.'

Page 46: Dissertation

! 39!

$

!

!

!

Page 47: Dissertation

! 40!

!

!

!

!Figure'35:'The'potential'transcription'binding'sites'found'indicated'within'the'CYP7B1'gene.'

Page 48: Dissertation

! 41!

3.4$Exon$and$promoter$region$using$Spidey$and$Fprom$

Expressed!sequences!are!the!key!to!the!inner!workings!of!an!organism.!To!

understand!fully!the!function!of!an!expressed!sequence,!however,!it!needs!to!be!

put!in!its!genomic!context.!Alignment!of!expressed!sequences!to!their!parent!

genomic!sequences!can!be!used!to!find!or!confirm!a!gene’s!positive,!to!locate!

potential!regulatory!elements!and!alternative!splicing.!With!estimates!of!the!

human!gene!of!only!30,000,!alternative!splicing!may!be!an!important!factor!in!

generating!transcriptional!diversity,!so!mRNAVtoVgenomic!alignments!will!be!

crucial!to!our!understanding!of!the!genome.!

!

Species! No.!of!exons!predicted! No.!of!exons!correct!

Homo!sapiens! 5! 6!

Dasypus!novemcinctus! 5! 6!

Felis!catus! 6! 6!

Bos!taurus! 5! 7!

Canis!lupus!familiaris! 5! 6!

Tursiops!truncatus! 5! 6!

Loxodonta!africana! 7! 7!

Cavia!porcellus! 6! 8!

Equus!caballus! 6! 7!

Mus!musculus! 5! 6!

Oryctolagus!cuniculus! 6! 6!Table'2:'Results'of'no.'of'exons'predicted'by'Spidey'vs.'No.'of'exons'correct'taken'from'Ensembl.'

!

The!difference!in!exons!predicted!and!the!correct!number!of!exons!could!be!that!

there!is!no!good!splice!sites!at!either!the!donor!or!acceptor!splice!junctions!and!

so!therefore,!Spidey!could!not!place!those!junctions!unambiguously.!!

!

The!11!species!we!obtained!contained!at!least!83%!of!mRNA!aligned!across!the!

species!with!an!overall!identity,!for!most,!of!100%.!If!the!3’!end!of!the!mRNA!

does!not!align!completely,!it!is!first!examined!for!the!presence!of!a!poly(A)!tail.!

Out!of!the!11!species!in!query,!the!armadillo!species!indicate!a!possibility!of!a!

pseudogene!(dysfunctional!relatives!of!genes!that!have!lost!their!proteinVcoding!

Page 49: Dissertation

! 42!

ability!or,!otherwise,!is!no!longer!expressed!in!the!cell)!since!a!nonValigning!

poly(A)!tail:!9!was!found.!

!

The!exact!locations!of!the!different!exons!of!each!species!are!accentuated!in!

Appendix!I.!It!also!shows!the!overall!percent!identity,!the!percent!coverage!of!the!

mRNA!and!the!presence!an!aligning!or!nonValigning!poly(A)!tail.!

!

Species! Length!(bp)! Predicted!TSS! Predicted!TATA!

box!

TATA!sequence!

Homo!sapiens! 212,627! 3,925! 3,894! TATATATG!

Dasypus!

novemcinctus!

194,050! 4711! 4,681! TTTAAAAG!

Felis!catus! 39,628! 5505! 5,477! TATAAGTA!

Bos!taurus! 181,409! 1405! 1,375! AATAAAAG!

Canis!lupus!

familiaris!

178,270! 5,368! 5,343! TATTAAAG!

Tursiops!

truncatus!

187,667! 1,530! 1,500! AATATATC!

Loxodonta!

africana!

40,375! 3,371! 3,341! TATAAAAA!

Cavia!porcellus! 38,224! 1,570! 1,530! TATATAAT!

Equus!caballus! 209,282! 4,469! 4,439! AATAAAAG!

Mus!musculus! 181,389! 3,968! 3,939! TATAAAAA!

Oryctolagus!

cuniculus!

46,718! V! V! V!

Table'3:'Fprom'predictions'of'TSS'and'TATA'box.'!

The!Fprom!(find!promoter)!can!be!used!to!identify!transcription!start!sites!(TSS)!

upstream!of!annotated!coding!parts!of!genes!found!by!gene!prediction!software.!

According!to!Softberry!(developers!of!Fprom),!for!approximately!50V55%!level!

of!true!promoter!region!recognition,!the!Fprom!program!will!give!one!false!

positive!prediction!for!about!4000bp.!

!

Page 50: Dissertation

! 43!

Examples!of!Fprom!predictions!are!presented!in!Table!3.!The!predicted!

promoter!and!TATA!box!regions!are!narrowed!down!to!within!~5kb!long!but!it!

is!important!to!note!that!there!are!a!lot!of!promoters!found!in!each!species!which!

are!highlighted!in!the!model!shown!on!the!next!page.!No!promoter!or!TATA!box!

has!been!found!for!the!rabbit!species!within!the!5kb!limit!!

!

For!each!position!on!a!given!sequence,!the!Fprom!program!evaluates!the!

occurrence!of!TSS!using!two!linear!discriminant!functions!(separate!for!TATA+!

and!TATAV!promoters)!with!characteristics!computed!at!a!given!position.!If!it!

finds!a!TATAVbox!(using!a!TATAVbox!weight!matrix)!in!the!region,!then!it!

computes!the!value!of!Linear!Discriminant!Function!(LDF)!for!TATA+!promoters,!

otherwise!the!value!of!LDF!for!TATAVless!promoters.!

!

The!computational!identification!of!promoters!in!genomic!DNA!is!an!extremely!

difficult!problem!(Solovyev,!2002).!This!task!is!twoVfold:!finding!the!exact!

position!of!a!TSS!within!a!long!upstream!region!of!a!typical!eukaryotic!gene;!and!

avoiding!false!positive!predictions!within!exon!and!intron!sequences!(Solovyev!

et!al.,!2006).!

!

Regions!of!DNA,!which!signal!initiation,!are!termed!promoters!and!lie!‘upstream’!

of!the!start!of!the!actual!gene.!Initiation!starts!with!molecules!such!as!

polymerase!II!enzymes!finding!promoter!regions!upstream!(towards!the!3’!end!

of!a!strand)!of!a!gene.!These!regions!consist!of!specific!patterns!of!bases!known!

as!TATA!box.!The!start!point!of!a!gene!is!typically!25!bases!downstream!of!the!

TATA!box!for!eukaryotes.!!

Page 51: Dissertation

! 44!

Below!shows!the!model!results!of!exon!and!promoter!prediction:!

Homo!sapiens!(Human)!!

!The!human!transcript!identified!49!promoters!predicted.!It!contains!5!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!86%!of!the!mRNA!length.!

!

Dasypus!Novemcinctus!(Armadillo)!!

!!The!armadillo!transcript!identified!69!promoters!predicted.!It!contains!5!exons!

that!are!100%!identical!to!the!genomic!sequence,!covering!94%!of!the!mRNA!

length!with!a!nonValigning!poly(A)!tail:!9.!

!

Page 52: Dissertation

! 45!

!Felis!Catus!(Cat)!

!

The!cat!transcript!identified!14!promoters!predicted.!It!contains!6!exons!that!are!

100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!length.!

!

Bos!Taurus!(Cow)!

!

The!cow!transcript!identified!58!promoters!predicted.!It!contains!5!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!83%!of!the!mRNA!length.!

!

!

Page 53: Dissertation

! 46!

Canis!Lupus!Familiaris!(Dog)!

!The!dog!transcript!identified!51!promoters!predicted.!It!contains!5!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!92%!of!the!mRNA!length.!

!

Tursiops!Truncatus!(Dolphin)!

!The!dolphin!transcript!identified!92!promoters!predicted.!It!contains!5!exons!

that!are!100%!identical!to!the!genomic!sequence,!covering!91%!of!the!mRNA!

length.!

!

Page 54: Dissertation

! 47!

Loxodonta!Africana!(Elephant)!

!The!elephant!transcript!identified!10!promoters!predicted.!It!contains!7!exons!

that!are!100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!

length.!

!

Cavia!Porcellus!(Guinea!Pig)!

!The!guinea!pig!transcript!identified!9!promoters!predicted.!It!contains!6!exons!

that!are!98.7%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!

length.

Page 55: Dissertation

! 48!

Equus!Cabalus!(Horse)!

!The!horse!transcript!identified!77!promoters!predicted.!It!contains!6!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!94%!of!the!mRNA!length.!

!

Mus!Musculus!(Mouse)!

!The!mouse!transcript!identified!59!promoters!predicted.!It!contains!5!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!88%!of!the!mRNA!length.!

Page 56: Dissertation

! 49!

Oryctolagus!Cuniculus!(Rabbit)!

!The!rabbit!transcript!identified!13!promoters!predicted.!It!contains!6!exons!that!

are!100%!identical!to!the!genomic!sequence,!covering!100%!of!the!mRNA!length.!

!

The!entire!predicted!promoter!of!each!of!the!species!retrieved!from!Fprom!

contained!sequence!elements!within!the!V10!to!V35!bp!that!is!commonly!

expected.!

4.0$Discussion$

4.1$Genetics$and$sequencing$

Genes!have!been!known!to!exist!on!chromosomes,!which!ultimately!are!

composed!protein!and!DNA.!Other!theories!have!since!then!emerged!as!to!which!

is!responsible!for!inheritance!since!Mendel’s!work!in!the!midV19th!century.!This!

included!Griffith’s!experiment!in!1928!that!suggests!that!bacteria!are!capable!of!

transferring!genetic!information!through!transformation!(Reece!et!al.,!2011).!

Sixteen!years!later,!Oswald!Avery,!Colin!McLeod!and!Maclyn!McCarty!identified!

DNA!as!the!carrier!for!genetic!information!in!bacteria!(Klug!et!al.,!2007).!Watson!

and!Crick!in!1953!determined!the!structure!of!DNA!through!Franklin!and!

Wilkin’s!work!on!XVray!crystallography.!This!showed!that!genetic!information!

exists!in!the!sequence!of!nucleotides!on!each!strand!of!DNA.!In!the!following!

years,!scientists!tried!to!understand!how!DNA!controls!the!process!of!protein!

Page 57: Dissertation

! 50!

production.!It!was!then!discovered!that!the!cell!uses!DNA!as!a!template!to!create!

matching!mRNA.!The!nucleotide!sequence!of!mRNA!is!used!to!create!an!amino!

acid!sequence!in!protein!and!this!translation!between!nucleotide!sequenced!and!

amino!acid!sequences!is!known!as!the!genetic!code!(Rice,!2009).!This!newfound!

molecular!understanding!of!inheritance!has!led!to!the!development!of!DNA!

sequencing.!

!

The!genome!of!an!organism!contains!thousands!of!genes,!but!not!all!these!genes!

need!to!be!active!at!any!given!moment.!A!gene!is!expressed!when!it!is!being!

transcribed!into!mRNA!and!there!exist!many!cellular!methods!of!controlling!the!

expression!of!genes!such!that!proteins!are!produced!only!when!needed!by!the!

cell.!Regulatory!proteins!such!as!transcription!factors!bind!to!DNA!to!either!

promote!or!inhibit!the!transcription!of!a!gene!(Brivanlou!&!Darnell,!2002).!

!

As!the!entire!genomes!of!many!different!species!are!sequenced,!this!led!to!the!

direction!in!current!research!on!gene!finding!in!a!comparative!genomics!

approach!that!is!based!on!the!principle!that!the!forces!of!natural!selection!which!

drive!the!genes!and!other!functional!elements!to!endure!mutation!at!a!slower!

rate!than!the!rest!of!the!genome,!since!mutation!in!functional!elements!are!more!

likely!to!negatively!impact!the!organism!than!mutations!elsewhere.!Genes!can!

thus!be!evolutionary!detected!by!comparing!the!genomes!of!related!species!to!

detect!this!evolutionary!pressure!for!conservation.!

!

4.2$How$cholesterol$plays$a$role$in$CYP7B1$

Cholesterol!metabolised!to!7alphaVhydroxylated!bile!acids!is!a!principle!pathway!

of!cholesterol!degradation.!Cholesterol!7alphaVhydroxylase!(CYP7A1)!is!the!

initial!and!rateVdetermining!enzyme!in!the!“classic!pathway”!of!bile!acid!

synthesis.!An!“alternative”!pathway!of!bile!acid!synthesis!begins!with!27V

hydroxylation!of!cholesterol!by!27Vhydroxylase!(CYP27),!followed!by!CYP7B1!

(Ren!et!al.,!2003).!It!plays!a!minor!role!in!total!bile!acid!synthesis!but!the!

regulation!of!CYP7B1,!possibly!a!rateVdetermining!enzyme!in!the!alternative!

pathway,!has!not!been!thoroughly!studied!(Pandak!et!al.,!2002).!

Page 58: Dissertation

! 51!

Role/!Tissue! Substrates!

Bile!salt!synthesis! !

Liver! 25VHydroxycholesterol,!27V

hydroxycholesterol!

Steroid!hormone!metabolism! !

Brain! Pregnenolone,!

dehydroepiandrosterone!

Metabolism!of!estrogen!receptor!

ligands!

!

Prostate! 5αVAdrostaneV3β,17βVdiol!

Prostate! Dehydroepiandrosterone!(?)!

Vascular! 27Vhydroxycholesterol!

Immunoglobulin!production! !

Immune!cells! 25Vhydroxycholesterol!Table'4:'Physiological'roles'of'CYP7B1'adapted'from'(Stiles!et!al.,!2009).'

!

4.3$Problems$and$Challenges$in$Bioinformatics$

Bioinformatics!had!been!developed!to!handle!and!analyse!the!vast!amounts!of!

information!being!generated!by!sequencing!projects!but!had!considered!that!

once!the!human!genome!was!sequenced,!there!would!be!a!major!logistical!

problem!in!handling!the!sequence!data.!

Some!of!the!earliest!problems!in!genomics!concerned!how!to!measure!similarity!

of!DNA!and!protein!sequences!either!within!a!genome,!or!across!the!genomes!of!

different!species.!DNA!and!proteins!can!be!similar!in!terms!of!their!function,!

their!structure!or!their!linear!sequence!of!nucleotides!or!amino!acids.!The!key!

presumption!for!DNA!is!that!if!two!DNA!sequences!are!similar!that!they!probably!

share!the!same!function,!even!if!they!occur!in!different!parts!of!the!genome!or!

across!two!or!more!genomes!(Keedwall!&!Narayanan,!2005).!

!

Predictability!has!been!blatantly!difficult!in!biology,!and!the!role!of!theory!in!

biology!is!very!different!from!that!of!theoretical!physics,!which!usually!takes!a!

leading!role!in!research!(Buehler!&!Rashidi,!2005).!Much!of!the!difficulties!of!

theoretical!biology!are!rooted!in!the!complexity!of!biological!systems.!Many!

Page 59: Dissertation

! 52!

cellular!components!and!mechanisms!remain!to!be!discovered.!Genomics!and!proteomics!provide!good!examples!of!the!difficulties!in!predicting!the!behaviour!of!complex!system.!The!reasons!lie!in!the!nature!of!incomplete!data!rest!and!incomplete!or!wrong!data!annotation.!!4.4$Problems$with$online$tools$

The!tools!used!in!bioinformatics!are!applied!mathematics!and!computer!science.!Information!storage!and!retrieval,!statistical!analysis,!data!fitting,!and!computer!simulation!are!central!tasks,!and!today’s!molecular!biology!would!be!impossible!without!them.!Computers!are!essential!in!processing!large!amounts!of!data!in!a!timeVefficient!manner!that!is!otherwise!inefficient!through!manual!processing.!Computers,!however,!need!to!come!with!instructions,!and!the!analytical!process!that!foes!into!the!system!is!the!work!of!the!human!operator!and!needs!to!be!included!in!the!overall!time!it!takes!to!solve!a!problem!with!a!numerical!processor!(Buehler!&!Rashidi,!2005).!Thus,!human!intervention!takes!time!and!is!errorVprone.!!Running!comparative!analysis!or!searching!for!predicted!transcription!start!sites!are!not!an!easy!task.!There!are!complications!most!especially!with!publicly!funded!online!software!since!most!of!them,!nowadays,!have!either!run!its!course,!or!its!developers!no!long!fund!it!or!it!is!very!limited!to!the!length!of!the!sequence!in!one!can!submit.!!

Page 60: Dissertation

! 53!

4.5$Summary$of$results$

To!understand!a!novel!sequence!for!its!potential!functionality!in!an!organism,!

multiple!sequence!alignment!provides!biological!information!through!

evolutionary!related!genes!and!proteins.!

!

Using!Mulan!to!align!multiple!sequences!of!different!species!allowed!the!

determination!of!these!ECRs!shared!commonly!among!the!different!species!in!

question.!A!summary!led!to!confirming!the!region!to!where!the!TFBS!are!located.!

Transcription!factors!are!most!essential!for!the!regulation!of!gene!expression.!

Predicting!the!position!of!the!different!TFBS!predicted!shows!where!these!

transcription!factors!either!bind!to!enhancer!or!promoter!regions!of!DNA!to!the!

genes!they!regulate.!

!

Determining!the!phylogeny!has!increased!the!sequence!homology!between!

sequences,!which!indicates!a!closer!evolutionary!relationship!among!the!species.!

!

Overall,!the!increasing!interest!in!the!‘junk’!DNA,!that!is,!DNA!which!is!believed!

not!to!code!for!any!protein!has,!over!the!years,!allowed!scientists!and!

researchers!to!question!whether!such!sections!of!the!DNA!are!the!remains!of!

previously!useful!DNA!that!now!contains!no!function,!or!whether!nonVcoding!

DNA!provides!a!structural!aid!to!help!stabilise!chromosomes!and!the!nucleus!

(Keedwall!&!Narayanan,!2005).!

Page 61: Dissertation

! 54!

4.6$Future$of$comparative$genomics$

In!the!next!several!years,!genomes!from!a!wide!variety!of!species!covering!many!taxa!will!be!sequenced!which!will!therefore!bring!many!advances!in!comparative!genomics.!The!resources!for!comparative!genomics!is!expected!to!be!much!more!user!friendly!and!that!they!will!become!part!of!the!toolkit!of!virtually!every!experimental!biologist.!However,!building!the!bioinformatics!structure!to!realise!this!exciting!potential!will!require!new!developments!(Miller!et!al.,!2004).!!The!amount!of!biological!sequence!information!is!increasing!very!rapidly!and!seems!to!be!following!an!exponential!growth!law.!Computational!methods!are!playing!an!increasing!role!in!biological!sciences.!Genome!sequencing!projects!have!been!remarkably!successful,!and!comparative!analysis!of!whole!genomes!is!now!possible.!This!provides!challenges!and!opportunities!for!new!types!of!study!in!bioinformatics.!At!the!same!time,!several!types!of!experimental!methods!are!being!developed!currently!that!may!be!classed!as!‘highVthroughput’!(Higgs!&!Attwood,!2005).!These!include!microarrays,!proteomics,!and!structural!genomics.!The!philosophy!behind!these!methods!is!to!study!large!number!of!genes!or!proteins!simultaneously,!rather!than!to!specialise!in!individual!cases.!Bioinformatics!therefore!has!a!role!in!developing!statistical!methods!for!analysis!of!large!data!sets,!and!in!developing!methods!of!information!management!for!the!new!types!of!data!being!generated.!!!

Page 62: Dissertation

! 55!

Bibliography$Altschul,!S.!et!al.,!1990.!Basic!local!alignment!search!tool.!J'Mol'Biol,!215(3),!

pp.403V10.!

Baum,!D.,!2008.!Reading!a!Phylogenetic!Tree:!The!Meaning!of!Monophyletic!

Groups.!Nature'Education,!1(1),!p.190.!

Blanchette,!M.!&!Tompa,!M.,!2002.!Discovery!of!Regulatory!Elements!by!a!

Computational!Method!for!Phylogenetic!Footprinting.!Genome'Res,!12(5),!

pp.739V48.!

Brivanlou,!A.H.!&!Darnell,!J.E.,!2002.!Signal!Transduction!and!the!Control!of!Gene!

Expression.!Science,!295(5556),!pp.813V18.!

Buehler,!L.K.!&!Rashidi,!H.H.,!2005.!Computers!in!Biology!and!Medicine.!In!L.K.!

Buehler,!ed.!Bioinformatics'Basics:'Applications'in'Biological'Science'and'Medicine.!

2nd!ed.!Boca!Raton:!CRC!Press.!

Buhaescu,!I.!&!Izzedine,!H.,!2007.!Mevalonate!pathway:!a!review!of!clinical!and!

therapeutical!implications.!Clin'Biochem,!40(9V10),!pp.575V84.!

Campbell,!M.!&!Farell,!S.,!2012.!Lipid!Metabolism:!Cholesterol!Biosynthesis.!In!A.!

White,!ed.!Biochemistry.!7th!ed.!Belmont:!Brooks/Cole.!pp.613V20.!

Collins,!F.!&!Galas,!D.,!1993.!A!New!FiveVYear!Plan!for!the!United!States:!Human!

Genome!Program.!Science,!262,!pp.43V46.!

Collins,!F.,!Green,!E.,!Guttmacher,!A.!&!Guyer,!M.,!2003.!A!Vision!for!the!Future!of!

Genomics!Research:!A!Blueprint!for!the!Genomic!Era.!Nature,!422(6934),!

pp.835V47.!

Collins,!F.S.!et!al.,!1998.!New!goals!for!the!U.S.!Human!Genome!Project:!1998V

2003.!Science,!282(5389),!pp.682V89.!

Dietrich,!W.F.!et!al.,!1996.!A!comprehensive!genetic!map!of!the!mouse!genome.!

Nature,!380(6570),!pp.149V52.!

Fletcher,!H.!&!Hickey,!I.,!2013.!DNA!Structure.!In!E.!Owen,!ed.!Genetics.!4th!ed.!

New!York:!Garland!Science.!p.2.!

Frazer,!K.!et!al.,!2004.!VISTA:!computational!tools!for!comparative!genomics.!

Nucleic'Acids'Res,!32(Web!Server!Issue),!pp.W273V9.!

Page 63: Dissertation

! 56!

Freilich,!S.,!Goldovsky,!L.,!Ouzounis,!C.A.!&!Thornton,!J.M.,!2008.!Metabolic!

innovations!towards!the!human!lineage.!BMC'Evol'Biol,!8,!p.247.!

Fu,!Q.!et!al.,!1998.!Control!of!cholesterol!biosynthesis!in!Schwann!cells.!J'

Neurochem,!71(2),!pp.549V55.!

Ganley,!A.R.!&!Kobayashi,!T.,!2007.!Phylogenetic!footprinting!to!find!functional!

DNA!elements.!Methods'Mol'Biol,!395,!pp.367V80.!

Higgs,!P.G.!&!Attwood,!T.K.,!2005.!Introduction:!the!revolution!in!bilogical!information.!In!P.G.!Higgs,!ed.!Bioinformatics'and'Molecular'Evolution.!Oxford:!

Blackwell!Science!Ltd.!

Horton,!J.,!Goldstein,!J.!&!Brown,!M.,!2002.!SREBPs:!activators!of!the!complete!

program!of!cholesterol!and!fatty!acid!synthesis!in!the!liver.!J'Clin'Invest,!109(9),!

pp.1125V31.!

Howard!Hughes!Medical!Institute,!2001.!Species:'Comparing'their'Genome.!

[Online]!Available!at:!http://www.actionbioscience.org/genomics/hhmi.html!

[Accessed!6!April!2014].!

Keedwall,!E.!&!Narayanan,!A.,!2005.!Introduction!to!Problems!and!Challenges!in!

Bioinformatics.!In!E.!Keedwall,!ed.!Intelligent'Bioinformatics.!West!Sussex:!John!

Wiley!&!Sons!Ltd.!pp.31V49.!

Kent,!W.,!2002.!BLAT—The!BLASTVLike!Alignment!Tool.!Genome'Res,!12,!pp.656V

64.!

Klug,!W.S.,!Cummings,!M.R.!&!Spencer,!C.A.,!2007.!Introduction!to!Genetics.!In!G.!Carlson,!ed.!Essentials'of'Genetics.!6th!ed.!New!Jersey:!Pearson!Prentice!Hall.!p.4.!

Korf,!B.R.,!2007.!The!Human!Genome.!In!M.!Sugden,!ed.!Human'Genetics'and'

Genomics.!3rd!ed.!Oxford:!Blackwell!Publishing!Ltd.!pp.77V80.!

Le!Roy,!C.!&!Wrana,!J.L.,!2005.!ClathrinV!and!nonVclathrinVmediated!endocytic!

regulation!of!cell!signalling.!Nat'Rev'Mol'Cell'Biol,!6,!pp.112V26.!

Lesk,!A.M.,!2008.!Genome!organization!and!evolution.!In!A.M.!Lesk,!ed.!

Introduction'to'Bioinformatics.!3rd!ed.!New!York:!Oxford!University!Press!Inc.!

p.104.!

Lewington,!S.!et!al.,!2007.!Blood!cholesterol!and!vascular!mortality!by!age,!sex,!

and!blood!pressure:!a!metaVanalysis!of!individual!data!from!61!prospective!

studies!with!55!000!vascular!deaths.!The'Lancet,!370(9602),!pp.1829V39.!

Page 64: Dissertation

! 57!

Lieberman,!M.!&!Ricer,!R.,!2013.!BRS'Biochemistry,'Molecular'Biology'and'

Genetics.!6th!ed.!Philadelphia:!Lippincott!Williams!&!Wilkins.!

Loots,!G.G.!&!Ovcharenko,!I.,!2005.!Dcode.org!anthology!of!comparative!genomic!

tools.!Nucleic'Acids'Res,!33(Web!server!issue),!pp.W56V64.!

Loots,!G.G.!&!Ovcharenko,!I.,!2007.!Mulan!MultipleVSequence!Alignment!to!

Predict!Functional!Elements!in!Genomic!Sequences.!Methods'Mol'Biol,!395,!

pp.237V54.!

Lu,!H.,!2011.!Application!of!comparative!genomics!for!the!detection!of!genomic!

features!and!transcriptional!regulatory!elements.!Graduate'Theses'and'

Dissertations,!Paper!12151.!

Maxfield,!F.!&!van!Meer,!G.,!2010.!Cholesterol,!the!central!lipid!of!mammalian!

cells.!Curr'Opin'Cell'Biol,!22(4),!pp.422V29.!

McVean,!G.A.,!2012.!An!integrated!map!of!genetic!variation!from!1,092!human!

genomes.!Nature,!491(7422),!pp.56V65.!

Miller,!W.,!Makova,!K.D.,!Nekrutenko,!A.!&!Hardison,!R.C.,!2004.!Comparative!

Genomics.!Annu'Rev'Genomics'Hum'Genet,!5,!pp.15V56.!

Mo,!H.!&!Elson,!C.,!2004.!Studies!of!the!isoprenoidVmediated!inhibition!of!

mevalonate!synthesis!applied!to!cancer!chemotherapy!and!chemoprevention.!

Exp'Biol'Med,!229(7),!pp.567V85.!

National!Library!of!Medicine,!2014.!CYP7B1'Z'cytochrome'P450,'family'7,'

subfamily'B,'polypeptide'1.![Online]!Available!at:!

http://ghr.nlm.nih.gov/gene/CYP7B1![Accessed!13!April!2014].!

NCBI!RefSeq,!2008.!CYP7B1'cytochrome'P450,'family'7,'subfamily'B,'polypeptide'1'

[Homo'sapiens'(human)].![Online]!Available!at:!

http://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=ShowDetailView&TermToS

earch=9420![Accessed!13!April!2014].!

Nelson,!D.L.!&!Cox,!M.M.,!2008.!In!K.!Ahr,!ed.!Lehninger's'Principles'of'

Biochemistry.!5th!ed.!New!York:!W.H.!Freeman!and!Company.!pp.831V45.!

Ohyama,!Y.!et!al.,!2006.!Studies!on!the!transcriptional!regulation!of!cholesterol!

24Vhydroxylase!(CYP46A1):!Marked!insensitivity!towards!different!regulatory!

axes.!J'Biol'Chem,!281(7),!pp.3810V20.!

Ovcharenko,!I.!et!al.,!2005.!Mulan:!MultipleVsequence!local!alignment!and!

visualisation!for!studying!function!and!evolution.!Genome'Res,!15(1),!pp.184V94.!

Page 65: Dissertation

! 58!

Ovcharenko,!I.!et!al.,!2005.!Mulan:!MultipleVsequence!local!alignment!and!

visualization!for!studying!function!and!evolution.!Genome'Res,!15(1),!pp.184V94.!

Ovcharenko,!I.!et!al.,!2004.!zPicture:!dynamic!alignment!and!visualization!tool!for!

analyzing!conservation!profiles.!Genome'Res,!14(3),!pp.472V77.!

Pandak,!W.!et!al.,!2002.!Regulation!of!oxysterol!7alphaVhydroxylase!(CYP7B1)!in!

primary!cultures!of!rat!hepatocytes.!Hepatology,!35(6),!pp.1400V8.!

Parker,!S.!et!al.,!2009.!Local!DNA!Topography!Correlates!with!Functional!

Noncoding!Regions!of!the!Human!Genome.!Science,!324(5925),!pp.389V92.!

Pennisi,!E.,!2007.!Breakthrough!of!the!year.!Human!genetic!variation.!Science,!

318(5858),!pp.1842V43.!

Pennisi,!E.,!2013.!The!CRISPR!Craze.!Science,!341(6148),!pp.833V36.!

Reece,!J.B.!et!al.,!2011.!Genomes!and!Their!Evolution:!New!approaches!have!

accelerated!the!pace!of!genome!sequencing.!In!B.!Wilbur,!ed.!Campbell'Biology.!

9th!ed.!San!Francisco:!Pearson!Education,!Inc.!pp.473V74.!

Reece,!J.B.!et!al.,!2011.!The!Molecular!Basis!of!Inheritance:!DNA!is!the!Genetic!

Material.!In!Campbell'Biology.!9th!ed.!San!Francisco:!Pearson!Education,!Inc.!

pp.351V56.!

Reece,!J.B.!et!al.,!2011.!The!Structure!and!Function!of!Large!Biological!Molecules:!

Nucleic!acids!store,!transmit,!and!help!express!hereditary!information.!In!B.!

Wilbur,!ed.!Campbell'Biology.!9th!ed.!San!Francisco:!Pearson!Education!Inc.!

p.133.!

Reed,!B.!et!al.,!2008.!GenomeVWide!Occupancy!of!SREBP1!and!Its!Partners!NFY!

and!SP1!Reveals!Novel!Functional!Roles!and!Combinatorial!Regulation!of!Distinct!

Classes!of!Genes.!PLoS'Genet,!4(7).!

Ren,!S.!et!al.,!2003.!Regulation!of!oxysterol!7alphaVhydroxylase!(CYP7B1)!in!the!

rat.!Metabolism,!52(5),!pp.636V42.!

Rice,!S.A.,!2009.!DNA!(raw!material!of!evolution).!In!S.A.!Rice,!ed.!Encyclopedia'of'

Evolution.!New!York:!Infobase!Publishing.!p.134.!

Rosanoff,!A.!&!Seelig,!M.S.,!2004.!Comparison!of!Mechanism!and!Functional!

Effects!of!Magnesium!and!Statin!Pharmaceuticals.!J'Am'Coll'Nutr,!23(5),!pp.501V

05.!

Rose,!K.!et!al.,!2001.!Neurosteroid!Hydroxylase!CYP7B!vivid!reporter!activity!in!

dentate!gyrus!of!geneVtargeted!mice!and!abolition!of!a!widespread!pathway!of!

Page 66: Dissertation

! 59!

steroid!and!oxysterol!hydroxylation.!Journal'of'Biological'Chemistry,!276,!

pp.23937V44.!

Rose,!K.A.!et!al.,!1997.!Cyp7b,!a!novel!brain!cytochrome!P450,!catalyzes!the!

synthesis!of!neurosteroids!7alphaVhydroxy!dehydroepiandrosterone!and!7alphaV

hydroxy!pregnenolone.!Proceedings'of'the'National'Academy'of'Sciences'of'the'

United'States'of'America,!94(10),!pp.4925V30.!

Saher,!G.!et!al.,!2005.!High!cholesterol!level!is!essential!for!myelin!membrane!

growth.!Nat'Neurosci,!8(4),!pp.468V75.!

Saher,!G.!et!al.,!2009.!Cholesterol!Regulates!the!Endoplasmic!Reticulum!Exit!of!

the!Major!Membrane!Protein!P0!Required!for!Peripheral!Myelin!Compaction.!J'

Neurosci,!29(19),!pp.6094V104.!

Saher,!G.,!Quintes,!S.!&!Nave,!K.,!2011.!Cholesterol:!a!novel!regulatory!role!in!

myelin!formation.!The'Neuroscientist,!17(1),!pp.79V93.!

Samuelsson,!T.,!2012.!Genomics'and'Bioinformatics.!New!York:!Cambridge!

University!Press.!

Sanger,!F.,!1981.!Determination!of!Nucleotide!Sequences!in!DNA.!Science,!

214(4526),!pp.1205V10.!

Schlicker,!A.,!2005.!A!Global!Approach!to!Comparative!Genomics:!Comparison!of!

Functional!Annotation!over!the!Taxonomic!Tree.!Master'Thesis.!

Schwartz,!S.!et!al.,!2000.!PipMakerVVa!web!server!for!aligning!two!genomic!DNA!

sequences.!Genome'Res,!10(4),!pp.577V86.!

Setchel,!K.D.!et!al.,!1998.!Identification!of!a!new!inborn!error!in!bile!acid!

synthesis:!mutation!of!the!oxysterol!7alphaVhydroxylase!gene!causes!severe!

neonatal!liver!disease.!J'Clin'Invest,!102(9),!pp.1690V703.!

Shafaati,!M.,!O'Driscoll,!R.,!Björkhem,!I.!&!Meaney,!S.,!2009.!Transcriptional!

regulation!of!cholesterol!24Vhydroxylase!by!histone!deacetylase!inhibitors.!

Biochem'Biophys'Res'Commun,!378(4),!pp.689V94.!

Sievers,!F.!et!al.,!2011.!Fast,!scalable!generation!of!highVquality!protein!multiple!

sequence!alignments!using!Clustal!Omega.!Mol'Syst'Biol,!7,!p.539.!

Solovyev,!V.,!2002.!Finding!genes!by!computer:!probabilistic!and!discriminative!

approaches.!In!T.!Jiang,!T.!Smith,!Y.!Xu!&!M.!Zhang,!eds.!Current'Topics'in'

Computational'Biology.!Massachusetts:!The!MIT!Press.!pp.365V401.!

Page 67: Dissertation

! 60!

Solovyev,!V.,!Kosarev,!P.,!Seledsov,!I.!&!Vorobyev,!D.,!2006.!Automatic!annotation!

of!eukaryotic!genes,!pseudogenes!and!promoters.!Genome'Biol,!7(1),!p.S10.!

Solovyev,!V.V.,!Shahmuradov,!I.A.!&!Salamov,!A.A.,!2010.!Identification!of!

promoter!regions!and!regulatory!sites.!Methods'Mol'Biol,!674,!pp.57V83.!

Stapleton,!G.!et!al.,!1995.!A!novel!cytochrome!P450!expressed!primarily!in!brain.!

J'Biol'Chem,!270(50),!pp.29739V45.!

Stiles,!A.,!McDonald,!J.,!Bauman,!D.!&!Russell,!D.,!2009.!CYP7B1:!One!Cytochrome!

P450,!Two!Human!Genetic!Diseases,!and!Multiple!Physiological!Functions.!J'Biol'

Chem,!284(42),!pp.28485V89.!

Strachan,!T.!&!Read,!A.,!2011.!Comparative!Genomics.!In!T.!Strachan,!ed.!Human'

Molecular'Genetics.!4th!ed.!New!York:!Garland!Science.!p.306.!

Sun,!L.VP.,!Li,!L.,!Goldstein,!J.L.!&!Brown,!M.S.,!2005.!Insig!Required!for!SterolV

mediated!Inhibition!of!Scap/SREBP!Binding!to!COPII!Proteins!in!Vitro.!J'Biol'

Chem,!280,!pp.26483V90.!

Thompson,!J.,!Higgins,!D.!&!Gibson,!T.,!1994.!CLUSTAL!W:!improving!the!

sensitivity!of!progressive!multiple!sequence!alignment!through!sequence!

weighting,!positionVspecific!gap!penalties!and!weight!matrix!choice.!Nucleic'Acids'

Res,!22(22),!pp.4673V80.!

Thompson,!W.!et!al.,!2004.!Decoding!Human!Regulatory!Circuits.!Genome'Res,!

14(10a),!pp.1967V74.!

Touchman,!J.,!2010.!Comparative!Genomics.!Nature'Education'Knowledge,!3(10),!

p.13.!

van!der!Most,!P.!et!al.,!2009.!Statins:!Mechanisms!of!neuroprotection.!Prog'

Neurobiol,!88,!pp.64V75.!

Wasserman,!W.W.!et!al.,!2000.!HumanVmouse!genome!comparisons!to!locate!

regulatory!sites.!Nat'Genet,!26(2),!pp.225V28.!

Watson,!J.D.!&!Crick,!F.H.C.,!1953.!A!Structure!for!Deoxyribose!Nucleic!Acid.!

Nature,!171,!pp.737V38.!

Weiling,!F.,!1991.!Historical!study:!Johann!Gregor!Mendel!1822–1884.!Am'J'Med'

Genet,!40(1),!pp.1V25.!

Wheelan,!S.J.,!Church,!D.M.!&!Ostell,!J.M.,!2001.!Spidey:!A!tool!for!mRNAVtoV

Genomic!Alignments.!Genome'Res,!11(11),!pp.1952V57.!

Page 68: Dissertation

! 61!

Wilcox,!C.!et!al.,!2007.!Coordinate!upVregulation!of!TMEM97!and!cholesterol!

biosynthesis!genes!in!normal!ovarian!surface!epithelial!cells!treated!with!

progesterone:!implications!for!pathogenesis!of!ovarian!cancer.!BMC'Cancer,!7,!

p.223.!

Ye,!J.,!McGinnis,!S.!&!Madden,!T.L.,!2006.!BLAST:!improvements!for!better!

sequence!analysis.!Nucleic'Acids'Res,!34(Web!Server),!pp.W36V39.!

Zhang,!Z.!&!Gerstein,!M.,!2003.!Of!mice!and!men:!phylogenetic!footprinting!aids!

the!discovery!of!regulatory!elements.!J'Biol,!2(2),!p.11.!

Zvelebil,!M.!&!Baum,!J.O.,!2008.!Producing!and!Analyzing!Sequence!Alignments.!

In!D.!Holdsworth,!ed.!Understanding'Bioinformatics.!New!York:!Garland!Science.!

pp.89V90.!

!

!

!

Page 69: Dissertation

! 62!

Appendix$I$Exact!positions!of!different!exons!of!different!species!predicted!using!Spidey.!Human!

!Armadillo!

!Cat!

!Cow!

!Dog!

!Dolphin!

!

Page 70: Dissertation

! 63!

Elephant!

!Guinea!Pig!

!Horse!

!Mouse!

!Rabbit!

!


Recommended