+ All Categories
Home > Documents > University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26...

University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26...

Date post: 22-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
29
University of Groningen Characterisation of the M-locus and functional analysis of the male-determining gene in the housefly Wu, Yanli IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2018 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Wu, Y. (2018). Characterisation of the M-locus and functional analysis of the male-determining gene in the housefly. University of Groningen. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 24-05-2021
Transcript
Page 1: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

University of Groningen

Characterisation of the M-locus and functional analysis of the male-determining gene in thehouseflyWu, Yanli

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2018

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Wu, Y. (2018). Characterisation of the M-locus and functional analysis of the male-determining gene in thehousefly. University of Groningen.

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 24-05-2021

Page 2: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

23

Chapter2CharacterisationofthecomplexnatureofM-lociinMuscadomestica

Partofthischapterispublishedin:Sharma,A.,Heinze,S.D.,Wu,Y.,Kohlbrenner,T., Morilla, I., Brunner, C., Wimmer, E.A., Zande, L. van de, Robinson, M.D.,Beukeboom,L.W.,Bopp,D.(2017).MalesexinhousefliesisdeterminedbyMdmd,aparalogofthegenericsplicefactorgeneCWC22.Science356,642–645.

Page 3: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

24

2.1AbstractThe housefly (Musca domestica) is a perfect model to study insect sexdetermination as it harbours various systems. An M-locus that contains themale-determininggene(s)istypicallylocatedontheY-chromosome,butcanalsobe present on any of the five autosomes or even the X-chromosome.Recently,based upon a differential transcriptome analysis of early male and femaleembryos,“orphanreads”(ORMs)wereidentifiedaspossibletranscriptsfromthemale-determininggene,Mdmd(forMuscadomesticamaledeterminer),toresideintheM-locus.TofurtherinvestigatethenatureoftheM-locus,IusedtheseORMsequences to find adjacent genomicDNA sequences. I found that theMIII-locus(M-locus on chromosome III) and the MV-locus (M-locus on chromosome V)contain multiple copies of sequences, with various level of homology to eachother.CladogramanalysisfurtherdemonstratedthatsequencesintheMIII-locusand theMV-locus couldbedivided intodifferent clades,with sequenceswithinclades being more similar than sequences between clades. Interestingly, theMIII-locusandtheMV-locussharesomesimilarsequences.Theseresultsaremosteasily explained by assuming that there have been independent amplificationeventsbeforeandafterthetranslocationoftheM-locustoautosomesIIIandV,possiblyfromtheY-chromosome.

Page 4: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

25

2.2IntroductionVariousinsectsexdeterminationsystemsexistthatcanbevariableevenwithinspecies (Sánchez, 2004; Bachtrog et al., 2014; Beukeboom and Perrin, 2014;Blackmonet al., 2017).How thisdiversityof insect sexdetermination systemshas evolved still remains unclear. The housefly, Musca domestica, harboursseveralsexdeterminationsystemsandisthereforeaperfectmodeltostudytheevolutionof sexdetermination.AnM-locus that contains themale-determininggene(s)istypicallylocatedontheY-chromosome,butcanalsobepresentonanyautosome or even the X-chromosome (Wagoner, 1969; Inoue and Hiroyoshi,1982; Denholm et al., 1983; Inoue et al., 1986).Md-transformer (Mdtra) wasidentifiedasthefemale-determininggeneintheM.domesticasexdeterminationpathway(Hedigeretal.,2010).MdtramRNAandMdtra2mRNAarematernallyprovidedtokick-startapositiveautoregulatoryfeed-backloopoffemale-specificsplicing ofMdtra mRNA in the zygote (Bopp, 2010). MdTRA protein leads tofemale-specific splicing ofMdtra mRNA with the assistance of other essentialco-factors suchasMdTRA2protein (Hedigeretal.,2010).ThemRNAof theM.domestica doublesex homologue, Mddsx, is spliced by MdTRA protein and itsco-factor MdTRA2 protein into the female variant, which leads to femaledevelopment (Burghardt et al., 2005; Hediger et al., 2010). The action of themale-determining gene(s) is the interruption of this autoregulatory loop. Thisresults in male-specific splicing of Mdtra mRNA, yielding a non-functionalMdTRA truncated protein (Hediger et al., 2010). Hence, in the presence of amale-determininggene(s),Mddsxissplicedintoitsmale-specificisoform,leadingtomaledevelopment.Adifferentialtranscriptomeanalysisonearlyunisexualembryosidentifiedfourtranscriptpartsamongthetopmale-specificallyexpressedsequencesthatwerealsoabsentinthefemalegenome(Scottetal.,2014;Sharmaetal.,2017).These“orphansequences”weretermedORM#1,ORM#2,ORM#3andORM#6(Sharmaet al., 2017). PCR amplification from the genomes of MIII males (M-locus onautosomeIII)withprimerslocatedintheseORMsconfirmedthatallfourORMsbelong to the same gene. This candidate male-determining gene was namedMdmd(forMuscadomesticamaledeterminer)(Sharmaetal.,2017).Mdmdisonlypresent in themale genome (Sharma et al., 2017). Fig. 2.1 shows the order ofORMsintheMdmdassembly.

Page 5: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

26

Figure2.1:ThepositionofORMsinMdmd:ORM#1islocatedonthe5’regionandORM#6onthe3’region.ORM#3isspanningasmall intronandORM#2is located in themiddlepartofMdmd.MIF4GandMA3aretwoconserveddomains.Silencing of Mdmd by RNAi confirmed that Mdmd is necessary for testesdifferentiation (Sharma et al., 2017). Moreover, knockout of Mdmd byCRISPR-Cas9 resulted in complete feminisation, indicating thatMdmd plays animportant role in male development. Disruption of Mdmd also affected itsdownstreamgeneMdtraandMddsx.WhenMdmd isdisruptedbyCRISPR-Cas9,Mdtraissplicedinthefemalevariantinsex-reversedindividuals(Sharmaetal.,2017). Similarly, the female splice variant of Mddsx was also detected insex-reversed individuals, incontrast tothemalesplicevariantofcontrolmales(Sharma et al., 2017). These results confirmed thatMdmd plays an importantrole inmaledevelopmentandservesas theprimarysignal in theM.domesticasexdeterminationpathway. SeveralquestionsremainaboutthestructureoftheM-locusandthefunctionofMdmd. How Mdmd is embedded in the M-locus remains unknown, and theregionsadjacenttotheMdmdORMshavenotbeendeterminedyet.Moreover,itis not yet known whether expression of Mdmd is sufficient to turn genotypicfemales intomales orwhether additional genes are involved. Identifying thesegenomicregionsadjacenttotheorphancontigswillprovidemolecularevidencefor the organisation of the M-locus and help to charaterize the completesequence ofMdmd. In this chapter, I describe the genomic regions adjacent toORM#1andORM#6intwoautosomalMstrains,MIII(M-locusonchromosomeIII)andMV(M-locuson chromosomeV)bygenomewalking (Siebert et al., 1995). Ipresent evidence that the M-loci in both strains contain multiple copies ofsequences, that all show various level of homology to each other. I furtherinvestigatewhethertheM-locialsocontaininterspersedgenomicsequencesthatexistbothinthemaleandthefemalegenome.Inaddition,IdescribethecommonsequencessharedbytheMIII-locusandtheMV-locus.TheseresultscontributetoafurtherunderstandingofsexchromosomeevolutioninM.domestica.

Mdmd

5' 3' ORM1 ORM3 ORM2 ORM6

MIF4G MA3

Page 6: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

27

2.3MaterialsandMethods2.3.1MuscadomesticastrainsandculturingTwodifferentM.domestica strainswereused forgenomewalkinganalysis. (1)3-6MIIIstrain:MislocatedonautosomeIII.FemaleshavegenotypesX/X;pwbwbw/pwbwbwandmalesX/X;pw+MIIIbwb+w/pw+bwbw.pwstandsforpointedwings, bwb for brown body andw for white eyes, all being recessive visiblemarkers on autosome III. Females have brown body, white eyes and notchedwings.MalesareheterozygousforMandtheyhaveblackbody,whiteeyesandnormalwings.(2)35-4MVstrain:MislocatedonautosomeV.FemalesareX/X;bwb/bwb; ocra/ocra, males are X/X; bwb/bwb; MV ocra+/+ ocra. ocra is arecessiveyelloweyecolourmarkeronautosomeV.Femalesarephenotypicallybrown body with yellow eyes. Males are heterozygous for M and they havebrownbodywithredeyes.Strainswererearedat25°Casdescribedpreviously(Schmidtetal.,1997). 2.3.2GenomewalkingDNA of single adult males from the MIII and MV strains was extracted byNucleoSpin®TissueGenomicDNApurificationkitfromMachereyNagel(Düren,Germany). Genome walking was performed according to UniversalGenomeWalkerTM 2.0 User Manuel from Clontech (Fig. 2.2; California, UnitedStates). The concentration of experimental genomic DNA was checked inNanodropfromThermoFisherScientific(Massachusetts,UnitedStates).ThesizeandthequalityofgenomicDNAwerecheckedona0.6%agarose/EtBrgelandthe size of genomicDNA should be larger than50kbwithminimum smearing.Subsequently, to testwhether the genomicDNA canbedigestedby restrictionenzymes,theexperimentalgenomicDNAwasdigestedbyDraI(TTT|AAA)withthe followingconcentrations:5µLExperimentalgenomicDNA(0.1µg/µL),1.6µLDraI(10units/µL),2µL10×DraIRestrictionBufferinatotalvolumeof20µL.Afterincubationat37°Covernight,5µLofdigestedproductswereanalysedona0.6% agarose/EtBr gel alongwith 0.5 µL of undigested experimental genomicDNA as a control. A smear was observed in the gel, indicating that theexperimental genomic DNA can be digested by restriction enzymes.Subsequently,genomicDNAwasdigestedseparatelybyfourenzymesprovidedby the kit: DraI (TTT|AAA), EcoRV (GAT|ATC), PvuII (CAG|CTG) and StuI(AGG|CTT). Each enzyme digested the genomic DNA separately with thefollowing concentrations: 25 µL Genomic DNA (0.1 µg/µL), 8 µL Restrictionenzyme(10units/µL),10µL10×Restrictionenzymebufferinatotalvolumeof100 µL. After incubation at 37°C for 2 hrs, the reactionwas vortexed at slow

Page 7: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

28

speed for 5-10 sec and incubated at 37°C overnight afterwards. The reactionproductswerecheckedon0.6%agarose/EtBrgel. Afterdigestion,DNA fragmentswerepurifiedby theNucleoSpin®Gel andPCRClean-up kit from Macherey Nagel (Düren, Germany) and ligated to theGenomeWalker adaptors to establish so-called GenomeWalkerTM “libraries”.After building four libraries, a primary “touchdown” PCRwas performedwithprimer pairs GSP1 and AP1 for 5’_genome walking and GSP3 and AP1 for3’_genomewalking (Fig. 2.2, primer sequenceswere shown in appendix). Thefollowing concentrations and conditionswere used for the primary PCR: 1 µLDNAlibrary,0.5µL10µMforwardprimer,0.5µL10µMreverseprimer,0.5µL10 mM dNTP, 2.5 µL 10×Advantage 2 PCR Buffer and 0.5 µL Advantage 2PolymeraseMix(50×)inatotalvolumeof25µL;7cyclesof94°Cdenaturationfor25sec,annealing/extensionfor6minat72°C,followedby32cyclesof94°Cdenaturation for 25 sec, annealing/extension at 67°C for 6 min, and finallyextensionat67°Cfor7min.After primary PCR, a secondary (nested) “touchdown” PCRwas performed bytaking1µLof50×dilutedprimaryPCRproduct.Theprimers for thesecondaryPCRareAP2andGSP2aorGSP2b,respectively,for5’_genomewalkingandAP2and GSP4a or GSP4b, respectively, for 3’_genome walking. CompoundconcentrationswerethesameasfortheprimaryPCR.Thefollowingconditionswere used for the secondary PCR: 5 cycles of 94°C denaturation for 25 sec,annealing/extension for 6 min at 72°C, followed by 20 cycles of 94°Cdenaturation for 25 sec, annealing/extension at 67°C for 6 min, and finallyextensionat67°Cfor7min.PCRproductswereanalysedona1%agarose/EtBrgel. TargetfragmentswerepurifiedwiththeNucleoSpin®GelandPCRclean-upkitfromMachereyNagel (Düren,Germany) and subsequently cloned according totheTACloning®Kit,withpCR®IIvectorfromClontech(California,UnitedStates)underthefollowingconcentrationsandconditions:1-5.5µLDNA(DNAfromgelpurificationwasdiluted in20µLwater),2µL5×ExpressLinkTMT4DNALigaseBuffer,1.5µLlinearisedpCR®IIvector(25ng/µL)withatotalvolumeof9µL,1µLExpressLinkTMT4DNALigase(5Weissunits)wasaddedintothereactiontoreach the finalvolumeof10µL.Alternatively, if thePCRproductsonlyshowasingle fragment, 1 µL of PCR products can be directly ligated into the pCR®IIvectorwithoutgelandPCRpurification.Theligationconcentrationsfortherestof the components were the same as above. Ligation was performed at 16°Covernight. The construct was used to transform competent E. coli DH5α. ThepCR®II vector contains the lacZα gene that allows for blue-white screening of

Page 8: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

29

positive colonies by α-complementation. White colonies were cultured inLuria-Bertani(LB)mediumthatcontain100µg/mLampicillinat37°Covernight.Plasmids were extracted the following day and the size of inserted DNAfragments was checked by EcoRI-HF® (G|AATTC) from NEB (Massachusetts,United States) digestion. LGC Genomics (Berlin, Germany) carried outsequencing of the candidate fragments by using the primersM13F andM13Rlocatedinthevector.The primers GSP_Dra52_R2 and GSP_Dra52_R1 combined with AP1 and AP2,respectively,wereusedforasecondroundof5’_genomewalkingofboththeMIIIandMV strains. MIII_GSP_Stu93_F1 and MIII_GSP_Stu93_F2 combined with AP1andAP2,respectively,wereusedforasecondroundof3’_genomewalkingoftheMIIIstrain.MV_GSP_Dra13B_F1andMV_GSP_Dra13B_F2combinedwithAP1andAP2,respectively,wereusedforanothersecondroundof3’_genomewalkingoftheMV strain. A third round of genome walking was performed after havingobtainednewsequencesfromthesecondroundofgenomewalking.TheprimersMIII_GSP_Pvu3B_R1 and MIII_GSP_Pvu3B_R2 combined with AP1 and AP2,respectively,wereusedforathirdroundof5’_genomewalkingoftheMIIIstrain.MV_GSP_Pvu7B_R1 and MV_GSP_Pvu7B_R2 combined with AP1 and AP2,respectively,wereusedforanotherthirdroundof5’_genomewalkingoftheMVstrain.

Figure2.2:Genomewalking.GenomicDNAwasdigestedbyfourenzymes:DraI,EcoRV,PvuIIandStuI. Each enzyme digested the genomic DNA separately. After digestion, GenomeWalkerTMadaptors were annealed to the DNA. Gene specific primer GSP1 and adaptor primer AP1 areprimersforprimaryPCR.GSP2andAP2areprimersforsecondaryPCR.N:Aminegroupblocksextensionofthe3’endoftheadaptor-ligatedgenomicfragments,preventingthegenerationofanAP1bindingsite in loweradaptorstrand(ifdouble-strandedadaptorsequencesarepresentatboth ends, they will form a “panhandle” structure that cannot be extended) (modified fromUniversalGenomeWalkerTM2.0UserManualfromClontech,California,UnitedStates).

Page 9: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

30

2.3.3RapidamplificationofcDNAends(RACE)5’_RACE PCR and 3’_RACE PCR were performed using the SMARTEerTM RACEcDNAAmplificationKitfromClontech(California,UnitedStates)accordingtotheSMARTEerTMRACEcDNAAmplificationKitUserManual.RNAwaspurifiedfrom0-24 hrs embryos from theMIII strainwith the ZR Tissue& Insect RNAMicroPrepTMkit from Zymo Research (California, United States). First-strand of the5’_RACE_Ready cDNAand the3’_RACE_Ready cDNAwas synthesisedaccordingtotheSMARTEerTMRACEcDNAAmplificationKitUserManual. After synthesis of first-strand of the 5’_RACE_Ready cDNA and the3’_RACE_Ready cDNA, a “touchdown” PCR was performed with primer pairsUniversalPrimerAMix(UPM)andGSP1orGSP2b,respectively,for5’_RACEPCRand UPM and GSP4b for 3’_RACE PCR. The following concentrations andconditionswereusedforthePCR:2.5µLRACE_ReadycDNA,1µL50×UPM,1µL10 µM primer, 1 µL 10mMdNTP, 5 µL 10×Advantage 2 PCRBuffer and 1 µLAdvantage2PolymeraseMix(50×) inatotalvolumeof50µL;5cyclesof94°Cdenaturation for 30 sec, annealing/extension for 5min at 72°C, followed by 5cyclesof94°Cdenaturationfor30sec,annealingat70°Cfor30secandextensionat72°Cfor5min,andfinally25cyclesof94°Cdenaturationfor30sec,annealingat 68°C for 30 sec and extension at 72°C for 5 min. The PCR products werechecked on 1% agarose/EtBr gel. The cloning procedurewas the same as thecloning step in genomewalking. Plasmids fromwhite colonieswere extractedand the size of insertedDNA fragmentswas checkedbyEcoRI-HF®(G|AATTC)fromNEB(Massachusetts,UnitedStates)digestion.Sequencingofthecandidatefragmentswith the primersM13F andM13R in the vectorwas carried out byLGCGenomics(Berlin,Germany).2.3.4PCRamplificationofM-locussequencesPCRwas performed on singlemale housefly gDNA and cDNAwith the primercombinationsGSP2b-Dra-52-ForGSP1-9-FandGSP3-RorGSP4b-R,respectively.GSP2b-Dra-52-F and GSP1-9-F are located on the newly yield sequences fromgenomewalking and RACE. The following concentrations and conditionswereused in gDNAPCR:100nggDNA,0.5µL10µM forwardprimer,0.5µL10µMreverseprimer,3µL2.5mMdNTP,3µL10×Advantage2PCRBufferand0.5µLAdvantage 2 Polymerase Mix (50×) in a total volume of 30 µL; followed bydenaturationat94°C for2min, then30cyclesof94°Cdenaturation for30sec,annealingat70°Cfor30secandextensionat72°Cfor7min,andlastlyextensionat 72°C for 10 min. For the cDNA PCR, cDNA was first synthesised with theThermo Fisher Scientific (Massachusetts, United States) Maxima First Strand

Page 10: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

31

cDNA Synthesis Kit with the following concentrations and conditions: 4 µL5×ReactionMix,2µLMaximaEnzymeMixand1.5µLtemplateRNA(1.2µg/µL)in a total volume of 20 µL. The mixture was incubated at 25°C for 10 minfollowedby30minat50°C.Thereactionwasterminatedbyincubatingat85°Cfor5min.ThecDNAwasdiluted5×withnuclease-freewateraftersynthesisand1µLcDNAwasusedineachPCRreaction.ThecDNAPCRwasperformedunderthesameconditionsasgDNAPCR. PCRproductswereanalysedona1%agarose/EtBrgel.Thecloningprocedurewas the same as the cloning step in genome walking. Plasmids from whitecolonieswereextractedandthesizeofinsertedDNAfragmentswascheckedbyEcoRI-HF® (G|AATTC) from NEB (Massachusetts, United States) digestion.Thecandidate fragments from positive plasmidswere sequencedwith the primersM13F and M13R in the vector combined with PCR primers by LGC Genomics(Berlin,Germany).2.3.5SequenceanalysisSequencingdatawereanalysedwith“Geneious”(Kearseetal.,2012).Nucleotidemultiplealignmentwasusedinthedataprocessing.ABLASTsurveyofalluniqueMdmd sequences against database Genome (Musca_domestica-2.0.2 referenceAnnotationRelease102)andorganismMuscadomestica (taxid:7370)(Scottetal., 2014)was performed using theNCBI on-line blast tool. Phylogenetic treeswere builtwith the Geneious tree builder thatwas based on the Jukes-Cantorgenetic distance model and the Neighbor-joining method combined with thebootstrapresamplingmethod(1000replicates).

2.4Results2.4.1TheMIII-locusconsistsofmultiplecopiesofMdmdThe sequences obtained from genome walking revealed that the MIII-locusconsistsofmultiplecopiesofsequences,withvariouslevelofhomologytoeachother (Fig. 2.3). The first round of genome walking yielded four new anddifferent sequences (sequences #1-4) with ORM#1 based primers and six(sequences#5-10)withORM#6basedprimers.Eachsequencefrom3’_genomewalkingmightbeconnectedtoanysequencefrom5’_genomewalking.Hence,theMIII-locus containsat least six copiesofMdmd.Among theobtained sequences,the longest is around 2.4kb (sequence #5), and the shortest 260bp (sequence#2).Sequences#1,#9and#10containgenomicsequencesthatexistinboththemaleandthefemalegenome.Thegenomicsequencesinsequences#9and#10

Page 11: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

32

share identical parts. Among these ten newly obtained sequences, some arecompletelydifferent fromeachother,butsomearepartlysimilar(displayed insimilar colours: from dark green to light green (sequences #4 and #5) in Fig.2.4A, fromdark blue to light blue (sequences #6, #7 and#8) in Fig. 2.4B andfrom red to light red (sequences #9 and#10) in Fig. 2.4C. Alignment of thesepartly similar sequences reveals that they have indels (insertion or deletionmutations) and nucleotide variations. Since the genomic DNA comes from asinglemale,nucleotidevariationscannotbepopulationpolymorphism,butmostcomefromindependentrepeatsoftheMIII-locus.

Figure2.3:TheMIII-locusconsistsofmultiplecopiesofsequenceswithvariouslevelofhomologyto each other. 5’_genome walking started from ORM#1 and 3’_genome walking started fromORM#6ontheMIIIstraingenomicDNA.GSP1andGSP2bareprimersfor5’_genomeonORM#1.GSP3andGSP4bareprimersfor3’_genomewalkingonORM#6.Sequences#1-4overlappedwithORM#1. Sequences #5-10 overlapped with ORM#6. Different sequences are colour coded (5’yellow, orange, grey and green, 3’ green, blue and red). Partly similar sequencesmarkedwithdifferentcolour intensities (e.g. sequencesmarked fromdarkgreen to lightgreen)and labeledwith A, B and C. The dotted line represents potentialMdmd homologous sequences of so farunknown variations. The shaded boxes indicate sequences that exist in both themale and thefemalegenome.

ORM#1 ORM#6

GSP3GSP4b�GSP2bGSP1�

possibleupstreamsequences possibledownstreamsequences

1

2

3

5

6

7

8

9

10

A

A

C

B4

Page 12: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

33

Figure2.4:AlignmentofpartlysimilarsequencesobtainedfromgenomewalkingoftheMIIIstrain.A:Alignmentofsimilarsequencesmarkedfromdarkgreentolightgreen(sequences#4and#5).B:Alignmentofsimilarsequencesmarkedfromdarkbluetolightblue(sequences#6,#7and#8).C: Alignment of similar sequences marked from red to light red (sequences #9 and #10).Sequence#4wasobtainedfrom5’_genomewalkingwiththeprimersGSP1andGSP2binORM#1.Sequences#5,#6,#7,#8,#9and#10wereobtainedfrom3’_genomewalkingwiththeprimersGSP3andGSP4binORM#6.Allthesepartlysimilarsequencescanalignwitheachotherbutwithindels (insertionordeletionmutations)andnucleotidevariations.Thehorizontalbars indicatethepresenceof the samesequencesand the lines indicate indels.Thevertical lines in thebarsindicatenucleotidevariationsamongsequences. More evidence indicated that theMIII-locus containsmultiple copies ofMdmd.Sequence #4might be a “bridge” betweenMdmd homologous sequences as itwent out fromORM#1and intoORM#6 (Fig. 2.5). Also, sequence#5,which issimilar to sequence #4, went out from ORM#6 and into ORM#1 and part ofORM#3 (Fig. 2.5). These results demonstrate that the MIII-locus consists oftandemcopiesofMdmdrepeatsandsomeofthecopiesarequitesimilar.

Figure2.5:Sequences fromgenomewalking linkedseveral copiesofMdmd. Sequence#4wentfrom ORM#1 into ORM#6. Sequence #5 went from ORM#6 into ORM#1 and part of ORM#3.Sequence#4wasobtainedfrom5’_genomewalkingwiththeprimersGSP1andGSP2binORM#1.Sequence#5wasobtainedfrom3’_genomewalkingwiththeprimersGSP3andGSP4binORM#6.The dotted lines are undefined sequences of potentialMdmd repeats, and the solid lines areknownsequences. The MIII-locus contains interspersed genomic sequences that exist in both themale and the female genome. However, those sequences represent mostlyrepetitivesequences thatcannotbeusedtodesignprimers for furthergenome

4

GSP2b�

GSP4b�

5

678

GSP4b�

GSP4b�

GSP4b�

GSP3GSP4b�GSP2bGSP1�

ORM#1ORM#6 ORM#1

4 5

ORM#3ORM#6

B

C

A

Page 13: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

34

walking.Forexample,sequence#1containssuchsequences,asshowninFig.2.3.To characterise thewholeMIII-locus, I performed a second and third round of5’_genomewalkingbasedonsomeofthemalespecificsequencesacquiredinthepreviousround. Iobtainedthreenewsequencescontaininggenomicsequencesthatexist inboththemaleandthe femalegenome.Walkingout fromsequence#2 yielded sequences #2-1 and #2-2 (Fig. 2.6A). Interestingly, sequence #2-1overlappedwithMusca_Mariner_Like_Elements (MLEs),andsequence#2-2withsequences from tapeworms and trematodes. In the third round of 5’_genomewalking, I used primers in the male specific part of sequence #2-2, yieldingsequence #2-2-1 (Fig. 2.6B). This sequence also overlapped withMusca_Mariner_Like_Elements(MLEs). Whenwalkingoutfromsequence#9inthesecondroundof3’_genomewalking,Iacquiredfourdiversesequences.Allofthemcontainedgenomicsequencesthatexist in both themale and the female genome (Fig. 2.6C). Sequences#9-3 and#9-4areverysimilarwithtwonucleotidedifferences(Fig.2.6C).Sequences#9-2,#9-3and#9-4overlappedwith theMuscadomestica pre-mRNA-splicing factorCWC22 homolog, which contains two conserved domains MIF4G and MA3.Genomic sequences “b” in #9-3 and “c” in #9-4 are identical, which are alsosimilarwith“a”in#9-2.SinceORM#3overlappedwithconserveddomainMIF4G,Ialsoalignedsequences#9-2,#9-3and#9-4withORM#3.Ifoundthatsequence#9-2 overlapped with part of ORM#3 (Fig. 2.7A). Sequences #9-3 and #9-4overlappedwithpartofORM#3andwithpartofsequence#9(Fig.2.7B).Theseresults confirmed that the MIII-locus consists of multiple tandemly repeated,partiallytruncatedcopiesofMdmdinterspersedbygenomicsequencesthatexistinboththemaleandthefemalegenome. In addition, I performed RACE PCR to characterise theMIII-locus at the cDNAlevel. I obtained eight sequences (Fig. 2.8A). Interestingly, sequences #1-RACEand #2-RACE are identical with sequences #1 and #2 from genome walking,respectively. When I compared sequence #3-RACE with sequence #2 fromgenomewalking,Ionlyfoundonenucleotidechange(datanotshown).Sequence#4-RACEispartlysimilarwithsequence#2fromgenomewalkingandsequence#5-RACE ispartly similarwith sequence#4 fromgenomewalking.The resultsfromRACEPCRindicatedthatsomeofthecopiesintheM-locusaretranscribedintoRNA. I alsoperformedPCRwithprimers in thenewlyobtainedsequencesandinORM#6,yieldingtwosequences(Fig.2.8B).Alignmentofthesetwopartlysimilarsequencesshowsthattheyhaveindels(insertionordeletionmutations)andnucleotidevariations(Fig.2.8C).

Page 14: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

35

Figure2.6:ResultsofsecondandthirdroundofgenomewalkingoftheMIIIstrain.TheMIII-locuscontainsinterspersedgenomicsequencesthatexistinboththemaleandthefemalegenome.A:Second round of 5’_genome walking started from sequence #2, yielding two new sequences:sequences #2-1 and #2-2. B: Third round of 5’_genome walking started from sequence #2-2,yieldingonenewsequence:sequence#2-2-1.C:Secondroundof3’_genomewalkingstartedfromsequence#9,yieldingfournewsequences:sequences#9-1,#9-2,#9-3and#9-4.“a”,“b”and“c”are similar genomic sequences. GSP_Dra52_R2 and GSP_Dra52_R1 are primers for the secondround of 5’_genome walking. MIII_GSP_Pvu3B_R1 and MIII_GSP_Pvu3B_R2 are primers for thethirdroundof5’_genomewalking.MIII_GSP_Stu93_F1andMIII_GSP_Stu93_F2areprimersforthesecondroundof3’_genomewalking.Theshadedboxesindicatesequencesthatexistinboththemale and the female genome. The dotted lines are undefined sequences of potential Mdmdrepeats. Insequencealignment,thehorizontalbarsindicatethepresenceofthesamesequencesandtheverticallinesinthebarsindicatenucleotidevariationsamongsequences.

MIII_GSP_Pvu3B_R2 MIII_GSP_Pvu3B_R1

2-2-1

ORM#12-2

C

B

A

Page 15: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

36

Figure 2.7: Sequences from genome walking showing the complexity of the MIII-locus. A:Sequence#9-2 overlappedwith part of ORM#3. B: Sequences#9-3 and#9-4 overlappedwithpartofORM#3andwithpartofsequence#9.Sequences#9-2,#9-3and#9-4wereobtainedfromthe second round of 3’_genomic DNA walking with the primers MIII_GSP_Stu93_F1 andMIII_GSP_Stu93_F2. The shaded boxes indicate sequences that exist in both the male and thefemale genome. The dotted lines are undefined sequences, and the solid lines are knownsequencesofpotentialMdmdrepeats.

A

B

Page 16: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

37

Figure2.8:RACEPCRtodeterminetheMIII-locusonthecDNAlevel.A:The5’_RACEPCRstartedfrom ORM#1. GSP1 or GSP2b are the primers for the 5’_RACE PCR on ORM#1. Sequences#1-6-RACEoverlappedwithORM#1.The3’_RACEPCRstartedfromORM#6.GSP4bistheprimerfor the 3’_RACE PCR on ORM#6. Sequences #7-8-RACE overlappedwith ORM#6. B: PCRwithprimers GSP1-9-F and GSP2b-Dra52-F in the newly obtained sequences and primers GSP4b-Rand GSP3-R in ORM#6, respectively, yielding two sequences. C: Alignment of these two partlysimilarsequencesshowsthattheyhaveindels(insertionordeletionmutations)andnucleotidevariations.Thedottedlinesareundefinedsequences,andthesolidlinesareknownsequencesofpotentialMdmdrepeats.Insequencealignment,thehorizontalbarsindicatethepresenceofthesame sequences and the lines indicate indels.Thevertical lines in thebars indicatenucleotidevariationsamongsequences.

ORM#1

GSP1�

1-RACE

2-RACE

5-RACE

3-RACE

4-RACE

GSP2b�

6-RACE

GSP4b�

ORM#6

7-RACE

8-RACE

A

B

C

Page 17: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

38

2.4.2MultiplecopiesofMdmdexistinMII,MIII,MVandMYmalesGenomicDNA fromMI,MII,MIII,MV andMYmaleswas amplifiedwith divergentprimers localised atORM#1 andORM#6. The results are displayed in Fig. 2.9,which was kindly provided by Claudia Brunner. It revealed that there aremultiplecopiesofMdmdinmalesfromtheMII,MIII,MVandMYstrainsbutnottheMI strain, which probably has a different male-determining gene(s). TheMIII-locus contains at least six copies of Mdmd. The MV-locus seems lesscomplicated,asthereareonlytwofragmentsamplifiedwithdivergentprimers.For finding sequences adjacent to ORM#1 and ORM#6 in theMV strain, I alsoperformedgenomewalkinginthisstrain.

Figure 2.9: Multiple copies ofMdmd exist inMII,MIII,MV andMYmales. A: Sequences betweenMdmdwereamplifiedbydivergentprimers1aslocalisedatORM#1and6aslocalisedatORM#6.B: Multiple fragments were amplified inMII,MIII,MV andMYmales indicating that there weremultipletandemlyrepeatedcopiesintheM-loci.Thedottedlinesareundefinedsequences.ThisfigurewaskindlyprovidedbyClaudiaBrunnerfromUniversityofZürich.

MIII MY MV MII MI

3kb

1kb

A

2kb

6kb

B

Page 18: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

39

2.4.3TheMV-locusconsistsofmultiplecopiesofMdmdGenome walking in the MV strain revealed that the MV-locus also containsrepetitive sequences (Fig.2.10).The first roundofgenomewalkingyieldedsixnew and different sequences (sequences #11-16) with ORM#1 based primersand four (sequences#17-20)withORM#6basedprimers.Each sequence from5’_genomewalkingmightbeconnectedtoanysequencefrom3’_genomewalking.Hence, theMV-locus contains at least six copies ofMdmd. Among the obtainedsequences,thelongestisaround2.1kb(sequence#20),andtheshortestis195bp(sequence#16).Sequence#20containsgenomicsequencesthatexistinboththemaleandthefemalegenome.Similarsequencesaredisplayedinsimilarcolours:from dark purple to light purple (sequences #12, #13 and #14) in Fig. 2.11A,from dark orange to light orange (sequences #15 and #16) in Fig. 2.11B andfrom dark pink to light pink (sequences #18, #19 and #20) in Fig. 2.11C.Alignment of these partly similar sequences shows that they have indels(insertion or deletionmutations) and nucleotide variations. Since the genomicDNA comes from a single male, nucleotide variations cannot be populationpolymorphism,butmostcomefromindependentrepeatsoftheMV-locus.

Figure2.10:TheMV-locusconsistsofmultiplecopiesofsequenceswithvariouslevelofhomologyto each other. 5’_genome walking started from ORM#1 and 3’_genome walking started fromORM#6 on theMV strain genomic DNA. The primer GSP1 combined with GSP2a and GSP2b,respectively,wereusedfor5’_genomewalkingonORM#1andtheprimerGSP3combinedwithGSP4aandGSP4b,respectively,wereusedfor3’_genomewalkingonORM#6.Sequences#11-16overlappedwith ORM#1. Sequences #17-20 overlappedwith ORM#6. Different sequences arecolour coded. Partly similar sequences are marked with different colour intensities (e.g.sequencesmarkedfromdarkpurpletolightpurple)andlabeledwithA,BandC.Thedottedlineisanundefinedsequence.Theshadedboxes indicatesequencesthatexist inboththemaleandthefemalegenome.

GSP3GSP4aGSP4b�GSP2aGSP2bGSP1�

11

12

13

14

15

16

18

17

20

ORM#1 ORM#6

19

B

A

possibleupstreamsequences possibledownstreamsequences

C

Page 19: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

40

Figure 2.11: Alignment of partly similar sequences obtained from genome walking of theMV

strain. A: Alignment of similar sequencesmarked fromdark purple to light purple (sequences#12,#13and#14).B:Alignmentofsimilarsequencesmarkedfromdarkorangetolightorange(sequences#15and#16).C:Alignmentofsimilarsequencesmarkedfromdarkpinktolightpink(sequences #18, #19 and #20). Sequences #12, #14 and #16 were obtained from 5’_genomewalkingwith the primers GSP1 and GSP2a in ORM#1. Sequences #13 and#15were obtainedfrom 5’_genome walking with the primers GSP1 and GSP2b in ORM#1. Sequence #18 wasobtainedfrom3’_genomewalkingwiththeprimersGSP3andGSP4ainORM#6.Sequences#19and#20wereobtainedfrom3’_genomewalkingwiththeprimersGSP3andGSP4binORM#6.Allthesepartlysimilarsequencescanbealignedbutwith indels (insertionordeletionmutations)andnucleotidevariations.Thehorizontalbarsindicatethepresenceofthesamesequencesandthe lines indicate indels. The vertical lines in the bars indicate nucleotide variations amongsequences.To characterise the wholeMV-locus, I performed a second and third round of5’_genomewalkingbasedonsomeofthemalespecificsequencesacquiredintheprevious round, particularly chose those sequences that are shared with theMIII-locus(seefollowingpart).Iobtainedfivenewsequences.Walkingoutfromsequence#15yieldedsequences#15-1and#15-2(Fig.2.12A).Sequences#15-1and#15-2areverysimilar(Fig.2.12A).Inthethirdroundof5’_genomewalking,Iusedprimers in themale specificpartof sequence#15-2,yielding sequences#15-2-1,#15-2-2and#15-2-3(Fig.2.12B).Sequences#15-2-2and#15-2-3arevery similar with two nucleotide differences (data not shown). Interestingly,sequences #15-2-1, #15-2-2 and #15-2-3 overlapped withMusca_Mariner_Like_Elements (MLEs), and Musca domestica clone MdAG226microsatellitesequences.

14

1213

GSP2a�GSP2b�

GSP2a�

20

1819

GSP4b�

GSP4b�

GSP4b�

B

C

A

Page 20: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

41

Whenwalkingoutfromsequence#20inthesecondroundof3’_genomewalking,I acquired three sequences. All of them include #20, which contains similargenomic sequences that exist in both the male and the female genome (Fig.2.12C).Genomicsequences“a”in#20,“b”in#20-1,“c”in#20-2and“d”in#20-3are similar. Sequences #20-2 and #20-3 are very similarwith four nucleotidedifferences (Fig. 2.12C). These results confirmed that theMV-locus consists ofrepetitive sequences interspersed by genomic sequences that exist in both themaleandthefemalegenome.

Page 21: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

42

Figure2.12:ResultsofsecondandthirdroundofgenomewalkingoftheMVstrain.TheMV-locuscontainsinterspersedgenomicsequencesthatexistinboththemaleandthefemalegenome.A:Second round of 5’_genomewalking started from sequence #15, yielding two new sequences:sequences#15-1and#15-2.B:Thirdroundof5’_genomewalkingstartedfromsequence#15-2,yielding three new sequence: sequences #15-2-1, #15-2-2 and #15-2-3. C: Second round of3’_genomewalkingstartedfromsequence#20,yieldingthreenewsequences:sequences#20-1,#20-2 and #20-3. “a”, “b”, “c” and “d” are similar genomic sequences. GSP_Dra52_R2 andGSP_Dra52_R1areprimers for the second roundof5’_genomewalking.MV_GSP_Pvu7B_R1andMV_GSP_Pvu7B_R2areprimers for the third roundof5’_ genomewalking.MV_GSP_13B_F1andMV_GSP_13B_F2 are primers for the second round of 3’_genome walking. The shaded boxesindicate sequences that exist in both the male and the female genome. The dotted lines areundefined sequences. In sequence alignment, the horizontal bars indicate the presence of thesame sequences and the lines indicate indels.Thevertical lines in thebars indicatenucleotidevariationsamongsequences.

B

C

A

Page 22: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

43

2.4.4MIII-locusandMV-locusshareintergenicsequencesbetweenMdmdrepeats Interestingly, in5’_genomewalkingof theMV strain, I obtained sequences thatwere very similar to some of the sequences in the MIII-locus (Fig. 2.13A).Alignment of the sequences revealed only few nucleotide differences. In3’_genomewalking,IalsofoundsimilaritiesbetweensequencesoftheMIII-locusand theMV-locus(Fig. 2.13B).These results indicate that theMIII-locus and theMV-locussharesomesimilarsequences.

Figure 2.13: Sequence alignments from genome walking sequences of theMIII andMV strainsreveal similarity. A: Sequence #2 was obtained from 5’_genomewalking of theMIIIstrain andsequences#15and#16wereobtainedfrom5’_genomewalkingoftheMVstrain.B:Sequences#9and#10wereobtained from3’_genomewalkingof theMIII strainandsequences#18,#19and#20were obtained from3’_genomewalking of theMV strain. The horizontal bars indicate thepresence of the same sequences and the lines indicate indels. The vertical lines in the barsindicatenucleotidevariationsamongsequences.I composed a cladogram of the sequences from 5’_genome walking in theMIII-locusand theMV-locusandof sequence#5 from3’_genomewalking in theMIII-locusbytrimmingthevariableend.Itturnsoutthatthesesequencesbelongtosixclades,whichIlabeledA-F.Sequences#2,#15and#16belongtocladeA,sequences#12,#13and#14belongtocladeBandsequences#4and#5belongtocladeC(Fig.2.14).Sequences#1,#3and#11formtheirownclades.Similarly,Ialsocomposedacladogramofthesequencesfrom3’_genomicDNAwalkinginMIII-locusand theMV-locusandof sequence#4 from5’_genomewalking in theMIII-locusbytrimmingthevariableend.Itturnsoutthatthesesequencesbelongto five clades, which I labeled A-E (N.B. different clades than from the5’cladogram).Sequences#9,#18,#19and#20belongtocladeA,sequences#6,#7and#8belong to cladeC, andsequences#4and#5belong tocladeD (Fig.2.15).Sequences#10and#17formtheirownclades.CladogramconfirmedthattheMIII-locusandtheMV-locussharesomesimilarsequences.

15

216

GSP2b�

GSP2a�MVMVMIII

GSP2b�

A

B

Page 23: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

44

Figure2.14:Cladogramof5’_genomewalkingsequencesintheMIII-locusandtheMV-locusandofsequence#5from3’_genomewalkingintheMIII-locus.Theybelongtosixclades,sequences#2,#15and#16 to cladeA, sequences#12,#13and#14 to cladeB, and sequences#4and#5 tocladeC.Thebranchlabelsshowthepercentageofconsensussupport.Thescalebarindicatesthenumberofsubstitutionspersite.Thebottomtableshowsthepercentageofbases/residuesthatareidenticalbetweentwosequences.Highsimilaritiesbetweentwosequencesareindicatedbywhitenumbersindarkcellsandlowsimilaritiesbydarknumbersinlightcells.

A

B

C

D

E

F

MV

MV

MIII

MV

MV

MV

MIII

MIII

MIII

MV

MIII

Page 24: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

45

Figure2.15:Cladogramof3’_genomewalkingsequencesintheMIII-locusandtheMV-locusandofsequence#4from5’_genomewalkingintheMIII-locus.Theybelongtofiveclades,sequences#9,#18,#19and#20tocladeA,sequences#6,#7and#8tocladeC,andsequences#4and#5tocladeD.Thebranchlabelsshowthepercentageofconsensussupport.Thescalebarindicatesthenumberofsubstitutionspersite.Thebottomtableshowsthepercentageofbases/residuesthatareidenticalbetweentwosequences.Highsimilaritiesbetweentwosequencesareindicatedbywhitenumbersindarkcellsandlowsimilaritiesbydarknumbersinlightcells.

A

D

E

B

C

MIII

MV

MV

MV

MV

MIII

MIII

MIII

MIII

MIII

MIII

Page 25: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

46

2.5DiscussionTheobjectiveof thisstudywas todetermine thestructureof theM-loci in twoautosomalM strains,MIII andMV, through genome walking. I first performedgenomewalkingtoidentifygenomicregionsadjacenttotheMdmdORMsintheMIII strain. I found that theMIII-locus consists ofmultiple copies of sequences,whichallshowhomologytoeachother.TheMIII-locuscontainsatleastsixcopiesofMdmd,whichwasconfirmedbygenomicDNAamplificationwiththedivergentprimerslocalisedatORM#1andORM#6intheMIIIstrain.TheMV-locusseemstohave fewer copies, as therewereonly two fragments amplifiedwithdivergentprimersinFig.2.9,indicatingthatitmightcontainaminimumofthreecopiesofMdmd. However, genomewalking on theMVstrain revealed that it contains atleast six copies ofMdmd. The different results obtained from genomewalkingand genomic DNA amplification with divergent primers indicate that variousmethods are required to determine the structure of the M-loci in other M.domesticastrains. Cladogram analysis further illustrated that sequences in theMIII-locus and theMV-locus could be divided into different clades, with sequences within cladesbeingmoresimilar thansequencesbetweenclades. Interestingly, theMIII-locusand theMV-locus share some similar sequences. These results aremost easilyexplained by assuming that there have been independent amplification eventsbefore and after translocation of theM-locus to autosomes III and V, possiblyfromtheY-chromosome.Inaddition,somesequencesarealwaysinterspersedbyidenticalorsimilargenomicsequencesthatexistinboththemaleandthefemalegenome, indicating thatamplificationofMdmdoccurredwith inclusionof theirflanking genomic regions. Also, it is still not known how many repetitivesequences exist in the M-loci from different M. domestica strains. Furthercharacterisation of theM-loci by Pacific Biosciences (PacBio) sequencing thatproduces long readswillbe required todetermine theprecise structureof theM-lociindifferentM.domesticastrains. One of the most striking findings regarding the structure of theM-loci is thepresence of transposable element sequences that are homologous toMusca_Mariner_Like_Elements(MLEs).MLEsbelongtoclassIItransposons/DNAtransposons that are characterised by cut-and-paste transposition. DNAtransposons are known to play a role in gene duplification and translocation(Feschotte and Pritham, 2007). M-loci that contain multiple copies of MdmdflankedbyMLEs,maysuggest thatTEshaveplayedarole ingeneratingMdmdduplications. In addition, transposonsmay be involved in translocation of theM-loci. For example, the Tc1/mariner element was found to be capable of

Page 26: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

47

transposing host sequences in Drosophila melanogaster, Lucilia cuprina andBactroceratryoni(Coatesetal.,1997).AnalternativeexplanationisthattheyarenotfunctionallyinvolvedintheM-locusevolution,butmerelyhavelandedthereafter amplification and translocation of Mdmd. Besides the presence oftransposable element sequences in the Mdmd region, I also observed theinsertionofmicrosatellitesequences in theMV-locus.Microsatellitesaresimplesequence repeats (SSR),which have highmutation rates (Li et al., 2002). It iscurrentlyunknownhowthetransposableelementsequencesandmicrosatellitesequences inserted in theM-lociandwhat theirrolemighthavebeen inMdmdamplification and translocation. Further study is required to determine thecauses and effects of the observed association between theM-loci and theserepetitivesequencesinthehouseflygenome. The question of how sex chromosomes evolve is currently receiving a lot ofattentiongiventhatwenowhavethegenomictoolstoaddressthisquestioninanumber of systems (Beukeboom and Perrin, 2014). The housefly polymorphicsex determination systems can be uniquely used to study Y-chromosomeevolution.Basedonmyresults,IamabletoformulateahypothesisfortheM-locievolution in the context of the generally accepted model for Y-chromosomeevolution (Fig. 2.16; Charlesworth, 1996; Rice, 1996; Beukeboom and Perrin,2014).The initial stageof theY-chromosomeevolution is considered tobe theacquisition of a sex-determining geneby a standard chromosome. First,Mdmdmusthaveevolvedasanewmale-determininggeneandtakenupapositionatthetopoftheM.domesticasexdeterminationhierarchy.Whetherthishappenedon the ancestral Y or an autosomal pair that was not yet involved in sexdeterminationcannotbeansweredatthismoment.ThenextstagewouldbethereductionofrecombinationinthesurroundingMdmdregion,aspredictedbythetheory of sex chromosome evolution (Rice, 1996). This would be followed byaccumulation of transposable elements and deleterious mutations, includingrepetitiveDNAsequencesandtransposonsduetoalackofrecombinationontheproto-sexchromosomes(Bachtrog,2005,2006,2013).IindeedfoundthatM-locicontain transposable elements and repetitive sequences. Insertions oftransposons may play a dynamic and early role in proto-Y chromosomedegeneration and may cause functional genes to gradually lose their function(Bachtrog, 2005). Also, accumulation of transposable elements and relatedrepeatscaninduceheterochromatin(Lippmanetal.,2004).Inatleastoneotherstudyofnovel sexdeterminationgenes in the fishOryzias latipes, itwas foundthat the young Y-chromosome accumulated inactive repetitive elements andtransposable element-like sequences in themale-specific region (Nanda et al.,2002;Kondoetal.,2004).ThefindingoftransposableelementinsertionsclosetoMdmd homologous sequences in my study is consistent with this model of

Page 27: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

48

Y-chromosomedegeneration. Overevolutionary time, lackof recombinationandaccumulationofdeleteriousmutationswascounteractedbytheamplificationofMdmdontheY-chromosome,thus forming the M-locus that contains multiple copies of sequences, withvarious level of homology to each other. In a more advanced phase ofY-chromosomeevolution,theM-locusmaytranslocateagaintoanautosomeandform a new proto-Y chromosome, starting the whole cycle over again. ThefindingofmultiplecopiesofMdmdinMIIIandMVmalesmayreflectthisprocess.TheM-locusmay have translocatedmultiple times from the Y to an autosomeand/orsubsequentlybetweenautosomes.Inaddition,mycladogramanalysisofsequencesobtainedfromgenomewalkingrevealedthattosomeextentdifferentsequences exist in different autosomes, indicating that after translocation, theM-locus underwent further independent amplification on each autosome. TheexistenceofmultipledifferentautosomalM variants in thehouseflyprovidesaunique opportunity for further study of early stages of sex chromosomeevolution.

Page 28: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

CharacterisationofthecomplexnatureofM-lociinMuscadomestica

49

Figure2.16:Model for theevolutionofM-loci.Mdmd evolvedasanewmale-determininggene,generating a proto-Y chromosome. Lack of recombination and accumulation of deleteriousmutationswascounteractedbytheamplificationofMdmdontheYandformedtheM-locus.Afteramplification, theM-locus translocated from the Y to autosomes, eithermultiple times and/orsubsequently between autosomes. After translocation, the M-locus underwent independentamplificationoneachautosome.ThedifferentlycolouredboxesindicaterepetitivesequencesintheM-loci thatarepartly sharedbetweenchromosomes.Theshadedboxes indicate sequencesthatexistinboththemaleandthefemalegenome.

2.6AcknowledgementsI acknowledge Akash Sharma for providing the four orphan reads of themale-biasedsequencesandClaudiaBrunnerforthegelpictureinFig.2.9.

Mdmd

Mdmd

Mdmd

M-locus

amplifica-on

amplifica-on

ProtoXProtoY

AutosomeIII

YX

AutosomeIII

AutosomeV

AutosomeV

transloca-on

Mdmd

M-locus

amplifica-on

Page 29: University of Groningen Characterisation of the M-locus and … · 2017. 12. 6. · Chapter 2 26 Figure 2.1: The position of ORMs in Mdmd: ORM#1 is located on the 5’region and ORM#6

Chapter2

50

2.7Appendix2.7.1PrimersequencesGSP1:5’-TCTACTGGGTGTTCATTTGAATCCGTTGTG-3’GSP2b:5’-CCAATACGACTTCCCTTTGCCCTGATAG-3’GSP2a:5’-TTCGAGATTCGGCGTCGGTGGCR(A/G)TTCAT-3’GSP3:5’-GGTW(A/T)GACGCGGACAATCAACGAGATATT-3’GSP4b:5’-AGTGAAATTAAAAGACGCCGGGAAGAGC-3’GSP4a:5’-R(A/G)GCAGAATCATGAAATATCACAACGTCATG-3’AP1:5’-GTAATACGACTCACTATAGGGC-3’AP2:5’-ACTATAGGGCACGCGTGGT-3’GSP_Dra52_R1:5’-TCCCTAATTATAGGGTGGCTCAGAACATCG-3’GSP_Dra52_R2:5’-CCGTCTTTTAATACCCAAAGTTCTGAAACG-3’MIII_GSP_Stu93_F1:5’-CTTCTGTTGTTGGCCCTTCCACCTTTAG-3’MIII_GSP_Stu93_F2:5’-GCTGCAATGTCAGATTGTGCATGGGTTAC-3’MV_GSP_Dra13B_F1:5’-AAAGCTGTTCTCTCATCCATACAATTCGTG-3’MV_GSP_Dra13B_F2:5’-ATGTATACCTACCCAAACTTCGGTGTCCTG-3’MIII_GSP_Pvu3B_R1:5’-AGAAACATTTAACGGCACCGGGACACCTC-3’MIII_GSP_Pvu3B_R2:5’-GCTGTTTGCCTTGGGCTTAGTTTGTGTGC-3’MV_GSP_Pvu7B_R1:5’-TTGGGCTTGACTTGTGTGTATTTTTTCTGC-3’MV_GSP_Pvu7B_R2:5’-AAACTTGTTGTTGCAAAATGGTAAGCCTGG-3’ UPMforRACE: Long:5’-ATTAACCCTCACTAAAGGGAAAGCAGTGGTATCAACGCAGAGT-3’ Short:5’-ATTAACCCTCACTAAAGGGA-3’GSP2b-Dra-52-F:5’-TGGAAAATTACGATGTTCTGAGCCACCCTA-3’GSP3-R:5’-AATATCTCGTTGATTGTCCGCGTCAACC-3’GSP1-9-F:5’-CAAACCACCCTGACGACCAGAAGATGATG-3’GSP4b-R:5’-GCTCTTCCCGGCGTCTTTTAATTTCACT-3’M13F:5’-GTAAAACGACGGCCAGTG-3’M13R:5’-CAGGAAACAGCTATGAC-3’1as:5’-GATTGGCTCAGATCGGCGTA-3’6as:5’-GGTTGACGCGGACAATCAAC-3’


Recommended