Daisy-chaingenedrivesforthealterationoflocalpopulationsCharlestonNoble1-2*,JohnMin1,3-4*,JasonOlejarz2,JoannaBuchthal3-4,AlejandroChavez1,4,5,AndreaL.Smidler6,ErikaA.DeBenedictis3,GeorgeM.Church1,4,MartinA.Nowak2,7-8,andKevinM.Esvelt3
1DepartmentofGenetics,HarvardMedicalSchool,2ProgramforEvolutionaryDynamics,HarvardUniversity,3MediaLaboratory,MassachusettsInstituteofTechnology,4WyssInstituteforBiologicallyInspiredEngineering,HarvardUniversity,5DepartmentofPathology,MassachusettsGeneralHospital,6DepartmentofImmunologyandInfectiousDiseases,HarvardSchoolofPublicHealth,Boston,Massachusetts,7DepartmentofMathematics,8DepartmentofOrganismicandEvolutionaryBiology,HarvardUniversity,USA.
Correspondenceto:[email protected].
Forfullfunctionalityandcommenting,pleaseviewtheonlineversionatwww.responsivescience.org/pub/daisydrives.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Abstract
RNA-guidedgenedriveelementscouldaddressmanyecologicalproblemsbyalteringthetraitsofwildorganisms,butthelikelihoodofglobalspreadtremendouslycomplicatesethicaldevelopmentanduse.HerewedetailalocalizedformofCRISPR-basedgenedrivecomposedofgeneticelementsarrangedinadaisy-chainsuchthateachelementdrivesthenext.“Daisydrive”systemscanduplicateanyeffectachievableusinganequivalentglobaldrivesystem,buttheircapacitytospreadislimitedbythesuccessivelossofnon-drivingelementsfromthebaseofthechain.Releasingdaisydriveorganismsconstitutingasmallfractionofthelocalwildpopulationcandriveausefulgeneticelementtolocalfixationforawiderangeoffitnessparameterswithoutresultinginglobalspread.WeadditionallyreportnumeroushighlyactiveguideRNAsequencessharingminimalhomologythatmayenableevolutionarystabledaisydriveaswellasglobalCRISPR-basedgenedrive.Daisydrivescouldsimplifydecision-makingandpromoteethicalusebyenablinglocalcommunitiestodecidewhether,when,andhowtoalterlocalecosystems.
Author’sSummary
‘Global’genedrivesystemsbasedonCRISPRarelikelytospreadtoeverypopulationofthetargetspecies,hamperingsafeandethicaluse.‘Daisydrive’systemsofferawaytoalterthetraitsofonlylocalpopulationsinatemporarymanner.BecausetheycanexactlyduplicatetheactivityofanyglobalCRISPR-baseddriveatalocallevel,daisydrivesmayenablesafefieldtrialsandempowerlocalcommunitiestomakedecisionsconcerningtheirownsharedenvironments.
Formoredetailsandananimationintendedforageneralaudience,seethesummaryatSculptingEvolution.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Introduction
RNA-guidedgenedriveelementsbasedontheCRISPR/Cas9nucleasecouldbeusedtospreadmanytypesofgeneticalterationsthroughsexuallyreproducingspecies[1].Theseelementsfunctionby“homing”,ortheconversionofheterozygotestohomozygotesinthegermline,whichrendersoffspringmorelikelytoinheritthegenedriveelementandtheaccompanyingalterationthanviaMendelianinheritance(Fig.1a)[2].Todate,genedriveelementsbasedonCas9havebeendemonstratedinyeast[3],fruitflies[4],andtwospeciesofmosquito[5][6].Drivehomingoccurredathighefficiency(>90%)inallfourspecies,stronglysuggestingthatrefinedversionsmaybecapableofalteringentirewildpopulations.Potentialapplicationsincludeeliminatingvector-borneandparasiticdiseases,promotingsustainableagriculture,andenablingecologicalconservationbycurtailingorremovinginvasivespecies.
Theself-propagatingnatureofglobalgenedrivesystemsrendersthetechnologyuniquelysuitedtoaddressinglarge-scaleecologicalproblems,buttremendouslycomplicatesdiscussionsofwhetherandhowtoproceedwithanygivenintervention.Technologiescapableofunilaterallyalteringthesharedenvironmentrequirebroadpublicsupport.Hence,ethicalgenedriveresearchanddevelopmentmustbeguidedbythecommunitiesandnationsthatdependonthepotentiallyaffectedecosystems.Unfortunately,attainingthislevelofengagementandinformedconsentbecomesprogressivelymorechallengingasthesizeoftheaffectedregionincreases.Candidateapplicationsthatwillaffectmultiplenationscouldbedelayedindefinitelyduetolackofconsensus.
Amethodofconfininggenedrivesystemstolocalpopulationswouldgreatlysimplifycommunity-directeddevelopmentanddeploymentwhilealsoenablingsafefieldtesting.Existingtheoreticalstrategies[7][8]canlocallyspreadcargogenesnearlytofixationifsufficientorganisms(>30%ofthelocalpopulation)arereleased.“Threshold-dependent”drivesystemssuchasthoseemployingunderdominance[9]willspreadtofixationinsmallandgeographicallyisolatedsubpopulationsiforganismsexceedingthethresholdforpopulationtakeoverarereleased(typically~50%).Toxin-basedunderdominanceapproachesarepromisingandhavebeendemonstratedinfruitflies[10][11],butaremorelimitedintheirpotentialeffectsthanarehoming-baseddrivesystems.Alloftheseapproachesinvolvereleasingcomparativelylargenumbersoforganisms,whichmaynotbepolitically,economically,orenvironmentallyfeasible.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Awaytoconstructlocally-confinedRNA-guideddrivesystemscouldenablemanypotentialapplicationsforwhichneitherglobaldrivesystemsnorexistinglocaldrivesaresuitable.Herewedescribe‘daisydrive’,apowerfulformoflocaldrivebasedonCRISPR-mediatedhominginwhichthedrivecomponentsareseparatedintoaninterdependentdaisy-chain.WeadditionallyreportnewlycharacterizedguideRNAsequencesrequiredforevolutionarystabilityandsafeuse.
Results
DesignandModeling
Figure1|a,StandardCRISPRgenedrivesdistortinheritanceinaself-sustainingmannerbyconvertingwild-type(W)allelestodriveallelesinheterozygousgermlinecells.b,A“daisydrive”systemconsistsofalinearchainofseriallydependent,unlinkeddriveelements;inthisexample,A,B,andCareonseparatechromosomes.Elementsatthebaseofthechaincannotdriveandaresuccessivelylostovertimevianaturalselection,limitingoverallspread.
Adaisydrivesystemconsistsofalinearseriesofgeneticelementsarrangedsuchthateachelementdrivesthenextinthechain(Fig.1b).Thetopelement,whichcarriesthe“payload”,isdriventohigherandhigherfrequenciesinthepopulationbytheelementsbelowitinthechain.Noelementinthechaindrivesitself.Thebottomelementislostfromthepopulationovertime,causingthenextelementtoceasedrivingandbelostinturn.Thisprocesscontinuesupthechainuntil,eventually,thepopulationreturnstoitswild-type
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
state(Fig.1b).
Thesimplestformofdaisydrive—atwoelementchain—isobtainedbyseparatingCRISPRgenedrivecomponentssuchthatthepayload-carryingelement,designated‘A’,exhibitsdriveonlyinthepresenceofanunlinked,non-drivingelement,‘B’(SupplementaryFig.1).These“splitdrives”havebeendescribed[1],demonstrated[3],andrecommended[12]asastringentlaboratoryconfinementstrategy.BecauseanyaccidentalreleasewouldinvolveonlyasmallnumberoforganismscarryingtheBelement,thedrivingeffectexperiencedbytheAelement—andthusitsspread—wouldbenegligibleinalargepopulation[3].Aslongasthepayloadconfersafitnesscosttothehostorganism,bothelementswilleventuallydisappearduetonaturalselection.
Wehypothesizedthatthespreadofthepayload-carryingelement,A,couldbeenhancedbyaddingmoreelementstothebaseofthedaisychain.Toexplorethisidea,weformulatedadeterministicmodelwhichconsiderstheevolutionofalargepopulationofdiploidorganismsaffectedbyadaisydrivesystemwithelementsspreadacrossnloci(SupplementaryMethodsSection1).Ateachlocustherearetwoalleles,thewild-type(W)andthecorrespondingdaisydriveelement(D).Tomodeltheeffectsofdriveinindividuals,weassumethatgermlinecellswhichareheterozygousatalocusconverttodrive-homozygotesatthatlocuswithprobabilityHifthepreviouslocushasatleastonecopyofadriveallele.Inotherwords,individualswithgenotypeDWatlocusi+1andatleastonecopyofDatlocusiproducegameteshavingtheDalleleati+1withprobability(1+H)/2.WeassumethatstandardMendelianinheritanceoccursintheabsenceofdriveandthatalllociareunlinked(e.g.,ondifferentchromosomes).Weignorethepossibleemergenceofdrive-resistantallelesbecausethesecanbepreventedbyensuringthateachelementtargetsanessentialgenewithmultipleguideRNAsandreplacesitwitharecodedversion[1][13].
Tomodelselectiondynamics,weassumedthateachconstructconfersadominantfitnesscost,ci,onitshostorganismandthatthesecostsareindependent(SupplementaryFig.2;SMSection1.3).Weassumethatthetargetgene—arecodedcopyofwhichisalsocontainedinthecorrespondingdriveelement—ishaploinsufficent.Inthisscenario,ifalocusicontainsadriveelementandthenextlocusdoesnot,thenthedrivecutsbothwild-typeallelesatthatlocusuntilbothcopiesaredisrupted,renderingresultinggametesnonviable(ci=1).Ifthenextlocusinsteadcontainstwocopiesofthedrive,thennocuttingoccurs(ci=0).Ifthereisexactlyonedrivealleleatthenextlocus,thenthewild-typealleleisdisruptedbycutting,renderingtheorganismnonviableunlessasuccessfulhomingeventoccurs,inwhichcase
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
thedriveiscopied,asecondcopyofthetargetgeneiscreated,andfunctionisrescued.ThisoccurswithprobabilityH,sotheassociatedcostisci=1-H.
Importantly,thesecostsareexpectedtobelowbecausereportedRNA-guidedgenedrivesexhibitveryhighhomingefficiencies:over99%foreachofthemanydrivesystemstestedinyeast[3],95%forthefruitflydriveelement[4],99.8%forthedriveelementinAn.stephensi[5],and87.3%to99.7%forthethreedrivesystemsinAn.gambiae[6].Ifthetargetgeneishaploinsufficientforgametogenesis,thecostmayevenbezero(SupplementaryFig.3).Finally,weassumethatthepayloadelementconfersanadditionaldominantcost,cn,whichisindependentofthisprocess.Thetotalfitnessofanindividualisequaltof=(1-c1)(1-c2)…(1-cn).
Anadditionalimplicitassumptionofourmodelforselectiondynamicsisthatnon-payloadelementsonlyconfercostsviawild-typetargetgenedisruption.WeconsiderthisreasonablebecausemostelementsinthedaisychaincanconsistofonlyguideRNAs,whichshouldconfermuchlowercoststhantypicalpayloads[14][15];moreover,potentiallycostlyoff-targetcuttingisminimalwhenusinghigh-fidelityCas9variants[16][17].
Westudiedathree-elementdaisydrivesystem(C->B->A)vianumericalsimulation(Fig.2).Wefindthatarbitrarilyhighfrequenciesofthepayloadelement,A,canbeachievedbyvaryingthereleasefrequency.However,thesystemdisplayshighsensitivitytothehomingrateandpayloadcost.Inparticular,largereleasesizes(>10%oftheresidentpopulation)arerequiredtodrivecostlypayloads(>10%)ifhominghasefficiencyonthelowerendofobserveddrivesystems(~90%).
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Figure2|DynamicsofC->B->Adaisydrivesystems.a,Ahighlyefficientdaisydrive(98%homingefficiency)witha10%fitnesscostforthepayloadelement,seededat1%,exhibitslimitedspread(left).Thesamedriveseededat5%rapidlyspreadsthepayloadtonear-fixation(middle).Decreasingthehomingefficiencyto90%wouldthenrequirealargerreleasesize(right).b,ThemaximumfrequencyachievedbyC->B->Adaisydrivesasafunctionofthehomingefficiencyandthepayloadcost,forreleasesizesof1%(left),5%(middle),and10%(right).
Wenextexploredtheeffectsofaddingadditionalelementstothedaisydrivesystemasapotentialmeansofincreasingtheirpotency.Weobservethatlongerchainsleadtomuchstrongerdrive(Fig.3).Atahomingefficiencyof95%perdaisydriveelement,whichisreadilyaccessibletocurrentdrivesystems,four-andfive-elementsystemsdrivingapayloadwith10%costcouldbereleasedatfrequenciesaslowas5%and3%,respectively,andstillexceed99%frequencyinfewerthan20generations.Onaper-organismbasis,theseareover100-foldmoreefficientthansimplyreleasingorganismswiththepayload(SupplementaryFig.4).
Adjustingthemodeltoincluderepeatedreleasesineverysubsequentgeneration,weobservedthatdaisydrivescanreadilyalterlocalpopulationsifrepeatedlyreleasedinverysmallnumbers,althoughthebenefitofrepeatedreleaseislostwhentheinitialreleasesizebecomeslarge(>10%)(SupplementaryFig.5).Thismaybeusefulforapplicationsthatmustaffectlargegeographicregionsoverextendedperiodsoftime,aswellasforlocaleradicationcampaigns[18].
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Figure3|Themaximumfrequencyofthepayloadelement(A),aswellasitstimeto99%frequencyinapopulation,increaseswiththenumberofelementsinthedaisychain.a,Examplesimulationsassuminga1%releaseofdaisydriveorganismshavinga10%payloadfitnesscost,and95%(left),98%(middle),or100%(right)homingefficiencies.Darkershadesindicatelongerdaisychains(from2to5elements).b,Generationsrequiredforthepayloadelementtoattain99%frequency.
EvolutionaryStabilityandCRISPRMultiplexing
Despitethesepromisingtheoreticalresults,currenttechnologicallimitationsprecludethesafeuseofdaisydriveelements.Specifically,anyrecombinationeventthatmovesoneormoreguideRNAswithinanupstreamelementofthechainintoanydownstreamelementwillconvertalineardaisydrivechainintoaself-sustaininggenedrive‘necklace’anticipatedtospreadglobally(Fig.4a).
Theonlywaytoreliablypreventsucheventsistoeliminateregionsofhomologybetweentheelements.PromoterhomologycanberemovedbyusingdifferentU6,H1,ortRNApromoterstoexpresstherequiredguideRNAs[19][20][21];ifthereareinsufficientpromoterstheneachcandriveexpressionofmultipleguideRNAsusingtRNAprocessing[22][23]orbyconnectingapairofsgRNAsbyashortlinker.However,eachelementmuststillencodemultipleguideRNAs>80basepairsinlengthinordertopreventthecreationofdrive-resistantalleles,precludingsafeandstabledaisydrivedesigns.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
OnealternativeistouseadistinctorthogonalCRISPRsystemforeverydaisyelement[24](SupplementaryFig.6).Unfortunately,enhanced-specificityvariantsareonlyavailablefortheS.pyogenesCas9,itismoredifficulttofindmultiplepromoterssuitableforCas9expressionthanforguideRNAexpression,andthefitnesscostislikelyhigherthananequivalentguideRNAelement.WeaccordinglysoughttoidentifyhighlyactiveguideRNAsequenceswithminimalhomologytooneanotherthatcouldenablesafedaisydriveusingonlyasingleCRISPRnuclease.
WecomparedknowntracrRNA,crRNA,andalternativesgRNAsequencesforCRISPRsystemsrelatedtothatofS.pyogenestoidentifybasestolerantofvariationwithinthesequenceofthemostcommonlyusedsgRNA(Fig.4b-c).WethencreateddozensofsgRNAvariantsdesignedtobeasdivergentfromoneanotheraspossible.AssayingtheseusingasensitivetdTomato-basedtranscriptionalactivationreporteridentified15differentsgRNAswithactivitiescomparabletothestandardversion(Fig.4d).Activityincreasedwiththelengthofthefirststeminagreementwithotherreports(SupplementaryFigs.7-8)[25].ThissetofminimallyhomologoussgRNAscanbeusedtoconstructstabledaisydrivesystemsofupto5elementswith4sgRNAsperdrivingelement,andwillalsofacilitatemultiplexedCas9targetinginthelaboratorybypermittingthecommercialsynthesisofDNAfragmentsencodingmanysequence-divergentguideRNAs.Futurestudieswillneedtoexaminethestabilityoftheresultingdaisydrivesystemsinlargepopulationsofanimalmodels.
Importantly,ourdivergentguideRNAswillalsoenableglobalCRISPRgenedriveelementstoovercometheproblemofinstabilitycausedbyincludingmultiplerepetitiveguideRNAsequencesinthedrivecassette[26],whichinturnisrequiredinordertoovercomedrive-resistantalleles[13].Usingnon-repetitiveguidesmayconsequentlyallowstableandefficientglobaldriveelementstoaffecteveryorganisminthetargetpopulation.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Figure4|a,AnyrecombinationeventthatmovesaguideRNAfromoneelementtoanothercouldcreatea"daisynecklace"capableofself-sustainingglobaldrive.b,Becausepromoterscanbechanged,repetitionoftheconservedguideRNAsequenceisakeyproblem.c,Usingexistingdata,wegeneratedatemplateidentifyingcandidatepositionspresumedtolerantofsequencechanges.d,RelativeactivitiesofcandidateguideRNAsgeneratedfromthetemplatewereassayedusingaCas9transcriptionalactivatorscreenusingatdTomatoreporterinhumancells.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Discussion
ConstructionandDeployment
Onapracticallevel,researchersneedonlyconstructone‘generic’daisydrivestrainperspecies—theequivalentofamultistagerocketthatcouldbeloadedwithanydesiredpayload.Thisgenericdaisydrivesystem,whichwouldharbortheCas9geneintheBpositionbutlackanyAelements,couldbeusedinthreedifferentways.
First,oneormoreelementscarryingpayloadscouldbeaddeddirectlytothegenericdaisydrivestrain.Inadditiontothepayload,eachsuchAelementmustencodeguideRNAssufficienttodriveitselfinthepresenceofCas9.Thesedaisy-driveorganismswouldthenbemass-producedandreleasedinasingle-strain,single-stageapproach.
Second,thegenericdaisydrivestrainitselfcouldbereleasedinthetargetregiontospreadtheCas9geneandaccompaniedbyoneormorestrainscarryingpayloadelements.Matingsinthewildwouldcombinetheelementstogeneratethedesiredeffect.Thisisamulti-strain,single-stageapproach.
Third,thegenericdaisydrivestraincouldbereleasedandthespreadoftheCas9genemonitoredinordertoidentifytheexactregionthatwouldbeaffected.Optionally,spreadwithinthisregioncouldbeadjustedbyreleasingwild-typeorganisms.Onceacceptablydistributed,asubsequentreleaseofstrainscarryingpayloadelementswouldinitiatethedesiredeffect.
FieldTrialsandSafeguards
Someecologicalproblemsaresowidelydistributedgeographicallythataddressingthemmayrequireglobalgenedrivesystems.However,globaldrivesystemscannotbetestedinfieldtrialswithoutasubstantialriskofeventualworldwidespread[27].Daisydrivesystems,whicharecapableofmimickingthemoleculareffectsofanygivenglobaldriveonalocallevel,mayofferapotentialsolution.
Similarly,scientistscurrentlyhavefewattractiveoptionsforcontrollingunauthorizedoraccidentally-releasedglobaldrivesystems.Whileitispossibletooverwritegenome-levelalterationsandundophenotypicchangesusingimmunizingreversaldrives[1],thesecountermeasuresmustnecessarilyspreadtotheentirepopulationinordertoimmunizethemagainsttheunwanteddrivesystem;strategiesbasedonpurereversaldrives[3]or
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
variationssuchasgenedrive‘brakes’[28]willonlyslowitdown.Incontrast,daisydrivesystemsmaybepowerfulenoughtoeliminateallcopiesofanunwantedglobaldrivesystemvialocalimmunizingreversalorpopulationsuppressionbeforedisappearingthemselves.
Lastly,daisydrivesystemscouldpermitcontrolledandpersistentpopulationsuppressionbylinkingasex-specificeffecttoageneticlocusuniquetotheothersex.Forexample,femalefertilitygenessuchasthoserecentlyidentifiedinmalarialmosquitoes[6]couldbetargetedbyageneticloaddaisydrivewhosebasalelementislocatedontheYchromosomeoranequivalentmale-specificlocus(SupplementaryFig.9).Thesemaleswouldsuffernofitnesscostsduetosuppressionrelativetocompetingwild-typemales.Iffemalefertilitygenedisruptionoccurredearlyindevelopmentratherthaninthegermline,thesamesystemcouldproduceamale-linkeddominantsterile-daughtereffectthatwouldbelesspowerfulbutmorereadilymodulated(SupplementaryFig.10).
Byenablingscientiststoreversiblycontrollocalpopulationabundance,daisydrivescouldbecomeavaluabletoolforthestudyofecologicalinteractionsandthelikelyconsequencesofreleasingglobalRNA-guidedsuppressiondrives.
Conclusion
RNA-guidedgenedrivesbasedonCRISPR/Cas9havegeneratedconsiderableexcitementasapotentialmeansofaddressingotherwiseintractableecologicalproblems.Whileexperimentshaveracedaheadatabreathtakingpace,thelikelihoodofglobalspreadoncereleasedintothewildmayproveaformidablebarriertodeploymentduetotheneedforinternationalpublicsupport,fieldtrials,andsubsequentregulatoryapproval.Theseethicalanddiplomaticcomplicationsaremostacutefordrivesystemsaimingtosolvethemosturgenthumanitarianproblems,includingmalaria,schistosomiasis,dengue,Zika,andothervector-borneandparasiticdiseases.Lackofinternationalconsensuscoulddelayapprovalbyyearsorevendecades.
Similarly,thepotentialforglobalRNA-guideddrivesystemstobereleasedaccidentallyordeployedunilaterallyhasledtomanycallsforcautionandexpressionsofalarm,notleastfromscientistsinthevanguardofthefield[1][29][12].Anysucheventcouldhavepotentiallydevastatingconsequencesforpublictrustandsupportforfutureinterventions.
Incontrast,daisydrivesystemsmightbesafelydevelopedinthelaboratory,assessedinthefield,anddeployedtoaccomplishtransientalterationsthatdo
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
notimpactothernationsorjurisdictions.Byusingmolecularconstraintstolimitgenerationalandgeographicspreadinatunablemanner,daisydrivescouldexpandthescopeofecologicalengineeringbyenablinglocalcommunitiestomakedecisionsconcerningtheirownlocalenvironments.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
References1. Esvelt,K.M.,Smidler,A.L.,Catteruccia,F.&Church,G.M.ConcerningRNA-
guidedgenedrivesforthealterationofwildpopulations.eLifee03401(2014).doi:10.7554/eLife.03401
2. Burt,A.Site-specificselfishgenesastoolsforthecontrolandgeneticengineeringofnaturalpopulations.Proc.Biol.Sci.270,921–928(2003).
3. DiCarlo,J.E.,Chavez,A.,Dietz,S.L.,Esvelt,K.M.&Church,G.M.SafeguardingCRISPR-Cas9genedrivesinyeast.Nat.Biotechnol.33,1250–1255(2015).
4. Gantz,V.M.&Bier,E.Genomeediting.Themutagenicchainreaction:amethodforconvertingheterozygoustohomozygousmutations.Science348,442–444(2015).
5. Gantz,V.M.etal.HighlyefficientCas9-mediatedgenedriveforpopulationmodificationofthemalariavectormosquitoAnophelesstephensi.Proc.Natl.Acad.Sci.U.S.A.112,E6736-6743(2015).
6. Hammond,A.etal.ACRISPR-Cas9genedrivesystemtargetingfemalereproductioninthemalariamosquitovectorAnophelesgambiae.Nat.Biotechnol.34,78–83(2016).
7. Gould,F.,Huang,Y.,Legros,M.&Lloyd,A.L.Akiller-rescuesystemforself-limitinggenedriveofanti-pathogenconstructs.Proc.Biol.Sci.275,2823–2829(2008).
8. Rasgon,J.L.Multi-LocusAssortment(MLA)forTransgeneDispersalandEliminationinMosquitoPopulations.PLoSONE4,e5833(2009).
9. Curtis,C.F.Possibleuseoftranslocationstofixdesirablegenesininsectpestpopulations.Nature218,368–369(1968).
10. Akbari,O.S.etal.Asyntheticgenedrivesystemforlocal,reversiblemodificationandsuppressionofinsectpopulations.Curr.Biol.CB23,671–677(2013).
11. Reeves,R.G.,Bryk,J.,Altrock,P.M.,Denton,J.A.&Reed,F.A.Firststepstowardsunderdominantgenetictransformationofinsectpopulations.PloSOne9,e97557(2014).
12. Akbari,O.S.etal.Safeguardinggenedriveexperimentsinthelaboratory.Science349,927–929(2015).
13. Noble,C.,Olejarz,J.,Esvelt,K.M.,Church,G.M.&Nowak,M.A.EvolutionarydynamicsofCRISPRgenedrives.XXXXX(2016).doi:10.1101/0XXXXX
14. Marrelli,M.T.,Moreira,C.K.,Kelly,D.,Alphey,L.&Jacobs-Lorena,M.Mosquitotransgenesis:whatisthefitnesscost?TrendsParasitol.22,197–202(2006).
15. Harvey-Samuel,T.,Ant,T.,Gong,H.,Morrison,N.I.&Alphey,L.Population-leveleffectsoffitnesscostsassociatedwithrepressiblefemale-lethaltransgeneinsertionsintwopestinsects.Evol.Appl.7,597–606(2014).
16. Slaymaker,I.M.etal.RationallyengineeredCas9nucleaseswithimprovedspecificity.Science351,84–88(2016).
17. Kleinstiver,B.P.etal.High-fidelityCRISPR-Cas9nucleaseswithnodetectablegenome-wideoff-targeteffects.Nature529,490–495(2016).
18. Wyss,J.H.ScrewwormeradicationintheAmericas.Ann.N.Y.Acad.Sci.916,186–193(2000).
19. Port,F.,Chen,H.-M.,Lee,T.&Bullock,S.L.OptimizedCRISPR/CastoolsforefficientgermlineandsomaticgenomeengineeringinDrosophila.Proc.Natl.Acad.Sci.U.S.A.111,E2967-2976(2014).
20. Ranganathan,V.,Wahlin,K.,Maruotti,J.&Zack,D.J.ExpansionoftheCRISPR-
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Cas9genometargetingspacethroughtheuseofH1promoter-expressedguideRNAs.Nat.Commun.5,4516(2014).
21. Mefferd,A.L.,Kornepati,A.V.R.,Bogerd,H.P.,Kennedy,E.M.&Cullen,B.R.ExpressionofCRISPR/CassingleguideRNAsusingsmalltRNApromoters.RNAN.Y.N21,1683–1689(2015).
22. Xie,K.,Minkenberg,B.&Yang,Y.BoostingCRISPR/Cas9multiplexeditingcapabilitywiththeendogenoustRNA-processingsystem.Proc.Natl.Acad.Sci.U.S.A.112,3570–3575(2015).
23. Port,F.&Bullock,S.L.ExpansionoftheCRISPRtoolboxinananimalwithtRNA-flankedCas9andCpf1gRNAs.bioRxiv46417(2016).doi:10.1101/046417
24. Esvelt,K.M.etal.OrthogonalCas9proteinsforRNA-guidedgeneregulationandediting.Nat.Methods10,1116–1121(2013).
25. Dang,Y.etal.OptimizingsgRNAstructuretoimproveCRISPR-Cas9knockoutefficiency.GenomeBiol.16,280(2015).
26. Simoni,A.etal.DevelopmentofsyntheticselfishelementsbasedonmodularnucleasesinDrosophilamelanogaster.NucleicAcidsRes.(2014).doi:10.1093/nar/gku387
27. Marshall,J.M.Theeffectofgenedriveoncontainmentoftransgenicmosquitoes.J.Theor.Biol.258,250–265(2009).
28. Wu,B.,Luo,L.&Gao,X.J.Cas9-triggeredchainablationofcas9asagenedrivebrake.Nat.Biotechnol.34,137–138(2016).
29. Oye,K.A.etal.Regulatinggenedrives.Science345,626–628(2014).
Acknowledgments
WethankM.TuttleforperformingpreliminaryguideRNAactivityassays,F.Gould,A.Lloyd,andL.Alpheyforhelpfuldiscussions,andL.Alpheyforcriticalreadingofthemanuscript.
Authorcontributions
K.M.E.conceivedthestudy,J.M.andK.M.E.ranpreliminarysimulationswithadvicefromA.L.S.;C.N.,J.O.,M.A.N.createdtheevolutionarydynamicsmodel,J.M.andK.M.E.designedtheguideRNAtemplateandcandidatesequences,J.B.andA.C.designedandperformedguideRNAexperimentswithadvicefromK.M.E.,E.D.createdtheinteractiveversionofthemodel,andC.N.,J.M.,andK.M.E.wrotethemanuscriptwithcontributionsfromallotherauthors.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Methods
GuideRNADesign
WeexaminedexistingdataonguideRNAvariantsandcorrespondingactivitiesaswellasthecrystalstructureofS.pyogenesCas9incomplexwithsgRNAtoidentifybasesthatwouldlikelytoleratemutation.Usingthisinformation,weconstructedasetof20sgRNAsandassayedactivity(seebelow)usingonlytworeplicatestoidentifysequencechangesthatwereharmfultoactivity.TheseexperimentssuggestedthatthelargeinsertionfoundinsgRNAsfromcloselyrelatedbacteriawaswell-toleratedinonlyonecase.ItwasconsequentlyremovedandadditionalsgRNAsdesigned.Allcandidateswerethenassayedtoidentifythosewithsufficientlyhighactivity.FutureexperimentsrequiringadditionalhighlydivergentsgRNAs,suchasdaisysuppressiondrivesinwhichtheAelementencodesmanyguideRNAsthatdisruptmultiplerecessivefertilitygenesatmultiplesites,willrequireamorecomprehensivelibrary-basedapproachtoactivityprofiling.
MeasuringGuideRNAActivity
HEK293TcellsweregrowninDulbecco’sModifiedEagleMedium(LifeTechnologies)fortifiedwith10%FBS(LifeTechnologies)andPenicillin/Streptomycin(LifeTechnologies).Cellswereincubatedataconstanttemperatureof37°Cwith5%CO2.Inpreparationfortransfection,cellsweresplitinto24-wellplates,dividedintoapproximately50,000cellsperwell.Cellsweretransfectedusing2ulofLipofectamine2000(LifeTechnologies)with200ngofdCas9activatorplasmid,25ngofguideRNAplasmid,60ngofreporterplasmidand25ngofEBFP2expressingplasmid.
Fluorescenttranscriptionalactivationreporterassayswereperformedusingamodifiedversionofaddgeneplasmid#47320,areporterexpressingatdTomatofluorescentproteinadaptedtocontainanadditionalgRNAbindingsite100bpupstreamoftheoriginalsite.gRNAswereco-transfectedwithreporter,dCas9-VPR,atripartitetranscriptionalactivatorfusedtotheC-terminusofnuclease-nullStreptococcuspyogenesCas9,andanEBFP2expressingcontrolplasmidintoHEK293Tcells.48hourspost-transfection,cellswereanalyzedbyflowcytometry.Inordertoexclusivelyanalyzetransfectedcells,cellswithlessthan10^3EBFP2expressionwereignored.Thepreliminaryscreenoftheinitial20designswasperformedwithonlytworeplicatestoidentifycriticalbases.ExperimentsevaluatingthefinalsetofsgRNAsequenceswereperformedwithsixbiologicalreplicates.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Supplementary Methods:Daisy-chain gene drives for the alteration of local populations
Charleston Noble, John Min,Jason Olejarz, Joanna Buchthal, Alejandro Chavez,
Andrea L. Smidler, Erika A. DeBenedictis, George M. Church,Martin A. Nowak, and Kevin M. Esvelt
In these Supplementary Methods, we study the evolutionary dynamics of “daisy drive” geneticconstructs. The aim of engineering such constructs is to genetically manipulate some—but notall—of a wild population.
1 Evolutionary dynamics of a daisy drive construct
We first describe a daisy drive system consisting of only two elements. This simple case demonstratesthe principle behind daisy drive engineering. We then describe a daisy drive system with an arbitrarynumber of elements.
1.1 Model for the evolutionary dynamics of a 2-element daisy drive
We consider a wild population of diploid organisms and focus on two loci, “1” and “2”. The wild-typealleles at the two loci are 1W and 2W , and we denote by 1WW 2WW the genotype of an individualwhich is homozygous for both. Using CRISPR genome editing technology, one can engineer whatwe refer to as “daisy” alleles at both loci (1D and 2D). They function as follows. The 1D allele effectscutting of the 2W allele in an individual’s germline. After cutting, if the individual additionally hasa 2D allele (i.e., the individual is 2WD), then the individual is converted from 2WD to 2DD with someprobability. Otherwise the individual remains 2WD. This results in super-Mendelian inheritance ofthe 2D allele in a 1D-mediated fashion. Importantly, the 1D allele undergoes standard inheritanceand does not facilitate its own spread similarly. We assume that the two loci are independent andthat a single copy of 1D produces cutting.
To see how the daisy drive works, consider Table 1, which is understood as follows:.Gametes of haplotype 1W 2W are produced in the following ways:
• 1WW 2WW individuals produce only 1W 2W gametes.
• 1WW 2WD individuals produce gametes with a wild-type allele at the second locus with prob-ability 1/2. There is a fitness effect, F , due to the payload of the drive allele at the secondlocus. So 1WW 2WD individuals produce 1W 2W gametes at relative rate F/2.
• 1WD2WD individuals produce gametes with a wild-type allele at the first locus with probability1/2. The action of the drive allele at the first locus is to bias the inheritance of the drive alleleat the second locus, quantified by a factor fW regarding the transmission of the wild-type
1
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
Genotype 1W 2W 1W 2D 1D2W 1D2D
1WW 2WW 1 0 0 0
1WW 2WD12F
12F 0 0
1WW 2DD 0 F 0 0
1WD2WW 0 0 0 0
1WD2WD12fWF
12fDF
12fWF
12fDF
1WD2DD 0 12F 0 1
2F
1DD2WW 0 0 0 0
1DD2WD 0 0 fWF fDF
1DD2DD 0 0 0 F
Table 1: Gamete production table showing the relative rates at which individuals of each genotype(rows) produce gametes of each haplotype (columns).
allele at the second locus. There is a fitness effect, F , due to the payload of the drive allele atthe second locus. So 1WD2WD individuals produce 1W 2W gametes at relative rate fWF/2.
Gametes of haplotype 1W 2D are produced in the following ways:
• 1WW 2WD individuals produce gametes with a wild-type allele at the first locus with probability1 and gametes with a drive allele at the second locus with probability 1/2. There is a fitnesseffect, F , due to the payload of the drive allele at the second locus. So 1WW 2WD individualsproduce 1W 2D gametes at relative rate F/2.
• 1WW 2DD individuals produce only 1W 2D gametes. There is a fitness effect, F , due to thepayload of the drive allele at the second locus. So 1WW 2DD individuals produce 1W 2D gametesat relative rate F .
• 1WD2WD individuals produce gametes with a wild-type allele at the first locus with probability1/2. The action of the drive allele at the first locus is to bias the inheritance of the drive alleleat the second locus, quantified by a factor fD regarding the transmission of the drive alleleat the second locus. There is a fitness effect, F , due to the payload of the drive allele at thesecond locus. So 1WD2WD individuals produce 1W 2D gametes at relative rate fDF/2.
• 1WD2DD individuals produce gametes with a wild-type allele at the first locus with probability1/2 and gametes with a drive allele at the second locus with probability 1. There is a fitnesseffect, F , due to the payload of the drive allele at the second locus. So 1WD2DD individualsproduce 1W 2D gametes at relative rate F/2.
Gametes of haplotype 1D2W are produced in the following ways:
• 1DD2WD individuals produce gametes with a drive allele at the first locus with probability 1.The action of the drive allele at the first locus is to bias the inheritance of the drive allele atthe second locus, quantified by a factor fW regarding the transmission of the wild-type alleleat the second locus. There is a fitness effect, F , due to the payload of the drive allele at thesecond locus. So 1DD2WD individuals produce 1D2W gametes at relative rate fWF .
• 1WD2WD individuals produce gametes with a drive allele at the first locus with probability1/2. The action of the drive allele at the first locus is to bias the inheritance of the drive allele
2
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
at the second locus, quantified by a factor fW regarding the transmission of the wild-typeallele at the second locus. There is a fitness effect, F , due to the payload of the drive allele atthe second locus. So 1WD2WD individuals produce 1D2W gametes at relative rate fWF/2.
Gametes of haplotype 1D2D are produced in the following ways:
• 1WD2WD individuals produce gametes with a drive allele at the first locus with probability1/2. The action of the drive allele at the first locus is to bias the inheritance of the drive alleleat the second locus, quantified by a factor fD regarding the transmission of the drive alleleat the second locus. There is a fitness effect, F , due to the payload of the drive allele at thesecond locus. So 1WD2WD individuals produce 1D2D gametes at relative rate fDF/2.
• 1WD2DD individuals produce gametes with a drive allele at the first locus with probability1/2 and gametes with a drive allele at the second locus with probability 1. There is a fitnesseffect, F , due to the payload of the drive allele at the second locus. So 1WD2DD individualsproduce 1D2D gametes at relative rate F/2.
• 1DD2WD individuals produce gametes with a drive allele at the first locus with probability1. The action of the drive allele at the first locus is to bias the inheritance of the drive alleleat the second locus, quantified by a factor fD regarding the transmission of the drive alleleat the second locus. There is a fitness effect, F , due to the payload of the drive allele at thesecond locus. So 1DD2WD individuals produce 1D2D gametes at relative rate fDF .
• 1DD2DD individuals produce only 1D2D gametes. There is a fitness effect, F , due to thepayload of the drive allele at the second locus. So 1DD2DD individuals produce 1D2D gametesat relative rate F .
Using these rules, we can formally express the rates at which the four types of gametes areproduced in the population. We denote by g(z) the rate (with implicit time-dependence) at whichgametes with haplotype z are produced by individuals in the population.
g(1W 2W ) = x(1WW 2WW ) +1
2Fx(1WW 2WD) + fWFx(1WD2WD)
g(1W 2D) = Fx(1WW 2DD) +1
2Fx(1WW 2WD) +
1
2Fx(1WD2DD) + fDFx(1WD2WD)
g(1D2W ) = fWFx(1DD2WD) +1
2fWFx(1WD2WD)
g(1D2D) = Fx(1DD2DD) +1
2Fx(1WD2DD) +
1
2fDFx(1WD2WD) + fDFx(1DD2WD)
Here we have used the following notation: x(z) is the frequency of individuals with genotype z andf(z) is the fitness of individuals with genotype z.
3
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
The selection dynamics are then modeled by the following system of equations:
x(1WW 2WW ) = g(1W 2W )2 − ψ2x(1WW 2WW )
x(1WW 2WD) = 2g(1W 2W )g(1W 2D)− ψ2x(1WW 2WD)
x(1WW 2DD) = g(1W 2D)2 − ψ2x(1WW 2DD)
x(1WD2WW ) = 2g(1W 2W )g(1D2W )− ψ2x(1WD2WW )
x(1WD2WD) = 2g(1W 2D)g(1D2W ) + 2g(1W 2W )g(1D2D)− ψ2x(1WD2WD)
x(1WD2DD) = 2g(1W 2D)g(1D2D)− ψ2x(1WD2DD)
x(1DD2WW ) = g(1D2W )2 − ψ2x(1DD2WW )
x(1DD2WD) = 2g(1D2W )g(1D2D)− ψ2x(1DD2WD)
x(1DD2DD) = g(1D2D)2 − ψ2x(1DD2DD)
Here, an overdot denotes the time derivative, d/dt. Throughout this SI, we omit explicitly writingthe time dependence of our dynamical quantities. Note that this formulation assumes randommating, i.e., that two random gametes come together to form an individual. Also note that productsg(y)g(z) represent the pairings of different gametes. At any given time, we require that the totalnumber of individuals sums to one: ∑
z
x(z) = 1
To enforce this density constraint, we set
ψ = g(1W 2W ) + g(1W 2D) + g(1D2W ) + g(1D2D)
1.2 Model for the evolutionary dynamics of an n-element daisy drive
We can apply the same engineering to a daisy drive chain of arbitrary length n, where the driveallele at one locus induces cutting of the wild-type allele at the next locus in the sequence. Todescribe this mathematically, it is helpful to generalize our notation.
Consider a daisy drive construct with only two loci, as in the previous section. We use a “1” bitto denote a wild-type allele, and we use a “0” bit to denote a daisy drive allele. The two alleles ata particular locus are denoted in a vertical pair. Thus, we denote the nine possible genotypes as
1WW 2WW = 1111
1WW 2WD = 1110
1WW 2DD = 1010
1WD2WW = 1101
1WD2WD = 1100
1WD2DD = 1000
1DD2WW = 0101
1DD2WD = 0100
1DD2DD = 0000
Notice that if an individual is heterozygous at a particular locus, then this notation allows for twoways of writing the alleles at that locus. For example, genotype 1WD2WD can be written in anyone of four equivalent ways: 11
00, 0011, 10
01, or 0110.
4
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
When modeling daisy drives with a large number of loci, it is helpful to adopt shorthand notation.We use p and q to denote two binary strings of alleles. p and q need not be equal (i.e., the bits atcorresponding positions need not all be equal). For example, genotype 1WW 2WD can be written aspq , where p = 11 and q = 10. In this case, we denote the individual bits in the strings p and q byp1 = 1, p2 = 1, q1 = 1, and q2 = 0. Notice that p = 10 and q = 11 (i.e., p1 = 1, p2 = 0, q1 = 1, andq2 = 1) also describe the same genotype.
We denote by xpq the frequency of individuals with genotype pq . We denote by gq the rate at
which gametes with haplotype q are produced. For an n-element daisy drive, gq is given by
gq =∑α,β
xαβF1−αnβn
n∏i=1
{δαiqiδβiqi [δ0,qi + αi−1βi−1δ1,qi ] + (1− δαiβi)
[αi−1βi−1
2+ (1− αi−1βi−1)fqi
]}(1)
Here, we have defined α0 = β0 = 1. In the sum over α, β when enumerating genotypes, heterozygousloci (αi 6= βi) are each counted once, so there is no double-counting. gq is linear in each xαβ , whereall genotypes αβ are summed over.
We understand the terms in the factors in brackets as follows. Consider just a single factor inbrackets for a particular value of i.
• If αi = βi = qi = 0, then individuals of genotype αβ have two identical copies of allele 0 at the
ith locus, and those individials create only gametes with allele 0 at position i.
• If αi = βi = qi = 1 and αi−1βi−1 = 1, then individuals of genotype αβ have two identical copiesof allele 1 at the ith locus and no copy of allele 0 at the (i− 1)th locus, and those individialscreate only gametes with allele 1 at position i.
• If αi = βi = qi = 1 and αi−1βi−1 = 0, then individuals of genotype αβ have two identical copiesof allele 1 at the ith locus and at least one copy of allele 0 at the (i − 1)th locus. Since thedaisy drive allele at the (i − 1)th locus destroys both copies of the wild-type allele at the ith
locus, these individuals do not produce viable gametes. Hence, there is no corresponding termin Equation (1).
• If αi 6= βi and αi−1βi−1 = 1, then individuals of genotype αβ have a single copy of allele qi at
the ith locus, and without any action from the daisy drive, those individials create gameteswith allele qi and allele (1 + (−1)qi)/2 at position i in equal proportion.
• If αi 6= βi and αi−1βi−1 = 0, then individuals of genotype αβ have a single copy of allele qi at
the ith locus, and the daisy drive allele at the (i− 1)th locus biases the inheritance of allele qiat position i.
The prefactor F 1−αnβn is the fitness cost associated with the payload. It appears if there is at leastone copy of the daisy drive allele at the last position, n, in the daisy chain.
The selection dynamics for an n-element daisy drive are modeled by the following equations:
xpq =∑α,β
gαgβ
n∏i=1
[δpiqiδαipiδβiqi + (1− δpiqi)(1− δαiβi)]− ψ2xpq (2)
There is one such equation for each possible genotype pq .
We make sense of Equations (2) as follows. Each pair of gametes gα and gβ makes a newindividual.
5
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
• If pi = qi = αi = βi, then gametes of haplotypes α and β pair to make only individials withgenotype pi
qi at locus i.
• If pi 6= qi and αi 6= βi, then gametes of haplotypes α and β pair to make only individials withgenotype pi
qi at locus i.
We impose the density constraint ∑p,q
xpq = 1
As already noted, in the sum over p, q when enumerating genotypes, heterozygous loci (pi 6= qi) areeach counted once, so there is no double-counting. We use the following identity
∑p,q
n∏i=k
[δpiqiδαipiδβiqi + (1− δpiqi)(1− δαiβi)] = 1 ∀ 1 ≤ k ≤ n
The form of ψ that enforces the density constraint is
ψ =∑α
gα
6
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure1
SupplementaryFigure1.Familytreeanalysisof(A)B->Asplitdriveand(B)C->B->AdaisydrivedemonstratesthepowerofincludinganadditionaldaisydriveelementinspreadingthepayloadtomoreoffspringintheF4generation.(C)GraphicaldepictionoftotalallelespergenerationforB->AthroughD->C->B->Adaisydrives.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure2
SupplementaryFigure2|Mechanisticmodelforfitnessparametersassumedinthedaisydrivesystem.a,Weassumethatfitnesscostsprimarilyarisefromcuttingandmisrepaireventswhichresultindisruptedhaploinsufficienttargetgenes.Sucheventsoccuronlywhenthereisadriveelementatsomelocus(i)andasusceptibletargetatthenextlocus(i+1).Ifthenextlocushastwowild-typealleles—thatis,nodriveelements—thencuttingandmisrepairislethal:thefitnesscostassociatedwiththispairoflociisthusci=1.Ifthenextlocushasonewild-typeallele,thenmisrepaireventsoccurpreciselywhenhomingdoesnotsucceed,andthishappenswithprobability1-H;thustheassociatedcostisci=1-H.Ifthereisnodriveelementatthefirstpositionand/ornosusceptibletargetsatthesecondposition,thennocuttingcanoccurandthusnocostisincurred.(b)Weassumethatthecostcontributedbyeachlinkinthechainisindependent.Wecalculatethecostsciforeachpairofadjacentlinks,andthetotalfitnessoftheorganismisthentheproduct(1-c1)(1-c2)...(1-cn),wherecnisthecostassociatedwiththepayloadgene.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure3
SupplementaryFigure3|Apotentialmeansofeliminatingthefitnesscostresultingfromincorrectrepairmightinvolvestargetingagenewhoselossimpairsgametogenesis,suchasaribosomalgene.Increasedreplicationofcorrectlyrepairedcellscarryingthedrivesystemwouldtheoreticallyresultinawild-typelevelofgametes,allofwhichcarrythedrivesystem.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure4
SupplementaryFigure4|Thecomparativeefficacyofdaisydrivesystemscanbeassessedbycomparingthepayloadfrequencyresultingfromofreleasingonedaisydriveorganismofdifferentdaisy-chainlengthsafter20generationsrelativetoreleasingorganismswithonlythepayload.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure5
SupplementaryFigure5|Repeatedseedingofengineeredorganismsimprovesdaisydrivespreadforlowreleasefrequencies.a,Examplesimulationsassuminga1%releaseofdaisydriveorganismshavinga10%payloadfitnesscost,and95%(left),98%(middle),or100%(right)homingefficiencies.Darkershadesindicatelongerdaisychains(from2to5elements).b,Generationsrequiredforthepayloadelementtoattain99%frequency.AllsimulationsareidenticaltothoseinFig.3ofthemaintext,excepthereweassumethattheinitialreleaseisrepeatedeachgeneration.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure6
SupplementaryFigure6|DaisydrivesystemscanbeconstructedusingorthogonalCas9elements.Suchadrivesystemisresistanttoconversionintoadaisynecklace,whichwouldrequirearecombinationeventthatmovedtheentireCas9geneandassociatedguideRNAsintoasubsequentlocusinthedaisy-chain.EnsuringthatalltheCas9proteinsareexpressedappropriatelywithoutre-usingpromotersandtherebycreatinghomologybetweenelementscouldbechallenging.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure7
SupplementaryFigure7.Completelistofsequence-divergentguideRNAsgeneratedandassayedusingthetranscriptionalactivationreporter.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure8
SupplementaryFigure8.ResultsofthepilotscreenofthefirstsetofdesignedsgRNAsequences.#3-6,#10-13,and#17-20allcarriedtheextrainsert;thelatter8displayedmarkedlyloweractivityandwerenotfurtherconsidered.Thecauseofthedifferenceisunclear,althoughitisworthnotingthattheseallhadlongerstem-loopsthandid#3-6,allofwhichwereclosertotheactivityofthestandardor'wild-type'sgRNA.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure9
SupplementaryFigure9.PotentialfamilytreeofaC->B->AgeneticloaddaisydriveforwhichthepayloadintheAelementdisruptsafemalefertilitygene.TheCelementismale-linked,ensuringthatitdoesnotsufferafitnesscostfromthelossoffemalefertility.MatingeventsbetweentwoparentscarryingtheAelement(boxed)canproducesterilefemaleoffspringthatwillsuppressthepopulation.Malesdonotsufferafitnesscostduetodisruptionoffemale-specificfertilitygenes.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint
SupplementaryFigure10
SupplementaryFigure10|Adaisy-likesystemcancreatemaleswithamale-linkedsterile-daughtertrait.Element(A)disruptsoneormorerecessivegenesrequiredforfemalefertility.ThepresenceofasinglecopyoftheBelementcausestheAelementtodriveinthezygoteorearlyembryo,ensuringthatfemalesaresterilebutleavingmalesunaffected.TheCelementismale-linkedandcausestheBelementtodrive,whichcanoccureitherinthegermlineasdepictedorinthezygoteorearlyembryo.Theresultisamalethatproducessonslikehimself,butdaughtersthataresterile.Thesemalesshouldhavefitnessonlyslightlylesserthanwild-typemalesandconsequentlyshouldremaininthepopulationforanextendedperiodoncereleased.Sincepopulationreproductivecapacitydependsdirectlyonthenumberofsterilefemales,whichinturndependsonthefractionofmaleswiththesterile-daughtertrait,thisarchitecturecouldpermitreversibletitrationofthelocalpopulationbyreleasingsterile-daughterorwild-typeorganisms.
.CC-BY-ND 4.0 International licensenot certified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which wasthis version posted June 7, 2016. . https://doi.org/10.1101/057307doi: bioRxiv preprint