Quality control analysis of the 1000 Genomes Project Omni2 ... · Quality control analysis of the...

Post on 05-Nov-2019

6 views 0 download

transcript

30September2016 1

Quality control analysis of the 1000 Genomes Project Omni2.5 genotypes

NicoleM.Roslin1,WeiliLi1,AndrewD.Paterson1,2,3*,LisaJ.Strug1,2,41)TheCentreforAppliedGenomics,TheHospitalforSickChildren,Toronto,ON,Canada2)PrograminGeneticsandGenomeBiology,TheHospitalforSickChildren,Toronto,ON,Canada3)EpidemiologyandBiostatistics,DallaLanaSchoolofPublicHealth,UniversityofToronto,TorontoON,Canada4)BiostatisticsandStatistics,UniversityofToronto,Toronto,ON,Canada*)Correspondingauthor(andrew.paterson@sickkids.ca)

Citation Foranyuseofthe1000GenomesProjectdata,pleaseusethecitationasnotedhere:http://www.1000genomes.org/faq/how-do-i-cite-1000-genomes-project.Tocitethisreportorthelistsdescribedhere,pleaseusethefollowing:

RoslinNM,LiW,PatersonAD,StrugLJ.Qualitycontrolanalysisofthe1000GenomesProjectOmni2.5genotypes(Abstract/Program#576/F).Presentedatthe66thAnnualMeetingofTheAmericanSocietyofHumanGenetics,October18-22,2016,Vancouver,Canada.

Data Summary Chips:IlluminaHumanOmni2.5-4v1_BandIlluminaHumanOmni25M-8v1-1_BInitialnumberofSNPs:2458861Initialnumberofsamples:2318NumberofSNPspassingQC:1989184(80.9%)NumberofsamplespassingQC:2318(100%)Numberofquasi-unrelatedsampleswithconsistentethnicityandwellinferredsex:1736

Abstract The1000GenomesProjectgenotype2318individuals(48.1%male)from19populationsin5continentalgroupsontheIlluminaOmni2.5platform.Thedataarepubliclyavailable,andwillproveavaluableresourcetoobtainethnic-specificallelefrequencies,aswellasexploringpopulationhistoriesthroughprincipalcomponentsanalysis(PCA),estimationofinbreedingcoefficients,andadmixtureanalysis.Asinanystudy,thedatashouldbecleanedpriortoanalysis,toremoveindividualsormarkersofquestionablequality.Furthermore,athoroughunderstandingoftherelationshipsbetweenindividualsmustbeestablished.Herewereportourfindingsaftercomprehensiveexaminationofthedataforqualitycontrol. Thebasicqualityofthegenotypeswasassessedusingstandardprocedures.KINGversion1.4wasusedtoconfirmtherelationshipsintheprovidedpedigrees,andalsotodetectundeclaredrelationships.PCAwasusedtoexaminethesimilaritiesanddifferencesbetweenindividualsamongandbetweenpopulationgroups.

30September2016 2

Ingeneral,thedatawasfoundtobeofhighquality.Nosampleswereremovedduetolowcallrate(<97%)orexcessheterozygosity.Sexchromosomegenotypesshowedtwoindividualswithdiscrepanciesbetweenreportedandinferredsex,andwereunabletodeterminesexinanadditional20individuals;thesexforthesewaschangedtounknown.Relationshipcheckingfounddiscrepanciesbetweenfirst-degreerelationshipsintheprovidedpedigreesandthegenotypesin9families,includingoneinstancewhereareportedparent/childpairwasunrelated,twoinstanceswherefullsibswereunrelated,andonesetofthreeindividualswhoformedanewlydefinedtrio.Asetof1756individualswhowereinferredtobemoredistantthan3rddegreerelativeswasextractedandusedinPCA.Theseindividualsclusteredinapatternthatisconsistentwithotherpublishedreportsofglobalpopulations.Weidentified4individualswhosegenotypesclusteredmorecloselywithadifferentgeographicregionthantheoneintheprovideddata. Althoughthegenotypedataisofhighquality,errorsexistinthepubliclyavailabledatasetthatrequireattentionpriortousingthegenotypes.PLINK-formatfilesincludingSNPswithgoodqualitymetricsandrevisedpedigreestructuresisavailableathttp://tcag.ca.Fileswithdistantlyrelatedorunrelatedindividuals,withsexinferenceconsistentwithprovidedgender,andwithPCAconsistentwithcontinentalgrouparealsoavailable.

1 BackgroundGenotypesgeneratedontheIlluminaOmni2.5platformareavailablefordownloadfromthe1000GenomesProjectwebsite(http://www.1000genomes.org/).Thisdataisavaluableresourceofgenotypesfromindividualscollectedfrommultipleethnicgroupsaroundtheworldwithoutregardtodisease.Thedatacanbeusedeitherasrepresentativegenotypesfromindividualsofknownethnicityforpopulationstructureanalysis,orasasourceofethnic-specificallelefrequencieswhichcanbeusedinanalysessuchaslinkageanalysis,relationshipestimationandestimationofinbreedingcoefficients.Asinanystudy,thedatashouldbecleanedpriortoanalysis,toremoveindividualsormarkersofquestionablequality.Furthermore,athoroughunderstandingoftherelationshipsbetweenindividualsmustbeestablished.Thisreportdescribesthequalityanalysesthatweperformedonthechipdata.ListsofSNPsandsampleswithgoodqualitymetricsareavailableathttp://tcag.ca/.ThisanalysiswasapprovedbytheResearchEthicsBoardatTheHospitalforSickChildren(REBnumber1000054008).

2 DataacquisitionGenotypefiles,invcfformat,weredownloadedon16May2016fromthefollowingftpsiteatthe1000GenomesProject:ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/(GenomesProjectetal.,2015).ThelinktothisdirectorycamefromtheFrequentlyAskedQuestionssectionofthewebsite(http://www.1000genomes.org/faq/what-omni-genotype-data-do-you-have/).GenotypesweregeneratedattheBroadandSangerInstitutes,usingtwodifferentversionsoftheOmni2.5platform.Themajorityofthesamplesweregenotypedat

30September2016 3

theBroad,andnoindividualsweregenotypedinbothlocations.TheanalysispresentedinthisreportstartedwiththedownloadedfileALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.vcf.gz,whichisthecombinedsetofBroadandSangergenotypes.Thedatahad2318individualsand2458861markers.SeeAppendix2forthespecificdetailsregardingthedownloadedfiles.

3 Priorprocessingby1000GenomesBeforethedatawasmadeavailablefordownload,thedataprovidersperformedasmallamountofprocessingonthedata,whichissummarizedhere.ForthedatageneratedattheBroad,genotypeswerecalledusingGenomeStudiov2010.3withthecallingalgorithm/genotypingmoduleversion1.8.4usingthedefaultclusterfile.Physicalpositionsweretakenfrombuild36,whichweresubsequentlyconvertedtohg19usingthehuman_g1k_v37dataset.SNPswereremovedaccordingtothefollowingfilters:themetadataonthechipfortheSNPwasinconsistent,theSNPhadaduplicatewithahigherGentrainScore,theassaywasdesignedformultiplealternatealleles,theSNPwasnotpolymorphicinthe1000Genomesdataset,ortheSNPwaswithin50bpofaknownindel.Thesefiltersresultedin2441898SNPsremaining(outof2450000,99.7%).ThespecificchipusedwasIlluminaHumanOmni2.5-4v1_B. ForthedatageneratedattheSangerCenter,detailedinformationonpriorprocessingwasnotprovided.However,ofthe2391739SNPsonthechip,genotypeswereprovidedfor2176330(91.0%).Notethatrsnumberswerenotprovidedforthevariantsinthisdataset.Itwasassumedthatthephysicalpositionswerebasedonhg19.ThegenotypescamefromtheIlluminaHumanOmni25M-8v1-1_Bchip. ThetwodatasetsweremergedusingGATKv3.1.1combineVariants.Thefinaldatasethad2318individualsand2458861markers.Noindividualswerepresentinbothsetsofdata.ThemajorityofSNPs(2150028,87.4%)werepresentinboththeBroadandSangerdatasets;282531(11.5%)SNPswerepresentjustinthedatageneratedattheBroad,while26302(1.1%)SNPswereonlyinthedatafromtheSanger.Inthefinaldataset,therewerenomarkerswithidenticalchromosomeandphysicalpositions.ThemergeddatawassavedtofileOmni25_sanger_broad_combined.vcf.gzandmadeavailableonthe1000Genomesftpsite.

4 FamilyandethnicinformationGenderandbasicinformationonrelationshipsbetweencertainindividualswasprovidedby1000Genomesinapedigreefile(Appendix2,file2).Additionally,reportedethnicbackgroundwasprovidedasafilefromeachgenotypingcentre(Appendix2,files3and4).Thepedigreefileincludedonlyindividualsgenotypedorsequencedinthe1000Genomesproject,andsoadditional"dummy"individualsneededtolinkuprelatedindividualswerenotpresent.Becauseofthis,severalknownrelationshipswerenotmadeexplicitbythepedigree.AllsamplesfromSangerand2098individualsfromtheBroadhadavailableethnicinformation,while

30September2016 4

43individualsgenotypedattheBroadhadmissingethnicityandgenderintheprovidedfiles,foratotalof2318individuals.Missinggendersandethnicitieswerefilledinusingfile5inAppendix2.CountsofindividualsfromeachofthesampledpopulationsareshowninAppendix1.

5 Qualitycontrol(QC)analysisTheprovidedvcffileswereconvertedtoPLINKbinaryformatusingPLINK1.90(https://www.cog-genomics.org/plink2/).Onemarkerhadanon-validchromosomecode(chromosomeGL000202.1forSNP11-69436716);thechromosomeforthismarkerwassettomissing.Theprovidedsexandrelationshipinformationwereincorporatedintothepedigreefiles.ThesefileswereusedasthebasisfortheQCstepsthatfollow.Analyseswereperformedusingin-housescripts,unlessotherwisestated.

5.1 BasicsamplequalityAnalyseswereperformedtodeterminethegeneralqualityofthesamples.Sincethetwodifferentgenotypingcentresusedslightlydifferentsetsofmarkers,onlythe2150028SNPspresentinboththeBroadandSangerdatasetswereusedinthesetests.

5.1.1 SexchromosomeanalysisSexchromosomecompositionwasinferredforeachofthesamplesbasedontheheterozygosityrateonchromosomeXandthecallrateonchromosomeY.ThemajorityofsamplescouldbeclearlyidentifiedasXYorXX,accordingtothefollowingcriteria:

1. XY(male)=proportionofheterozygousgenotypesontheXchromosome>0.9,andcallrateontheYchromosome>0.9.

2. XX(female)=proportionofheterozygousgenotypesontheXchromosome<0.9,andcallrateontheYchromosome<0.4.

Allotherindividualsweredeclaredtohaveambiguous(orunknown)sex.Basedonthesethresholds,twoindividuals,NA21310andHG02300,werelistedasmales,buthadgenotypesconsistentwithfemales(Figure1).Noothersexdiscrepancywasfound;however,for20individuals,thesexcouldnotbeinferredusingtheaboverules.Thesexfortheseindividualswassettomissing.For8oftheseindividuals(reportedtobemale),theheterozygosityrateontheXchromosomewasconsistentwithmales,butshowedreducedcallrateontheYchromosome.SuchalossofYcouldbetheresultofcelllineartifactsorenvironmentalfactors(Dumanskietal.,2015),ratherthangenotypingorsampleerror,andsotheseindividualswerenotflaggedashavingpoorquality.Similarly,11reportedfemalesshowedexcessivehomozygosityonchromosomeX,alsoconsistentwithcelllineartifacts.Oneindividual(HG01683:male,IBS)hadslightlyreducedcallrateonYforamale,butXchromosomeheterozygositysimilartofemales,suggestingthepossibilityofXXY.

30September2016 5

Figure1.ProportionofhomozygousgenotypesontheXchromosomevs.proportionofnon-missinggenotypesontheYchromosome.Symbolsarecodedbasedonthesexintheprovidedpedigreefile:male=redtriangles,female=greenplussymbols.Dashedlinesdelimitthethresholdsappliedabove.

5.1.2 CallrateTheproportionofnon-missinggenotypes(callrate)wascalculatedperindividual.Ingeneral,thecallratewasveryhigh,withanaveragecallrateof99.6%inthedata.Thelowestcallratepersamplewas97.0%,whichisatthecommonly-acceptedthresholdof97%.CallrateswerecalculatedusingPLINK1.90.

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0.0 0.2 0.4 0.6 0.8 1.0

0.80

0.85

0.90

0.95

1.00

2318 samples

Quantile

Sam

ple

call

rate

Figure2.Proportionofnon-missinggenotypesperindividual.Thedashedlineshowsthe97%threshold.

5.1.3 MultilocusheterozygosityInordertodetectpossiblesamplecontamination,theaverageautosomalheterozygositywascalculatedforeachindividual.Theaverageheterozygositywas0.20,andnosamplewasunusuallyextreme,comparedtotheothersamples(Figure3).Thewidedistributionwithmultiplepeaksarelikelyreflectionsofthemulti-ethnicnatureofthegenotypedindividuals.

30September2016 6

0.16 0.18 0.20 0.22 0.24

010

030

050

0

mean = 0.204 range = 0.158 − 0.248

Proportion of heterozygous loci

Cou

nt

Wed May 18 11:04:35 2016/home/nroslin/Data/1000Genomes/Omni25/2Samples/sampleHet.pdf

Figure3.Histogramoftheproportionofheterozygousautosomallocipersample.

5.1.4 HeterozygoushaploidgenotypesHaploidregionsofthegenomearetypicallyrepresentedashomozygousdiploidgenotypes.TheseregionsincludetheXchromosomeformales,theYchromosome,andthemitochondrion.Individualswithanunusuallyhighnumberofheterozygousgenotypesattheselocicouldprovideadditionalevidenceofqualityissues.Sincetheregionstoexaminedependonsex,thesexinferredabovewasused,ratherthantheprovidedsex.Inthecurrentdataset,noindividualhadanunusuallyhighnumberofheterozygoushaploidgenotypes,relativetotheothersamples(resultsnotshown).ThesegenotypesweredetectedusingPLINK.

5.1.5 SummaryofbasicsamplequalityIngeneral,thequalityofthesampleswasexcellent.Therewasnoevidenceofcontamination,andnoevidenceofsamplefailure.Inferredsexwasconsistentwithprovidedsexfor2296of2318(99%)oftheindividuals.Thesexfortwoindividuals(NA21310andHG02300)waschangedfrommaletofemale,whilethesexfor20individualswassettomissing.

5.2 BasicqualityperSNPStandardtestswereperformedtodetermineSNPquality.Onlymarkersthatwerepresentonbothversionsofthechipwereincluded–theothermarkerswillhavenon-randomsetsofmissingvalues,whichwillcausebiasinmanyofthetests.Theanalyseswereperformedafterthesexchangesdescribedabove.

5.2.1 CallrateTheproportionofnon-missinggenotypeswascalculatedperSNP.Thecallratewasexcellent:2114511of2150028(98.3%)oftheSNPshadcallrate>97%(Figure4).

30September2016 7

Figure4.CallrateperSNP,forthemarkerspresentinboththeBroadandSangerdatasets.Thedashedlineshowsthe97%threshold.

5.2.2 Heterozygoushaploid(HH)genotypesThedistributionofheterozygoushaploidgenotypes,definedinsection5.1.4above,wasexamined.Non-missinggenotypesforfemalesontheYchromosomewerealsoidentified.Atotalof11888SNPshadatleastoneHHgenotype.NoSNPappearedtohaveanexcessivenumberoftheseproblematicgenotypes,comparedtotherestoftheSNPs(resultsnotshown).

5.2.3 Hardy-WeinbergequilibriumSNPsweretestedfordeviationfromtheproportionsexpectedunderHardy-Weinbergequilibrium(HWE),usinganexacttest(Wigginton,Cutler,&Abecasis,2005)implementedinPLINK.ForSNPsontheXchromosome,onlyinferredfemaleswereusedinthetest.SNPsontheYchromosomeandthemitochondrionwerenottested.Atotalof355357SNPs(17%)hadp-values<10-6.Althoughthiscouldindicatesubstantialproblemsintheallelecallingalgorithm,itismorelikelyduetothefactthatthesamplescamefromawidevarietyofethnicbackgrounds,andtheassumptionsofHardy-Weinbergdonothold(mostimportantlytheassumptionofrandommating)inthedata.Therefore,SNPswerenotexcludedduetodeviationfromHWE.

5.2.4 SummaryofbasicSNPqualityTheSNPshadveryhighquality.SNPswereremovedifthecallratewas<97%orifaHHgenotypewasdetected.Furthermore,SNPsthatweremonomorphicinallthesampleswerediscarded,sincetheydonotcontainanyinformation.Outoftheoriginal2150028markersthatwerepresentinboththeBroadandSangerdatasets,2105791(97.9%)passedthesequalityfilters,andwillbeusedinthemoresophisticatedanalysesbelow.

30September2016 8

5.3 RelationshiptestingThesoftwareKING1.4(Manichaikuletal.,2010)wasusedtoinfercloserelationshipsbetweenpairsofsamples.Thisprogramestimateskinshipcoefficientsforallpairsofindividualsbasedontheirheterozygosity,anddoesnotrequireallelefrequencyestimates.Thehighqualitysetof2105791SNPsdescribedabovewasused.KINGwasalsousedtoselectasetofindividualswhowereinferredtobeunrelatedormoredistantlyrelatedthan3rddegree(firstcousinorequivalent). Priortotesting,therelationshipinformationprovidedby1000Genomeswasincorporated.Therewere28pedigreesthatwerelargerthantrios,showninAppendix3,plusanadditional373trios(basedonthenumberofindividualsperfamilyID).Inmanycases,largerfamilieswerecomposedofmultiplesmallerfamilieswithdifferentpedigreeIDs.WeassignedtheseanewfamilyIDbasedonthepopulationfromwhichthefamiliescame.Forexample,thethreeindividualsHG00501(amotherinatrio,familySH028),HG00512(afatherinatrio,familySH032)andHG00524(afatherinatrio,familySH036)werealllistedassiblings,eventhoughtheyhaddifferentfamilyIDs.Wecombinedalltheetriosintoasinglefamily,addingdummyparents,andnamedtheextendedfamilyCHS3. Withinthe28pedigrees,thepairwisekinshipcoefficientswereconsistentwiththeprovidedstructurefor23ofthem.Fortheremaining5families,ASW3wassplitintotwoseparatefamilies(theremayhavebeenatypointheprovidedpedigreeinformation,mixingupIDsNA20334andNA20344).FamiliesASW4andASW5werecombinedandonehalfsiblingchangedfathers.FamilyLWK003originallyconsistedoftwoindividualswithacommonmother,withnoinformationonthefather(s).Thegenotypeswereconsistentwithafullsiblingrelationship,andsoasingledummyfatherwascreated.Finally,familyYRI4wassplitintotwounrelatedfamilies. InthetrioCLM23,thekinshipcoefficientbetweentheparentswasestimatedtobe0.033,indicatingthattheymaybedistantlyrelated(consistentwithfirstcousinonceremoved,orequivalentrelationship).Sincetherearenootherfamilymemberswithgenotypesavailable,itisnotpossibletoconfirmthisrelationship,andsothepedigreewasnotchanged.Apaperestimatedinbreedingcoefficientsin1000Genomessamplesbasedonwholegenomesequencedata(Gazal,Sahbatou,Babron,Genin,&Leutenegger,2015),includingtheparentsoftrioCLM23.Thefatherofthetrio(HG01277)wasfoundtohaveaninbreedingcoefficientestimateconsistentwithoffspringofsecondcousins.However,thechildinthetrio(HG01279)wasnotanalyzed,andsothisanalysiscannotconfirmarelationshipbetweentheparents. Otherthanthechangesmadetotheextendedfamilies,5additionalchangesweremade,basedoninferredfirst-degreerelationshipsbetweenpairsofindividualswithdifferentfamilyIDs.Therewere3instancesofunreportedfullsiblingrelationships,oneinstancewherethreesupposedlyunrelatedindividualsformedatrio,andoneinstancewhereareportedparent/offspringrelationshipwasinferredtobeunrelated.Asubstantialnumberof2nddegreerelationshipswerealsoobserved,butpedigreeswerenotmodifiedbasedonthesemoredistantkinshipcoefficients.AllpedigreechangesareshowninAppendix4.

30September2016 9

KINGwasusedtoselectquasi-unrelatedindividuals.Itidentified1756individualswhowerenocloserthan3rddegreerelatives(firstcousinorequivalent).

5.3.1 MendelianerrorsTheprogramPEDSTATS0.6.10(Wigginton&Abecasis,2005)wasusedtolookforerrorsinMendeliantransmissionintherevisedpedigreesidentifiedabove.OnlytheautosomesandchromosomeXwereexamined.Atotalof228542errorswerefound.Outofthepedigreeshavingatleastoneparent/childpair,family1349hadthemosterrors(5743).Sinceover2millionSNPsweretested,thisrepresentserrorsinapproximately0.2%ofthemarkers,andsodoesnotpresentaqualityconcern.ThedistributionoferrorsbySNPwasalsonotremarkable(resultsnotshown).All116607SNPswith≥1errorwereremoved,resultingin1989184SNPs.

5.4 PrincipalcomponentsanalysisPrincipalcomponentsanalysis(PCA)wasusedasavisualaidtoshowhowsimilarthegenotypesfromthedatawere,whentheprovidedethnicinformationwasincluded.Inordertoeasecomputationalburden,andtoavoidartifactsfromunusualpatternsoflinkagedisequilibrium(LD)thatareunrelatedtoethnicity,thesetof1989184SNPswasfurtherreduced.First,allmarkersonchromosomeXwereremoved.Next,tworegionswithunusualpatternsoflinkagedisequilibrium(LD)wereremoved:themajorhistocompatibilityregiononchromosome6,andaninversionpolymorphismonchromosome8.PositionsonthegeneticmapwereobtainedfromtheRutgerscombinedmap(Matiseetal.,2007),andonlymarkerswithuniquecMpositions(upto3decimalplaces)wereretained.Arbitrarily,outofagroupofSNPsatthesamegeneticposition,theonewiththelowestphysicalpositionwasretained.Markerswithminorallelefrequency(MAF)>0.3inthecompletesetofdatawereextracted.Thisresultedinasetof282273markers.ToreducetightLDbetweenmarkers,thesetofSNPswasfurtherreducedsothatallmarkershadpairwiser2<0.2withineachchromosome,toproduceafinalsetof57931SNPs.SincecloserelationshipsbetweenindividualscandistortthePCAresults,onlythe1756quasi-unrelatedindividualsidentifiedabovewereusedintheanalysis.PCAwasperformedonthisdatausingtheSmartPCApackage(version10210)ofEigenstrat(Patterson,Price,&Reich,2006;Priceetal.,2006).

30September2016 10

Althoughonlyasubsetofindividualsrecruitedforthe1000GenomesProjecthaveOmni2.5genotypes,individualsfromallcontinentalgroupsarerepresented.ThedistributionofsamplesinthePCAisconsistentwithwhathasbeenshownelsewhere,withAFR,EASandEURformingmajoraxes,andAMRandSASlyingwithintheseaxes(Figure5).Fourindividualsdidnotclusterwellwithotherindividualsfromtheircontinentalgroups.ThreeindividualsfromAMR(HG01241,HG01242andHG01108,allfromPUR)clusteredmorecloselywithindividualsfromAFR,andoneindividualfromAFR(NA20314,ASW)wasmoresimilartoAMR.Theseresultsarenotsurprising,giventhepopulationhistories.Thetwoindividualswithsexdiscrepancyclusteredwellwithintheirlistedgeographicregion.

−0.03 −0.02 −0.01 0.00 0.01 0.02 0.03 0.04−0.0

6−0

.04

−0.0

2 0

.00

0.0

2 0

.04

−0.05

0.00

0.05

0.10

0.15

PC2

PC3

PC1

●●●●●●●

● ●●● ●●●●●●●●●●

●●●●●●●

●●●●●●

●●●

●●●●●●● ●● ●●●●

●●●●●● ●● ●●

●●●

●●●●●

●●

● ●●

●●●

●●●

●●●●●●

●●●●

●● ●●●●●●●

●●●

●●● ●●●

●●●● ●●●●●●●●

●●

●●●

●●●

●●

●●

● ●●●●

●●●●●●●●

●●

●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●●

● ●

●●●

●●

●●

●●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●

●●●

●●●●

●●

●●●

●●●●

●●

●●●●●●●

●●●●●

●●●●●●

●●●●●●●

●●●●

●●●

●●●●●

●●●●●●●

●●

●●●●●●●●

●●●●

● ●

●●●●●

● ●●●●

●●●●●●●●

●●●●●●

●●●●●●

●●●●●●●●●●●●●

●●

●●●●●●●●●

●●

●●●●●

●●

●●●●●●●●●●●

●●●●●●●

●●

●●●●●●●●

●●

●●

●●●●●●●●●

●●

●●●

●●

●●●●●●

●●●

●●●

●●

●●●●●●●●●●●●

●●

●●●

●●●●●●●●●

●●

●●

●●

●●●

●●●●●●●●●●

●●●●

●●

●●●

●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●

● ●●●

●●●●●

●●

●●

●●●

●●

●●

●●●●●

●●

●●●●

●●

●●●●

●●●●

●●●

●●

●●●●

●●

●●

●●●●●●●

●●●

●●●●

●●●●●

●●●●●●●●

●● ●

●●

●●●●

●●●●

●●

●●●

● ● ●●● ●

●●●●

●●●●

●●●

●●●

●●●●

●●●

●●

●●●●

●●●●

●●

●●●●●

●●

●●●●●●

●●●●●

●●●

●●●●●● ●

●●●

●●●●●●

●●●

● ●●●●●

●●

●●

●●●●●●●●

●●

●●●●●●●

●●●●●●

●●● ●●●●

●●●

●●●●●●●●●

● ●

●●●●

●●

●● ●●

●●

●●●●●

●●●●●

●●

●●●●●●

●●●●

●●●●● ●●●

●●●●●

●●●

●●

●●●

●●

●●●● ●

●●

●●●●●

● ●

●●●●●●

●●●

●●

●●●

●●●●

●●

●●

●●

●●●●●

●●●

●●

●●●●●●●

●●

●●

●●● ●●●

●●●●●

●●

●●

●●

●●●●●●●●

●●

●●●

●●●

●●●●

●●●●

●●●

●●

●●●●●●

●●●●●●●

●●●●●●●●●●●●●

●●●●

●●●

●●●

●●●●●●●●●

●●●●

●●

●●●●●

●●

●●●

●●●●●●

●●●●●●●

●●

●●●●●

●●●

●●● ●●●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●●●●●●●●●

●●●

●● ●●● ●●●●●●●●●

●●

●●●●●●●

●●●●●

●●●●●●

●●●●●●

● ●●●●●●●●●●●

●●●●●●

●●●●●●●●

●●●●

●●

●●●●●

●●●●●

●●●●

●●

●●

●●●●●●●●●●

●●●●

●●●●

●●●●

●●●●●

●●

●●

●●

●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●

●●●●●

●●●●

●●●●●

●●●

●●●●●●●●

●●●

●●●●

●●

●●●

●●

●●●●

●●●●

●●

●●●●●●●

●●● ●●●●●●

●●●●●●●●●●●●●

●●●

●●●●●●

●●●●●●●●●●●●●●

●●●●●

●●●●●●

●●●●●

●●●

●●

●●●●●●

●●●●

●●●●●●●●●

●●●

●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●●

●●●

●●

●●●

●●●●●●

●●●●

●●●

●●●●

●●●●●●

●●●●●●●●

●●●

●●●●

●●●●

●●●●

●●

●●●●

●●●●●●●

●●●●●●

●●●●●

●●●

●●●●

●●●●●●●●

●●●●

●●●●●●●●●

●●●●●●

●●

●●●●●

●●●●●●●●●●●●

●●●

●●●●●●

●●●

●●●●●●

●●

●●●●●

●●●●

●●

●●●

●●●●●●●●●

●●●●●

●●

●●●

●●●●●●●●●●●●●●●●

ACBASWLWKMKKYRICLMMXLPELPURGIH

CDXCHBCHDCHSJPTKHVCEUIBSFINGBRTSIGIH

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●●●

●●

●●

●●●●●●●●●

●● ●

●●

●●

●●

●●●

●●●●

●●

●● ●

●●

●●

●● ●●

●●●

●●

●●

●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●

●●

●●

●●●●

●●●

●●●●

●●

●●●●●

●●

●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●

●●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●●●●●●●

●●● ●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.04 −0.02 0.00 0.02

−0.03

−0.01

0.01

0.03

PC1

PC2

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●

●●

●●●●●● ●●● ●●●●

●●

●●

● ●●●● ● ●

● ●

●●●●● ●●● ●●● ●

●●●●●●

●● ●●

●● ●●●●●● ●●●●●● ●● ●●●●●● ●●

●●● ●●●● ●● ●●●

●●●●●● ●●●

●●● ●●● ● ●●● ●●

●●● ●● ●●●●●●● ●●●●●●

●●●● ●●●●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●● ●

●●

●●

●●●

● ●

●●

●●●

●● ●

●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●● ●

●●●●●●●●●

●●●

●●

●●●●

●● ●●●●●

●●● ●●

●●

●●

●●●●

●● ●● ●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.04 −0.02 0.00 0.02

−0.02

0.02

0.06

0.10

PC1

PC3

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●

●●

●●●●●● ●●● ●●●●

●●

●●

● ●●●● ● ●

● ●

●●●●● ●●● ●●●

●●

●●●●●●

●● ●●

●● ●●●●●● ●●●●● ● ●● ●●●●●● ●●

●●● ●●●● ●● ●●●

●●●● ●● ●●●

●●● ●●● ● ●●● ●●

●●●●● ●●●●●●● ●

●●●●●

●● ●●●●●●●●●●●

●●

●●

●●

●●

●●●

●●

● ●

●●

● ●

●●●●

●● ●

●●

●●

●●●

● ●

●●

●●

●● ●

●●

●●●● ●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●

●●

●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●● ●●

●● ●●● ●●●●● ●

●●●●●● ●●● ●●●●●● ●●●● ●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●●

●●

●● ●

●●●●●

●●●●

●●●

●●

●●●

●●

●● ●

●●●●

●●● ●●

●●

●●

●●●●

●● ●● ●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●●●●●●● ●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.03 −0.01 0.01 0.03

−0.02

0.02

0.06

0.10

PC2

PC3

ACBASWLWKMKKYRICLMMXLPELPURGIH

CDXCHBCHDCHSJPTKHVCEUIBSFINGBRTSIGIH

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●●●

●●

●●

●●●●●●●●●

●● ●

●●

●●

●●

●●●

●●●●

●●

●● ●

●●

●●

●● ●●

●●●

●●

●●

●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●

●●

●●

●●●●

●●●

●●●●

●●

●●●●●

●●

●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●

●●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●●●●●●●

●●● ●

●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.04 −0.02 0.00 0.02

−0.03

−0.01

0.01

0.03

PC1

PC2

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●

●●

●●●●●● ●●● ●●●●

●●

●●

● ●●●● ● ●

● ●

●●●●● ●●● ●●● ●

●●●●●●

●● ●●

●● ●●●●●● ●●●●●● ●● ●●●●●● ●●

●●● ●●●● ●● ●●●

●●●●●● ●●●

●●● ●●● ● ●●● ●●

●●● ●● ●●●●●●● ●●●●●●

●●●● ●●●●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●● ●

●●

●●

●●●

● ●

●●

●●●

●● ●

●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●● ●

●●●●●●●●●

●●●

●●

●●●●

●● ●●●●●

●●● ●●

●●

●●

●●●●

●● ●● ●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.04 −0.02 0.00 0.02

−0.02

0.02

0.06

0.10

PC1

PC3

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●

●●

●●●●●● ●●● ●●●●

●●

●●

● ●●●● ● ●

● ●

●●●●● ●●● ●●●

●●

●●●●●●

●● ●●

●● ●●●●●● ●●●●● ● ●● ●●●●●● ●●

●●● ●●●● ●● ●●●

●●●● ●● ●●●

●●● ●●● ● ●●● ●●

●●●●● ●●●●●●● ●

●●●●●

●● ●●●●●●●●●●●

●●

●●

●●

●●

●●●

●●

● ●

●●

● ●

●●●●

●● ●

●●

●●

●●●

● ●

●●

●●

●● ●

●●

●●●● ●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●

●●

●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●● ●●

●● ●●● ●●●●● ●

●●●●●● ●●● ●●●●●● ●●●● ●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●●

●●

●● ●

●●●●●

●●●●

●●●

●●

●●●

●●

●● ●

●●●●

●●● ●●

●●

●●

●●●●

●● ●● ●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●● ●●●●●●●●● ●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.03 −0.01 0.01 0.03

−0.02

0.02

0.06

0.10

PC2

PC3

Figure5.Threedimensionalplotofthefirstthreeprincipalcomponents(PC1,PC2,PC3)of1756quasi-unrelatedindividuals.Pointsareshadedbytheirpopulationcodes.Ingeneral,green=AFR,blue=AMR,yellow=EAS,red=EUR,andorange=SAS.

30September2016 11

The4individualswhodidnotclusterwellwiththeirgeographicregionwereremoved,andclusteringwasrepeatedwithineachgeographicregion.ForAFR,LWK,MKKandYRIeachformedtightclusters,whileACBandASWweremorespreadout(Figure6).

ACBASWLWKMKKYRI ●

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.20 −0.10 0.00−0.10

0.00

ACB

PC1

PC2

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.20 −0.10 0.00

−0.10

0.00

ASW

PC1

PC2

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.20 −0.10 0.00

−0.10

0.00

LWK

PC1

PC2

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.20 −0.10 0.00

−0.10

0.00

MKK

PC1

PC2

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●

●●

●●●

●●

●●●

●●●●

● ●●

●●

●●

●●

●●

●●

● ●●●●

●●

●●●

●●●

●●

●●●

●●

●●

●● ●●

●●

●●

●●● ●

●●

●●

●●●●●

●●●●●

●●●

●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●

● ●●●●

●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●

● ●●

●●●●●●

●●●●●●●●●●●●●●

● ●

●● ●●●●●

●●●

●●

●● ●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

−0.20 −0.10 0.00−0.10

0.00

YRI

PC1

PC2

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Figure6.PlotsofthefirsttwoPCsfromananalysisofAFR.Eachplothighlightsonepopulationfromthiscontinentalgroup.

30September2016 12

InAMR,the4differentpopulationsdidnotformtightclusters(Figure7).Also,eventhoughtheycouldbedifferentiatedsomewhat,theyalsoshowedsubstantialamountsofoverlap.

CLMMXLPELPUR ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●●

●●

●●

●●

●●

● ●

●●

●● ●

●● ●

●●

●● ●●

●●

●● ●

●●

● ●●●● ●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

−0.10 0.00 0.05

−0.15

−0.05

0.05

0.15

CLM

PC1

PC2

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●●

●●

●●

●●

●●

● ●

●●

●● ●

●● ●

●●

●● ●●

●●

●● ●

●●

● ●●●● ●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

−0.10 0.00 0.05

−0.15

−0.05

0.05

0.15

MXL

PC1

PC2

●● ●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●●

●●

●●

●●

●●

● ●

●●

●● ●

●● ●

●●

●● ●●

●●

●● ●

●●

● ●●●● ●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

−0.10 0.00 0.05

−0.15

−0.05

0.05

0.15

PEL

PC1

PC2

●●●

●●

● ●

●●

●● ●

●● ●

●●

●● ●●

●●

●● ●

●●

● ●●●● ●

●●

●●●●

●●

●●

●●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

● ●

●●

●●

●●

● ●●

●● ●●

●●

●●

●●

●●

● ●

●●

●● ●

●● ●

●●

●● ●●

●●

●● ●

●●

● ●●●● ●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

−0.10 0.00 0.05

−0.15

−0.05

0.05

0.15

PUR

PC1

PC2

●●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

●●●●●

●●

●●

●●

●●

●●

●●

Figure7.PlotsofthefirsttwoPCsfromananalysisofAMR.Eachsub-populationishighlightedinaseparateplot.

30September2016 13

Amongthe5EASpopulations,JPTwasdistinctfromCDX,CHB,CHSandKHV(Figure8).OneJPTsamplemaybeadmixedwithanancestorfromaChinesepopulation.CDX(ChineseDai)andKHV(KinhfromVietnam)weremoresimilartoeachotherthantotheotherpopulationsfromChina,withtheexceptionofoneCDXindividualwhoclusteredwitheitherCHBorCHS.

CDXCHBCHDCHSJPTKHV

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06−0.05

0.00

0.05

CDX

PC1

PC2

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06

−0.05

0.00

0.05

CHB

PC1

PC2

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06

−0.05

0.00

0.05

CHS

PC1

PC2

●●

●●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06

−0.05

0.00

0.05

JPT

PC1

PC2

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06−0.05

0.00

0.05

KHV

PC1

PC2 ●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●●●

●●●●●●

●●

●●●

●●●●

●●●●

●●●●

●●●

● ●●

●●●

●●●

●●●

●●●

●●●

●●●

●●●●

●●●●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●

● ●●

●●●●

●●●●

●●

●●● ●

●●●

●●

●●

●●●●

● ●

●●●●

● ●●●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●●●

●●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●

●● ●

●●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●●●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●

●●

●●●●●●●●

●●

●●●●

●●●●●●●

●●●●

● ●●●

●●●●●

●●●●●

●● ●

●●

●●●●●

●●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●

●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●

●●●●

●●●●●

●●●

●●●

●●●

●●

−0.06 −0.02 0.02 0.06

−0.05

0.00

0.05

CHD

PC1

PC2

Figure8.PlotsofthefirsttwoPCsfromananalysisofEAS.Eachsub-populationishighlightedinaseparateplot.

30September2016 14

The5populationsfromEURformedthreemainclusters(Figure9).CEUandGBRwereindistinguishablefromeachother,whileIBSandTSIshowedsimilaritiestoeachother.FINwasdistinctfromtheother4populations.

CEUFINGBRIBSTSI ●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

CEU

PC1PC

2

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

FIN

PC1

PC2

●●

●●

●●●

●● ●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

GBR

PC1

PC2

●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

IBS

PC1

PC2

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

TSI

PC1PC

2

●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

●●●

● ●●●

●●

●●●● ●●●●

●●

● ●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●● ●●

●●

●●●●

●●●●

●●●●●●

●●

●●●●●●

●●●●●●●

●●

●●●●●●

●●

●● ●

●●

●●

●●●

●● ●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

●●● ●

●●●●

●●

●●

●●

●●●

●●●●●●

●●

●●

●●

●●●●●●

●●●●●

●●●●

●●

●●

●●

●●●●●

●●●●

● ●●●●●●

●●

●●●

●●●

●●●●● ●●

●●●●●●●

●● ●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●●●●

●●●

●●●

●●●

●●

●●

−0.05 0.00 0.05 0.10

−0.05

0.00

0.05

Unknown

PC1

PC2

Figure9.PlotsofthefirsttwoPCsfromananalysisofEUR.Eachsub-populationishighlightedinaseparateplot.

SinceonlyoneethnicgroupfromSASwasincludedintheOmni2.5genotyping(GIH),PCAwasnotperformedwithinthiscontinentalgroup. Intotal,therewere1752individualswhowereunrelated(upto3rddegree),andwhoclusteredwellwithintheirassignedgeographicregion.

6 SummaryTheOmni2.5genotypesprovidedbythe1000GenomesProjectwereanalyzedforquality.Ingeneral,thequalityofthedatawasexcellent.Outof2318samples,twoindividualshadgenotypesontheXandYchromosomesthatwereinconsistentwiththeprovidedgender,whilesexcouldnotbedeterminedforanother20individuals.Itisrecommendedthatthese22individualsberemovedforanyanalysisinwhichsexisimportant.Allsampleshadcallrate>97%.Nosamplehadunusuallyhighheterozygosityacrosstheautosomes.Over98%oftheSNPshadcallrate>97%. Dummyindividualswereaddedtotheprovidedfamiliestoform"complete"pedigrees.Basedonpairwiserelationshiptesting,thepedigreesweremodifiedtobeconsistentwithfirst-degreeestimatedkinshipcoefficients.Changesweremadetoatotalof10pedigrees. Asetofunrelatedordistantlyrelatedindividualswasalsochosen.Theseindividualsweremoredistantthan3rddegreerelatives,clusteredwithother

30September2016 15

samplesfromtheircontinentalgroup,andhadclearsexinferencethatwasconsistentwiththeprovidedgender.Thissethad1736individuals. Asetof1989184highqualitySNPswasselected.ThesemarkerswerepresentinthedatasetsfromboththeBroadandSangerInstitutes,hadcallrate>97%,hadtwoobservedalleles,andnoobservederrorsinMendeliantransmission. Thefollowingfilesareavailableatourwebsite(http://tcag.ca):

• PLINKbinaryformatfilesincludingthe1989184SNPspassingQCwithtwoalleles,andtheinferredpedigreestructures,includingallnecessarydummyindividuals

• PLINKbinaryformatfilesincludingthe1989184SNPspassingQCwithtwoalleles,andthe1736individualswhowereinferredtobe<3rddegreerelatives,clusteredwellwithotherindividualsintheirgeographicregion,andhadclearsexinferencethatwasconsistentwithprovidedgender

• Tableofqualitystatisticspersample• TableofqualitystatisticsperSNP

30September2016 16

Appendix 1:Listof1000Genomespopulationsanddescriptions(numberofindividualswithOmni2.5data)AFR(Africans,508)ACB=AfricanCaribbeaninBarbados(102)ASW=AfricanancestryinSouthwestUS(104)LWK=LuhyainWebuye,Kenya(116)MKK=MaasaiinKinyawa,Kenya(31)YRI=YorubainIbadan,Nigeria(189)AMR(Americas,418)CLM=ColombianinMedellin,Colombia(107)MXL=MexicanancestryinLosAngeles,California(103)PEL=PeruvianinLima,Peru(105)PUR=PuertoRicaninPuertoRico(111)EAS(EastAsians,587)CDX=ChineseDaiinXishuangbanna,China(100)CHB=HanChineseinBejing,China(108)CHD=ChineseinDenver,Colorado(1)CHS=SouthernHanChinese,China(153)JPT=JapaneseinTokyo,Japan(105)KHV=KinhinHoChiMinhCity,Vietnam(121)EUR(Europeans,649)CEU=UtahresidentswithNorthernandWesternEuropeanancestry(183)FIN=FinnishinFinland(100)GBR=BritishinEnglandandScotland(104)IBS=IberianpopulationsinSpain(150)TSI=ToscaniinItaly(112)SAS(SouthernAsians,113)GIH=GujaratiIndianinHouston,TX(113)

30September2016 17

Appendix 2:Listofdownloadedfiles1. ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/h

d_genotype_chip/ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.vcf.gz(lastmodifiedAugust182014;mergedgenotypefile,invcfformat)

2. ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20140502_sample_summary_info/20140502_all_samples.ped(lastmodifiedMay2,2014;sexandrelationshipinformation)

3. ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/broad_intensities/omni25.2141.sample.panel(lastmodifiedNovember222013;populationinformationforsamplesgenotypedattheBroad)

4. ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/sanger_intensities/sanger_omni_chip.20130805.ALL.panel(lastmodifiedAugust122013;populationinformationforsamplesgenotypedatSanger)

5. http://www.1000genomes.org/sites/1000genomes.org/files/documents/20101214_1000genomes_samples.xls(populationinformationfor43individualsmissingfromBroadsamplespopulationfile)

30September2016 18

Appendix 3:Pedigreeslargerthantriosbasedontheinformationprovidedby1000Genomes;originalpedigreeidentifiersareshowninnon-blackcolours

Family ASW3

ASW34NA20344ASW33NA20336ASW35 NA20349ASW36NA20334

ASW32ASW31

NA20345 NA20350NA20337NA20335

2484 2485 2489 2492

Family ASW1

ASW13NA19713NA19982 NA19985

ASW11 ASW12

NA19714NA19983

2436

2437

Family ASW2

NA20341NA20289ASW23

ASW22ASW21

NA20290

2471

2487

Note:family2487hasanothermember,NA20340,whoisunrelatedtoeveryoneinASW2

Family ASW4

NA20301ASW44 NA20282 NA20278 ASW43

ASW42ASW41

NA20302NA20285 NA20279NA20284

24692467

2477

Family ASW5

NA20312 NA20313

ASW51 ASW52

2478i 2478ii

30September2016 19

(Appendix 3, continued)

Family CDX4

HG02381HG02373

CDX42CDX41

HG02373

HG02381

Family CDX3

HG00983HG00978

CDX32CDX31

HG00983HG00978

Family CEU1

NA06986 NA12813 NA12812NA07045

CEU12 CEU11

NA12801NA06997

13291 1454

Family CHS1

HG00580 HG00635HG00581HG00583 HG00578HG00577 HG00634HG00584

CHS11 CHS12 CHS14CHS13

HG00636HG00582HG00579HG00585

SH054 SH052 SH053 SH071

Note:familiesSH058andSH062aremoredistantlyrelatedtofamilyCHS1

Family CHS2

HG00701 HG00702HG00658

HG00657HG00656

HG00703

SH089

SH074

Family CHS3

HG00525HG00500 HG00512 HG00524HG00513HG00501

CHS31 CHS32

HG00526HG00514HG00502

SH028 SH032 SH036

30September2016 20

(Appendix 3, continued)

Family GBR002

HG00147 HG00146

GBR0022 GBR0021

GBR002aGBR002

Family GIH004

NA20874 NA20879

GIH0041 GIH0042

NA20879NA20874

Family KHV1

HG02046 HG02067

KHV11 KHV12

VN063VN056

Family LWK003

NA19444NA19434

LWK0031 LWK0032NA19432

Family LWK005

LWK0053 NA19470NA19443

LWK0052LWK0051

NA19469LWK005

NA19443

Family LWK007

NA19396 NA19397

LWK0071 LWK0072

NA19397NA19396

Family LWK008

NA19373 NA19374

LWK0081 LWK0082

NA19373 NA19374

Family LWK009

NA19347 NA19352

LWK0091 LWK0092

NA19347NA19352

30September2016 21

(Appendix 3, continued)

Family MXL1

NA19676 NA19680NA19675

NA19679 NA19678

NA19677 m004

m009

Family MXL2

NA19672NA19661 MXL23NA19660

NA19686

MXL21 MXL22

NA19662 NA19674NA19684NA19685

m008

m011

2382

Family PEL100

HG02299 HG02302 HG02301HG02298

PEL1001 PEL1002

HG02303HG02300

PEL52 PEL53

HG02300showssexinferredbygtypes(maleinpedfile)

Family TSI1

NA20792NA20526

TSI12TSI11

NA20526NA20792

Family YRI1

NA18913 NA18912NA19240

NA19239 YRI11NA19238

NA18914

Y117

Y028

30September2016 22

(Appendix 3, continued)

Family YRI2

NA18861NA18862

YRI21 NA19105

NA18863

Y024

Y080

Family YRI3

NA19152 NA19104NA19153

YRI31YRI32

NA19154

Y072Y080

Family YRI4

NA19185 NA19166 NA19204NA19203NA19184

YRI42 YRI41

NA19205NA19186

Y039Y049 Y048Family YRI5

NA19214 NA19254NA19213

YRI51YRI52

NA19215

Y062Y110

30September2016 23

Appendix 4:PedigreesthatweremodifiedafterrelationshipinferenceinKING.Pedigreespriortotestingareshowninblueboxes,whilepedigreesaftertestingareshowninredboxes.

Family ACB1

HG02478 HG02479HG02429

ACB12ACB11

HG02480

Family BB53

HG02480

HG02478 HG02479

HG02429

FamilyBB39

HG02429andHG02478inferredtobefullsibs

Family ASW3

ASW35 ASW36 NA20336NA20334

ASW31 ASW32

NA20337NA20335

Family ASW6

ASW63 NA20349 ASW64NA20344

ASW61 ASW62

NA20350NA20345

ASW3splitintotwodisjointfamilies

Family ASW3

ASW34NA20344ASW33NA20336ASW35 NA20349ASW36NA20334

ASW32ASW31

NA20345 NA20350NA20337NA20335

30September2016 24

(Appendix 4, continued)

Family ASW4

NA20301ASW44 NA20282 NA20278 ASW43

ASW42ASW41

NA20302NA20285 NA20279NA20284

Family ASW5

NA20312 NA20313

ASW51 ASW52

Family ASW4

NA20282 NA20301ASW44 ASW45ASW43NA20278

ASW42ASW41

NA20302NA20284 NA20313NA20279 NA20285

NA20312

ASW4andASW5combined,withoneunrelatedindividualsplitoff;halfsiblingsswitched

30September2016 25

(Appendix 4, continued)

Family ASW7

NA20274 NA20414

ASW71 ASW72

Family GIH005

NA20900

NA20891 NA20882

NA20274

NA20414

NA20882

NA20900

NA20891

twoindividualsinferredtobefullsiblings

threeindividualsinferredtobeatrio

Family LWK001

NA19313

NA19331 LWK0011

NA19334

Family LWK001

LWK0011 NA19334NA19331

LWK0013LWK0012

NA19313

twoindividualsinferredtobefullsiblings

30September2016 26

(Appendix 4, continued)

Family 2395

NA19742

23951 NA19740

Family LWK003

NA19434 NA19444

LWK0031 NA19432

Family 2395

NA19742

NA19741 NA19740

NA19741

Family LWK003

NA19444NA19434

LWK0031 LWK0032NA19432

siblingsofunknowntypeinferredtobefullsibs

parent/offspringinferredtobeunrelated

Family YRI4

NA19185 NA19166 NA19204NA19203NA19184

YRI42 YRI41

NA19205NA19186

Family YRI4

NA19166 NA19185NA19184

YRI41 YRI42

NA19186

Family Y048

NA19205

NA19203 NA19204

YRI4splitintotwodisjointfamilies

30September2016 27

References Dumanski,J.P.,Rasi,C.,Lonn,M.,Davies,H.,Ingelsson,M.,Giedraitis,V.,...Forsberg,

L.A.(2015).Mutagenesis.SmokingisassociatedwithmosaiclossofchromosomeY.Science,347(6217),81-83.doi:10.1126/science.1262092

Gazal,S.,Sahbatou,M.,Babron,M.C.,Genin,E.,&Leutenegger,A.L.(2015).Highlevelofinbreedinginfinalphaseof1000GenomesProject.SciRep,5,17453.doi:10.1038/srep17453

GenomesProject,C.,Auton,A.,Brooks,L.D.,Durbin,R.M.,Garrison,E.P.,Kang,H.M.,...Abecasis,G.R.(2015).Aglobalreferenceforhumangeneticvariation.Nature,526(7571),68-74.doi:10.1038/nature15393

Manichaikul,A.,Mychaleckyj,J.C.,Rich,S.S.,Daly,K.,Sale,M.,&Chen,W.M.(2010).Robustrelationshipinferenceingenome-wideassociationstudies.Bioinformatics,26(22),2867-2873.doi:10.1093/bioinformatics/btq559

Matise,T.C.,Chen,F.,Chen,W.,DeLaVega,F.M.,Hansen,M.,He,C.,...Buyske,S.(2007).Asecond-generationcombinedlinkagephysicalmapofthehumangenome.Genomeresearch,17(12),1783-1786.doi:10.1101/gr.7156307

Patterson,N.,Price,A.L.,&Reich,D.(2006).Populationstructureandeigenanalysis.PLoSgenetics,2(12),e190.doi:10.1371/journal.pgen.0020190

Price,A.L.,Patterson,N.J.,Plenge,R.M.,Weinblatt,M.E.,Shadick,N.A.,&Reich,D.(2006).Principalcomponentsanalysiscorrectsforstratificationingenome-wideassociationstudies.NatureGenetics,38(8),904-909.doi:10.1038/ng1847

Wigginton,J.E.,&Abecasis,G.R.(2005).PEDSTATS:descriptivestatistics,graphicsandqualityassessmentforgenemappingdata.Bioinformatics,21(16),3445-3447.doi:10.1093/bioinformatics/bti529

Wigginton,J.E.,Cutler,D.J.,&Abecasis,G.R.(2005).AnoteonexacttestsofHardy-Weinbergequilibrium.AmericanJournalofHumanGenetics,76(5),887-893.doi:10.1086/429864