EvaluationoftheExternalRNAControlsConsortium(ERCC)referencematerialusingamodifiedLatinsquaredesignPinePS1,MunroSA1,ParsonsJR1,McDanielJ1,BergstromLucasA2,LozachJ3,MyersTG4,SuQ4,Jacobs-HelberSM5,6,andSalitM1
1. JointInitiativeforMetrologyinBiology,NationalInstituteofStandardsandTechnology,Stanford,CA,94305,USA
2. GenomicsResearchandDevelopment,AgilentTechnologies,SantaClara,CA,95051,USA
3. Illumina,Inc.,SanDiego,CA,92122,USA4. NationalInstituteofAllergyandInfectiousDiseases,Bethesda,MD,20892,USA5. AIBioTech,Inc.,Richmond,VA,23235,USA6. Currentaffiliation:GENETWORx,LLC.,GlenAllen,VA,USA
CorrespondingAuthor:PScottPine([email protected])JointInitiativeforMetrologyinBiology,NationalInstituteofStandardsandTechnology,443ViaOrtega,Stanford,CA,94305,USAE-mail:SarahAMunro([email protected])JerodRParsons([email protected])JenniferMcDaniel([email protected])AnneBergstromLucas([email protected])JeanLozach([email protected])TimothyGMyers([email protected])QinSu([email protected])SarahMJacobs-Helber([email protected])MarcSalit([email protected])
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
AbstractBACKGROUND:HighlymultiplexedassaysforquantitationofRNAtranscriptsarebeingusedinmanyareasofbiologyandmedicine.Usingdatageneratedbythesetranscriptomicassaysrequiresmeasurementassurancewithappropriatecontrols.MethodstoprototypeandevaluatemultipleRNAcontrolsweredevelopedaspartoftheExternalRNAControlsConsortium(ERCC)assessmentprocess.TheseapproachesincludedamodifiedLatinsquaredesigntoprovideabroaddynamicrangeofrelativeabundancewithknowndifferencesbetweenfourcomplexpoolsofERCCRNAtranscriptsspikedintoahumanlivertotalRNAbackground.RESULTS:ERCCpoolswereanalyzedonfourdifferentmicroarrayplatforms:Agilent1-and2-color,Illuminabead,andNIAIDlab-madespottedmicroarrays;andtwodifferentsecond-generationsequencingplatforms:theLifeTechnologies5500xlandtheIlluminaHiSeq2500.IndividualERCCswereassessedforreproducibleperformanceinsignalresponsetoconcentrationamongtheplatforms.Mostdemonstratedlinearbehavioriftheywerenotlocatednearoneoftheextremesofthedynamicrange.PerformanceissueswithanyindividualERCCtranscriptcouldbeattributedtodetectionlimitations,platform-specifictargetprobeissues,orpotentialmixingerrors.Collectively,thesepoolsofspike-inRNAcontrolswereevaluatedforsuitabilityassurrogatesforendogenoustranscriptstointerrogatetheperformanceoftheRNAmeasurementprocessofeachplatform.Thecontrolswereusefulforestablishingthedynamicrangeoftheassay,aswellasdelineatingtheuseableregionofthatrangewheredifferentialexpressionmeasurements,expressedasratios,wouldbeexpectedtobeaccurate.CONCLUSIONS:ThemodifiedLatinsquaredesignpresentedhereusesacompositetestingschemefortheevaluationofmultipleperformancecharacteristics:linearperformanceofindividualcontrols,signalresponsewithindynamicrangepoolsofcontrols,andratiodetectionbetweenpairsofdynamicrangepools.ThiscompactdesignprovidesaneconomicalsampleformatfortheevaluationofmultipleexternalRNAcontrolswithinasingleexperimentperplatform.Theseresultsindicatethatwell-designedpoolsofRNAcontrols,spiked-intosamples,providemeasurementassuranceforendogenousgeneexpressionexperiments.KeywordsERCC,geneexpression,microarray,RNAcontrols,RNAsequencing,RNA-Seq,spike-incontrols
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
BackgroundIn2003,theNationalInstituteofStandardsandTechnology(NIST)hostedameetingtodiscusstheneedforauniversalRNAreferencematerial,whichcouldbeusedforgeneexpressionprofilingassays[1].Asaresultofthiseffort,theExternalRNAControlsConsortium(ERCC)wasformed,ofwhichNISTisafoundingmemberandhost.TheERCCassembledasequencelibraryof176DNAsequencesthatcouldbetranscribedintoRNAtoserveascontrolsinsystemsusedtomeasuregeneexpression[2,3].ThesecontrolswerecatalogedasERCC-00001throughERCC-00176,andarecollectivelyreferredtoasERCCsinthismanuscript.Thesewereevaluatedandasubsetwasselectedfordisseminationasastandard.Asetof96controlsarenowavailableasasetofsequence-certifiedDNAplasmids,NISTStandardReferenceMaterial(SRM)2374[4].Inthefinalphaseofevaluation,anexperimentaldesignforassessingthecombinedperformanceofERCCspreparedascomplexRNApoolswasused.EachERCCsubpoolwasdesignedtohavea220dynamicrangeofabundanceofcontrols,andparticularcontrolsinthedifferentpoolswerepresentindifferentabundancesaccordingtoamodifiedLatinsquaredesign.Thisdesignprovidesknownrelativedifferencesbetweenthepoolsacrossalargedynamicrangeofabundance(Figure1).Withthisdesign,individualERCCswereassessedfortheirsignalresponseto1.5-,2.5-,and4-foldincreasesinconcentration.Pairwisecomparisonsofthesepoolsalsoprovidesforanassessmentofratio-basedperformanceasafunctionofdynamicrange.Initiallyassessedwiththreedifferentmicroarrayplatforms,thesesamepoolsweresubsequentlymeasuredbyRNAsequencing(RNA-Seq)withtwosecond-generation(NGS)sequencingplatforms.Thedatafromthesetwosetsofexperiments,correspondingtothe96controlsoftheSRM,arepresentedhere.MethodsPoolDesignTheERCCsweredistributedinto5subpools(A–E),eachcontainingauniquesetofcontrols(seeFig.1A).ThesesubpoolswerepreparedatAIBioTech(formerlyCBIServices,Richmond,VA)toERCCspecifications.Thisdesignresultsintherelativeabundancewithineachsubpoolcoveringadynamicrangeof220.SubpoolsA–EwerethenmixedbyvolumeinamodifiedLatinsquaredesigntocreate4differentpools(seeFig.1BandTable1).SubpoolsB–Ehavedifferentrelativeabundancesbetweenthefourpools(inaLatinsquaredesign),whilesubpoolAisheldataconstantproportion(the“modification”).Inaddition,theERCCsinsubpoolsB–Eparticipatein6pairwisecomparisonsbetweenpoolstoproduceratiosof4-,2.7-,2.5-,1.7-,1.6-,and1.5-to-1(Fig.1BandSuppl.Figs.1–4).TheERCCsinsubpoolAarealwayspresentat10%inanyofthepools,andcreatethe1-to-1componentinanyofthe6possiblepairwisecomparisons.ThesepoolsweredesignatedasPools12,13,14,and15inthesetofpoolsdevelopedforERCCtesting[2].EachERCCofthesepoolswasspikedintoacommon“background”ofhumanlivertotalRNA(Ambion)tocreate4correspondingsamples.Each
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
microarraytestsitedeterminedtherelativeamountofspike-inpoolstoaddtothebackground.Agilent,Illumina,andNIAIDused0.144%,0.25%,and0.265%(wt/wt)ofERCCpoolpertotalliverRNA,respectively.ForthesequencingtestsitestotalRNAsampleswerespikedatNISTat0.3%(wt/wt)andthensequencedbyNISTandIllumina.TheERCCmoleculesusedinthesepoolswerepreparedbyinvitrotranscriptionofpolymerasechainreaction(PCR)productsrepresenting“candidate”sequencespriortothereleaseofNISTSRM2374.Theplasmidsweredesignedtoproduceeither“sense”or“antisense”RNAcontrols[4].Inthisstudy,sevenoftheseERCCtranscriptsweredeterminedtobeantisenseusingastrandedRNA-Seqprotocol(seeTable1)andwereexcludedfromfurtherdataanalysis,becausethemicroarraysweredesignedtodetect“sense”RNAcontrols.MicroarraymeasurementsSamplesweremeasuredateachtestsiteusingthefollowingmethods.TheNIAIDin-housespottedmicroarrayscontainlong(70-mer)oligonucleotidesdesignedtohybridizetheERCCtranscriptsprintedonepoxy-coatedglassslides(Corning)inquadruplicateusinganOmniGridrobot(GenomicSolutions)with16SMP3printtips(Telechem).RNAwasreversetranscribedusingOligodTprimer(12-20mer)mix(Invitrogen)andSuperscriptIIreversetranscriptase(Invitrogen).FluorescentCy-Dye-dUTP(GE)nucleotidewasincorporatedintofirst-strandcDNAduringthereversetranscription.AfterdegradationofthemRNAtemplatestrand,labeledsingle-strandedcDNAtargetwaspurifiedusingVivaspin500(10K,Millipore).Hybridizationwasperformedat45C°,for16hoursonaMAUIhybridizationstation.Thearrayswerewashedtwicein1XSSCand0.05%SDSandtwicein0.1XSSC,thenairdried.MicroarrayswerescannedonGenePix4000B(Axon)at10micronresolution.GenePixProsoftwarewasusedforimageanalysis.Medianpixelintensity(nobackgroundsubtraction)wastakenforeachofthe4replicatespots,themedianofthesefourvalueswastakentorepresentthedata.TheAgilentmicroarrays(8x60KAgilentG38-packformatwiththeDesignID022439)contain60-meroligonucleotideprobessynthesizedinsituontoslidesusingaproprietarynon-contactindustrialinkjetprintingprocess.LabeledcRNAforboththeone-colorandtwo-colormicroarrayexperimentswaspreparedusingtheAgilentLowInputQuickAmpLabelingKit,Two-Color(5190-2306).RNAwasreversetranscribedusingAffinityScriptRT,Oligo(dT)PromoterPrimer,andT7RNAPolymerase.FluorescentCy-Dye-dCTPnucleotidewasincorporatedduringcRNAsynthesisandamplification.Microarrayswerehybridizedat65oCfor17hours.AllmicroarrayswerescannedinonebatchinrandomorderusingdefaultsettingsforAgilentCScannerusingasinglepassoverthescanareaataresolutionof3µmanda20-bitscantype.DatawasextractedwithAgilentFeatureExtractionSoftware(ver.10.7.3.1)usingthedefaultsettingsforeithertheone-colorprotocolorthetwo-colorprotocol.TheIlluminaHuman-6ExpressionBeadChipscontain50-meroligonucleotideprobeswitha29-meraddresssequencesattachedtobeadsheldinetchedmicrowells.RNAwasreverse
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
transcribedusingaT7Oligo(dT)primercontainingaT7promotersequence.BiotinylatedcRNAwaspreparedusingtheIlluminaTotalPrepRNAAmplificationKit(Ambion).BeadChipswerehybridizedat58oCfor14–20hours,washed,andlabeledwithstreptavidin-Cy3.BeadChipswerescannedwiththeIlluminaiScanSystem.Intensityvaluesaredeterminedforeverybeadandsummarizedforeachbeadtype.FormoredetailsrefertotheWhole-GenomeGeneExpressionDirectHybridizationAssayGuide(Illumina,partno.11322355).RNASequencingmeasurementsNISTpreparedsamplesofspikedlivertotalRNAforsequencinganalysiswiththe5500xlatNISTandtheHiSeq2500atIllumina.PriortolibrarypreparationsamplesweredepletedofribosomalRNA.The5500xlexperimentproducedanaverageof23,866,495single-endedreads(75base)persampleandtheHiSeq2500experimentyieldedanaverageof48,168,710paired-endreads(2x75base).Forbothplatformssequencereadswerealignedagainstareferencesequenceconsistingofthehumangenome(hg19)andtheERCCtranscriptsequencesofSRM2374(Note:ERCC-00114isnotpartoftheSRMandnotincludedaspartofthereferencetranscriptome).AlignmentandquantificationofsequencereadstoobtainpertranscriptcountswasperformedwiththeLifeScopebioinformaticanalysissuite(LifeTechnologies)for5500xldataandtheTophat-CufflinkssuitewasusedforHiSeq2500data[5,6].ResultsandDiscussionForeachoftheplatforms,iftheERCCspike-inpoolsareaddedtothebackgroundRNAintheproperproportion,thenthe220rangeofrelativeabundancewillcoverthedistributionoftheendogenoustranscriptsignals.Inthefirstsetofexperiments,eachmicroarrayplatformproviderempiricallydeterminedinpilotstudiestheirchosenspike-inproportiontoaddtothetotalRNAbackground(notshown).Agilentused0.144%(wt/wt)forbothone-colorandtwo-colorarrays,andIlluminaandNIAIDused0.25%,and0.265%,respectively.FortheRNA-Seqexperiments,ERCCpoolswereaddedtothebackgroundatNISTat0.3%andsharedwiththeIlluminasite.TheLifeTech5500xlandIlluminaHiSeqmeasurementswereperformedatNISTandIllumina,respectively.ThedistributionofERCCsignalsrelativetotheendogenousliverbackgroundtranscriptsareshownforallplatformsinTable2.Forallsites,thedynamicrangeofthesignalsfromthecontrolsmatchedtherangeofsignalexpressionfromtheendogenousgenesoftheliverbackground.Thissupportstheuseofthesesignalstoderivemetricsusefulforcharacterizingeachmeasurementsystem.Dose-responseandOutlierDetectionForeachplatform,wecandeterminewhethertheanalyticalsignal(fluorescenceintensityinmicroarraysorlengthnormalizedcountsinsequencing)changeswiththeconcentrationofananalyte(theERCCbeingmeasured).Foreachcontrol,thesignalfromeachpoolcanbeplottedagainstthecorrespondingrelativeabundance(Table1),producingacollectionofdose-responsecurvesrepresentingeachindividualERCCinthestudy(Figures2–7,panelA).ERCCs
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
thatweremissingdataforoneormoreconcentrationsintheRNA-Seqexperimentswereflaggedaspartiallydetectedorundetected,andexcludedfromfurtheranalysis(Figures6and7,panelA).Themid-pointofeachERCCdose-responsecurve(averagesignalversusaveragerelativeabundancefromtheLatinsquare)wasusedtoassesswhetheranyparticularERCCwasanoutlierrelativetotheentiresetofcontrols.Thedatawerefittoanappropriatemodelforeachplatform(Figures2–7,panelB).Forthemicroarrayexperiments,amodelusingtheLangmuirisothermandwasused[7,8].Thedissociationconstant,Kd,wasdeterminedbyfittingthedataasfollows:
! = !!"#!!!!!
+ !" (1)
Wherethemaximalintensityofafeatureatsaturation,Imax,andthebackground,bg,areexperimentallyderivedfromtheaverageofthemostabundantERCCineachofthe4poolsandERCC-00073,acomponentomittedfromthepools,respectively.FortheRNA-Seqexperiments,alinearfitwithaslopeof1andfittedy-interceptwasusedasthemodel.Foreithermodel,ERCCsoutsidethe99%confidenceinterval(CI)wereflaggedasoutliers(Figs2–7,panelB)andcomparedacrossplatformstoidentifyanyERCC-specificanomalies(Table3).WiththeexceptionoftheERCCsinthe1-to-1subpool,thesignalforeachcontrolshouldfollowastrictlyincreasingmonotonicfunctiondeterminedbythepoolfractionoftheLatinsquaredesign,10%<15%<25%<40%,(seeFig.1B).ThismonotonicitywasassessedwithSpearman’srho,ρ,whereERCCswithρ<1wereidentifiedforcomparisonacrossplatforms.Inaddition,theslopeofeachindividualERCCdose-responsecurvecanbecalculatedandplottedasafunctionoftherelativeabundance,wheretheslope(m=1)correspondstoanidealdose-response.Forthemicroarraydata,thefirstderivativeoftheLangmuirfunctionalsoprovidesuswithamodeloftheexpectedslopeandtheinflectionpointsallowustodemarcatearegionofthedynamicrangewhereweshouldexpectalinearresponse(Figs2–5,panelC)[9].Non-monotonicERCCsthatfallwithinthatportionofthedynamicrangewerealsoidentifiedasoutliers.FortheRNA-Seqdata,allnon-monotonicERCCsareflaggedasoutliers(Figs6and7,panelC).Onecontrol,ERCC-00113,wasanoutlieronallplatforms,withρ=-0.2foreach.Closerinspectionofthemonotonictrendindicatedthattheleastabundanttargetfeatureproducedthehighestsignalineachcase.ThisERCCwasmoreconsistentwithmembershipinsubpoolC,indicatingalikelyerrorinthepreparationofthesubpools.Therefore,Figures2–7includethisERCCplottedasacomponentofsubpoolC.Table3includesallERCCidentifiedasoutliersbythetwocriteriaaboveandhighlightedinFigures2–7andspecificcontrolsdiscussedbelowareindicatedwithanasterisk(*).Themajorityofnon-monotonicERCCsinthemicroarrayexperimentsoccurredbelowthelowerinflectionpointontheslopeplotsandthoseflaggedfornon-detectionintheRNA-Seqexperimentsalsoappearinthelowerrangeofthesignalresponsecurves.FortheseERCCs,itisdifficulttoassessperformancebeyondtheirutilityfordefiningthelowerlimitsofthelinearrange,sothesearenotincludedintheoutliertable.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Therewereninecontrolsthatwereoutliersonatleastoneplatformforeachcriteria.Sixofthosewereoutliersforbothcriteriaonthesameplatform:ERCC-00156onLifeTech;ERCC-00131,ERCC-00134andERCC-00143onAGL-1;ERCC-00148onAGL-1andILMHiSeq;andERCC-00168onAGL-2andILMHiSeq.Allofthesecontrolsperformedwellonthemajorityofplatforms.FifteenERCCswerenon-monotoniconly.ERCC-00046andERCC-00062werethemosthighlyabundantoutliersinthisclass.Inbothcases,thetwolowestconcentrationsforeachcontrolproducednearlyidenticalvalueswherethelowestconcentrationisslightlyhigher.WiththeexceptionofERCC-00138,allofthesecontrolsperformedwellonthemajorityofplatforms.Thereare26ERCCsthatappeartobeoutlierswithrespecttotheoveralldose-responsemodelthatarestillmonotonic.Forexample,ERCC-00058wastheonlycontroltobedeterminedaresponsecurveoutlieronallmicroarrayplatformsandoneRNA-Seqplatform,howevertheobservedslopeonallplatformstestedwasgreaterthan0.9.ERCC-00170wasalsoflaggedoneveryplatformexcepttheNIAIDmicroarray,butwasnotevaluatedformontonicitybecauseitisinthe1-to-1subpool.Someoftheseresultsmaybeattributabletodifficultieswithaccuratelypreparinglargedynamicrangepoolswithmultiplecontrols,sothattheactualconcentrationisdifferentthanthenominalabundance.ThelinearsignalresponsesindicatethepropercombinationsofthesubpoolsA–EwereachievedfortheLatinsquaredesign.SomeoftheseoutliersmightalsobetheresultofanRNAprocessingbiasthatmaybeanalytespecificandproportionaltoabundance,forexamplepoly-Aenrichment[10].Intensity-dependentdifferentialexpressionFormicroarraydata,anintensity-dependentbiasisoftenvisualizedusinganMA-plot;whereMisthelog2transformationoftheratioofredandgreenfluorescenceintensitiesin2-channeldata,andAisthelog2transformationoftheaverageofthetwo[11].Thisviewhasalsobeenappliedtotwo-conditionsinglechanneldata,whereMbecomestheratiooftwodifferentconditions,whichisalsoreferredtoasaratio-intensityplot(RI-plot)[12].ThesecomparativevisualizationshavebeenextendedtosequencingdataintheformofRA-plots,wheretheratiosandaveragesofintegercountdataformacharacteristicpatternatthelowerendofthesignalrange[13].EachofthesevisualizationsisavariationofaBland-Altmanplot(ordifferenceplot),whichisusedheretovisualizetheabilitytodetectthenominaldifferencesbetweentwomeasurements[14].ABland-AltmanplotoftheERCCcomponentscanbegeneratedforanypairwisecombinationofPools12–15.Onepossiblepairwisecomparison,whichproducesfold-changesof2.5and2.7inboth“up”and“down”directions(seeFig1B)isshowninFigures2–7,panelD.AdditionalpairwisecomparisonsareshowninSupplementalFigures1–6.Forthemicroarrayplatforms,thediscriminationbetweenthetargetratiosisoptimalnearthemiddleoftheirdynamicrange,andtheratiosare“compressed”atboththelowerandupper
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
extremes.Thisconstraintuponlog2ratioshasbeenpreviouslydescribed[15].Theratiosconvergetowardsunityatlowerendduetobackgroundnoise,whichisadditive,andcontributestobothsamplesbeingcompared.Asimilarcompressionisseenathighsignal,wheresaturationdominates.WecanalsouseEquation1toderivetheexpectedintensityratiosandaverageintensitiesforanyfold-changeofrelativeabundance.ThesefittedcurvesarealsoshowninFigures2–5,panelD.SignalsinRNA-Seqarenotsubjecttosaturation(thoughhighabundancetranscriptscandominatethecounting,and“crowdout”signalsfromlowerabundancecontrols).Asaconsequence,theratiosdonotcompressattheupperendofthedynamicrange.TheRNA-Seqsignalsinthisdatasetarederivedfromcountingtechnicalreplicates,wherethevariationcanbecharacterizedbyaPoissondistribution[16].Inthiscase,“shotnoise”dominatesthesignalatthelowend,wherecountsmightbeaddedtoeithersample,andtheratiosmaydeviatefromtargetvaluesineitherdirection(Figs6and7,panelD).ConclusionsThemodifiedLatinsquaredesignprovidedforsimultaneousevaluationofmultiplecontrolswithaminimalnumberofsamples.WhileeachindividualERCCwasonlytestedoverasmallrangeofrelativeabundance,upto4-foldfortheERCCstestedatmultipleratiosandasinglerelativeabundancevalueforthe1-to-1components,inaggregate,theydescribetheoverallmeasurementbehaviorofaplatform.ThespreadofthedataindicatesthatdifferencesinsignalsobservedbetweendifferentRNAspecieswithinthesamesamplemaynotaccuratelyreflecttherelativeabundancebetweendifferentRNAcomponentsofthesamesample.SomeofthisdispersionmaybeduetothecomplexityofthepoolsusedintheseexperimentswherethedistributionoftargetabundancesdescribedinFig.1Amaynothavebeenattained.Formicroarrays,probedesignsforeachERCCtargetmayalsointroducesomevariabilityinsignalbetweendifferentERCCsatthesamerelativeabundance.ForRNA-Seq,anon-uniformdistributionofreadsalongdifferentcontrolsequencesmayalsocontributetothevariability[17].TheERCCsdiddemonstratethatthereisalinearregionofthedynamicrangeofeachplatformwherechangesinabundanceofaparticularRNAtranscriptcanproduceaproportionalchangeinsignal.Inthisregion,theratiosobtainedwitheachplatformapproachthetargetratiosofthemodifiedLatinsquaredesign.Asaconsequence,comparisonsbetweensamplesforanyparticularRNAspeciescanbeexpectedtobeaccuratewithrespecttoratio-basedmeasurementsiftheyfallwithinthisregion.ApairofcomplexmixturesofRNAcontrolsderivedfromNISTSRM2374designedtoprovideasetofratiosacrossasimilardynamicrangeiscommerciallyavailable(Ambion™ERCCExFoldRNASpike-InMixes).NISThasdevelopedanR-basedtool,theerccdashboard,toprovidemetricsandvisualizationsforthesecontrols[18].TheERCCRNAcontrolsdemonstratedutilityinfourdifferentgeneexpressionmicroarrayplatformsandtwoRNA-Seqplatforms.PerformanceissueswithanyindividualERCCcouldbe
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
attributedtodetectionlimitationsoratargetprobeissueforparticularplatforms.Thespike-inRNAcontrolswereusefulforestablishingthedynamicrangeofrelativeabundanceforaplatformaswellasdelineatingareliableregionwhereratioscanbemeasuredaccurately.Thecompositetestingschemeusedinthisstudydemonstratedthatusingwell-designedpoolsofRNAcontrolsprovidesmeasurementassuranceforendogenousgeneexpressionexperiments.PoolsofRNAcontrolsfromthisstudyhavebeenusedasspike-insforRNA-Seqexperiments[19],andcommerciallyavailableversionsofthesecontrolshavebeenusedfortheirintendedpurposeasqualitycontrols[20–23].Thesecontrolshavealsoprovenusefulinproductandmethoddevelopmentduetotheircertifiedsequencesandknownconcentrations[24–32].Recently,theyhavebecomeimportantincomparingtranscriptomesbetweencelltypesinimmunology[20,32,33],agriculture[34,35],andotherbiologystudies[21,36–38],aswellaskeytounderstandingandaccountingforthetechnicalnoiseinsingle-cellsequencingexperiments[39–42].
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
AbbreviationsERCC ExternalRNAControlsConsortiumNIAID NationalInstituteofAllergyandInfectiousDiseasesNIST NationalInstituteofStandardsandTechnologySRM StandardRefererenceMaterialPCR PolymerasechainreactionRNA-Seq RNAsequencingNGS Next-generationsequencingAGL-1 Agilent1-colorAGL-2 Agilent2-colorILM-bead IlluminabeadarrayILM-ngs Illuminanext-generationsequencingLifeTech LifeTechnologiesCompetingInterestsJLisemployedbyIlluminaInc.,manufacturerofoneofthemicroarrayplatformsandoneofthesequencingplatformsusedinthisstudy.ABLisemployedbyAgilentInc.,manufacturerofoneofthemicroarrayplatformsusedinthisstudy.Theseauthorsprovideddata.AlldataanalysiswasperformedbyNIST.Authors’ContributionsABL,JL,TGM,andSMdesignedthestudy.JM,SMJ-H,andSMdevelopedthereferencesamples.SAM,JM,ABL,JL,TGM,andSQacquiredandprocessedthedata.Allauthorsparticipatedinthepreliminaryanalysisandinterpretation.PSPdevelopedmetricsandvisualizationsanddraftedthemanuscript.Allauthorsparticipatedintherevisionprocessandprovidedfinalapproval.AcknowledgementsThisresearchwassupportedinpartbytheIntramuralResearchProgramoftheNIH,NIAID.DisclaimerCertaincommercialequipment,instruments,ormaterialsareidentifiedinthispaperinordertospecifytheexperimentalprocedureadequately.SuchidentificationisnotintendedtoimplyrecommendationorendorsementbytheNationalInstituteofStandardsandTechnology(NIST),norisitintendedtoimplythatthematerialsorequipmentidentifiedarenecessarilythebestavailableforthepurpose.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
References1. CroninM,GhoshK,SistareF,QuackenbushJ,VilkerV,O'ConnellC.UniversalRNAreference
materialsforgeneexpression.ClinChem.2004;50(8):1464-71.2. ExternalRNAControlsConsortium.ProposedmethodsfortestingandselectingtheERCC
externalRNAcontrols.BMCGenomics.2005;6:150.3. ExternalRNAControlsConsortium.ExternalRNAControlsConsortium:aprogressreport.
NatMethods.2005;2(10):731-4.4. NISTSRM2374CertificateofAnalysis:https://www-
s.nist.gov/srmors/view_detail.cfm?srm=2374.5. TrapnellC,RobertsA,GoffL,PerteaG,KimD,KelleyDRetal..Differentialgeneand
transcriptexpressionanalysisofRNA-seqexperimentswithTopHatandCufflinks.NatProtoc.2012;7(3):562-78.
6. RobertsA ,TrapnellC,DonagheyJ,RinnJL,PachterL.ImprovingRNA-Seqexpression
estimatesbycorrectingforfragmentbias.GenomeBiol.2011;12(3):R22.7. HalperinA,BuhotA,ZhulinaEB.OnthehybridizationisothermsofDNAmicroarrays:the
Langmuirmodelanditsextensions.J.Phys.:Condens.Matter.2006;18:S463–S490.8. HeldGA,GrinsteinG,TuY.Relationshipbetweengeneexpressionandobserved
intensitiesinDNAmicroarrays--amodelingstudy.NucleicAcidsRes.2006;34(9):e70.9. SebaughJL,McCrayPD. Definingthelinearportionofasigmoid-shapedcurve:bend
points.Pharmaceut.Statist.2003;2:167–174.10. QingT,YuY,DuT,ShiL.mRNAenrichmentprotocolsdeterminethequantification
characteristicsofexternalRNAspike-incontrolsinRNA-Seqstudies.Sci.ChinaLifeSci.2013;56(2):134-42.
11. DudoitS,YangYH,CallowMJ,SpeedTP.Statisticalmethodsforidentifyingdifferentially
expressedgenesinreplicatedcDNAmicroarrayexperiments.Stat.Sin.2002;12:111–139.12. PinePS,RosenzweigBA,ThompsonKL.Anadaptablemethodusinghumanmixedtissue
ratiometriccontrolsforbenchmarkingperformanceongeneexpressionmicroarraysinclinicallaboratories.BMCBiotechnol.2011;11:38.
13. MarchettiA,SchruthDM,DurkinCA,ParkerMS,KodnerRB,BerthiaumeCTetal..
Comparativemetatranscriptomicsidentifiesmolecularbasesforthephysiologicalresponsesofphytoplanktontovaryingironavailability.ProcNatlAcadSciUSA.2012;109(6):E317-25.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
14. BlandJM,AltmanDG.Measuringagreementinmethodcomparisonstudies.StatMethods
MedRes.1999;8(2):135-60.15. SharovV,KwongKY,FrankB,ChenE,HassemanJ,GaspardRetal..Thelimitsoflog-ratios.
BMCBiotechnol.2004;4:3.16. MortazaviA,WilliamsBA,McCueK,SchaefferL,WoldB.Mappingandquantifying
mammaliantranscriptomesbyRNA-Seq.NatMethods.2008;5(7):621-8.17. FinotelloF,LavezzoE,BiancoL,BarzonL,MazzonP,FontanaPetal..ReducingbiasinRNA
sequencingdata :anovelapproachtocomputecounts.BMCBioinformatics.2014;15(Suppl1):S7.
18. MunroSA,LundSP,PinePS,BinderH,ClevertDA,ConesaAetal..Assessingtechnical
performanceindifferentialgeneexpressionexperimentswithexternalspike-inRNAcontrolratiomixtures.NatCommun.2014;5:5125.
19. JiangL,SchlesingerF,DavisCA,ZhangY,LiR,SalitMetal..Syntheticspike-instandardsfor
RNA-seqexperiments.GenomeRes.2011;21(9):1543-51.20. FritzEL,RosenbergBR,LayK,MihailovićA,TuschlT,PapavasiliouFN.Acomprehensive
analysisoftheeffectsofthedeaminaseAIDonthetranscriptomeandmethylomeofactivatedBcells.NatImmunol.2013;14:749–755.
21. YuY,FuscoeJC,ZhaoC,GuoC,JiaM,QingTetal..AratRNA-SeqtranscriptomicBodyMap
across11organsand4developmentalstages.NatCommun.2014;5:1–11.22. HashimshonyT,WagnerF,SherN,YanaiI.CEL-Seq:Single-CellRNA-SeqbyMultiplexed
LinearAmplification.CellRep.2012;2:666–673.23. FuGK,XuW,WilhelmyJ,MindrinosMN,DavisRW,XiaoWetal..Molecularindexing
enablesquantitativetargetedRNAsequencingandrevealspoorefficienciesinstandardlibrarypreparations.ProcNatlAcadSci.2014;111:1891–1896.
24. DevonshireAS,SandersR,WilkesTM,TaylorMS,FoyCA,HuggettJF.Applicationofnext
generationqPCRandsequencingplatformstomRNAbiomarkeranalysis.Methods.2013;59:89–100.
25. KraljJG,SalitML.Characterizationofinvitrotranscriptionamplificationlinearityand
variabilityinthelowcopynumberregimeusingExternalRNAControlConsortium(ERCC)spike-ins.AnalBioanalChem.2013;405:315–320.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
26. SandersR,MasonDJ,FoyCA,HuggettJF.EvaluationofDigitalPCRforAbsoluteRNAQuantification.PLoSOne.2013;8:e75296.
27. DevonshireAS,ElaswarapuR,FoyCA.EvaluationofexternalRNAcontrolsforthe
standardisationofgeneexpressionbiomarkermeasurements.BMCGenomics.2010;11:662.
28. MercerTR,ClarkMB,CrawfordJ,BrunckME,GerhardtDJ,TaftRJetal..Targeted
sequencingforgenediscoveryandquantificationusingRNACaptureSeq.NatProtoc.2014;9:989–1009.
29. LiaoY,SmythGK,ShiW.TheSubreadaligner:fast,accurateandscalablereadmappingby
seed-and-vote.NucleicAcidsRes.2013;41:e108–e108.30. LawCW,ChenY,ShiW,SmythGK:voom:precisionweightsunlocklinearmodelanalysis
toolsforRNA-seqreadcounts.GenomeBiol.2014;15:R29.31. ZhuY,LiM,SousaAM,ŠestanN:XSAnno:aframeworkforbuildingorthologmodelsin
cross-speciestranscriptomecomparisons.BMCGenomics.2014;15:343.32. MohammadiP,diIulioJ,MuñozM,MartinezR,BarthaI,CavassiniMetal..DynamicsofHIV
LatencyandReactivationinaPrimaryCD4+TCellModel.PLoSPathog.2014;10:e1004156.
33. ShinH,ShannonCP,FishbaneN,RuanJ,ZhouM,BalshawRetal..VariationinRNA-Seq
TranscriptomeProfilesofPeripheralWholeBloodfromHealthyIndividualswithandwithoutGlobinDepletion.PLoSOne.2014;9:e91041.
34. MacGregorDR,GouldP,ForemanJ,GriffithsJ,BirdS,PageRetal..HIGHEXPRESSIONOF
OSMOTICALLYRESPONSIVEGENES1IsRequiredforCircadianPeriodicitythroughthePromotionofNucleo-CytoplasmicmRNAExportinArabidopsis.PlantCell.2013;25:4391–4404.
35. HirschCN,FoersterJM,JohnsonJM,SekhonRS,MuttoniG,VaillancourtBetal..Insights
intotheMaizePan-GenomeandPan-Transcriptome.PlantCell.2014;26:121–135.36. SheeanME,McShaneE,CheretC,WalcherJ,MüllerT,Wulf-GoldenbergAetal..Activation
ofMAPKoverridestheterminationofmyelingrowthandreplacesNrg1/ErbB3signalsduringSchwanncelldevelopmentandmyelination.GenesDev.2014;28:290–303.
37. MaloneJH,ChoDY,MattiuzzoNR,ArtieriCG,JiangL,DaleRKetal..Mediationof
Drosophilaautosomaldosageeffectsandcompensationbynetworkinteractions.GenomeBiol.2012;13:R28.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
38. JagyaN,VarmaSP,ThakralD,JoshiP,DurgapalH,PandaSK.RNA-Seqbasedtranscriptomeanalysisofhepatitisevirus(HEV)andhepatitisBvirus(HBV)replicontransfectedHuh-7cells.PLoSOne.2014;9(2):e87835.
39. BrenneckeP,AndersS,KimJK,KołodziejczykAA,ZhangX,ProserpioVetal..Accountingfor
technicalnoiseinsingle-cellRNA-seqexperiments.NatMethods.2013;10:1093–5.40. BuettnerF,NatarajanKN,CasaleFP,ProserpioV,ScialdoneA,TheisFJetal..Computational
analysisofcell-to-cellheterogeneityinsingle-cellRNA-sequencingdatarevealshiddensubpopulationsofcells.NatBiotechnol.2015;33(2):155-60.
41. JaitinDA,KenigsbergE,Keren-ShaulH,ElefantN,PaulF,ZaretskyIetal..MassivelyParallel
Single-CellRNA-SeqforMarker-FreeDecompositionofTissuesintoCellTypes.Science.2014;343(6172):776-9.
42. GrünD,KesterL,vanOudenaardenA.Validationofnoisemodelsforsingle-cell
transcriptomics.NatMethods.2014;11:637–640.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
FigureLegendsFigure1.Latinsquarepluspooldesign.PanelAshowsaschematicdesignoftherelativeabundanceof95uniqueERCCdistributedinto5subpools.PanelBshowstheproportionofeachsubpoolwithineachpool.SubpoolsB–EaremixedusingaLatinsquareofproportions40,25,15,and10percent,plussubpoolAasanadditional10percentcomponentofeach.SubpoolsA,B,C,D,andEareshaded,black,white,lightgrey,mediumgrey,anddarkgrey,respectively.RefertoTable1forthetargetrelativeabundanceofanERCCusedinthedesignofeachpool.Figure2.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheIlluminamicroarrayplatform.InPanelA,eachlinerepresentsanindividualERCC,wheregrey=titrated,black=1-to-1,red=outlier,anddashed-line=background(averageERCC-00073).InPanelB,thecentroidofeachERCCisplotted,wheretheredlinecorrespondstothefittedLangmuirmodel,opencircles=within99%CI,redcircles=outliers,anddashed-line=background.InPanelC,theslopeofeachERCCisplotted,wheretheredlinecorrespondstoexpectedslope(firstderivativeoftheLangmuirmodel),theverticaldottedlinescorrespondtothemarginsofthelinearregion(inflectionpointsofthefirstderivativeoftheLangmuirmodel),theopencircles=monotonicERCCs(ρ=1),greysquares=non-monotonic,andred=outliers.NumbersinPanelsBandCcorrespondtothelastthreedigitsoftheControlIDinTable3.InPanelD,eachERCCisrepresentedontheBland-AltmanplotofMix1vsMix3,wheretheredlinecorrespondstotheratioversusaverageintensityderivedfromthefittedLangmuirmodel,withoutlierscodedasinPanelsBandCabove.Figure3.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheNIAIDmicroarrayplatform.SeeFigure2legend.Figure4.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheAgilent1-colormicroarrayplatform.SeeFigure2legend.Figure5.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheAgilent2-colormicroarrayplatform.SeeFigure2legend.Figure6.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheLifeTechNGSplatform.InPanelA,eachlinerepresentsanindividualERCC,wheregrey=titrated,black=1-to-1,andred=outlier.PartiallydetectedandundetectedERCCsareincludedatthebottomtoindicatetheirtargetedrelativeabundance.InPanelB,thecentroidofeachERCCisplotted,wheretheredlinecorrespondstothelinearfittedmodel,opencircles=within99%CI,andredcircles=outliers.InPanelC,theslopeofeachERCCisplotted,wheretheopencircles=monotonicERCCs(ρ=1),greysquares=non-monotonic,andred=outliers.NumbersinPanelsBandCcorrespondtothelastthreedigitsoftheControlIDinTable3.InPanelD,eachERCCisrepresentedontheBland-AltmanplotofMix1vsMix3,withoutlierscodedasinPanelsBandCabove.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Figure7.ERCCsignalresponseasafunctionofrelativeabundanceineachofthefourpoolsontheIlluminaNGSplatform.SeeFigure6legend.SupplementalFigure1.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheIlluminamicroarrayplatform.SymbolscorrespondtopoolsA–E(seeFig.1).Filledcircles=A,opencircles=B,opendiamonds=C,opentriangles=D,andopensquares=E.TheredlinecorrespondstotheratioversusaverageintensityderivedfromthefittedLangmuirmodel.SupplementalFigure2.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheNIAIDmicroarrayplatform.SeeSupplementalFigure1legend.SupplementalFigure3.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheAgilent1-colormicroarrayplatform.SeeSupplementalFigure1legend.SupplementalFigure4.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheAgilent2-colormicroarrayplatform.SeeSupplementalFigure1legend.SupplementalFigure5.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheLifeTechNGSplatform.SymbolscorrespondtosubpoolsA–E(seeFig.1).Filledcircles=A,opencircles=B,opendiamonds=C,opentriangles=D,andopensquares=E.SupplementalFigure6.Bland-Altmanplotofeachpair-wisepoolcomparisonusingtheLifeTechNGSplatform.SeeSupplementalFigure5legend.
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Table1.DistributionofERCCsamongpoolsandmixtures.
ControlID Subpool Pool12 Pool13 Pool14 Pool15 Note
ERCC-00073 N/A 0 0 0 0 Omitted
ERCC-00162 A 1 1 1 1ERCC-00154 A 2 2 2 2ERCC-00144 A 4 4 4 4ERCC-00136 A 8 8 8 8ERCC-00126 A 16 16 16 16ERCC-00114 A 32 32 32 32 Non-SRMERCC-00108 A 64 64 64 64 AntisenseERCC-00096 A 128 128 128 128ERCC-00053 A 256 256 256 256ERCC-00077 A 512 512 512 512ERCC-00071 A 1024 1024 1024 1024ERCC-00060 A 2048 2048 2048 2048ERCC-00084 A 4096 4096 4096 4096ERCC-00043 A 8192 8192 8192 8192ERCC-00035 A 16384 16384 16384 16384ERCC-00025 A 32768 32768 32768 32768ERCC-00079 A 65536 65536 65536 65536ERCC-00170 A 131072 131072 131072 131072ERCC-00003 A 262,144 262,144 262,144 262,144ERCC-00012 A 1,048,576 1,048,576 1,048,576 1,048,576
ERCC-00163 B 1 1.5 2.5 4ERCC-00156 B 2 3 5 8ERCC-00145 B 4 6 10 16ERCC-00137 B 8 12 20 32ERCC-00128 B 16 24 40 64ERCC-00116 B 32 48 80 128 AntisenseERCC-00109 B 64 96 160 256ERCC-00097 B 128 192 320 512ERCC-00085 B 256 384 640 1,024ERCC-00078 B 512 768 1,280 2,048ERCC-00171 B 1,024 1,536 2,560 4,096ERCC-00054 B 2,048 3,072 5,120 8,192ERCC-00044 B 4,096 6,144 10,240 16,384ERCC-00039 B 8,192 12,288 20,480 32,768ERCC-00028 B 16,384 24,576 40,960 65,536ERCC-00019 B 32,768 49,152 81,920 131,072ERCC-00061 B 65,536 98,304 163,840 262,144ERCC-00013 B 262,144 393,216 655,360 1,048,576ERCC-00002 B 1,048,576 1,572,864 2,621,440 4,194,304
ERCC-00164 C 1.5 2.5 4 1ERCC-00157 C 3 5 8 2ERCC-00147 C 6 10 16 4ERCC-00138 C 12 20 32 8ERCC-00130 C 24 40 64 16ERCC-00117 C 48 80 128 32ERCC-00111 C 96 160 256 64ERCC-00098 C 192 320 512 128ERCC-00086 C 384 640 1,024 256ERCC-00004 C 768 1,280 2,048 512ERCC-00074 C 1,536 2,560 4,096 1,024ERCC-00057 C 3,072 5,120 8,192 2,048 AntisenseERCC-00062 C 6,144 10,240 16,384 4,096ERCC-00046 C 12,288 20,480 32,768 8,192ERCC-00040 C 24,576 40,960 65,536 16,384ERCC-00051 C 49,152 81,920 131,072 32,768ERCC-00022 C 98,304 163,840 262,144 65,536ERCC-00014 C 393,216 655,360 1,048,576 262,144 AntisenseERCC-00018 C 1,572,864 2,621,440 4,194,304 1,048,576
ERCC-00165 D 2.5 4 1 1.5ERCC-00158 D 5 8 2 3ERCC-00148 D 10 16 4 6ERCC-00142 D 20 32 8 12ERCC-00131 D 40 64 16 24ERCC-00120 D 80 128 32 48ERCC-00099 D 160 256 64 96 AntisenseERCC-00112 D 320 512 128 192ERCC-00092 D 640 1,024 256 384ERCC-00081 D 1,280 2,048 512 768ERCC-00075 D 2,560 4,096 1,024 1,536ERCC-00058 D 5,120 8,192 2,048 3,072ERCC-00067 D 10,240 16,384 4,096 6,144ERCC-00048 D 20,480 32,768 8,192 12,288ERCC-00041 D 40,960 65,536 16,384 24,576ERCC-00033 D 81,920 131,072 32,768 49,152ERCC-00007 D 163,840 262,144 65,536 98,304ERCC-00023 D 655,360 1,048,576 262,144 393,216ERCC-00016 D 2,621,440 4,194,304 1,048,576 1,572,864
ERCC-00123 E 4 1 1.5 2.5ERCC-00160 E 8 2 3 5ERCC-00150 E 16 4 6 10ERCC-00143 E 32 8 12 20ERCC-00134 E 64 16 24 40ERCC-00113 E 128 32 48 80 Re-assignedtoPoolCERCC-00168 E 256 64 96 160ERCC-00104 E 512 128 192 320ERCC-00095 E 1,024 256 384 640ERCC-00083 E 2,048 512 768 1,280ERCC-00076 E 4,096 1,024 1,536 2,560ERCC-00069 E 8,192 2,048 3,072 5,120ERCC-00059 E 16,384 4,096 6,144 10,240 AntisenseERCC-00031 E 32,768 8,192 12,288 20,480ERCC-00042 E 65,536 16,384 24,576 40,960ERCC-00034 E 131,072 32,768 49,152 81,920ERCC-00009 E 262,144 65,536 98,304 163,840 AntisenseERCC-00017 E 1,048,576 262,144 393,216 655,360ERCC-00024 E 4,194,304 1,048,576 1,572,864 2,621,440
TargetRelativeAbundance
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Table&3.&&Dynamic&Range&Coverage.
Platform Units Subset
Illumina ERCC 5.81 ± 0.08 5.72 ± 0.07 13.88 ± 0.11 8.16 ± 0.13
Bead BKGD1
5.35 ± 0.07 14.34 ± 0.08 8.99 ± 0.10
NIAID ERCC 5.53 ± 0.02 5.53 ± 0.03 15.83 ± 0.38 10.30 ± 0.39
In?house BKGD 5.20 ± 0.01 15.26 ± 0.27 10.06 ± 0.27
Agilent ERCC 2.62 ± 0.14 2.57 ± 0.16 20.73 ± 0.18 18.16 ± 0.24
One?color BKGD 2.41 ± 0.10 20.66 ± 0.06 18.25 ± 0.12
Agilent ERCC 2.57 ± 0.06 2.40 ± 0.06 18.37 ± 0.10 15.98 ± 0.12
Two?color BKGD 4.40 ± 0.15 20.00 ± 0.10 15.60 ± 0.19
Illumina ERCC ?4.98 ± 0.67 14.58 ± 0.27 19.56 ± 0.72
HiSeq BKGD ?6.34 ± 0.40 18.27 ± 0.05 24.61 ± 0.40
LifeTech ERCC ?3.26 ± 0.38 16.47 ± 0.34 19.73 ± 0.51
SOLiD BKGD ?6.64 ± 0.00 17.30 ± 0.35 23.94 ± 0.35
1AllOtranscriptsOmeasuredOinOtheOtotalOhumanOliverORNAObackground.
2MinimumORPKMOvalueOreportedOisOtruncatedOatO0.01OforOallOreplicates.
log2ORPKM2
undetected
undetected
log2Osignal
log2Osignal
log2Osignal
log2Osignal
ERCC=00073 Minimum Maximum Range
log2OFPKM
2"
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Table3.ERCCoutliersgroupedbyperformancecriteria.
Log2Target
Controls RelativeAbundance
Subpool ILM NIAID LifeTech ILMHiSeq ILM NIAID LifeTech ILMHiSeq
ERCC-00156* 1 B ¢ ¢
ERCC-00147 2 C ¢ ¢ ¢
ERCC-00148* 2 D ¢ [1,2] ¢ ¢ [1] ¢ ¢
ERCC-00137 3 B ¢ [2] ¢ [1] ¢
ERCC-00143* 3 E ¢ [1] ¢ [1]
ERCC-00131* 4 D ¢ [1] ¢ ¢ [1,2]
ERCC-00134* 4 E ¢ [1] ¢ [1,2]
ERCC-00168* 6 E ¢ [1,2] ¢ ¢ [2] ¢ ¢
ERCC-00095 8 E ¢ ¢
ERCC-00157 1 C ¢
ERCC-00158 1 D ¢ [1]
ERCC-00160 1 E ¢ [1]
ERCC-00145 2 B ¢ [1]
ERCC-00150 2 E ¢ [1]
ERCC-00138* 3 C ¢ [1,2] ¢ ¢
ERCC-00142 3 D ¢ [2] ¢
ERCC-00128 4 B ¢ [2] ¢
ERCC-00111 6 C ¢
ERCC-00097 7 B ¢
ERCC-00098 7 C ¢ ¢
ERCC-00104 7 E ¢
ERCC-00086 8 C ¢ ¢
ERCC-00062* 12 C ¢
ERCC-00046* 13 C ¢
ERCC-00162 0 A ¢ [1]
ERCC-00126 4 A ¢
ERCC-00113* 5 C ¢ [2] ¢ ¢
ERCC-00117 5 C ¢
ERCC-00120 5 D ¢
ERCC-00109 6 B ¢ ¢ ¢
ERCC-00077 8 A ¢
ERCC-00081 9 D ¢
ERCC-00060 10 A ¢
ERCC-00075 10 D ¢
ERCC-00171 10 B ¢
ERCC-00054 11 B ¢
ERCC-00058* 11 D ¢ [1,2] ¢ ¢ ¢
ERCC-00069 11 E ¢ ¢
ERCC-00044 12 B ¢ ¢
ERCC-00025 14 A ¢
ERCC-00028 14 B ¢
ERCC-00040 14 C ¢
ERCC-00042 14 E ¢ ¢
ERCC-00007 16 D ¢
ERCC-00022 16 C ¢ [1,2]
ERCC-00170* 16 A ¢ [2] ¢ ¢ ¢
ERCC-00023 18 D ¢ [1,2]
ERCC-00002 20 B ¢
ERCC-00012 20 A ¢
ERCC-00024 20 E ¢ [1,2]
1Agilent1-colordata.2Agilent2-colordata.*Discussedfurtherinmaintext.Note:Thefollowinganalyteswereincorrectlypreparedastheirantisensesequenceandomittedfromthedataanlysis:ERCC-00009,ERCC-00014,ERCC-00057,ERCC-00059,ERCC-00099,ERCC-00108,andERCC-00116.
ResponseCurveOutliers Non-monotonic
AGL AGL
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Subpool
A
B
C
D
E
Subpool
A
B
C
D
E
PercentofMixture
Log2Rela4veAbundance
A
B
Figure1
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Figure%2%(Illumina)–)bead))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Figure%3%(NIAID))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Figure%4%(Agilent)15color))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Figure%5%(Agilent)25color))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Par1al%Undetected%
Figure%6%(LifeTech))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log 2%Signa
l%
Average%Log 2%Signa
l%
Slop
e%
Rela1ve%Abundance% Average%Rela1ve%Abundance%
Average%Rela1ve%Abundance%
Log 2%Ra1
o%
Average%Log2%Signal%
A% B%
C% D%
Par1al%Undetected%
Figure%7%(Illumina)–)NGS))
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
AverageLog2Signal
Log2Ra0
o
Supplementalfigure1(ILMN)
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log2Ra0
o
Supplementalfigure2(NIAID)
AverageLog2Signal
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log2Ra0
o
Supplementalfigure3(AGL-1)
AverageLog2Signal
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
Log2Ra0
o
Supplementalfigure4(AGL-2)
AverageLog2Signal
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
AverageLog2Signal
Log2Ra0
o
Supplementalfigure5(LifeTech)
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint
AverageLog2Signal
Log2Ra0
o
Supplementalfigure6(ILMN)
Pool12/Pool13
Pool14/Pool15
Pool12/Pool14 Pool13/Pool15
Pool15/Pool12
Pool13/Pool14
.CC-BY-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 7, 2016. . https://doi.org/10.1101/034868doi: bioRxiv preprint