Accurate sampling and deep sequencing of the HIV-1protease gene using a Primer IDCassandra B. Jabaraa,b,c, Corbin D. Jonesa,d, Jeffrey Roache, Jeffrey A. Andersonb,c,f,1, and Ronald Swanstromb,c,g,2
aDepartment of Biology, bLineberger Comprehensive Cancer Center, cUniversity of North Carolina Center for AIDS Research, dCarolina Center for GenomeSciences, eResearch Computing Center, fDivision of Infectious Diseases, and gDepartment of Biochemistry and Biophysics, University of North Carolina, ChapelHill, NC 27599
Edited by John M. Coffin, Tufts University School of Medicine, Boston, MA, and approved November 8, 2011 (received for review June 24, 2011)
Viruses can create complex genetic populations within a host, anddeep sequencing technologies allow extensive sampling of thesepopulations. Limitations of these technologies, however, potentiallybias this sampling, particularly when a PCR step precedes the se-quencing protocol. Typically, an unknown number of templates areused in initiating the PCR amplification, and this can lead to un-recognized sequence resampling creating apparent homogeneity;also, PCR-mediated recombination can disrupt linkage, and differ-ential amplification can skew allele frequency. Finally, misincorpora-tion of nucleotides during PCR and errors during the sequencingprotocol can inflate diversity. We have solved these problems byincluding a random sequence tag in the initial primer such that eachtemplate receives a unique Primer ID. After sequencing, repeatedidentification of a Primer ID reveals sequence resampling. These re-sampled sequences are then used to create an accurate consensussequence for each template, correcting for recombination, allelicskewing, and misincorporation/sequencing errors. The resultingpopulation of consensus sequences directly represents the initialsampled templates. We applied this approach to the HIV-1 protease(pro) gene to view the distribution of sequence variation of a com-plex viral population within a host. We identified major and minorpolymorphisms at coding and noncoding positions. In addition, weobserved dynamic genetic changes within the population during in-termittent drug exposure, including the emergence of multiple re-sistant alleles. These results provide an unprecedented view of acomplex viral population in the absence of PCR resampling.
drug resistance | genetic diversity | high throughput sequencing | HIV |population dynamics
High throughput sequencing allows the acquisition of largeamounts of sequence data that can encompass entire
genomes (1–4). With sufficient amounts of starting DNA, PCR isnot needed before the library preparation step of the sequencingprotocol. Sequencing miscalls inherent in high throughput se-quencing approaches are resolved using multiple reads over agiven base.Deep sequencing can also capture the genetic diversity of viral
populations (5–10), including intrahost populations derived fromclinical samples. This approach offers the opportunity to viewpopulation diversity and dynamics and viral evolution in un-precedented detail. One place where the presence of minorvariants is of immediate practical importance is in the detectionof drug-resistant variants. Standard bulk sequencing methodstypically miss allelic variants below 20% in frequency within apopulation (11, 12). Alternative assays can detect less abundantvariants that confer drug resistance, but require a priori selectionof sites and variants (13–23). Thus, deep sequencing approachesoffer the opportunity to identify minor variants associated withresistance de novo with the goal of understanding their role intherapy failure.Although screening for drug-resistant variants is a practical
application of the deep sequencing technology, this technologyalso addresses broader questions of sequence diversity andstructure for a complex population like HIV-1. However, therelatively high sequencing error rates of these technologiesartificially increase genetic diversity, which confounds the detec-
tion of natural genetic variation especially when sequencinga highly heterogeneous viral population. Moreover, the use ofPCR to amplify the amount of material before starting the se-quencing protocol adds the potential for several serious artifacts(24–27): First, nucleotide misincorporation by the polymeraseduring many rounds of amplification artificially increases se-quence diversity; second, artifactual recombination during am-plification occurs when premature termination products primea subsequent round of synthesis, which can obscure the linkage oftwo sequence polymorphisms (28, 29); third, differential amplifi-cation can skew allelic frequencies; and fourth, PCR amplificationcan create a significant mass of DNA from a small number ofstarting templates, which obscures the true sampling of the orig-inal population as these few starting templates/genomes get re-sampled in the PCR product, creating sequence resampling ratherthan the observation of independent genomes (30). Overall, thesebiases artificially decrease true diversity while introducing arti-factual diversity and also skew allelic frequencies, which can leadto incongruence between the real and observed viral populations.Most investigators use statistical tools to attempt to control forthe types of sequencing errors that are associated with each se-quencing platform.To make deep sequencing useful for complex populations, it is
necessary to overcome PCR resampling, which is mistaken forsampling of the original population, and PCR and sequencingerrors, which can be mistaken for diversity. As nucleotide mis-incorporation is largely random across sites and templateswitching/recombination is more likely to occur in the later cyclesof a PCR (31), strategies that create a bulk or consensus se-quence for each sampled template will call the correct base ateach position. One approach to sampling highly heterogeneouspopulations, such as the HIV-1 env gene, is through endpointdilution titration of the template before nested PCR, such thata single template is present in each PCR amplification (32–35).In addition to masking the misincorporations, PCR-mediatedrecombination produces recombinant templates identical to theparental sequence. Although highly accurate, this technique islabor-intensive and, as population sampling is dependent on thenumber of templates sequenced, this methodology does not lenditself to the identification of minor variants or to understandingthe structure of a complex population, nor is it easily adaptableto a high throughput approach.We have developed a high throughput technique for directly
resolving the genetic diversity of a viral population. This tech-nique avoids the recording of PCR and sequencing errors that
Author contributions: C.B.J., C.D.J., J.A.A., and R.S. designed research; C.B.J. and J.A.A.performed research; C.B.J., C.D.J., and J.R. contributed new reagents/analytic tools; C.B.J.,C.D.J., J.R., J.A.A., and R.S. analyzed data; and C.B.J., C.D.J., and R.S. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.1Present address: Discovery Medicine-Virology, Discovery Medicine and Clinical Pharma-cology, 311 Pennington-Rocky Hill Rd., 8A-1.14, Bristol-Myers Squibb, Pennington,NJ 08543.
2To whom correspondence should be addressed. E-mail: [email protected].
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1110064108/-/DCSupplemental.
20166–20171 | PNAS | December 13, 2011 | vol. 108 | no. 50 www.pnas.org/cgi/doi/10.1073/pnas.1110064108
Dow
nloa
ded
by g
uest
on
May
30,
202
0
create artificial diversity, and corrects for artificial allelic skewingand PCR resampling, revealing the original genomes in thepopulation. This is accomplished by embedding a degenerateblock of nucleotides within the primer used in the first round ofcDNA synthesis. This creates a random library of sequenceswithin the primer population. As primers are individually usedout of this library, each viral template is copied such that thecomplement (cDNA) now includes a unique sequence tag, orPrimer ID. This Primer ID is carried through all of the sub-sequent manipulations to mark all sequences that derive fromeach independent templating event, and PCR resampling thenbecomes over-coverage for each template to create a consensussequence of that template. Using this approach, we were able todirectly remove error, correct for PCR resampling, and capturethe fluctuation of minor variants in the viral population within ahost. We also resolved minor drug-resistant variants below 1% infrequency before the initiation of antiretroviral therapy, andwere able to correlate these variants with the emergence of drugresistance. The value of this strategy and its applicability to otherdeep sequencing protocols has been further emphasized througha recent parallel effort by Kinde et al. in short read sequencing ofhuman genomic material (36).
ResultsA cDNA Synthesis Primer Containing a Primer ID Can Be Used to TrackIndividual Viral Templates. A population of cDNA synthesis pri-mers was designed to prime DNA synthesis downstream of theHIV-1 protease (pro) gene, with the primer containing two ad-ditional blocks of identifying information (Fig. 1A). The firstblock was a string of eight degenerate nucleotides that created65,536 distinct sequence combinations (48), or Primer IDs. Thisregion was flanked by an a priori selected three nucleotide bar-code, creating a sample identification block so that multiplesamples could be pooled together in a sequencing run (7). Adesigned sequence at the 5′ end of the cDNA primer was used forsubsequent amplification of the cDNA sequences by nested PCR.Viral RNA was extracted from three longitudinal blood plasma
samples from an individual infected with subtype B HIV-1 whowas participating in a protease inhibitor efficacy trial (M94-247)
(ref. 37; Fig. S1). Approximately 10,000 copies of viral RNA fromeach sample were used in a reverse transcription reaction forcDNA synthesis and tagging using the Primer ID. The cDNAproduct was separated from the unused cDNA primers, and thenthe viral sequences were amplified by nested PCR and sequencedon the 454 GS FLX Titanium. Our data were distilled from totalreads of 20,429, 24,658, and 27,075 for the three time points (T1,T2, and T3, respectively). Raw sequence reads were assessed forthe cDNA tagging primer and a full length pro gene sequence(297 nucleotides long representing 99 codons), and when three ormore sequences within a sample contained an identical PrimerID, a consensus sequence was formed to represent one sequence/genome in the population (Fig. 1 B and C and Fig. S2).With these manipulations we generated 857, 1,609, and 2,213
consensus sequences, respectively, for the three time points (Fig.1C). The median number of reads per Primer ID was 6, rangingfrom 1 to 96 (Fig. S3A). The distribution of identical Primer IDsdid not form a normal distribution as would be expected if alltemplates were amplified equally. We saw a higher than expectednumber of single reads of Primer IDs; although we do not knowthe reason for this, such a result is consistent with different cDNAtemplates entering the PCR at different cycles. Because eachtemplate is individually tagged the different number of reads is anindication of allelic skewing, as noted this can be nearly 100-fold.In an analysis of a number of low abundant variants we saw a 20-fold range of representation through allelic skewing, with half ofthe variants up to 2- to 3-fold more abundant than the mean, andthe other half up to 5- to 10-fold less abundant (Fig. S4).We conservatively estimate the combined in vitro error rate of
the cDNA synthesis step by reverse transcriptase (RT) and thefirst strand synthesis by the Taq polymerase to be on the order of1 mutation in 10,000 bases, or approximately one mutation per33 pro gene sequences, based on an RT error rate of 1 in 22,000nucleotides (38) and a Taq polymerase error rate of 1.1 in 10,000nucleotides (39) but reduced by half because only the first roundof synthesis is relevant and a misincorporation at this step gives amixture. Later rounds of Taq polymerase errors should be largelylost through the creation of the consensus sequence. Thus, wewould expect 139 sequence misincorporations to be present inthe data set of 4,679 total sequences representing T1+T2+T3,and with an excess of transitions. These would be expected tooccur as 113 single copy single-nucleotide polymorphisms(SNPs) and 13 SNPs that appeared twice. We observed 98 singlecopy SNPs in the data set with a threefold excess of transitions,and with three-fourths of them being coding changes, which isconsistent with random mutations. We expect there to be lowfrequency SNPs in the viral population from rare but persistentvariants that are fortuitously sampled, and from the intrinsicerror rate of viral replication (the error rate during one round ofviral replication would represent approximately one mutationper 150 pro gene sequences; ref. 24). However, we cannot dis-tinguish real polymorphisms from the inferred background errorrate associated with the first and second rounds of in vitro DNAsynthesis. Thus, we have limited the analysis of population di-versity to SNPs that appeared at least twice in the data set (i.e.,linked to at least two separate Primer IDs), either at the sametime point or at multiple time points in the overall data set(Table S1). We have not corrected the data set for the presumed13 SNPs that appeared twice that are expected to be present dueto error even though this represents 33% of all of the SNPs thatappeared twice (13 of 39). Overall, 80% of the SNPs (i.e., anysequence change from the consensus that appeared at leastonce) in the total data set of 72,162 sequence reads were re-moved as error. Also, 60–65% of the sequence reads wererevealed as resampling. Finally, allelic skewing of up to nearly100 fold was corrected (Fig. S4).
Longitudinal Sequencing of the HIV-1 Protease (pro) Gene in anUntreated Individual Reveals Dynamic Changes in Genetic Variation.We analyzed the sequences of the pro gene populations to assessallelic frequency at the two sampled time points, separated by 6mo
pro pol
NNNNNNNNNNNNNNN NNNNNNNNNNN NN NN NN NN NNNNNN BARBARBARreverse complement Primer ID Barcode PCR priming site
vRNA 3’
primer 5’A
B CPrimer ID Barcode
CATAATAC TAGCATAATAC TAG
CATAATAC TAGCATAATAC TAG
CATAATAC TAG
CATAATAC TAG
CATAATAC TAG
Consensus sequence
Raw sequence readsSample
Ritonavir
Totalreads
Consensussequences
T1
-
20,429
857
T2
-
24,658
1,609
T3
+
27,075
2,213
Fig. 1. Tagging viral RNA templates with a Primer ID before PCR amplifica-tion and sequencing allows for direct removal of artifactual errors and iden-tifies resampling. (A) A primer was designed to bind downstream of theprotease coding domain. In the 5′ tail of the primer, a degenerate string ofeight nucleotides created a Primer ID, allowing for 65,536 unique combina-tions. An a priori selected three nucleotide barcode was designed for thesample ID. Finally, a heterologous string of nucleotides with low affinity tothe HIV-1 genome was included in the far 5′ end for use as the priming site inthe PCR amplification. (B) PCR biases and sequencing error are introducedduring amplification and sequencing of viral templates. Repetitive identifica-tion of the barcode and Primer ID allow for tracking of each templating eventfrom a single tagged cDNA. As errors are minor components within the PrimerID population, forming a consensus sequence directly removes them, andcorrects for PCR resampling. (C) HIV-1 RNA templates isolated from plasmasamples from two pre- and one postintermittent ritonavir drug therapy weretagged, amplified, and deep sequenced. Tagged sequences containing full-length protease were used to create a population of consensus sequenceswhen at least three sequences contained an identical barcode and Primer ID.
Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20167
MICRO
BIOLO
GY
Dow
nloa
ded
by g
uest
on
May
30,
202
0
and before ritonavir (37) drug selection (Fig. S1). The combinedsequence population from the two time points (T1 and T2) beforetherapy consisted of 492 unique pro gene sequences with 155SNPs. About 4% (i.e., 21) of these unique gene sequences wereabove 0.5% abundance, and these 21 unique gene sequencesrepresented 67% of all sampled genomes, with the genome rep-resenting the overall consensus sequence comprising 21% of thetotal population (Fig. S5 A and B). The relatively small number ofunique gene sequences above 0.5% frequency in the populationcontained only 7% of the 155 detected SNPs. Thus, a large pro-portion of the viral population’s diversity was associated witha large number of pro gene sequences that were present at lowabundance (Fig. S5 A and C); conversely, the majority of thepopulation consisted of a small number of SNPs. Similarly, Taji-ma’s D statistic for T1 and T2 in this individual were −2.35 and−2.31, respectively (Table S2), indicative of a population structurethat has an excess of low frequency polymorphisms. This pattern isconsistent with but more extreme than that observed in a priorshallow intrahost survey in which a metapopulation model wasproposed to explain the pattern of Tajima’s D statistic (40). Fig. 2shows the encoded amino acid variability and synonymous nucle-otide variability present in two or more individual genomes acrossthe 99 codons in the pro gene for these samples.Synonymous variability. There were 57 codons (with 63 variants/SNPs) that contained synonymous diversity that appeared inboth pretherapy time points, and 30 codons (with 31 variants)that appeared in only one time point. Taken together, 75 of the99 codons contained some level of synonymous diversity (Fig. 2and Table S1). Of the 63 variants that were present in bothuntreated time points, 92% were transitions. Of the 31 variantsthat appeared in only one of the time points, 71% were tran-sitions, representing a significantly smaller fraction of transitionsthan among the synonymous variants that appeared at both timepoints (P = 0.012; Fisher’s exact test). This suggests that syn-onymous transversions are selected against over time.Nonsynonymous variability. There were 26 codons (28 variants) thatcontained coding variability that appeared in both pretherapytime points, and an additional 28 codons (33 variants) withnonsynonymous changes found in only one of the time points.Taken together, 49 of the 99 codons contained some level ofnonsynonymous diversity (Fig. 2 and Table S1). For the 28nonsynonymous variants detected at both time points, 22 weretransitions, and these mostly represented conservative aminoacid changes. In the case of synonymous mutations two-thirds ofthe variants were present at both time points, whereas in the caseof nonsynonymous mutations, less than half were present at bothtime points (P = 0.012; Fisher’s exact test). This observation
suggests that, at this level of sequence sampling, we are able tosee a difference in stability within the population in comparingsynonymous and nonsynonymous substitutions.Genetic fluctuation. We compared the stability of minor SNPspresent at both T1 and T2. A total of 14 of the 91 SNPs (syn-onymous and nonsynonymous that appeared at both time points)had significant changes in abundance between the two timepoints (χ2 test with a false discovery rate of 0.05). Of the 14 SNPswith significant changes in abundance, 11 had a decrease in theabundance, with an average decrease around 7.5-fold. Therewere three SNPs that had a significant increase in abundance, allof which were synonymous, ranging from a 4- to 47-fold increase.Although a majority of SNPs that changed in abundance hada decrease in the frequency between T1 and T2, on a populationlevel, there was not a large change in diversity between the twotime points (T1 π = 0.0080, T2 π = 0.0079; Table S2). However,the trend of increased abundance at the three sites may be drivenby selection of cryptic epitopes in an alternative reading frame(see Discussion).Significance of rare variants. We observed two extremes in terms ofbiological relevance in the untreated population among variantsdetected as at least two independent sequences across the threetime points. At one extreme was the detection of nonviablegenomes in the form of a coding variant at position 25, whichmutates the active site of the protease, and the detection of ter-mination codons at positions 42 and 61 (Table S1). At the otherextreme was the detection of the L90M and V82A variants (attime points 1 and 2, respectively) that became the major resistancepopulations after ritonavir therapy was initiated (see below, Fig.3); in addition, V82I and V82L were detected at T2. We foundtwo more examples of primary resistance mutations at low abun-dance, K20R at all three time points and M46I at two time points,but these did not grow out in the presence of ritonavir (Fig. 3 andTable S1). Similarly, fitness compensatory mutations were alsodetected at low abundance (L10F, M36I, L63P, A71T, and V77I),all below 1%, and only L63P increased (modestly) in abundanceafter exposure to ritonavir. More generally, of the 28 substitutionsmost closely associated with protease inhibitor drug resistance (41,42), we found 10 such variants, half of which were detected at bothpretherapy time points (Table S1).
Assessment of Linkage Disequilibrium (LD) Within the HIV-1 pro GenePopulation. We measured LD for the sequences in the T1 and T2populations. We identified very few examples of LD at these twotime points using the Fisher’s exact test with a Bonferroni cor-rection. Of the 103 polymorphic sites in T1, only three pairs werein significant LD. Similarly, in T2 with 118 polymorphic sites, only
+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
0.1 1
10
0.1 1
10
+ * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r + * s r 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
0.1 1
10
0.1 1
10
NS
S
NS
S
S
NS
Fig. 2. Frequency of codon variation across all 99 positions in protease over three time points. Within a codon position, the first two bars represent untreatedtime points 1 and 2, respectively. Bars 3 and 4 are the third time point split based on the presence or absence of the resistance mutations to ritonavir. Bar 3 isthe population of susceptible genotypes (defined as not V82A, I84V, or L90M), and bar 4 is the major resistant variant, V82A, population. Upward facing barsare nonsynonymous changes (scale in regular typeface), and downward facing bars are synonymous changes (scale in bolded typeface). Within a codonposition, different shading represents different SNPs.
20168 | www.pnas.org/cgi/doi/10.1073/pnas.1110064108 Jabara et al.
Dow
nloa
ded
by g
uest
on
May
30,
202
0
four pairs displayed significant LD. A positive D (i.e., linkage) wasfound for six of the seven pairs in the untreated populations, withone pair associating at a lower than expected frequency. Overall,LD did not appear to play a significant role in defining the pro genepopulation in this late stage individual, with only a single pair ofSNPs showing linkage in both of the time points.
Detection of Multiple Drug-Resistant Alleles After Exposure to Selec-tion by a Protease Inhibitor. The third plasma sample we examinedfrom this subject was from a time point (T3) after the initiationof therapy with the protease inhibitor ritonavir. It is apparentfrom the cyclical pattern of viral load and self-report that thisperson had incomplete adherence to the drug regimen (Fig. S1).Thus, we expected selective pressure from the drug to disrupt theviral population but not to select for the more homogeneouspopulations that are associated with virologic failure solely dueto the appearance of drug resistance. The choice of this sampleallowed us to look at the evolution of resistance and the per-sistence of polymorphisms in both the resistant and nonresistantportions of the population. Over two-thirds of the sequencesfrom T3 carried a resistance mutation, with ∼50% of thesequences carrying the V82A allele, the most common resistancemutation associated with resistance to ritonavir (43).There were two divergent paths for population diversity at the
third time point. For the large V82A-containing populationthere was a general trend of decreased diversity (π = 0.0069),consistent with the expected bottleneck associated with fixing adrug resistance mutation. In contrast, the diversity in the coex-isting drug sensitive population was higher than the drug-re-sistant population and comparable to the earlier time points (π=0.0082; Table S2).Although V82A is the most common resistance mutation asso-
ciatedwith ritonavir resistance, the I84V allele andL90Mallele canalso be selected and in combination with V82A can confer a higherlevel of resistance (44). We detected all three of these distinct drugresistance alleles in the T3 sequence population, collectively rep-resenting 69% of the total T3 population: V82A (50% of the
population), I84V (5%), and L90M (14%). These three resistancemutations appeared on different genomes, with only a single ex-ample of a sequence with two of these resistance mutations (V82A/L90M). In total, there were 136 unique sequences carrying theV82A mutation (all with the GCC Ala codon), 29 uniquesequences carrying the I84V mutation (all with the GTA Val co-don), and 36 unique sequences carrying the L90M mutation.There were also small groups of pro gene sequences in T3 that
appear to be the result of selection by ritonavir. Two othersubstitutions at position 82, V82I and V82L, were detected at alow level at T2 and also seen at T3, but now representing 1.3%and 1.1% of the population. V82F was also detected as 0.14% ofthe population at T3. Finally, the compensatory mutation L63Pwas detected at T1 and modestly expanded at T3, with half of thesequences in the V82A background (Table S2).An important issue is the number of times each of the resistance
mutations evolved in the presence of drug selection. The data areconsistent with the major V82A variant (42% of the V82 sequen-ces) growing out from the preexisting variant detected at T2. Forthe six genomic variants of V82A that each accounted for greaterthan 2.5% of the V82A population, all were on the background ofthe consensus except for the three different polymorphisms atpositions 19 and 70 (Fig. S6B). In total, these represented∼71% oftheV82A population and presumably arose via recombination withthe founding sequence (Fig. S6A). The remaining 29% of theV82A-containing genomes vary in relative abundance from 2.3%to 0.1%, including over 100 unique sequences that each appearedonce but to a large extent represent the variation seen at T1 and T2added on to the predominant V82A genotypes.The composition of the I84V and L90M populations were
similar to the V82A population. In each case there was a pre-dominant population defined by a 5′ polymorphism: the majorL90M lineage (69% of the L90M sequences) was on the G16G/L19V background (Fig. S6 C and D), whereas the major I84Vlineage (35% of the I84V sequences) was on the consensus se-quence background for the 5′ polymorphisms (G16/L19) (Fig. S6E and F). The next three most abundant I84V lineages, repre-senting 28% of the I84V sequences, differed from the mostabundant sequence by other 5′ polymorphisms (Fig. S6F). Sim-ilarly, the next three most abundant L90M lineages, representing14% of the L90M sequences, differed from the most abundantL90M sequence by 5′ polymorphisms (Fig. S6D). With the ex-ception of the 5′ polymorphisms and the resistance mutations, alleight of these lineages were in the consensus sequence back-ground. The remaining sequences are accounted for by the lowlevel variability added onto these major lineages.As noted above, the major V82A lineage was detected at T2 (as
a single genome), and this population was likely clonally amplifiedto form the large proportion of the drug-resistant population seenat T3 (Fig. 3). L90M was also detected on the same pro genebackground in the therapy-naïve environment at T1, and waslikely also clonally amplified to form the large proportion of theL90M sequences (Fig. 3 and Fig. S6D). In contrast, V82I andV82L were detected in the pretherapy time points on backgroundsequences that did not become the predominant sequence whenthese mutation modestly expanded at T3, although these twopopulations have complex mixtures of the 5′ polymorphisms,which may indicate low level persistence and recombinationduring the period of drug exposure. Finally, I84V and V82F werenot detected in either pretherapy population (Table S1).
DiscussionComplex viral populations can form within a host (45–47). Highthroughput sequencing technologies allow for extensive samplingof these populations (1–3, 5, 22, 48). However, these technologiesare severely limited when a PCR amplification precedes the se-quencing protocol, as each sequence read has the potential to bereported as an independent observation without properly con-trolling for PCR resampling, PCR-mediated recombination, al-lelic skewing, PCR-introduced misincorporations, and sequencingerrors. When working with pathogenic agents in clinical samples,
T1T2T3
V82
V82A
L90M
I84V
V82I/L/F
0.0030
TGG
TGTG
A_13_T3_V82A_G
ACC
CC
AT_15_T3_V82A_
AATGCCGC_10_T3_V82A_
ACGTTTAC_39_T2
TCGAAATA_21_T3_L90M_
CTCCCAGA_17
_T3_V82A
_
TGTC
TCTA
_35_
T2
CCCGATAT_29_T2
TGC
TAACA_15_T3_V82A_
TCTC
ACCC
_30_
T1
TCTGAAGC_29_T1
TCC
ATACA_15_T3_V82A_
GG
CTAA
GG
_17_
T1
CGACCGGC_7_T2
GGACATTT_12_T2
CACG
CCGT
_22_
T1
TTGG
TCCG
_42_
T1
CCGGTTCG_44_T1
TGAT
CCTA
_40_
T1
CAGCCCCG_19_T1
AACGTTCG_7_T2
TCGC
AGAT
_51_
T1
CTGT
CAAC
_23_
T2
TGCGTCTC_19_T3_V82A_
GAAATAGT_14_T3_L90M_
GG
ATTCAC_9_T3_V82A_
TCGCGCCG_13_T2
AAGGACTT_11_T3
AAGACCAG
_11_T3_V82A_
ACTATAAC_7_T3
TTCACAGT_
14_T3
_V82A
_
ACACCAAC_9_T3_
V82A_
AGCTTTAG_18_T2
CCGCGACG_51_T1
GAATAGTG_11_T3_V82A_
TCCGGTTC_28_T2
2T_0
1_GA
CCCC
CT
ACACGGCG_13_T3_V82A_
CTAT
CGCT
_3_T
3
TCCCGCGG_14_T2
AAATCAAG_14_T1
_A28V_3T_82_AATTG
GAC
CAAT
CCTC
_16_
T3
AAAC
GGAC_1
1_T3
_V82
A_
ATAGAAAC_15_T1
GACC
GCCG
_25_
T1
GC
ATGTAG
_15_T3_V82A_
TGCGCTTA_7_T3_I84V_
GTATAGCT_8_T3
CTGCCTGC_12_T2GCCCCCAC_16_
T2
CGACAGCC_41_T1
ATTGGGGC_11_T2
ATGCACAA_79_T1
GCA
TCG
GT_
22_T
2
AAAC
AATA
_4_T
3_V8
2A_
CTCCGCGA_15_T2
_A28
V_3T
_42_
TCT
TAAT
A
TTTGTAAG_42_T1
GGTCCCCC_7_T2
CCGAGAAT_24_T2
TCTAAG
AA_26_T3
CTCGCCCC_16_T2
CCGCCCGC_32_T2
CCCG
CTCC
_40_
T1
CCCTCCCA_27_T
1
ACTGCTCC_24_T2
GGAGTTTG_5_T3
CCATACCT_15_T3_I84V_
GAGCGTTA_29_T3
AAATGTAT_7_T3_I84V_
CCTT
CTAT
_45_
T1G
ACTTCCA_8_T3_V82A_
TAAG
ACAT
_12_
T3
GTCTCGCT_26_T2
GAT
GTA
GC_
6_T3
GCACCGTC_19_T2
ATAC
CAAT_
10_T
3_V8
2A_
AGGAT
AAG_6_
T1
GAACATAG_7_T3TTAACTCA_21_T2
ATTCTCCA_20_T3
ACTATGG
C_7_T2
CTTA
TCAT
_12_
T1
AATACCCA_8_T3GCCGACCT_3_T3
CCTG
CCGC
_14_
T2
GCCGCCCG_4_T2
AAATGGCC_3_T3
CC
GG
AAAT_13_T3_V82A_
CTCCCCGT_2
4_T1
ATAAGCGC_9_T3
TACG
AAAC_13_T3_V82A_ _A
28V_
3T_3
2_TT
CTT
CGA
CCCAGTCG_15_T2
CCTC
GCG
A_3_
T3_V
82A_
GGAGTCCT_8_T3
GCGCCCGC_54_T1GAGGTTTC_16_T3_V82A_
TATGGAAA_35_T1
GTAAATCC_3_T3_V
82A_
CATGTCCC_11_T2
TCG
CTTG
C_10
_T3
CAAGATTA_15_T3_V82A_
CTAACAGA_4_T3GCCCGCTA_12_T2
CGAACTAA_26_T1
GCCCCACC_28_T2
CTTG
GAC
C_33
_T2
AGGCCTTT_3_T1
AGCCTTGT_10_T3_V82A_
ACCT
CCCA
_14_
T2
CCCCCCCA_6_T2
CCCGATAA_10_T3_V82A_
GTA
CTG
AG_2
1_T2
CCCTAGCT_1
7_T1
GCGCGGAC_4_T3
TCCCATGT_11_T3_V82A_
GG
ATCTAT_3_T2
ACTGCACT_4_T3_V82A_
GCTTCGTC_38_T1
TTGCGGGT_32_T1
CGCACCTC_8_T2
_A28V_3T_72_CTATAATA
ATCTC
CG
T_13_T3_V82A_
GTAG
TTTG
_20_
T3
GCTG
GTCT_31_T1
ATTG
CTTT
_30_
T1
CCCGCATC_5_T2TCATCTTG_10_T3_V82A_
TTTGGTAA_9_T3_V82L
TTCTCAGC_18_T2
CGTAAGCA_24_T3_V82A_
AAAGTATA_16_T3_V82A_
GAG
AGCTA_9_T3_V82A_
ACTA
CTTT
_20_
T1
CG
ATTC
TC_3
4_T3
_V82
A_
CTAA
TTAA
_29_
T1
GGTCGGCC_8_T2
CTGTTCCT_6_T1
CGATAACC_45_T1
GCACAACG_12_T3
CTCGTAAA_15_T3_I84V_
AGAGTATG_4_T3_V82A_
ACCCTGCA_6_T2
AACG
GAAC
_14_T3_V82I
GATCGCCA_3_T1
CCCT
GCCC
_14_
T2
GTG
TGC
TA_13_T3_V82A_
CCAAAACC_15_T2
GAAAAAGA_8_T3
AGG
CCCT
C_4_
T2
GCGTAGCG_21_T2
GCCCACCG_17_T2
GAGTTAGA_14
_T3_V
82A_
GGTAGGGA_38_T1
ACCACGTG_23_T2
AAGCGCTG_12_T3_L90M_
TTCCGCCT_16_T2
CGACCTAT_23_T1
TGAATGGT_21_T3_V
82A_
GTTCCCTT_
5_T3_V
82A_
AATCG
CC
T_19_T3_V82A_
TTTAATTA_16_T3CGCT
GG
AA_1
4_T2
CAGCCTTC_3_T
2
CTTGTC
TT_5_
T1
TACTGTTA_10_T3_L90M_
AGAACAAT_9_T3_V82A_
GCGC
GCGC
_4_T
2
GACGTTTT_17_T1
TACC
TCAT
_8_T
2
TAATAGTA_5_T3_L90M_
AGCA
GCGG
_9_T
3
CCAACAAG_17_T2
GTGAGAGT_10_T3
GCTTGGCG_24_T1
GTAT
TGGC
_6_T
3_L9
0M_
TCGAGGGA_3_T2
CGAT
GGCC
_13_
T2
GACC
CACG
_5_T2
ACCTAGCT_36_T2
CCAGGGTG_18_T2
TAGCGGGA_5_T3_V82A_
CCCG
ACCA
_3_T
2
TACTTACC_4_T3_V82A_
CCCA
TTTC
_6_T
3_V8
2A_
TGGTCTCC_14_T2
GAGC
CACG
_25_
T2
GTGAAAGT_17_T3_L90M_ GTT
TTC
AA_1
5_T3
CTGGTGTC_4_T1
ATTTTGAC
_14_T3_V82A_
TTCCTCGA_21_T2
AATTGCTG_29_T2
TACC
GG
TT_2
5_T2
ACTCGACA_18_T3_V82L
CCCCATAT_7_T1
AGAC
CAAT
_6_T
3_V8
2A_
CTAGACAA_17_T3_V82A_
GCC
AGAT
A_28
_T1
AAAACACT_11_T3
AAGG
CAGC
_35_
T1
CAAC
CACT
_23_
T2
TACGCTCA_12_T3_V82A_
CCCCTTTC_14_T2
AATG
AAGT_
11_T
3_V8
2A_
GTCCAAGT_15_T3_V82A_
AAAAGAAA_4_T3_L90M_
CG
ATTTAG_23_T3
CGG
CGCA
T_17
_T2
CTTAAC
AA_16_T3_V82A_
CCAT
GAAT
_41_
T1
CGCC
CCTT
_3_T
2
TGC
GAATA_12_T3_V82A_
TGCG
GTCT
_22_
T3
TGAGGTTT_20_T3_V82A_
TCAG
ATAT
_41_
T3_V
82A_
TAG
CAAA
C_8_
T3
_A28V_3T_52_AC
GGATTA
TGG
TCCC
T_21
_T2
CGTTTA
GG_11_
T3_V
82A_
CGGACTTC_26_T2GCCCACCC_23_T2
CGGACAAA_3_T1
CCCAAGAG_5_T2TGTGGCCG_22_T2
GAATATCA_3_T3_L90M_
TCCA
CGGG
_4_T
1
TTAAACTG_18_T3_L90M_
CG
AATA
AC_1
4_T3
_V82
A_
TGTCCCTC_36_T2
AAAAACTA_4_T3_L90M_
AGGA
TACT
_3_T
1
TGTC
AGCT
_23_
T3
CACGTGTA_11_T3_L90M_
TGCCTCG
A_13_T1
CACCGCCA_7_T2
CCATAACC_16_T2GTCCACCC_12_T
2
AACAATGC_13_T3_V82A_
TTCGTTCT_18_T1
TCAAAAGG_20_T3
ACATTTCA_38_T1
CCAACGCC_14_T3
TATCTACT_41_T2
AATC
ACCC
_28_
T1
TATATTTG_10_T2
TGCCTA
TA_9
_T3_
V82A
_
CAGAGTTG_15_T3_V82A_
GTTCCACC_3_T3_V82A_
CGGCCGCC_9_T2
GCCCCGCC_11_T2
GGCC
CGTC
_3_T1
AGGGAAAC_17_T3TTGTGCGG_9_T3_I84V_
AAAGACGG_15_T3
CTACCATA_31_T1
CCAG
CCAG
_24_T
2
GACGGACA_7_T3_V82A_
GGGGTAGT_7_T3_V82A_
TACCACGC_11_T2
GCATCAAG_27
_T1
GGGGCCCC
_19_T1
GCCTCCG
C_17_T2
AGTAG
AGG
_18_T3_V82A_
TAGCGTGA_37_T1
TTGTGCAG_4_T3_V82A_
ACTACGAC_41_T1
TTCCTACG_42
_T1
ACGGCAAT_20_T1CCAC
CGTG
_38_
T1
GCAACACC_14_T1
CCCCCTCG_15_T2
TGTCAATA_33_T1
ACGC
CGAA
_7_T
2
ATAA
CAT
T_14
_T3
CTCCCGCG_3_T2
GG
TTGG
TG_19_T3_V82A_
TGCGGACC_36_T1
TCGTGTTA_25_T1
GG
AATA
GA_
15_T
3_V8
2A_
ATATTGCC_3_T3_V82A_
TAGAG
TCA_14_T3_V82A_
TTAG
CTCA
_4_T2
GAG
AAAT
A_6_
T3
TGTCCGAC_30_T2
GACC
CCAC
_25_
T2
TTGGCCCC_4_T2
GG
GAACCC_7_T2
TCCT
CCG
T_6_
T2
TACGAGTC_20_T2
CCCGTCAC_16_T2
GCCCGGGA_14_T2
GACCGCCC_13_T2
TTATACCA_13_T1
GTAT
AGGT
_30_
T3_L
90M
_
TTATATTA_10_T3_V82A_
ACGCACCG
_17_T3_V82A_
ATAA
TCCC_1
0_T3
_V82
A_
GTGGTCGC_13_T2
GCCCTAGT_6_T3_V82A_
GCTACCAG_40_T1
CCCT
AGCA
_22_
T3
CCCCGTCA_15_T2
CCCATATG_27_T1GAGCGGCG_20_T2TAACTCAC_17_T2GGAGGGAC_4_T
2
CCGCGCAG_11_T1
CCGACTGC_20_T3
CGGTG
CAC_
6_T1
CCCGGCGT_25_T2
ATG
GG
AGA_
7_T2
TTTAACGA_7_T3_V82A_
GGTCTCTC_12_T2
CTCC
CCTG
_16_
T1
CGTCG
TAT_47_T1
TCAGTTTT_7_T2
GGGA
AATT
_11_
T2
CTG
ACCT
A_5_
T2
GCTCGACC_22_T2GCACGATC_25_T2 TTTCTCTC_19_T2
CTAATACT_11_T3
TTTCTTCT_26_T2
CCCCCCCC_20_T2
ACAAATTA_6_T3_V82A_
GCGCGCCT_19_T
1
ACTGAATA_10_T3_V82A_
TGCGGCGC_16_T3_V82A_
CTAAATTC_36_T1
ATTC
TAG
G_1
4_T2
AGAG
AGCT
_36_
T1AG
GTC
CCT_
28_T
2
GGATGCCT_19_T3_L90M_
CCCCTTTT_20_T1
TAGTA
AAA_32_T1
GCGAAACT_10_T3
CATAGCGT_29_T3_L90M_
TAAACCAT_2
2_T1
AAGGTTAA_16_T3_I84V_ATTCCCTG_31_T2
TATCCTTT_25_T1
CACACACC_4_T2CCAGACGC_32_T2
GCGGAAAG_10_T3_I84V_
GC
ATCAC
A_14_T3_V82A_
GAGACGAG_12_T3
TATA
AGAA
_5_T
3_V8
2A_
TCAAGACA_17_T3
GCTCCCAC_12_T2
ACTC
ACTG
_18_T3_V82A_
GACGCCAT_16_T2
GCGA
CCCC
_15_
T3
CAGAATCC_6_T3_V82A_
CCCG
GCGA
_15_
T2
ACCTACCC_3_T3
ACAT
CC
AC_2
0_T3
_V82
A_TA
TATT
AA_7
_T3_
V82A
_
CGCCCCCT_19_T2
AAACTAG
T_15_T3_V82A_
GTTC
AGTG
_20_T3_V82A_
ATTCCCGG_16_T2
ACAGGGCT_13_T3_I84V_
AACG
GAT
G_5_
T3
TCAGGCCA_14_T3_L90M_
TGTC
TGAT_2
1_T3_V
82A_
GAACCG
GG
_19_T1
CTCATAGA_10_T3_V82A_
CGATTT
TG_7
_T3_
V82A
_
CGG
CGAG
A_9_T3_V82A_
GCGCCCCT_17_T
3_V82A
_
CTCCGCGC_13_T2
ACTC
GG
AC_1
4_T2
CCCCACCC_6_T2
GAAACCG
T_10_T3_V82A_
CGGCCCCG_11_T2
CCTCCACC_3_T2
CCGCCCCG_36_T2
AACGCACT_32_T2
TGTA
CCTC
_9_T
2
TGCGCGCG_12_T2
ATGCCTTT_3_T3
CAAGAC
AT_1
2_T3
_V82A
_
TCGTGCCA_36_T2
TTTCCGAT_9_T3_L90M_
GTTAATTA_9_T3_V82A_
CAC
ATACA_17_T3_V82A_
GGAACCCT_8_T2
AACCCCCT_9_T3_V82A_
CACTAGTG_9_T3_L90M_
ACAGCGCA_29_T1
TCCCCTAG_5_T2
GACTTAAG
_9_T3_V82A_
GAGCCCTT_
15_T1
GGTTCCCG_48_T1
CCCGCACT_11_T2
AGTC
GC
GC
_17_T3_V82I
3T_71_ATTAATCT
CTCT
CACC
_12_
T3
GG
AAAATA_5_T3_V82A_
CTGCT
AGC_
17_T1
AACGGGGA_1
7_T1
TACTTAAT_8_T2
TTGAC
TAA_
8_T3
_V82
A_
CCAGAGAC_28_T1
CAAAGTCT_7_T3_V82A_
CTTTCTGC_11_T2
ACCCGTCT_31_T2
CCAC
ACAC
_37_
T2G
CTAG
TGG
_10_
T2
AAGCCATA_22_T1
CATC
CACC
_18_
T1
TCTCTGAA_14_T3
GTG
CCG
TT_1
5_T2
TCTA
CGGG
_28_
T1
CCGA
CCCG
_9_T
3
CCTCTCCT_12_T2
AGCCTCTT_25_T2
CCATTCCC_5_T1
ATCT
CGCC
_21_
T2
CCAA
CGCC
_34_
T1
CC
CC
ATGA_16_T3_V82A_
GTTCGGAG_12_T2GTTTATAA_6_T3
GAC
GTC
TT_7
_T3_
V82A
_
AGCGTCAG_17_T2
GATCG
AGT_
5_T3
_V82
A_
CCCC
GAC
T_16
_T3
GCCCCAGA_16_T2
CGTACGGA_12_T3
_V82A_
TTCGCCGT_18_T1
ACCCCCAG_10_T1TACTAGCC_12_T3_V82A_
CGAC
CTTG
_28_
T2
AGGTA
AAA_
3_T3
_V82
A_
TACCACTC_5_T3
CTCCATGC_5_T3_V82A_
CGTGGCTA_4_T3_V82A_
CGTCGGAA_17_T3
TCG
CGG
GT_
5_T2
GTG
TTTAC_13_T3_V82A_
CTTGCCCA_10_T2
CACAATCA_24_T2
GAATAAC
A_15_T3_V82I
CTCGCTG
G_14_T3_V82A_
GG
CCTG
TC_9
_T2
ACTA
CAAC
_14_
T1
TAGAGCGT_31_T3
ACC
TACC
A_14_T3_V82A_
GTTG
AGAT_5_T3
TCCGGATG
_29_T3
_V82A
_
CGAA
CCAC
_11_
T2
GTT
TTTA
T_3_
T3_L
90M
_
ATTGGCCA_14_T3_V82L
TCGGTGCC_18_T2
TTTGTGAC_9_T2
CGAGTAGA_7_T3_L90M_
GATGGTA
A_21_T1
ACTCCCCG_18_T2TGAG
TTCG_12
_T1TCGCTGCG_3_T2
GGTA
CCTC
_7_T
2
GCCCACCT_11_T2
CCGCACTC_51_T1
CCCA
CTCC
_7_T
2
GCACCCCA_12_T3_V82A_
TTTC
GACC_11
_T3_V
82A_
GGACGTCG_31_T1
TCG
GTA
AC_3
_T3
CCACCCGT_47_T2
AGTTGTCG_4_T2
TCCTGACT_11_T3_V82A_
ACTT
GAT
A_11
_T3_
V82A
_
GTATTG
CG_25_T1
AACCCAGC_3_T1
CAAA
CCCT
_16_
T2
CTTTTGCA_10_T3
TTAC
CGAC
_7_T
2CG
CATC
AT_7
_T2
ACTGTCAA_4_T3_V82A_
TCAATGAT_18_T3
GCCGCCCC_6_T2
CGTTTTCT_4_T3_V82A_
CCTGAATA_25_T3
CCCAGCTG_8_T2
TCCAGGCG_4_T3_V82A_
CCGC
TGCG
_20_
T2
GATGCGTG_7_T2
ATC
TCTT
A_11
_T2
CCCAGTAG_3_T2
AAGCTCCT_15_T3_I84V_
TGTCGGGC_16_T2
GCCG
TGCC_16_T3_V82A_
TGCGCGCT_6_T2
TCG
GAA
CC_2
9_T2
TAGCGATG_3_T3_V82A_
CCTCGGTC_15_T2
CCCC
ATAC
_3_T
3_V8
2A_
TCGCC
GTG_7
_T3_
V82A
_
TAGG
CAAG
_25_
T2
TACAATCT_7_T3_V82A_
CTTT
GCGC
_6_T
1
CTACCCCC_30_T2
CCCCCCAC_26
_T1
CATC
CC
AT_20_T3_V82A_TTTTC
AAA_19_T3_V82A_
GCTAGGGT_23_T2
ACTTCCAG_9_T2
CCAATCCG_5_T2
GGGCCCAT_12
_T3_V
82A_
GG
GATC
AT_15_T3_V82A_
AAGAAGAA_26_T3
CGAGCTTA_7_T3_L90M_
TGCG
AAAG
_10_
T2
CACT
GACA
_3_T
3
CCTG
GTCT
_18_
T1
CCTG
CCAA
_9_T
2
TGCGCCCC_18_T2
TGATTAGC_21_T2
CCCCCGCC_18_T2
TGGGGCGG_17_T2
CTAT
CTCG
_34_
T1
CTCCTGTG_21_T3
AGAT
GCAC_8
_T3_
V82A
_
CACAAGAC_3_T3_V82A_
GGTCACTC_16_T2
GC
CATG
TA_13_T3_V82A_
GGAGACAA_3_T3_I84V_
CATT
AAGC_
6_T3
_V82
A_
ACTA
CATC
_14_
T3
AATGCAG
C_6_T3
GAGCCTGA_6_T3_L90M_
_A28V_3T_7_AG
CAGATA
GAGC
TTTG
_40_
T1
TTGGACCG_13_T3_V82A_
TACT
TACT
_18_
T2
CTCCCATA_35_T1
CCCACCGA_33_T
1
TGAG
AAAA
_9_T
2
CACG
CCCG
_3_T
1
AACAACTG_9_T3_V82A_
AGCC
ACCT
_28_
T2
GCTC
GGTC
_7_T
3
CTGG
ACGT_7_T3
CGG
AACC
C_20
_T2
GTC
CCCT
T_6_
T2
CCTC
GGGT
_11_
T2
GTCC
CCGG
_8_T
2
GCTCACCG_13_T2
CGG
ATAAC_12_T3_V82A_
GCCTCCGT_6_T3_V82A_ACTGAACC_24_T1
TGAATTAC_4_T3_V82A_
TTCTCCAA_4_T3_V82A_
GACG
GCCT
_20_
T2
CATAT
CAA_9
_T3_
V82A
_
TCTTGACC_8_T
2
AATACTCT_5_T3_V82A_
CTCCCCAA_6_T1
TAGTAACC_15_T3
CGGATAGC_8_T2GGGCAGGG_14_T2GATGTTAC_3_T3
ATGTACTG_14_T3
GGGACCCT_15_T2
GCAGCCAC_29_T2
CAGCCGTG_9_T2
AATGTTG
T_7_T3_V82A_
_A28V_3T_71_TAC
CAAC
G
ACCCACTC_21_T3_L90M_
TTGCAGAG_3_T3_V82A_
CATT
CGGC
_9_T
2
AAACTAGT_44_T1
TACCAC
GA_8_
T3_V
82A_
GACCACAC_3_T2
GCCCGGCC_25_T2
CGTCTAGT_55_T1
GCCTGCCA_4_T2
AGCC
TCCG
_18_
T2
CCCGCCTG_51_T1ACACGGCC_18_T3_I84V_CTGTTGCT_7_T1
CTCT
TTAA
_20_
T3
GCAAGCCA_3_T2
CCCATCTT_40_T1
GTATTGTG_17_T1
TCACCCCT_18_T2
GACT
GTAA
_35_
T1
GTTTATTT_17_T3_V82A_
GGAG
CCCC
_17_
T2
ACCAA
AAT_
13_T
1
GCCTTCAG_6_T2
CGCTCCCG_5_T2
TCTC
TTCC
_8_T
2
TTGT
CGTT
_6_T
1
AAAAGAAT_7_T3_V82A_
GTTTACGT_29_T2
TTGAGCCC_11
_T3_V82A
_
CGGGAGCC_17_T3
GCAC
TCGA
_35_
T2
TGAGTCAA_39_T3_V82A_
CGAT
GATT
_27_
T3
TCCGCGGG_19_T2
TATGG
ACC_8_T3_V82A_
GC
GC
TAGA_15_T3_V82A_
CTCC
TCAC
_22_
T2
CCGCACCC_11_T2
ATCTTAAT_23_T3_L90M_
ACGAGGGA_3_T2
CGAAATCA_9_T3_V82A_
GTAAGCGA_22_T3_I84V_
AACA
TTCT
_5_T
3
TAGAGGTA_20_T3
ACTTTTAT_25_T1
TCCC
TCTC
_7_T
3_V8
2L
CGTCCGGT_27_T
1
TTCA
CCGA
_13_
T3_L
90M
_
GACTCCCA_2
8_T1
AAAT
ACC
T_9_
T3
GCTG
GACT_5_T3_V82A_
CCACTGGC_7_T2
ACCCCGCC_15_T3_V82I
GTA
GG
TTC_
17_T
3
CCG
GAC
TC_1
8_T2
TTAA
AAGC_10
_T1
CCCCACAC_12_T3_L90M_
CGATGGCA_5_T1
TCTCGCGC_17_T2
AGACTCTA_41_T1
TTTTCCGT_17_T2
CCGG
TGCA_5_T1
CTTGCGAG_4_T1
ACACCATC_11_T3_L90M_
TCAC
CAG
A_12
_T3
CCTATTCC_17_T3
ACTA
TTGC
_42_
T1
AGCG
GCAC
_11_
T2
GACC
CCCG
_7_T
1
GTAGTAAC_29_T1
AAAG
GCGG
_26_
T2
CCGCCCAA_14_T3_V82A_
TAGACCCC_14_T3_L90M_
TGCTG
AAG_19_T3
_V82A
_
CG
TTCTC
A_13_T3_V82A_
TCTA
TAAT
_18_
T3
TTCC
TTCT
_6_T
1
CAGCACCT_11_T2
GTAAGTGG_16_T3
ATATTTAC_10_T1TC
ATTT
CC_1
9_T2
GAGGAAAA_23_T2
CAT
AGAC
C_1
4_T3
_V82
A_
CCGCGGGA_5_T2
CACTGCAC_26_T1
CTCGCATC_13_T2
CCGTGGCC_18_T2
CTACCCGC_15_T2
AAAA
TTGT
_3_T
3_V8
2A_
ACCAG
AAC_13
_T3_V
82A_
ATCATGCC_9_T3
CCCG
CCCC
_25_
T1
GGGA
CGGC
_12_
T3
GTTTAGGA_22_T2
CATCTCAA_43_T1CTCTAAGA_18_T3_V82A_
GAATATTT_35_T1
TAGTC
GTA_3_T3_V82I
CAGGGCGG_24_T2
CAAATTTG_18_T2
TCCCCGGT_14_T2
GAC
TTCC
T_14
_T3
TCCCGAAA_3_T1
TGTTTATC_9_T3_L90M_
CCCC
CTAG
_3_T
2
CTGGTGTT_38_T1
GCCCTCCC_24_T1
AATT
TTAT
_5_T
3
ACATTTCA_6_T2
TATCCGCC_16_T3_L90M_
CGGGCGTT_22_T2
TTTT
GTC
C_13
_T2
TAATG
TTC_1
2_T3
_V82
A_
GAGCACCC_12_T3_V82A_
TGGC
CTAG
_23_
T1
CACGCCGA_38_T2
ACCAATGC_6_T3_L90M_
AAAGCACG
_8_T3_V82A_
GGTTGGTT_18_T2
GCGTGAC
G_22_T3
_V82A
_
GCCGGGCT_4_T2
AAGTCG
GA_8_T3_V82A_
GATATGCG_19_T1
GATATCCT_5_T1
TCTGTAAC_10_T2
TCTGAGCT_33_T1
ATAAGCAT_3_T3_V82A_
AGTGATA
G_21_T3
_V82A_
CGACAGCC_8_T2
AGTC
AAG
C_7_
T3_V
82A_
_A28V_3T_52_AAACA
GGA
AGCA
CATC
_34_
T3
GCAG
GG
AC_23_T1
AGCATCAA_8_T3_V82A_
CCTGGGGC_7_T2
AGTACATC_3_T3_L90M_
CGAGGTTA_8_T3_V82A_
CCGGCTTC_13_T2
GGTCGGGA_10_T2
GCG
TCG
CC_3
1_T2
TCGGCCAC_5_T2
TAAAGTAT_15_T3_V82A_
CCCCGCCG_19_T2
GCGCCCGC_8_T2
AACTCTCC_39_T1
GGGTAGAA_3_T3_V82A_
GCCCAG
AA_27_T1
TATC
ACCT
_32_
T1
ACCTGCGG_32_T2
CTGCGTGG_10_T3_L90M_
_A28V_3T_92_AATAG
CC
G
GACATCGT_12_T1
CGTATGCC_18_T3_L90M_
AGGATCCG_18_T2
ATAAACGT_20_T3
AACACACG_9_T3_L90M_
CTGACCGC_30_T2
GGCCCCTT_25_T1
CCGG
TAAT_9_T3_V82A_
ACCACCGG_14_T2
TAGCATTC_3_T2
CTTAGTGC_9_T1
CCTACCTC_5_T3_V82A_
CGGCGAAT_6_T2
AGATTGTT_12_T3_I84V_
AATT
CATG
_19_
T3
GGGACAGC_23_T2
CGCC
GACC
_7_T
3
ACTA
GGGA_12_
T3_V
82A_
CTAAATCA_18_T3_L90M_
CC
TGTTAC
_14_T3_V82A_
TGCCCGTA_10_T3
GCC
TTTG
C_15
_T2
ACACTGCC_10_T3_V82A_
CAGTAAAA_16_T3_V82A_
ATTATCCC_3_T3
TGGACAAA_35_T3
GGGT
GAAC
_35_
T1
TATTAAGC
_17_T3_V82A_
CTCACACA_6_T2
GACAACCC_17_T3_L90M_
GC
GTTG
GT_21_T3_V82A_
AACCAGTT_13_T3_V82A_
GTC
ATTG
G_9
_T3_
V82A
_
GACGCGCC_9_T2
TGCC
GGGA
_15_
T1
GACGCAAC_70_T1
AGTCCGAA_12_T3_L90M_
TTAATATT_5_T3
ACCGGAGG_17_T2
ATGATGAG_31_T1
CCCT
TCCC
_11_
T2
CGAG
GAAA_10_T3_V82A_
3T_81_CACAAGAC
TTTCGCCG_3_T2
TATGG
ACG
_21_T3_V82A_
CTAAAGCC_17_T3_V82A_
TCCC
ACTG
_13_
T2
CGTGCGTT_37_T2
AGTTTTG
T_12_T3_V82A_
TCCCGCCC_7_T2
TGAT
CATA
_3_T
3_V8
2A_
GCACCTAA_23_T1
TTAG
TAAC
_22_
T1
GGGATCGT_9_T2
TTATG
ACC_32_T1
GCCCCCTC_9_T2
TGAAGCCT_14_T3_L90M_
ACTG
CGTC
_21_
T2
CCCC
CCGT
_6_T
2
GCCTTTCG_11_T2
AGG
ACAC
C_5_
T3
TCTA
CAAG
_16_
T3_L
90M
_
TTGC
CAC
A_19_T3_V82A_
TCCACTGA_3_T1
3T_4
_TAC
TCCC
T
CGAACCAC_8_T3_V82A_
ATTG
AAAG
_3_T
3
TCCATCGT_8_T3_I84V_
ACCACCCG_11_T2
GCGCGCCC_3_T2
CTCC
CCAG
_24_
T2
GCAGTACT_5_T3
GGTCATTC_19_T2
TCAATAAC_13_T1
CCGGATAC_19_T3_L90M_
CTTT
TAGG_1
0_T1
GGCCAGCT_9_T2
AATTGC
GA_15_T3_V82A_
TAAACTTC
_15_T3_V82A_
ACATTCTT
_19_T
3_V82A
_
CCCGTCCC_13_T2
TCGTATCA_31_T1
AATCCGTT_13_T3_L90M_
GTGGCGCT_13_T2
ATCTATAC_17_T3_L90M_
CGGCGCGT_11_T2
TTCCCATG_23_T2
AAAGAATG_12_T3_L90M_
GGCCAGCC_4_T2
TGG
CCCGT_12_T3_V82A_
TTATGTCA_5_T3_V82A_AATGG
GG
C_8_T3_V82I
CGCG
ACTC_13_T3_V82A_TCCCACTA_5_T3_L90M
_
_A28
V_3T
_32_
TG
CAT
GG
G
CCCTGCTT_3_T2
CGAGACAT_7_T1
TAGGCGGA_18_T2
GATCTACT_34_T1
AATACGAC_7_T3_V82A_
ACGGTG
GC_17_T1
ATTGTGTG_11_T2
ACACGGTG_8_T3_V82A_
CTCCACCA_13_T2
GTCCGTCC_4_T2
GCCACCGG_4_T3_V82A_
CGTATCGG_7_T2
GGAGGGGA_8_T2
2T_3_CACTCCCC
TTTCCTGA_13_T2
CGACCAGC_3_T2
AAGGCGTA_3_T3
ACTCACTT_23_T3
TCGCACCG_5_T3_V82A_
CCTG
CGGC
_3_T
1 TCCC
AAAC
_31_
T1
CGTTGACG_43_T1
GCGCAACG_30_T1
CGTA
CCG
T_15
_T3
AAACACTA_24_T3_V82A_
CGTCACTC_23_T1
ACCC
GCCC
_11_
T2
TCCAAGAT_6_T3_V82A_
GGGT
CCGC
_18_
T2
CCAAGCAC_10_T3_V82A_
GCCT
CCCG
_24_
T2
GTCTGCCC_33_T2
TGGT
ATGA
_17_
T3
GCTATGAA_4_T2
GGCCGCAA_9_T2
TACTTAAT_13_T3_I84V_
GATCTG
AA_3_T3_
V82A_
ATCCTAAA_11_T3
TGAATTCT_22_T3_L90M_
CATGCATT_4_T3
CCCTCTGA_29_T1
GCCGGAGA_8_T3_V82A_
CACA
ATAA
_16_
T3
ATTA
TTGG_1
0_T3
_V82
A_
ACAA
GCA
A_25
_T3
ACCCACAG_3_T1
ATTACTTA_5_T1
TACC
TACT
_22_
T2
CAAT
TCAA
_32_
T1
TGAGGGAC_10_T2
ACGTCCAC_17_T2
CCGTTGGT_3_T3_I84V_
TTACCAGC_35_T1
TTG
ACAA
T_26
_T3_
V82A
_
AAGG
TCCC_10_T3_V82A_
CTCTGCTC_24_T2
CGAG
TCAT_8_T3_V82A_
ACCT
TTAC
_4_T
3
CCAGGTAC_39_T1
AGAATTG
T_8_T3_V82A_
AAGAACG
A_19_T1
TTTATATA_11_T2
CCCCGCCA_4_T2
CCCTGTCA_22_T2
TGCCCGAA_3_T2
TACGGCCC_5_T1
ATTG
GC
AC_1
5_T3
_V82
A_
ATGAACCA_19_T3_I84V_
TACCCGGC_9_T3_
V82A_
TGTCACCG_9_T2
AGGA
AGAC
_8_T
3
CCACCCGT_8_T3_I84V_
ATACTGGT_9_T3
TGTTCGTT_15_T2
GTTTCGGA_16_T3_I84V_
_A28V_3T_14_GTA
CGT
GA
CGTGCACG_17_T2
GAAAACCC_21_T3_V82A_
CTTATCTA_18_T3_L90M_
ACCAT
AGC_11
_T1
CCACCGTC_48_T1
TCCC
TCTG
_5_T1
ACTTATCC_15_T2
TGGTTCCA_8_T2
CGACACGG_15_T2
AACT
TTAC
_5_T
3
CCTTTAGG_13
_T3_V82A
_
CCGCCCCC_9_T1
AAGCGCGT_16_T3_L90M_
CCGATCTA_36_T1
GAATAGCT_5_T3_V82A_
AACCCCCG_26
_T1
TTGATTCA_6_T3_V82A_
GATACAAA_6_T3_I84V_
2T_3
1_GT
GTCC
CC
ATCAAAAT_7_T3_V
82A_
AATCCTTG_11_T3_L90M_
CCCG
GG
TA_5
_T3
TTTTGATC_17_T1
GAA
AGC
TT_1
7_T2
TTCCGATT_3_T2
GCCACAGC_7_T1
GATGCACG_14_T2 GCACCACT_61_T2
GGAAAACC_24_T1
CTTTCCTT_8_T1
ATGTTTAC_16_T1
GAAGTT
TC_9
_T3_
V82A
_
_A28
V_3T
_42_
GAA
GG
CTT
ACCCAAAC_22_T1
CTTAACGC_22_T2
CAGCTATA
_17_T
3_V82A
_
GATCATGC_14_T3TTCAAGGC_15_T2
ACCACGCG_28_T2
ACACGCCT_12_T3_I84V_
TTTGCCGT_35_T1
GACC
TACG
_3_T
3
TTGC
GCCA
_3_T
2
TTGTC
GAG_8
_T3_
V82A
_
TGGCACGA_12_T2
CCCCCAAC_3_T2
CCTC
ACCA
_3_T1
ATAAGG
AG_13_T3_V82A_
TCGTTATA_5_T2
CGCTCCAT
_14_T1
GCCTCTTC_28_T2
GCTCGAAA_6_T3_L90M_
CGAC
ACAA
_7_T
3_V8
2A_
ACCTGTAC_32_T3_V82F
CTTC
AAG
T_17
_T3
TTCG
GGGC
_11_T
2
TCCCGGTT_5_T2
GAAACTAT_11_T3_I84V_
ACCTTCAA_9_T3
CACAGGTG_11_T2
AGCAAATA_31_T1
GCGTGTGC_15_T2
TAGTTCAT_3_T3
TAAA
CCG
A_15
_T3
ATAT
CC
AA_9
_T3_
V82I
CACAAGGT_16_T1
TCTGCAAT_16_T3_L90M_
GCCACGAA_9_T1
CTGCCAAG_24_T2
GGGGGTGC_8_T1
CCCAAAT
T_17_T
1
TCGG
CCCA
_21_
T1
GG
AATTTG_13_T1
CGGGCCGA_8_T2
GCCT
CATT
_14_
T3_V
82L
ATAGTC
AA_3_T3_V82A_
TATACAG
G_13_T3_V82A_
TCTGCATG
_17_T2
GGAATGTT_7_T2
GTAGCCAG_14_T2
GCCCGCTT_27_T1
AAAAAATC_5_T3_V82A_
TTAACTCT_13_T3_I84V_
CACTAAGT_2
3_T1
ATTCTGCA_31_T1
ACGC
AAAT
_18_
T1
CCTAGAAC_9_T3_I84V_
TAACGCAT_16_T3
CGTAAAAA_6_T3_V82A_
GCGTT
AAT_
5_T3
_V82
A_
GTATTACG_10_T3_L90M_
ACCACATA_37_T1
ACCCAGCT_7_T2
TCGATGTT_21_T2CGACCCGC_13_T2TTCCGCGA_21_T2
TAG
AATG
G_1
6_T3
_V82
A_
GCGGGTCT_4_T2
GG
AAAG
GA_
5_T3
_V82
A_CC
ACTC
GA_
6_T2
GG
ATCA
TA_5
_T3
CGACTAAT_19_T3_I84V_
TCCCCATA_19_T3
GTTAG
CC
T_16_T3_V82A_
ACGGGCCC_11_T2
TTAT
ATC
A_18
_T3
GGTGCCCC_17_T2
AAC
CAT
CA_
3_T3
_V82
A_
ATCTTTAT_3_T3_V82A_
GTGTGAGA_24_T1
TGCTGGTC_28_T2
ATG
CAAT
C_20
_T3
ACGG
CGCC
_17_
T2
AGCGTCTC_32_T2
TTCCCTAT_9_T2
CCCC
AACC
_26_
T3
TGGAGATT_16_T3_L90M_GCAAGTCT_6_T3_V82A_
TGCGCCGA_5_T2
AATTATAT_8_T3_V82A_
CCCGGCCC_11_T1
TGATG
CCC_19_T2
ACTC
TGG
T_7_
T3
TCTC
ACAG
_29_
T1
2T_7
_ATT
CCG
GA
AGCAGTAC_40_T1
GCTC
GGAG
_19_
T1
TAATGTAC
_14_T3_V82A_
GCCG
TGTA
_38_
T2
CACAACCG_35_T2
GCGATGTT_6_T3_V82A_
GCAGGAGC_10_T3_L90M_
GCCC
ATCT
_9_T1
GATATGGG_7_T3_I84V_
ATACTTTG_3_T3_V82A_
GCC
ATCC
C_21
_T3
AGCCGCAA_4_T2
TAGCCGGC_17_T2
CCCA
AGAT
_3_T1
CCCG
CGG
A_10
_T3
TACACATG_16_T2
CCGCGTCG_5_T2
CCGCACGC_12_T2
GTCG
CTAC
_38_
T2
CTCCGGAC_5_T2
GTACATTG_4_T3_V82A_
TTAG
TGAG
_22_
T2
CTCCCTCG_5_T2
AACTCGTC_31_T2
CCACACGA_8_T3_V82A_
ACCC
CCCC
_12_
T1
CCACTTAC_20_T2
TTCGGTAG_9_T3_V
82A_
ACCGTGAA_9_T3_L90M_
TGGGTTGT_9_T3_I84V_
GTA
GAC
AA_3
9_T3
_V82
A_
TAAGACAA_26_T1
CGGC
ACCC
_16_T
2
ACGCGTCT_5_T2
GCCCCGCT_3_T2
CTCAATAT_15_T3_L90M_
CG
TAGC
AA_17_T3_V82A_
AGCACTAC_30_T3
CGGATGTA_4_T2
TGAC
TGAT
_17_
T3_V
82A_
GGCCCAAC_9_
T3_V
82A_
TCAAGCTT_27_T3
TCCGGACC_15_T2
AGTCCAAA_20
_T3_V
82A_
TTTC
TCCA
_21_
T1
ACTCCATG_39_T1
AGCTTCTT_27_T1
ATCACCGA_4_T3_I84V_
TCACCCAT_40_T1
CGTATCCA_8_T3_V82A_
GG
GACATG
_9_T3_V82A_
CGGATGTT_22_T3_I84V_
ACTG
ACAT
_41_
T1
TTGCAG
TT_5_
T1
GAGCCCCC_15_T1TCGC
TCAC
_5_T2
AACACAAC_4_T3_V82A_
TGCC
TTCG
_26_
T2
GAGAG
AGT_
15_T3
_V82A
_
ACCCCTTC
_26_T1
CCCCACCC_5_T3
AAGGGGAT_17_T3
GAACTAG
C_10_T3_V82A_
TAG
AAAA
G_7
_T3
GCCATCCC_10_T2
ACTTCCCC_16_T2
GG
GCC
CTC_
19_T
2
CCCCCCAC_4_T2_V82A_
CTG
ATGAC
_17_T3_V82A_
TCAA
CTTT
_29_
T1
GTTCTTTA_42_T1
ATATTG
GT_7_T3
_V82A_
GTCACCCG_17_T2
CGGGTTCC_7_T2
CAATCCAC_40_T2CCTC
CCTA_24_
T1
CTGC
CCCT
_25_
T1
AGCT
ACTA
_7_T
3
AGAC
CCAA
_9_T
3
CTG
GCC
GC_
19_T
2
ATTTATCA_13_T3_V82A_
CCTCGTGA_5_T3
CGTC
ACCG
_17_
T1
TGC
GC
CTC
_21_
T3_V
82A_CC
TCAC
CG_2
0_T2
CAAG
CAG
C_7_T3_V82I
CCCCTCTC_13_T2
T2
TTACCAGC_10_T3_V82A_
TAAA
GGTC
_9_T
2
GGGG
GCCC
_3_T
1
ATCAC
ATG_3_T3_V82I
GCCCTGCA_19_T2
GTATT
TAT_
10_T
1
CGCTACAC_12_T1
ACTG
GCG
G_9
_T2
CGAC
CGCC
_19_
T2
TTCACCTC_11_T3_V82A_
ACGCGGCG_4_T3_V82A_
GCTA
CCCG
_20_
T2
AACCGTT
G_13_T
3_V82A
_
CCCACGTT_21_T3_V82A_
AGTGACCA_11_T3_V82A_
ATAA
CCAA
_5_T
2
AGTA
GTC
C_10
_T3
CCCG
ACAA
_29_
T2
TGG
CAG
CT_8
_T2 CCGCGCCG_17_T2
GG
ACG
GAA
_6_T
3
GCGG
GCTC
_5_T
2
TGGCCGTC_4_T2
ATCAAATA_5_T1
ACTCCGTT_14_T3_V82A_
AGAG
GG
CT_21_T3_V82A_
TACGGGAT_21_T2
GTGCACGA_17_T3
TCACCGCC_31_T1
AATTAGAC
_11_T1
TTTA
TATC
_11_
T2
TGGT
CCGG
_16_
T2
TGTCGCCT_11_T1ACTGTCGA_11_
T2
ACCTATCT_36_T1
CC
AATG
AT_1
7_T3
_V82
A_
CCCATATG_14_T3_V82A_
TTCTA
GAG_3_T3_
V82A_
GATAAAAG_9_T2
ACTCGGCT_27_T2
TTTT
TCTC
_12_
T3_L
90M
_
GTCGCCGA_7_T3_L90M_
TGTCGCGC_13_T2
AGCGACCC_28_T2TGGGGTGC_16_T2TGCCACCC_7_T2
CTCC
CCCC
_7_T
2
ACACTGAT_1
0_T3_V
82A_
CTTAAAAC_3_T1
CCAG
CCCT
_13_
T1
GACCCACC_6_T3
GTCA
GACC
_13_
T3ATTGACGT_3
0_T1
CTGCCGCG_19_T2
TCC
ATGC
A_17_T3_V82A_
CGG
TAG
AA_1
4_T2
GACTCGTC
_5_T3_
V82A_
AGGAGAAC_17_T3
ACAT
GGCG
_37_
T2
GGTCTCAG_27_T1
ACACCAAT_5_T3_L90M_
TCACTTAA_12_T2
GCGCA
CAC_
18_T
3_V8
2A_
AGCTGCAT_38_T
1
CATGATTA_24_T3
CATG
CCCG
_43_
T1
ACTCCCCG_4_T3_V82A_
TTCGGTGG_19_T2
TTAT
AGAT
_14_
T2
TGGGGCCC_3_T2
TAACTGCA_8_T3_L90M_
GCGAGTGG_12_T2ACTGCATG_16_T2
ACTA
TGG
G_1
2_T2
TGCGGGAG_27
_T3_V82A
_
TTGGCTCT_31_T1
TAACCGCT_23_T3_V82A_
ACCGTGTC_24_T1
GCCACGGC_3_T2
TTAC
TGTT
_10_
T3
ACTCGGAC_13_T3
_V82A_
GCCAGACT_26_T1
TTCAGGAT_29_T1
TGAC
AGGC
_9_T
1
TAATTAGA_6_T3
GAGTGGGG_30_T1
TAAGCCCC_8_T3_V82A_
ATGCCAGT_16_T3_L90M_
GCCACCGG_25_T2
TTCTTCGC_5_T2
CCCCCGTG
_17_T2
ATCAT
CCC_8_T
3_V8
2A_
GTCACCGC_8_T3
ACTC
TGAA
_13_
T3
AACCGCCC_17_T2
CC
TGTTG
G_19_T3_V82A_
GG
CCTGTT_8_T1
AACA
TGTA
_17_
T1
CCCC
TCCA
_22_
T1CC
CTCA
CA_1
7_T2
ATTACCCG_14_T3
ACGAGCCA_31_T1
ATTGTGGG_10_T2
TACTTAGT_14_T3_I84V_
CGCC
CACC
_29_
T1
CGGCGGTG_21_T3
CTCTTTCT_22_T1
CCTGTGGC_7_T1
CCCA
CAGC
_17_
T3
AAACGCCG_7_T1
GACGGGGG_7_T2
ACTACCCC_12_T2
TGACCAGG_26_T1
CCACTTCT_9_
T2
TTC
GAA
GT_
10_T
3_V8
2A_
TTGTC
AAT_35_T
1
AACTGCTG
_11_T2
TCAT
TAGG
_7_T
2
TGAGACTT_8_T3CCAAAACT_5_T3
TGTTTACA_3_T3_V82A_
TCTT
ACCT
_28_
T2
AAGCGGCG_23
_T1
CTGC
TCGT
_22_
T2
TCTCCCCC_9_T2
CCAACTTT_17_T3_V82A_
CCGAATAA_25
_T3_V
82A_
ACGGGTTA_6_T3_V82A_
AGAA
ACG
C_15
_T2
GCCCCCTG_20_T2
GGTC
CGTA
_15_
T3_L
90M
_
GGAGAAGC_5_T3_V82A_
CCACCTCC_96_T1
CGCCGCGT_6_T2
TCTG
GAG
T_11
_T3_
V82I
TCGTATAC_14_T3
TACC
ATTG
_5_T
3
GTGGAGTG_7_T3
AGCC
CCGA
_27_
T2
TCC
GC
AAA_14_T3_V82A_
TCTG
AGTG
_29_
T3_L
90M
_
AATAGTGG_16_T3_V82A_
TTCCACTA_6_T2GCTATGCG_11_T2
AAAA
CAGT_
14_T1
GACACTGC_14_T2
TTCATG
GT_5_T3
_V82A_
AGCG
CATT
_15_
T1
CCCCCCGA_14_T3_V82A_
TAGCC
AGT_
11_T
2
CCCACCTA_36_T1
GAC
ACTC
T_18
_T2
CGTAAACA_8_T3_V82A_
_A28
V_3T
_22_
TACA
CG
CA
CTCTGCCC_25_T2
ATCTTGGT_16_T2
GCCGCCGG_24_T2
TCGGCTCT_15_T2
GTAGGCAG_17_T1
GAGTATAA_3_T3_V82A_
TTAAATGC
_12_T3_V82A_
ACCCGCAC_11_T2
TGAGCCAC_10_T3_V82L
AAGCACCG_9_T2
GCAATCAA_16_T3_L90M_
CCTG
CATT
_14_
T1
AAATGTTA_14_T3_V82A_
CCTC
TTG
A_15
_T3
CTCC
AACG
_33_
T1
TTGCCTTA
_3_T2
CCGCTGAC_14_T2
CCTG
ATTC
_12_
T2
CCAGCACC_15_T3_V82A_
AAACCAAA_13_T3_V82A_
CCATCGCC_36_T1
CGAT
TCTT
_47_
T1CA
GCGG
GT_7
_T2
TTCGAGTA_18_T3_V82A_
ATGACTTT_7_T2
CC
ATAATA_18_T3_V82A_
GGCTCCGA_9_T3
TAAT
TAG
G_7
_T3
TCAGCGAA_13_T3_V82A_
TGTC
TTTA
_19_
T2
AGCCCTAA_21_T3
CAAAA
TAT_
5_T3
_V82
A_
ACGT
GTAT
_3_T
3
CGTT
TCGA
_31_
T1
CCGA
TTCC
_21_
T2
ATTGAGCA_12_T3
AGTCTACC_18_T3_V82A_
CCCCATAA_32_T1
GATGCCTG_16_T1
GCACCCCG_8_T2
3T_0
2_CT
ATC
GTA
CCCGGATA_3_T3_V82A_
GGATCCCT_10_T3
TCAC
CCCG
_5_T
1
GGCCTATA_8_T2
CAAACATC_6_T3_V82A_
TACGAAAC_12_T2
CCGCCGCT_17_T2
CTCTTGAG_37
_T1
CATC
AACT
_6_T
2
TTCCACTC_15_T3
CGTGTTGA_11_T3_V82A_
TCCCGTCG_29_T2
CTCTGACG_3_T3_V82A_
TTGTTAGG_13_T3
GATG
GAAA_11_T3_V82A_
CGGCATAA_34_T1TAAAAGAA_16_
T3
ACCCGATC_10_T2
TATTGGAA_23_T2
CTTGGGAC_21_T2
CGCC
CTGA
_8_T
2
CACAGAGC_9_T2
CATTCTAC_19_T3
ACACGAAC_20_T3
_V82A
_
TCCATCCT_41_T
1
GCGTTCCC_17_T2
ATGACGGG_14_T2
ACACCGCG
_11_T2
TTTA
GG
TG_2
0_T3
_V82
A_
CCTCTGCA_29_T2
GAA
CGAG
C_17
_T3
ATATCATC
_18_T3_V82A_
CGCT
TCTA
_34_
T1
TGATCAAA_19_T3_L90M_
ATCG
GGGT_7_T1
ACGTTTGT_41_T1
GAGAGAGT_29_T
1
TTATCGCG
_3_T2
CATTTG
GT_14_
T3_V
82A_
CCGTTCCC_15_T1
TAC
GAG
TA_1
7_T2
GGGC
CCTT
_6_T
2
GTTACCTC
_29_T1
TATGCGCT_17_T2
TCCGTTCG
_11_T3_V82A_
CCGTCCCC_7_T2
ACAA
ATCA_
7_T3
_V82
A_
GGCG
GAAG
_23_
T1
TGTACCGG_7_T2GCCT
CCCC
_17_
T2
GGACTGCG_35_T1
AAGCACCA_12_T3_V82L
TCCCCGTT_18_T2
AGGCT
AAG_
14_T
3_V8
2A_
TTGG
CCAA
_16_
T3
ACGGATCT_15_T
3_V82A
_
CGCT
ACCT
_24_
T1
TCGTAACT_9_T2
AATTACTT_4_T3
ATCGATG
G_14_T1
GGAC
ACCC
_12_T
2
ACCGCCAA_5_T3_V82A_
GGGA
GAAG
_14_
T2
TTAACCCC_5_T1
GGACGCGG_19_T3_V82A_
GTACCGAC_20_T3_L90M_
CAGG
GTGA
_12_
T3_L
90M
_
GGGGACCA_7_T3_V82A_ ACCC
CGCA
_15_
T3_L
90M
_
GAGCTTGG_3_T3
ACACCCGT_3_T1
AGG
TTCCC_10_T1
CTCACTAT_11_T2
CGAGGCGA_4_T1
ACTG
ATGC
_6_T
3
CTGCCCTC_34_T2CCAC
CAGC_17
_T1
CGCA
TACG
_17_
T2
CACCAAGG_11_T2
GGACATTG_31_T3AGCGGTCT_36_T2
TCCCCCCG_16_T1CCGAGCCG_39_T1
TAGCATGT_12_T3
TTCCCCGA_3_T1
AGCTTCCC_10_T2
GAAAC
TTT_11_T3_V82I
ACTGTCCC_8_T2
GG
GG
GTC
A_5_
T2
GTGAGACT_12_T3_V82A_
GG
GAG
AAG
_36_
T1
CCG
CATT
T_4_
T2
TCTCGGTC_32_T2
CTTAAGGA_16_T2
GTCCACGC_7_T3_V82A_
TGCACCTA_3_T3_L90M_
TGCG
TAAG
_21_
T3
GTCCCCGC_23_T1
TCACCCAC_21_T3
AGG
TGTC
T_11
_T2
CAGGTGGA_40_T2
CAACAACG_9_T3_V82A_
TCCGAGAT_28_T2
GCCCG
GG
A_11_T3_V82A_AAGGCGGG_26_T2
ACCCCACG_7_T2
CCCCGCCC_6_T2
ACCA
CAAC
_14_
T2
GATCCGTT_28_T1
CCCC
GCCA_1
3_T1
ATTGACTT_20_T3_V
82A_
GCACGAGT_19_T
3_V82A
_
AAACTTG
T_13_T3_V82A_
ACCCCAAT_12_T2
GTTTACGT_3_T3_L90M_
TTTGGGGT_17_T2
GTAGCTTA_9_T3_V82A_
ATTCCGCC_3_T3_V82A_
CCGTTCTC_11_
T2
TTAC
CGCG
_5_T1
TGGAGTAC_9_T3GAGAGGAG_18_T3
GAC
CG
TCA_18_T3_V82A_GGACCAGG_7_T3_L90M_
CACAACCC_11_T3_L90M_CCTCACCA_5_T3_V82A_
_A28V_3T_8_AG
CTG
GTA
ATGCTACC_7_T3_V82A_
ATTGGGCA_6_T3_V82A_
TACAC
TTG_13_T3_V82A_
GCGTTTGT_11_T2
2T_9
_CCT
CCTC
C
TTCTACGT_4
3_T1
TCCCCCGC_22_T2
CACT
CGCC
_11_
T2
GTAACGAG_24_T1
ACTC
CGTC
_16_
T2
CTAC
GGCA
_37_
T1
TTTAGGTG_13_T1
CGGT
CCGG
_7_T
2
TGCC
GG
GA_
9_T3
AGTA
ATAA
_15_
T3_I8
4V_
CGGCCCAC_27_T2
ACCG
TCTG
_30_
T2
ACACGTTC_20_T3_L90M_
CCGTGCTT_17_T3
GGTCTACA_5_T2
CTCGCGGC_7_T3_V82A_
TAAG
AATT
_10_
T3_L
90M
_
AGTTTGCA_10_T3_I84V_
TCATATTA_23_T3_V82A_
GCGT
AAGC
_24_
T1
CCCTGGAG_47_T2_V82I
ACATACTT_3_T3
ATTT
AAGA_
7_T3
_V82
A_
TTCTCGCG_41_T2
AACGCTAC_9_T3_V82A_
AACC
GACC
_25_
T1
TATGTAAT_25_T3_L90M_
CCTGTGTA_3_T2
CAGCCGAG_9_T3_
V82A_
CGAG
TAGG_33
_T1TC
CGTT
TC_1
1_T3
TACTA
AAA_21_T
3_V82A
_
TCGAGCCA_30_T1
CCAGGGTC_19_T3_V82A_
CC
CG
TCTT
_10_
T3_V
82A_
CATTCTTA_23_T2 TGCGGTGG_17_T2ACCAGATT_14_T
2
CCATTACA_22_T3_V82A_
CCCCCGTG_16
_T3_V
82A_
CCCT
CTAT
_20_
T2
AAAC
TGCT
_6_T
1
TAGGTAGA_40_T1
ATACAAG
A_13_T3_V82A_
CGCAGATT_31_
T1
GGTTATGG_17_T3_V82A_
ACGC
TACG
_11_
T2
CGCGGGCT_13_T2
TCC
AAAT
G_1
4_T3
_V82
A_G
AAC
CAG
T_15
_T3_
V82A
_
AGCGGCTT_37_T2
ACCACCGC_31_T2CGTGTCCC_11_T2
AACAATTT_26_T3_L90M_
TTAGCGTG_19_T3_I84V_
TGAAAGTC_3_T3_L90M_
AACCAT
TG_7
_T3_
V82A
_
CCAA
CTG
C_6_
T3_V
82A_
AAAACGCG
_9_T3_V82A_
TCACCATA_13_T3
TCCGCGTG_8_T3_V82A_
GGGC
ACAG
_9_T
1
GCCA
GGCT
_3_T
1
CTG
CTCT
G_1
5_T3
CCCGCTTC_22_T2
TATT
GG
TC_5
_T2
TCCTGGTC_15_T2
CTTCCGGT_9_T2
ACAGGCTT_8_T3
TTATAGTC_3_T3_V82A_
CTCCTCAT_3_T3 CGGCCGCA_31_T2
CAAC
AGGA_
6_T3
_V82
A_
TCTT
GTGG_15
_T1
AAAATAGT_7_T3_I84V_
CCAGAGGA_26_T2
AAGATAGA_5_T3_I84V_
AGGATCCC_27_T1
GGGACACG_25_T2
AGGAGCCC_4_T2
ATACAAGG_13_T2ACTTACTC_22_
T2
ACTG
CCG
C_8_
T2
CTAC
ACAG
_9_T
2AACGTTCA_4_T3_L90M_
ACTGTGCG_13_T2
CGCG
CCTC_13_T2
TATTGATG_3_T3
_A28
V_3T
_22_
TCT
ACT
AA
GATAGAGT_8_T3_V82A_
ACCCCCAG_11_T2
TAATTTGG_15_T3
AGCGTGTA_20_T3_L90M_
AACCTGAA_29
_T1
CCCCGACC_3_T2
AGAG
AGG
T_24
_T2
GG
CACT
TC_2
4_T2
CGGCCCCC_5_T2
AGG
AATAA_7_T3_V82A_
ACCCCTCC_15_T2
GGCTATCA_7_T2
ACTCTTGG_11_T2AC
GCCCC
C_6_T
1CCTC
GGGG
_3_T
2
AAATAAAA_12_T2
TTTT
AGAT
_17_
T2
CGCCCTTC_17
_T1
TCCCCGGC_49_T1
GTCGCACA_7_T3_V82A_
CTCCTTGT_4_T2
TAAGAC
AC_14_T3_V82A_
CCGAAGGC_3_T3_I84V_
CCACAGAG_22_T2
CAGCTAAC_14_T1
ACAAAAGA_14_T3_V82L
TCCC
ACCC
_4_T1
GTACACGG_15_T2
CCCG
CATG
_26_
T2
AATT
AATA
_30_
T1
CAATTTGA_9_T1
TAAAGCAA_3_T3_V82A_
CCGGTGCA_11_T2
GG
GTT
ATA_
5_T3
GAC
ACTT
A_17
_T3
AGAG
TGCG
_17_
T1
TTTA
TATG
_8_T
3_L9
0M_
TGCA
GCCC
_3_T
2
ACCACTCT_10_T3_V82A_
CCGCTGCT_6_T2
TTG
TCAA
C_4_
T3
ACTG
TCTG
_11_T1
GGCCTCCC_33_T2
GAG
TATGC_42_T1
CTTCGGAC_35
_T1
AGTACACA_6_T3_V82A_
GCCGGATA_15_T2
AGCG
CCCC_27_T1
GAAATCGA_14_T3
TCTGCGCG_5_T2
TCCTAGGA_23_T3_V82A_
AGTG
ATAA
_4_T
3
TTCC
ACAC
_17_
T2
CGGTACAC_4_T3_V82A_
GAGG
TAGC
_35_
T1
CATCTAAA_5_T3_L90M_
AAAACGCT_10_T3
CCCG
CACC
_31_
T2
CAGA
CCCA
_35_
T1
ATAA
CTTA
_15_
T2
ATCACGTC_15_T2
CCCCAGCT_15_T2
ATGC
GCCT
_31_
T2
CCAT
CTG
C_9_
T3
GAAATTTT_5_T3_V82A_
ACGGTTAC_6_T3_L90M_
AAACGAAG_3_T3_V82A_
CCCCGTAC_14_T2
_A28V_3T_03_GA
GATAG
C
CTAGGGTT_5_T3_V82A_
GGTC
CGAC
_19_
T2
CCGA
CCCT
_7_T
2
GCCACCCG_10_T3
CCGC
CCTG
_16_
T2
TGGG
GGAA
_17_
T1
TTCGTTTT_11_T2
ACGTGAGG_21_T1
AAAC
CAT
T_10
_T3_
V82A
_
GTGTATAG_15_T3_I84V_
TGTCTCCT_19_T2
CAATGACG_5_T3_V82A_TTCGACGT_18_T3
TATTCAAC_13_T3_L90M_
ACCT
AGTA
_7_T
3
CAGCTCTC_13_T2
ACCGG
ATC_36_T1
ACC
CG
TGG
_17_
T3_V
82A_
ATTG
TGGC
_14_
T2
AAG
AATG
G_1
2_T3
_V82
A_
ACTGACCT_18_T
3
ACAACCAG_5_T3_
V82A_
TCC
TCAC
C_20_T3_V82A_
CAAGGGAC_8_T3_V82A_ACGT
AGGT
_30_
T1
ACCAATAA_17_T2ATTATTGT_13_T3
AGGTG
CTT_8
_T3_
V82A
_
TATCGACC_6_T3_V82A_
GCTGCTTC_25_T2
3T_6_CCTTGCCC
CTG
GAG
TT_6
_T2
CGGGACGA_2
2_T3_V
82A_
GAGC
CGCC
_20_
T2
TTGC
TCAC
_18_
T2
CAGAACAT_15_T3_L90M_ ATG
ATAC
C_5_
T2
CCGGGGCG_9_T2
ACCGAT
TA_17
_T1
AATTACTA_11_T3
TAAA
CGCT_15_T
1
CGTTTGCT_18_T2
GTTCCCAA_8_T3
_A28V_3T_4_ACTAATA
G
CTTC
CTCC
_13_
T2
TACCCGAC_7_T2
ACGCCTTC_6_T3_V82A_
AGAA
CTA
G_1
3_T3
_V82
A_
TCCCCATG_7_T2
CAGA
ACCC
_11_T
3
CG
ACATTA_19_T2
AGCA
CTTT
_11_
T3
GTT
CCG
CG_1
3_T3
_V82
A_
AATA
ACGC_
6_T3
_V82
A_
CGTC
CGCA
_29_
T1
CCAGGACC_18_T3_V82A_
ACCCGACA_10_T3
_V82A_
CCCCCCCG_39_T2TCACCCAT_18_T2
AGCTAGTT_3_T3_V82A_
GGTCTTAC_29_T1
2
TATA
AATT
_6_T
3_L9
0M_
CGCC
GTCC
_17_T
1
ATCA
AGGA
_25_
T3_L
90M
_
TGAACCCA_3_T2
CCCC
CTGC
_3_T
2 GTGC
CCAC
_21_
T2
ACACGATG_21_T3_I84V_
TCTG
TATG_15_T3_V82A_
ACAA
TGGA_
6_T3
_V82
A_
TCAG
GACT
_21_
T3
AACTATG
C_12_T3_V82A_
ATGAGCCC_12_T1CAAGCTCA_12_T3
TTCCGACA_19_T2
CCACTCAC_18_T2
GGGG
TTGC
_6_T
2
ATGAAATG_8_T3_V82A_
CCCCCTGA_3_T3_V82A_
ACCC
CTAC
_10_
T2
CAAAGATG_7_T3_V82A_
CTGC
ACTG
_18_
T2
ACCCAACG_22_T2
AAAGACCG_5_T3_V82A_
CGCGTACC_5_T1
TCAAGCAC_6_T3_L90M_CCCATCTC_7_T1
CGACTG
GT_8_T3_V82A_
TATG
CTA
A_9_
T3
ACCCGAC
C_21_T3
_V82A
_
ACCA
CCCC
_8_T
2
CAAA
TCG
T_7_
T3
TCGT
ACCC
_3_T
2
GCCCGCCT_13_T2
AGCAGCCT_3_T3_V82A_
CGTTACTG_11_T2
CTAA
ATCA
_19_
T2
3T_01_TCGCA
GGT
CCTTGTTC_15_T2
TAATAAAA_10_T3
2T_9
_TGT
CCCC
GG
AAGTCCA_30_T1
ACCGTCCG_4_T2
GGCACGCA_2
2_T1
TTATGCCG_11_T2
GTGC
TCAA
_35_
T1
CCTC
GTTT
_11_
T2
ATGCTGAA_20_T3_V82A_
AAAGG
GG
C_10_T3_V82A_
CCCC
AGCC
_6_T
2
CCCGGTGC_3_T2
CAGGTT
AG_13
_T3_V
82A_
TCCCATCG_11_T2TCGATTCC_8_T1
CATTGGTG_34_T2
CATCACGT_18_T3_V82A_
ATCCCTAC_48_T1
CCTA
CACC
_31_
T2
CTCTAACT_19_T3_V82A_
CTCCCCCG_17_T1
CTACTATC_7_T3
ATGACCCT_12_T3_L90M_
AGTCTAG
G_9_T3
GATGATTA_50_T1
CGTGTCCA_7_T1
TTGCCGTG_7_T
2
TTTACTAT_11_T3_L90M_
TCC
CG
GAT_16_T3_V82A_
CCTT
GG
TT_2
6_T2
TGCT
AATT
_14_
T2
TCTCTACA_5_T2
TGCC
CGGG
_47_
T2
GGCGGTGC_24
_T1
GCCC
TGAG
_5_T
2CC
AGGC
AA_4
_T1
TTCGCACC_14_T1
CTATTCTC_7_T1
AGCC
ACAG
_4_T
2
CCCTGCAG_51_T3
AACTTACT_6_T1
ACGAC
TCG_21
_T1
ACCCCGCG_30_T1
GAATGGCA_9_
T3_V82A
_
CACCCCAC_18_T2
GAAC
GGGA
_9_T
3_L9
0M_
CCGA
GCCC
_6_T1
ACTGACAC_8_
T3_V82A
_
CCGC
CAAT
_15_
T1
GGATTTCG_33_T1
TATGGGAC_8_T3
ATACATTA_9_T3_I84V_
CCTCCTCC_11_T1
TTCG
GATG
_8_T
1
CCTAGCCA_
18_T3_
V82A
_
ACAA
GAC
A_11
_T3
ACTATATT_14_T3_V82A_
ATACCGTG_3_T3
TTCGCCAC_4_T3_I84V_
CGCATGCG_23_T3_L90M_
CTTTTCCT_15_T2
ATAG
CTG
A_8_
T3_V
82A_
CGTC
CGCA
_38_
T2
ATAGGTG
T_21_
T3_V
82A_
CATG
GCCT
_18_
T2
GTTTAG
CC_12_T3_V82A_
CATTCTTC_7_T2
ATACGTGC_6_T3_V82A_
GGCCGCCG_7_T2
CACACCGC_20_T3
_V82A_
TTCGTT
CT_12_T
3_V82A
_
CTCC
CGGA
_33_
T2
GCCGCACG_18_T2TAGGGATA_9_T3
3T_4
_TTC
TTCC
A
CGGTTGTA_24_T3
GGCCCACG_12_T2CCCCTATC_8_T
2
GTATCCAG_7_T3_V82A_CAACTGTA_25_T2 ACGAAATG_4_T3_V82A_
CCACGGCC_16_T2
TAGCCCAC_14_T2
AAAA
ACCT
_10_
T3_L
90M
_
ACGCGCAT_4_T3_V82A_
CCGGAGCC_3_T2
AGAG
ATAA
_36_
T3_V
82A_
TCATCTTC_31_T3
GTTTAAGA_23_T3_L90M_
ACCCACCT_10_T1
CTGCAC
AA_13
_T1
TCCGCCAC_13_T2
CATG
TCG
C_44
_T2
ACAC
AACC_20
_T1
GCCCCG
GG
_13_T2
ACCC
CCCT
_6_T
2
TATGCCCC_15_T2
GGTC
CGGT
_21_
T3
GAG
CACA
G_1
3_T3
GAAGCAAC_17_T3
AATTACTG_17_T3_L90M_
ACCA
TGAA
_7_T
2
TGCACCCT_11_T3_I84V_
CTCCTTAA_20_T3GTTGTCAG_5_T3_I84V_
AACGTCCT_12_T3
CTCTGCCT_20_T2
TCATGTGC_16_T3_I84V_
CCTCCACC_15_T3
AGCTCATA_16_T2
CTAAGATA_16_T1
AAACTAAA_11_T3_L90M_
AACC
CGTA
_15_
T3_L
90M
_
TCTGGTAT_57_T2
CCTC
TCGC
_13_
T2
GTGTAAAG_11_T3_L90M_
CACT
TCCT
_23_
T1
GGATGACA_28_T2
GG
TTG
TGA_
21_T
2
GAGATTTC_16_T3_V82A_
AACGTATT_10_T3_V82A_
TGTTTATG_6_T3
TCGCCTGA_35_T2
GCGCGGCG_5_T3_V82A_
TCGCGAGC_5_T2
AGAACCAC_12_T3GATGACAT_4
6_T1
CCCCTTGC_13_T2
CCTCACGG_16_T2
TTACCGCG_8_T2
TGACCACA_37_T1
AAATTAAT_5_T3_V82A_
AGG
GAAAA_11_T3
AATCCCCC_8_T1
ACGGCAAA_9_T3
ATCCATCC_12_T1
GCCCGGGG_17_T2
AGAG
TTTG_12_T3_V82A_
CATTTCCC_34_T1
GGCAAAGT_3_T3
CAACACGT_9_T3
TATC
GCGA
_18_
T3_L
90M
_
AGGACGAC_15_T3_V82A_
ACCG
TGG
C_8_
T2
TCCCGCCC_21_T1
ATAA
ATTG
_8_T
3
GGACGGGA_25_T2
CGGCCGGG_11_T1
3T_1
1_CC
GCTT
CT
CTT
TCAC
A_20
_T2
GCTCGGCC_23_T2 CGAGATAT_6_T3_V82A_
TTACCATG_7_T2
ACTTTTGG_5_T
2
CAGC
TGCT
_19_
T3
CGTAGCGA_8_T2
CGGACCCG_3_T2
ACTCACTC_21_T2
ATAGGTCG_13_T3
_V82A_
GATTTCAT_10_T3_L90M_
CCATCGAG
_26_T1
TCCAAACA_17_T2
TCGCTCGG_5_T3_V82A_
GCTGATCC_37_T1
CCTAGTAG_45_T1
GCC
GTA
TG_1
3_T3
TAAGTTAC_20_T3
CGAAGTCG_17_T3_I84V_
AATTTAAA_6_T3
ATCT
GG
CC_4
2_T3
TCAT
CGG
G_1
9_T3
_V82
A_AGC
TAACC
_21_T3_V82A_
ATTAGAAC_11_T3_I84V_
CTGAACAT_5_T3
AGTGCCCG_17_T2
TCTGGTCG_9_T2
CTCTAATT_3_T1
GCG
AACAT_12_T3_V82A_
ACCCGCCT_5_T2
GTGC
CAAT
_39_
T1
ACCC
CACA
_18_
T2
TTTACC
CT_11_T3_V82A_
CCCTCTCC_35_T2
GGCTACAC_12_T3GACCCCGC_6_T2
AGAA
GCGC
_9_T
3
TAAA
GTC
A_29
_T3_
V82A
_
ACACGTAA_59_T1
CGCCCAAA_13_T3
GTAGAAAG_8_T3_L90M_
TTGAGGGA_7_T3_V82A_
TCTCCCCT_26_T1
2T_82_TTCCAGCT
CCGG
TGG
A_11_T1
AAAACCCC_9_T2
CTAAC
TAC_12_T3_V82A_
TCTGTTCT_11_T2
CG
CG
ACAT
_13_
T3_V
82A_
CCCC
CCGC
_3_T1
CGTT
TTGC
_17_
T2
GCAAGACT_23_T2
CTCGCCCA_6_T2
GG
TAAT
TC_5
_T2
GCCTCCTT_8_T3
ATACTATA_18_T3_V82A_
CAGA
GTCT
_45_
T1
CGGGCAAG_42_T1
CTCGTACA_23_T1
CTTA
AGCG
_3_T
3
TGCG
CCTG
_14_T
1
CCTC
ATCA
_6_T
3
CCAACAGT_6_T3_V82A_
ACG
GAA
AA_1
2_T3
CACCTCCC_5_T2
CG
TATTTT_5_T3
GCACGTTG_9_T2
CGGCGCGC_12_T2
AGAC
GCC
T_16
_T2
GAATAGCG_6_T1
TACT
ACG
C_10
_T3
GTCG
GTCT_13_T3_V82A_
AAGTCATG_17_T2
TCATGACC_32_T2
TGTAG
TTG_7_T3_V82A_
CCG
GAA
AC_5
_T3
GCCC
GACG
_13_
T2
TTCTCTAT_18_T2AACAC
CCA_27_T
1
TGTT
GAAC
_5_T
3
AACACCCC_5_T3_I84V_GCC
CACA
C_42_
T1
CCTATTGG_13_T2
TATTTAAA_3_T3_V82A_
ATTAGTAG
_8_T3_V82A_
GCGA
GGGT
_3_T
2
CAAG
AGCT
_29_
T2
GTGCACGT_3_T3_V82A_
ACGGGATC_10_T2
CGCACTTC_3_T2
GATCAAAG
_11_T3_V82A_
TGTC
AAG
C_1
5_T3
_V82
A_
ACGGAG
GG_14_T1
GTCGCCAC_9_T1
TTTT
CGTA_19
_T3_V
82A_
AACCCCAC_5_T3_V82L
TGAT
GCCG_9_T3
_V82A
_
CTGCCATA_54_T1
CGCGAGCA_7_T3_V82L
TTGATCAG_8_T3_L90M_
AACAAAAT_12
_T3_
V82A_
ACCCCTCC_15_T3_V82A_
AACATGTA_10_T3
ATCAATAC_4_T3_L90M_
ATTG
TTTT
_9_T
2
GCGGTATC_31_T1
CAGTCGAC_12_T2
AGTCGCGG_23_T3_V82A_
CTATAGTC_14_T2
GACTGCCG_25_T2
CTC
GTAG
A_14_T3_V82A_
CGATGTTG_24_T2
CCCG
GGCG
_22_
T2
CTTGTTCG_15_T2
CTCA
CCG
C_3_
T3_V
82I
TATTCACC_11_T3_L90M_
CACCCGAG_24_T1
CGAAACCA_11_T3
_V82A_
AGCAGGAG_22_T3ACCTAAGT_7_T3
CGAAATGT_15_T3_L90M_
TTACTGGA_6_T2
ACCCTG
CA_21_T
1
TATTAGTC_23_T3_V
82A_
AAGAAATG_6_T3
GCGTAAGT_9_T3_L90M_
GACCAACC_15_T3_V82A_
TCCGGCCA_6_T2
TATTGACG_14_T3_L90M_
TAAG
AGGC
_23_
T2
TAAC
CACA
_11_
T3
CCGCCATT_2
0_T1
GACCCCTA_9_T3_L90M_GGATGAAT_18_T3_L90M_
CCGCCCCA_18_T2
CCATTAAT_2
3_T1
GCGCTTAA_15_T2
GGTAACGT_3_T3_L90M_
GATCCTCT_32_T1
AGCGGGAC_38_T1
GGCACC
CC_5_
T1
AGATATAT_15_T3_V82A_
GAAAACAG_9_T3_V82A_
CGCTCTAG_10_T2
GAC
GC
GC
T_17_T3_V82A_
ATG
AAAA
A_3_
T3
CTTAGCCC_25
_T1
TGAA
GCC
A_17
_T2
CGTAATTA_7_T3
CTAAGATG_14_T3_L90M_
TTCGCACC_9_T2
CGCAACAA_8_T3
TCCCGTTT_15_T1
TCCGCGGT_7_T3
TAGTTAAA_8_T3_V82A_
TTACTCAC_26_T2
CTCT
CTG
C_4_
T3
GAC
TATCT_14_T3_V82A_
GATCTCCA_8_T3
TCTT
TCG
A_48
_T1 CCGCAGCA_13_T2
GTG
TCTA
G_1
7_T2
CGCCCCAC_3_T2
AGGC
TTCC
_8_T
2
GGGTGTGC_12_T2
TATG
CACT_15_
T3_V
82A_
CAGGAAGT_5_T3_L90M_
CCGGACCG_16_T2
CCAT
TCGC
_7_T
3_V8
2A_
TGTA
TTAA_1
6_T3
_V82
A_
TCGCACAA_3_T1
ATTC
GTT
C_14
_T3
ACCGCATG_31_T1
AAACTCGC_32_T1
TAGCCGGA_30_T1
CCGAT
TTT_
32_T
1
CCAGGTTG_3_T3_V82A_
CTACAACC_47_T1
GG
GTTAAT_3_T3_V82A_
ACCCACCC_12_T2
GG
ACAT
GA_
12_T
3_V8
2A_
GGTGTATA_3_T3ACTT
GG
CC_8
_T2
ATACCCGC_19_T1
CTC
CAC
GA_18_T3_V82A_
TCCCGATC_6_T3
TGCAGCGG_16_T3_L90M_
GCAGGACC_23_T2
TCGGGGGA_9_T2
ACCT
AGCC
_20_
T3_L
90M
_
ATCCCTAG_7_T3
CC
GATTAC
_17_T3_V82A_
AACACCCA_12_T3_V82A_
GTGGCGCC_18_T2
TCG
TGAA
G_8
_T2
AACCCCAG_5_T1
CCCC
CGCG
_12_T
1
TCACTACG_14_T2
GAC
CGCT
T_7_
T2
GCCCGCGC_24_T2
AGAAAACA_7_T3
CATAACCC_9_T3_V82A_
ATCC
ACGT
_31_
T1
TTGCACAC_12_T3
CAATAGAC_7_T3_V82A_
AGCT
GGGC
_26_T
2
AGTCCCCT_4_T3
AAAAGTCC_5_T3_V82A_
TCCCGAC
T_25_T
1
TCGCACG
G_15_T3_V82A_
ATGCG
ACC_37_T1
ATTTATGC_8_T3_V82A_
TTCACTGA_19_T1
CTAG
TTGG_18
_T1
TGCACCCG_17_T2
GACTTAAT_15_T3_L90M_
GTG
ACTG
T_19_T3_V82A_
GGAGCTGC_13_T2
GTCA
CTCC
_22_
T3
GCCGGACC_15_T2
CGCC
CGCT
_19_
T2
ATTA
CCAA_9_
T1
ATTTTACT_4_T3_V
82A_
AGTTGTCG_5_T3_L90M_
CCTCTCAA_32_T2
AATCAAGA_7_T3_L90M_
TCCCGGTC_21_T2
CCCAAAAT_5_T3_V82A_
AATAAAGG_9_T3_I84V_
ATAGATAA_10_T3_V82A_
ATCT
CCTT
_7_T
2
ACGGCCTC_15_T2
TGTCTGTT_3_T3_V82A_
GCT
CTG
TA_3
_T3
ACTG
AAGC
_18_T3_V82A_
CGTACACA_12_T3_V82A_
TTGTAAAT_24_T3_V
82A_
TGCTTGGT_15_T2
GTTAC
AAG_3
_T3_
V82A
_
CTG
ATCAG
_16_T3_V82A_
TACC
CGGT
_16_
T2
GTTTCCGA_24_T2
CC
ATTG
GC
_16_
T3_V
82I
TAGGGACG_23_T3_L90M_GCCATCCT_11_T3_L90M_
CAACACAC_36_T1
CTCCCGTA_18_T2
TACTACAC_20_T1
CGGTGCTA_4_T2
TTTCGTCA_13_T2
CCTTCATA_7_T3_I84V_
CCCACCCG_21_T2
CCTCATCC_5_T2
AATACG
AT_17_T3_V82A_
GAACCCAT_19_T3_V82A_
GCCTAAGA_13
_T3_V82A
_
GCCGGCCC_3_T3_L90M_
GGTCCTCT_22_T1
CTCCATTG_42
_T1
AACC
CCCA
_13_
T1
CTG
TATTT_16_T3_V82L
TCTTCAAT_5_T3_V82A_
TGCC
CAGC_
22_T1
AAGGCTAG_6_T1
CGACTCCC_28_T1
GTCCCTGC_13_T2
GTCCCCGC_8_T2
CGAGGCCC_24_T2
TATCCGTA_38_T1
TCCCCCGT_13_T1
TGTGCTTT_4_T3_L90M_
GCTTCCGT_16_T2
GCT
GTC
GC_
14_T
2
GTCCGAT
G_16_T3
_V82A
_
CTG
TTCC
G_2
5_T2
CGGAGGCC_24_T2
CCCTCCGC_5_T2
ACTTATTT_12_T3
TGCG
GCTC_12_T3_V82A_
CCACCAAC_23_T3_L90M_
TTGGGGCC_7_T
2
TACTAGTG_10_T3_V
82A_
CTGA
CCCG
_3_T
3
TGTGGTCT_6_T3_V82A_
ACCCCCAT_17_T1
TTGTAAC
T_13_T3_V82A_
CTCAGCAC_22_T1
CTTCTGCA_7_T1
CCCC
TACT
_12_
T2
CTTACTG
A_14_T
3_V82A
_
TCCCCCCG_10_T2
GTCACCCC_5_T3_L90M_
CCATAGAA_3_T1
CAATATCT_3_T2
CAAGTAAC_10_T3_L90M_
CTCC
GCGC_
11_T1
AAAGTTAA_11_T3_L90M_
GCGCACCC_16_T2
CTTCGATC_21_T3
CAGA
CGAG
_8_T
2
GGGGTCGA_5_T2
GC
AGAT
TT_1
5_T3
_V82
A_
AGGGCCGC_22_T2
TTTT
TGCG
_7_T
2
CACA
AAG
T_5_
T3
GAGTTCTC_44_T1
TCCC
GTC
T_6_
T2
ATTG
TCGT_
16_T
2
TTGTT
CAG_23_T3
_V82A
_
AAGACCGC_22_T3
CTCCTC
GC_10_T
3_V82A
_
ACCCAGTG_11_T3_V82A_
ATACCTGC_36_T1
TCGGAG
TC_1
2_T3
_V82
A_
AGG
GC
AAG_19_T3_V82A_
ACAGCCCA_36_T1
GAATGTTC_16_T3_L90M_
TCTCCGCC_3_T2
TCGGAGAA_6_T3
AACCCTGC_38_T1
TCTCGTTC_14_T2
CGCG
TTGA_
9_T2
CTTTGTCG_3_T1
GCCTGCCC_10_T2
TAGTCCCC_9_T2
CGCA
CAAT
_6_T
2
_A28
V_3T
_22_
TATA
AGA
G
CCCT
CCCC
_29_
T2
CAACTAGT_18_T3_L90M_
AAATAAAA_15_T3_V82A_
CCGG
TGCG
_24_
T2
CGCCCGCA_12_T2
CCAA
GAGG
_42_
T1
ACTGAG
GT_24_T3_V82A_
GAACGGCC_8_T2 AAAGAAAC_10_T3
CAAG
CGGA_
7_T3
_V82
A_
GCCTTGCG_9_T2
CGGGGATT_21_T2
CTAAG
GTG
_21_T3_V82A_
AGGC
TGCG
_3_T
3_L9
0M_
GAAAG
ATT_16_T3_V82A_
CTG
CACC
G_7
_T3_
V82A
_
GAATCTA
A_9_T3_
V82A_
TCGACCCA_19_T3
_V82A_
AGATAC
CA_18_T3_V82A_
CGGATAAA_18_T3
CTCTCCAT_54_T1
ATAAAAAG_9_T3_V82A_
CCGATCTA_10_T3_L90M_
GGACGGGA_9_T1
CAGATGCT_4_T2GTTACCCC_22_T2
TTGT
CTAG
_36_
T1
GGAG
CGAG
_3_T
3
GCCACTCG_41_T1
ACGG
GATT_12_T3_V82A_
CGACGTCG_21_T1
GCGTGATC_36_T1
CTACCCAA_9_T3_L90M_
CAGC
TCCC
_21_
T2
GGTCCGGT_22_T1
TAC
TAAT
A_13
_T3
TCCGGTTA_43
_T1
CGGAGGCC_10_T3
TCCGCACG_12_T2
CACCCACA_19_T2
_A28V_3T_92_CA
CTTCA
G
GCCTCCTG_18_T3
GGACTGGG_6_T3
3T_51_ACT
GGAAA
TGCATG
CC_13_T3_V82A_
ACATTTGC_5_T1
GCCT
ATTG
_31_
T1
ATCACAAT_22_T3_V82A_
TTAAACCC_4_T3
CGGC
CCGC
_26_T
2
GGTCTACC_8_T3
TGTG
CTTG
_20_
T2
TAAC
CGCG
_28_T
1
CCCCGTAT_15_T2
GACAAG
CT_9_T3_V82A_
CTCTCCCG_20_T2
ATCCTT
CA_19_T
3_V82A
_
CTCCCCCG_24_T2
AGCGAATT_11_T3_L90M_
GACCGTCC_4_T2
CGTCACGC_14_T2
CACT
TTG
A_19
_T1
GAATAATA_12_T3_V82A_
AACGCCCT_13_T2
GATCATAC_14_T1
TTTTGAAC_38_T1
GATTTCTC_31_T2
TCGAT
TTA_
5_T3
_V82
A_
GCCCGGCG_36_T2
CAAGAACC_10_T3_I84V_
GTCAAACA_19_T3
ATTG
GCAG
_37_
T1
_A28
V_3T
_22_
TAG
CC
CAC
CTTCGTGT_19_T3
AGCG
AAAG_18_T3_V82A_
GCCG
AAAA
_12_
T3_V
82L
CGTTCTAA_32
_T1
GG
ATAG
AG_5
_T3_
V82A
_
2T_4
_GCG
GGCG
G
GTCGGAAT_4_T3_L90M_
GTGCAACG_4_T2
AAGA
CGGG
_6_T
1
AGCT
TAGT_
4_T3
_V82
A_
ACTA
AAC
A_21
_T3_
V82A
_
GCTGCGAC_23_T2
CTGTTACG
_11_T2
ATG
ACTC
T_3_
T3_I
84V_
ATTG
TAG
T_16
_T3
AAGTG
GG
G_6_T2
GCACCTTG_52_T1
TGCC
CTTT
_38_
T1
CTGTCTG
T_11_T3_V82A_
CGAC
TCGC_
13_T1
AAACATAA_16_T3_L90M_
CGCCCTCC_21_T1
CTCCGCCG_35_T1
AATATACA_33_T2
GGTTGTGG_8_T
2 CACGCGAC_14_T2
GTC
TTAAG_21_T3_V82A_
GTCCCCCT_11_T2ACCCTCTC_3_T1
ATCAAGGG_9_T3_L90M_
TGAACCCA_32
_T1
GC
ACAAAG
_17_T3_V82A_
TCTGCGAC_22_T2
CCCC
CCAG
_4_T
2
TAAGTGTC_3_T3_V82A_
ATAA
GCC
T_12
_T2
AGGGGTTT_10_T3
GGACAAAA_31_T3
_V82A
_
ATTC
TGTC
_19_
T2
_A28
V_3T
_32_
AAAT
CGT
G
ACTGTTGA_39_T1
CGGA
AACC
_9_T3
CCACCGCC_35_T2
GCCCGGCC_31_T1
CCCTTAGT_8_T2
CAGGCGTG_4_T3_V82A_
CGGTCCTA_16_T2
A28V_3T_13_AC
CC
GTAT_
ACAACAGG
_33_T1
TATGG
ACA_17_T3_V82A_
2T_3
_ACT
CGA
CC
CTGCCTAT_3_T3_V82A_
GCGATCCG_17_T2CGTACCCC_9_T
2
ACCTCCCG_9_T3_V82A_
TATAATTT_15_T3_V82A_
TCAA
ATTA
_5_T
3
CCAGCAAT_30_T2
TACCCCTA_27_T3
ATCACTTA_19_T3_L90M_CAAATGCC_8_T3_L90M_
ACGCAACA_20_T3_L90M_
AGGG
AGCG
_7_T
3_L9
0M_
GTAACG
TA_11_T3_V82A_
AGCCTACG_21_T2
GG
GTC
TAA_
6_T2
AGGCG
CCC_1
7_T1
TAAACTGG_9_T3_L90M_
AACACTGT_3_T3_V82A_
CATTTTAA_6_T3_L90M_
GTCCGCGG_4_T2
ATGGTGGT_13_T3
ACGCA
CCC_
13_T1
CATATTTA_13_T2
GGGGAAAT_5_T3_I84V_TTG
GCGCC_6_T2
TCCCTCCC_6_T2
ATGTTGTT_3_T3_V82A_
TGG
ACCTG_43_T1
CGG
CGCCT_12_T2
CCCTGAAC_8_T3_L90M_
CCCC
AGCG
_5_T
2
GAT
TAAG
A_10
_T3
ACATGATT_16_T3_V82A_
CTAAAACG_3_T3_V82A_
CGCACGTG_16_T2
AATA
CTTC
_27_
T1
CCTCCGCG_17_T2
ACCCGCGC_7_T2
GG
GAC
CCT_
3_T3
AAACCCGA_11_T1TT
TGCAT
T_36
_T1
GATTGCAT_19_T3_L90M_
CCGC
ACCG
_23_T1
CTATTATG_4_T3
TGTA
GGTA_2
3_T1
CATAGGTG_10_T3
GACAAGGG_13_T3_L90M_
GGTGGATG_29_T1
GATA
CCCC
_40_
T1
CACTTGG
A_30_T2
TTAGTGCG_33_T1
ACCT
GCTA
_8_T
1
TCCCTGAC_27_T1
ACTACACC_9_T3_V82A_
GTAATG
CC_10_T3_V82A_
CCACGGTG_3_T3_L90M_
GTCCTCAC_28_T1
CCACTCTC_18_T2
AGGGCCCC_4_T2
ATCTGGGG_25_T2
TCTC
GATC
_32_
T1
CTGTCTTT_13_T2CGCGGGGT_3_T2
AAGGCGTG_29_T1
GAA
CTCG
C_10
_T2
GAGCAGAA_22_T3_I84V_
GCG
ATCCT_21_T3_V82A_
TAAG
TGGA_
8_T3
_V82
A_
ACG
ACTG
C_16_T3_V82A_
AAAG
GAC
A_9_
T3_V
82A_
ATTACAGC_14_T3_I84V_
TCCG
TTCC
_3_T
1
CACACGAG_17_T2
AGCAGCCC_21_T3
TGTA
GCGC
_5_T
1
CCCCCGAC_22_T2
GCCGGTAT_33_T
1
TCCTGTGG_3_T2
GTTTAATG_13_T3_L90M_
AGCCCCAC_3_T2
TACG
CAAG
_9_T
2GGATCCTG_25_T2
CTCT
TACC
_27_
T1
CCGA
GTGT
_42_
T2
ATCT
CCAT
_5_T
3
TATAGATT_24_T3_L90M_
AAGA
CCTC
_22_
T2
CCACCCGC_29_T2
GAAGTCAG_9_T3_L90M_
GACG
ATGC
_18_
T2
CCGTCGGG_42_T1
_A28V_3T_72_AG
GCATAT AG
CCCC
AA_4
_T3
GCAGTTCG_34_T1
GCG
TTCC
G_1
3_T2
CGAGACGG_16_T1
3T_4
1_AA
GAA
GGT
CCTGACCC_5_T3_V82A_
GGAATTAC_7_T3_L90M_
GTCTCGCC_65_T1
CTTCTCCT_22_T2
GGATGCAG_30_T1TCCTATCC_11_T3
ACCGATGT_5_T3
AGGGCCAT_18_T1
GG
AGCCG
C_14_T2C
TGTC
TAC_7_T3_V82L
TCCT
CGCG
_9_T
2
GCCCCCCC_5_T2
ACAACCAT_13_T3
GCGTCTAG_5_T3_V82A_
CAACCTAC_29_T2
CGAG
GATA
_22_
T1 GTCCAGGA_3_T3CCTG
GCT
T_6_
T2
TAAG
GAT
G_3
3_T2
CATCCCGG_17_T2TGTTC
CGC_15_T1
ATACACCG_13_T3_V82F
GAGCATCC_18_T2
GCG
GG
GAT
_4_T
2
TCAC
TTGA_18_T3_V82A_
GCCACCGC_14_T2
GGCGCCTG_8_T2
AAGCTGAA_18_T2
AGAAAGCG_9_T3_
V82A_
GTCCCCAC_26_T2AGTCGACC_20_T2
GGCCCATT_14_T3_L90M_
AGCCGTTA_17_T3_V82A_
ACGTG
GAT_10_T3_V82A_
AGTGCCCT_49_T
1
CCCA
GCC
T_12
_T2
TCAT
GCGC
_28_
T2
ACATACCC_19_T2
GTA
TGCC
T_31
_T2
TTTCGAGC_10_T3_V82A_
GCCGCCAC_5_T1
CTT
CC
CG
A_5_
T3_V
82A_
CCAAAAGC_18_T3_V82A_
ACGCATAT_10_T1
TCCTGATA_5_T3
CGCCTCCC_7_T2
CAACACCG_35_T1
CCACGTTC_3_T3_I84V_
CGATCCCC_10_T2
CAGGACAG_8_T2
CCGCGGCG_3_T1
AAGTCCAT_12_T2
CCCCCGCC_8_T1
GCAT
TTGA
_27_
T2
ATG
TATG
G_9
_T3_
V82A
_2T_22_TTC
GTGCT
TACGTCCC_3_T3
CTCACCAG_13_T3
_V82A_TTG
CCTGC_14_T2
GCCTAAGT_22_T3_V82A_
GAAGAAAT_10_T3
AGACCCAC_12_T3_V82A_
CCCGATTG_3_T3_I84V_
ACGTCCAG_4_T3_V82A_
CCCCCGAG_9_T3
CCCCTGCG_17_T2
CCCACACC_14_T2
ACCCCCCG_13_T2
AGAC
CCAC
_31_
T2
CGCC
CTCG
_9_T
2
CGCG
TCTC
_9_T2
TGTAATG
T_16_T3_V82A_
ATAAACAA_3_T3_V82A_
AGCTCGCG_21_T2
CATCCGCA_4_T1
CCACATCC_5_T1
CTGTACGC_17_T2
GTACCCAA_3_T3_
V82A_
TGTCGCCC_3_T3_I84V_
TGATTCCT_19_T3_L90M_
CCGTAATA_10_T3_V82A_
ACTC
AACC
_35_
T1
CCCTAGCC_41_T1ATTAGACA_8_T3
GGGC
AGCG
_12_
T2
ATACGCGG_28_T2
CCGCAG
GA_4_T2
CCCGACCC_31_T1
TCCTCCCT_30_T2
GTTGATGC_22_T3
ACCTTCCT_11_T3
CAACCTCG_11_T2
CTCGACGG_8_T3_V82A_
GATACACC_22_T2
CACTTAGC_12_T2CTATCGCC_13_T1
_A28V_3T_82_AG
CAAC
GT
ACTTTGTA_19_T2
CAGG
CGAC
_23_
T2
CTGCAGCT_19_T2
CTTTAATT_25_T1
CTTC
CCCT
_13_
T2
TATGGGTT_9_T3_L90M_
CCTC
CGGG
_11_
T2
CCACCCCA_13_T3_V82A_
TCTT
GCGC
_11_
T2
GCTCCCTA_10_T2
AAACCTAC_21_T1
GAACCCAC_5_T3_L90M_
CAAA
GTG
A_16
_T2 GCGTACTG_16_T2
GAGTTGAC_12_T3_V82A_
TGAA
CTCA_
11_T
3_V8
2A_
GCAACCTC
_25_T1
TCC
TATT
A_10
_T3_
V82A
_
GG
TATTGG
_15_T3_V82I
CGCC
CTTC
_25_
T2
ACGCCGCA_11_T3_V82A_
CGGCGCCC_21_T2CCCCCAGC_21_T2
TTTCAC
AT_16_T3_V82A_
ATAC
ACCC
_16_
T2
TAAA
CACG
_21_
T2
CCGCCCAC_11_T2
GCT
TCCG
G_2
3_T2
AGTTATA
A_16_
T3_V
82A_
TACAAGAG_19_T1
TTGCCTTC_14_T2
AGCCGACC_9_T3_V82A_
CCAGACTC_17_T2
TGTTGCGG_33_T2
CCCGTGGC_20_T2
_A28
V_3T
_42_
AG
CTTT
CA
TATCGCTT_8_T3_V82A_
CCTA
GTT
C_3_
T3
GTAAAATG
_6_T3
GGTGTACG_15_T2
GTCCTTCT_7_T3_V82A_
CCTCGCCG_33_T1ATTTGCTT_19_T2
CCCC
CCGG
_7_T
2
CTTTTGCG_9_T1
GGTCCATT_15_T1
GTCCCAGT_39_T1
GCTCT
CTC_
12_T2
CCCCCGGC_39_T2
TCTATATA_10_T3
GTACTCG
G_16_T2GT
ACGG
GA_1
2_T3
CC
TGAC
AA_7
_T3_
V82A
_
CGGCCGGT_21_T2AATA
GGGC_8_T1
GAGCCGGA_12_T2
ACCGCACT_33_T1
CAACGTCG_7_T2
CAAT
CCAT
_17_
T3
GCGCGCGG_27_T2
GCTTCGAG_8_T2
AGCG
CTCA
_3_T
3AG
CCAA
CC_7
_T1
TCCTCACC_3_T2
GTAG
GGAA
_9_T
2
TGAA
CGCA
_18_
T2
TGGAAAGA_7_T3_L90M_
CCCC
GAGC
_5_T1
CAGCCCTC
_15_T1
TCATGTCA_36_T1 TACATAGT_11_T3
AAAGAGAA_12_T3
ACCCTTTA_5_T3_V82A_
GCGCCGCT_29_T1
CGATTTCT_4_T2
GCCATG
GG
_13_T3
TCGCGGCC_3_T2
ATCC
GCGC
_10_
T3_V
82L
AACGAGTT_18_T3_L90M_
TTCCATTG_6_T3_V82A_
CAATGAAC_19_T3_L90M_
GAGTAAAG_8_T3
TTTGTAAG
_15_T3_V82A_
TATG
GAGC_10
_T1
TTGTAAGC_11_T3_L90M_
GCGG
TGGC
_17_
T1
CCCA
CAAC
_11_
T2TC
CGAA
GT_
17_T
2
TAAA
GTC
C_13
_T3
TAAATTAA_6_T3
ATCGTG
TG_7_T3_V82A_
ATGCCGCC_9_T2
GACCATGC_5_T2
CGTTTTTT_9_T2
CCGCCCTA_34_T1
GAGG
CCCA
_29_T
1
GAG
TTAGA_36_T1
CGGC
TGCG
_11_
T2
AGAG
AACA_19_T3_V82A_
CCCC
GCAC
_7_T
2
AGG
AGG
AC_17_T3_V82A_
TGAT
TCAT
_16_
T3_L
90M
_
TCTACATC_14_T3_V82A_
2
GTAATAAT_20_T3_V82A_ACGTCTGT_9_T2
CCGTCCTA_26_T2
CAATCGTA_11_T3_I84V_
GCGG
TCAG
_7_T
2
GTTGGCCG_21_
T2
CCTG
AAAC
_10_
T3
ACGTTATG
_32_T1
ACGTCGCA_28_T3
_V82A_
CATAAC
TA_14_T3_V82A_
GCA
GCT
CC_5
_T2
CATGATAT_18_T3_L90M_
TATG
GAC
T_12
_T3_
V82A
_
AACTATGT_6_T3_V82A_
GCCTCTCG_7_T3
TCAA
TACG
_6_T
3
ATTG
GAC
A_17
_T2
ATG
GTG
TA_9
_T3_
V82A
_
TAGTA
AGC_35_T3
_V82A
_
TGAG
CCGC
_24_
T2
CCCCTCCC_38_T2
AGGC
GACT
_12_
T3
AGCCGATC_18_T3_V82A_
CG
GG
TCTT_7_T3_V82I
ATTCGACC_17_T3
CAAAGATC_11_T3_V82A_
CTAG
TCG
G_19_T3_V82A_
ACTG
TTTG
_6_T
1
AAGCTCGC_3_T1
TGTGTATC_14_T3_I84V_
ACGC
TGTC
_44_
T1
TAGGCCCC_4_T2
TTGG
AGTT_14_T3_V82A_
TCAGCTTG_5_T3_V82A_TAAACAG
C_11_T3_V82A_
GTA
AGCG
G_1
4_T3
CACC
AACA
_3_T
2
AAAA
TGAG
_9_T
3
GACCCCCC_3_T3_V82A_
ACAGAGTT_10_T3_L90M_
GACTAGTA_8_T3
CGCGAACT_5_T1
GTCCCATG_31_T2
CCCGGCGG_4_T2
TGCGCGCC_18_T2
TGTGGTAG_12_T3
CGAT
GCA
G_1
7_T2
ACGTCATC_13_T3
ACCCTATA_8_T3_V82A_
CCCGCCAG_12_T2
CGGTTGCA_13_T3_L90M_
CCCC
GCA
A_6_
T2
CGCACAAC_15_T2
GGCCCACC_28_T1
ACTTAG
TA_19_T3_V82A_
ATTG
TGGT
_3_T
1
GGAGGCAT_25_T1
TTAC
GG
GT_
8_T3
CAAA
CGAG
_5_T
3
ATCCAGCC_30_T1
TCAAC
CAA_13_T3_V82A_
CCCCCCCA_7_T3
AAACACAT_38_T3_V82A_
GTCGGACT_7
_T3_V82A
_
TTCCTCCG_10_T3_I84V_CTAGCACC_6_T
2
GTCTACTG
_15_T1
CCGC
AGGT
_7_T
3_L9
0M_
ACACGTGC_66_T1
CCGACGAT_24_T2
GCCCCTAT_10_T2
CCGGGGCC_25_T1
TCCA
AAGC
_42_
T1
TAAGAGGG_13_T2
CCTGCTCC_30_T2
TAGG
TGGC
_9_T
3_L9
0M_
CCGGGACC_12_T2
ACGCCTCC_7_T3_V82A_
CCCA
GCGG
_7_T
2
TTCGCCCC_29_T2
CGGGGCGC_19_T2
CAAGATAC_19_T3
CCGTGAGA_4_T3_L90M_
CGTAGCAG_9_T3_L90M_
TCTC
GTC
C_22
_T2
CCAGACCA_18_T3_I84V_
TAACTTAT_6_T3_L90M_
ACTA
AATG
_10_
T3_V
82A_
AGCATATG_17_T3_I84V_
CGGT
CCCT
_4_T1
AACTTG
TC_15_T3_V82A_
AGGCGCTC_15_T1
TATC
CCCA
_38_
T2
CCTGTCGT_6_T2
GTAAAATT_15_T3
AACGCGAA_27_T3_V82A_
GG
GCTATG
_9_T3_V82A_
ACCGGCGA_29_T2
TCACTTCT_12_T2
GGTGCGCC_13_T2
ATTATTTG_12_T3_V82A_
GCCGGTGC_19_T2
CCGGGGGT_10_
T3_V
82A_
CTTGACAC_11_T2
CGATAA
AC_1
1_T3
_V82
A_
TGCATAAA_39_T2
AGTGGTTG_23_T3_I84V_
GTG
CTTCG_17_T2
CCCC
CAAA
_9_T1
ATTCAAAC_3_T3_V82A_
GAT
TGTC
A_20
_T3_
V82A
_
ATCCTACT_13_T1
CACA
TGTC
_11_
T2
CATG
GCT
T_10
_T3
AGTACGTT_19_T2
ATGACACC_32_T1
AATT
AAG
A_3_
T3
GCTATGGC_11_T3_V82A_
TAAACTAG_8_T3_V82A_
AGCACGAG_31_T1ATCTGGGA_47_T1
GCTG
CCAA
_9_T
3_L9
0M_
CTTTGTCC_18_T3
ACCCAAAT_3_T1
CAAAAG
CA_15_T3_V82A_
CTGACCCC_5_T2
TTTT
TATG
_9_T
3_V8
2A_GG
TCTA
TC_1
7_T3
AAC
TCAC
T_3_
T3_V
82A_
ACAC
CATA
_3_T
3
GCGCCCCC_7_T2
GCGCCAAT_15_T2
CCACCGTG
_4_T3_V82A_
GTCATCTG_9_T2
TGCCCG
AC_16_T2
ATACTCAA_5_T3_L90M_
CACTAGTA_4_T3
CACGTGGA_14
_T3_V82A
_
CCCCCGAT_3_T2
CCACCTCT_3_T2
ATCA
GG
TA_4
_T3
CC
TGATG
G_12_T2_V82L
GGTACAAC_11_T2
CACACCGA_24_T1
TTGACGGG_8_T3_L90M_
GTA
GTC
CC_1
7_T2
GCCCAACC_10_T3_V82A_
GTAG
CCCG
_5_T
2
GTAATCGC_12_T3
_V82A_
ACCGCCCC_3_T1
AGAT
AGGC_8
_T3_
V82A
_
GCGGCCCC_4_T1
TATG
CGGT_27_T
3_V82A
_
CGCTTATG_11_T3_L90M_
CATCGCCG_18
_T1
CCCATTTT_3_T2
TCCTGCCG_28
_T1AACT
GCCC_
10_T2
GCCGGCGT_26_T2
CTCTTCAT_19_T2
GTAGATAC_3_T3_V82A_
CGGACGAG_20_T2
CGAAGCTG_21_T1
CGCA
CGGG
_3_T
2
CCTGCGAG_10_T2
TTCAAACC_3_T3_V82A_
TCACTTCA_27_T2
TCCTGTAC_10_T3
AATA
GGGA
_26_
T1
CCTAGTAT_7_T3_V82A_
TAAGATAA_8_T3
GTC
TGTG
T_36
_T1
TCTC
TCG
T_23
_T2
ATGAAAAC_4_T3_I84V_
TGCCCGCC_28
_T1
CTCAAGCC_10_T2
TGCC
CCTG
_22_
T2
TTGCTCCG
_10_T3_V82A_
AACACAAT_18_T3_L90M_
GGAAGACC_9_T2
TTGAAATA_13_T3
TTGGCGTA_4_T2
TCGCGACT_23_T2
GGCCACGC_5_T3
GTGTGAAC_11_T3
AGTATATT_15_T3_L90M_
GCCGGCAC_34_T1
ACCCCGCG_14_T2
CTCACCTT_11_T2
CTTTCTTT_7_T1
GCTG
CCGC_21_T2
CTGTAG
GG
_10_T3_V82A_
CCGA
TTGA
_21_
T1
AAGAGATG_19_T1
TCACTCAC_10_T3_V82A_
ATGC
GGAC
_16_
T3_L
90M
_
AGCGGAAA_15_T3_I84V_
CCAGACTT_7_T3_L90M_
TCTCTACC_23_T2
ATCCCTTC_20_T1
TGAAAGTG_3_T2
AAACGTCA_6_T3_L90M_
CATCCTCG_20_T2
CACTTAAT_5_T2
CCATGCGC_21_T2
GCGTCCGC_9_T2 GACTAACC_11_T3CACGTGGC_3_T3_V82A_
CATTATGC_19_T3
CCGAAATG_20_T3
CAACACAA_11_T3_V82A_
CGTC
GCT
G_1
3_T2
GACC
CTCC
_5_T
3
TGAG
CGAA_11_T3_V82A_
CCGTCAAG_9_T3_V82A_
ATCTCAGA_27_T3GCAAGATG_15_T3
CAC
AAAAA_21_T3_V82A_
ACTA
ACCC
_4_T
3_V8
2A_
TTACACCC_14_T2
GCACTTCC_17_T3_V82A_
ATTTAACA_17_T3_V82A_
ATATGAGG_4_T3_V82A_TAGTAGCT_14_T3_L90M_ CT
ATCG
TC_7
_T2
GCACCCTG_4_T2
ATCT
GGAG
_10_
T2
CCAGTCGC_38_T1
ATAG
AATC
_9_T
3_V8
2A_
CTTCGGGG_26_T2
GTCT
CTCT
_17_
T2
TCCTTATA_4_T3_L90M_
ATGACTGT_8_T3_I84V_
GTGGGCCG_16_T2TCGTTCTT_15_T2
ATGTCG
CG_3_T2
AAGACGGT_5_T3
GGCAGGCG_8_T2
GTG
TTG
AC_2
7_T2
CTAGACCC_27_T2CCCGCCAC_16_T2TATGCCTG_6_T2
TTGACACG_51_T1
AACAG
ATG_22_T3_V82A_
GTCG
CTGC_10_T3_V82A_
GAG
CGAT
C_15
_T2
TCAACTAA_45_T1
GTG
GG
TGT_16_T2
ATCATCTT_5_T3
CCTT
ATG
G_2
0_T1
CAACGAA
T_9_T1
CACCGTAT_7_T2
TCTGGGCC_31_T2
AAAACTGA_5_T3_V82A_
AACCACGG_34_T1
AAAT
GGTA_12
_T1
CTGAG
ATT_30_T1
ACCATGCA_16_T3_V82A_
CCTTAAAC_4_T3
AAGG
GTAG
_9_T
2
TTTATCTT_34_T1
ACACACCA_9_T3_V82A_
AGG
TCTTA_18_T3_V82A_
CTCC
TAGT_
25_T1
TGGCCGCA_13_T2
TAACGGAG_5_T3
TCCAACAA_9_T3_L90M_
TATCCAAA_15_T3_V82A_CTCCCGGG_9_T2
CCCT
CGCC
_15_
T2
CCGCCGCA_30_T2
CCTTATCT_7_T2
CTTAAATC_10_T3_I84V_
TCTT
CGCC
_41_
T1
ACAACAAT_11_T
3_V82A
_
CATGGGGT_23_T3_V82A_
CCAG
GCCG
_30_
T2
TGTAAGCA_19_T2
GCG
CCATT_29_T2
TTGCTATT_7_T3
GTT
ACCC
C_5_
T3
CATCGATA_7_T1
CCGTAAAT_3_T3_V82A_
ACCC
AGAT
_19_
T3
CACGCGCT_13_T3_L90M_ ATAT
TTC
A_4_
T3
GAGAAAAC_33_T1
GG
AACACC_24_T3_V82A_
GCAGCAAC_19_T3_L90M_
AAACATC
C_13_T3_V82A_
GGATTT
AT_6
_T3
GCTTCACT_7_
T2
TGAG
TTGA
_25_
T2
TGTTTACC_3_T3
GCAA
ATGG
_18_
T1
GCC
TCTT
T_24
_T2
TCAACATA_11_T3_V82A_
CGATCTG
T_46_T1CCG
CTG
TA_1
6_T2
AGACTAAG_5_T3
AGTAGCGG_14_T3
TATTGTGG_9_T2
AAAA
CG
AC_3
_T3_
V82A
_
CAGG
GCTT_48_T1
GGTATGAG_15_T3_V82A_ _A28
V_3T
_32_
TC
CTT
GGA
TCGCTCCC_35_T2
AATT
CTTC
_10_
T3
ACCGACCG_19_T3
_V82A_
GCG
TCAT
A_19
_T2
GCGGCGAA_9_T3_V82A_
TCAGGTCG_3_T3_V82A_GGTTGGCA_33_T2
CGTG
CATG
_17_
T3_L
90M
_TAC
ATAT
A_12
_T3_
V82A
_
CCGG
CTGT_18_T2
TGATTAAA_14_T3
CCTCATAA_8_T3_L90M_
CAACCAAT_38_T1
TGAACAGT_24_T3
GGGTGGTA_24_T2
GGTTAAGC_3_T3_L90M_
CCTC
CACC
_6_T1
CTATTATT_38_T2
CGGTGTAG_21_T1
CCACCCTC_11_T2
TCTG
GTG
T_11
_T2
CTCC
GGCG_6_
T1
TGCA
TCCT
_17_
T2
CATAAGGG_9_T2
CCTG
TAGC
_3_T
3
AGCCG
GG
C_12_T2
TTGG
GGGA
_5_T
2
TAAC
GCAC
_3_T
3
ACGCACCA_66_T1
TTCCAATT_10_T2
AGGG
TGTA
_17_
T2
CACC
GGGA
_33_
T1
CACTTCCC_5_T3_L90M_ ACAT
CTGC
_9_T
1
GC
ATTTAA_7_T3_V82A_
TGTG
AGAT_40_T1
GTTCCATA_4_T3_I84V_
ACTCCATC
_22_T3
_V82A
_
TCAC
AACA
_23_
T2
GACACCAG_3_T3
GTGTAGAG_8_T3_V82A_
CATCCTT
C_8_T
3_V8
2A_
ATG
CCAG
G_1
1_T3
GCCCGGAC_8_T2
CCCG
CTGC
_11_
T2
ACGC
ACGT
_11_
T3
CCGTGGGT_11_T3
CGTGAAAC_16_T3_V82A_TAAAAGCT_6_T3_I84V_
TCAAACCT_9_T3_V82A_
GTTCTATA_23_T3
GCCC
CGCG
_9_T
2
CATGTTCC_3_T3_V82A_
TACG
CCAT
_4_T
3
AGAGAGAC_18_T3_L90M_
CCCCCAAA_4_T3_V82A_
ATCCCCGC_8_T2
AAAC
TTCG
_3_T
3
TCAAAGCG
_30_T1
TCCGACTC_27_T2
GCAGGTCC_48_T1
CTGGTTGC_13_
T2
GCTGCTAC_13_T2
TCCATCGT_13_T2
GG
GTA
CAA_
17_T
2
CATACGTT_13_T3_L90M_
3T_7_GA
GGA
GTA
GTCATCAG_5_T2
CCCG
CACG
_13_
T2
ATTTGTG
A_17_T3_V82A_
AAGCGCCG_9_T3_V82A_
TGGG
TGGG
_7_T
3_L9
0M_
CTCC
TAAC
_16_
T3
CTCGTCTC_17_T3_I84V_
CGACTGAT_11_T3_V82A_
ACTA
ATGA
_29_
T1
GAGTACAA_5_T3_V82A_
GG
AAATGA_9_T3_V82A_
TGTGATTT_10_T3_V
82A_
CCAG
AAG
C_15
_T3
TCCCTTTC_28_T2
CCTCTGCT_21_T3
CCCTCCCT_1
7_T1
CCCCTTGG_13_
T2
AGAA
ACCA
_9_T
2
CTCTTTCG_4_T3
CGGC
TTCC
_41_
T2
CCCCTCGA_17_T2
ACAG
CCTC
_14_
T3
GTG
AGAT
C_8_
T3
CTG
TATC
A_8_
T3_V
82A_
GATATACC_9_T3_V82A_
CTGTAAAA_11_T3_V82A_
CAAACACC_34_T1
AATCACTA_15_T3
TGCATTTG_11_T2
CTAAAGTA_9_T3_L90M_
CCCTTTCA_9_T3_V82A_
ACAC
CG
GA_17_T3_V82A_
CACC
CCCG
_12_
T2
CTGCCCGC_24_T2
GCCTGAAC_10
_T3_V82A
_
CGCCTCCA_15_T3_L90M_
GTATCATT_33_T2
GAT
ATG
AA_1
0_T1
TGAGTGTT_13_T3
AACTTGCT_9_T3_V82A_
TCTGCTGC_20_T2
TCTG
TACT
_15_
T2
GGCATATT_3_T3
_A28
V_3T
_42_
AC
CCA
AAC
CGCTGCAC_8_T2
TGG
GATAT_8_T3_V82A_
CTTT
CATA
_17_
T1
ATACTTAC_14_T3_V82A_
TACT
TCCT
_19_
T2
GCTCTCCT_9_T2
TAAAGTTC_6_T3_L90M_
CAG
AGG
AA_1
0_T3
GTAGCTAA_16_T3
TCTCCCGG_6_T2
CCAG
AGGG
_8_T
3TC
TTAG
CT_1
5_T2
AAGCACTT_3_T
2
TTC
CTA
AC_2
0_T3
_V82
A_
CATGCTTT_9_T1
2T_2
2_G
GCCT
GTG
GCCCTGCG_3_T1
CACATTCT_22_T2
CATATTTT_4_T3
CCAAAAGT_41_T3_V
82A_CTTTCTTA_5_
T2
AGC
CTA
TT_1
1_T3
GACCGAT
A_9_
T3_V
82A_
TCGG
ATGG
_4_T
2
TTCCTTTC_5_T1
GCAC
GCAA
_3_T
1
_A28V_3T_03_G
CGA
GTG
C
TAAC
TTTG
_7_T
3_L9
0M_
AAAT
CTTA
_10_
T3
CCTATCCC_6_T2
AGCA
CTG
T_17
_T2
GTCAATGT_3_T3_V82A_
GTTCTAAC_23_T1
TTGATAGA_11_T2
TACC
CCCT
_4_T
2
TCCAACAC_20_T3_L90M_
AGAC
CATT
_44_
T1
ACGATCTC_33_T1
CTGTCCTC_33_T2
AAAT
TATG
_16_
T3_L
90M
_
GTCTTG
AT_26_T1
TTAT
GAC
T_10
_T2
GTCATCAA_9_T3_V
82A_
GC
TACC
AT_16_T3_V82A_
GTCTTTCC_10_T3_L90M_
AGTTGGCC_15_T3_I84V_
CACGTATG_19_T3_V82A_
GTCT
GCCC
_39_
T1
GGCCGTGT_12_T2
CACCCAGC_27_T2
AACG
ATTG_19_T3_V82A_
TCCGCCCA_16_T2ACGCGATC_21_T3
AAATCCGT_8_T1
AAGAGAAC_11_T3
_V82A_
CTTATCTC_38_T1
GG
CTCGAC_12_T3_V82A_
CCTCTCCA_3_T1
CCCA
CCTC
_23_
T2
CTTGTCTT_9_T3
ATCGCACA_15
_T3_V82A
_
ACCAAGAG_49_T1
AACCGCTG_25_T3_L90M_
CGG
ACAC
C_15
_T3
ACCCATCC_15
_T3_V82A
_
CCACACAC_10_T3_V82A_
GCTCCTAT_9_T2
_A28V_3T_03_TATG
GAGT
CCAGGCGT_14_T
3_V82A
_
GC
CC
ATCT_18_T3_V82A_
TCAA
TACC
_17_
T3
AACTC
TCG
_13_T3_V82A_
TGAC
TAG
T_18
_T3_
V82A
_
ACAACATT_24_T3_I84V_
CTTG
AAAC_17_T3_V82A_
AATCC
CTG
_15_T3_V82A_
AAGA
CACC
_6_T
2
ACCC
GCTC
_29_
T2
ACAAAGAG_24_T3
TGAGGCTT
_18_T
3_V82A
_
AGCGCCCC_12_T2
CACGACCT_13_T
1
AACC
AACC
_20_
T1
AACTTGCG_10_T3_V82A_
ACAGGTGC_9_T2
TCAAGG
CA_3_T3
CCCC
TCCG
_10_
T2
GGGCGTAA_21_T2
CAAG
AGCA
_22_
T1
ATTT
GCGG
_15_
T2
GCG
CTCGG
_18_T2
GTCTGGTC_17_T3
TGACCCCG_13_T2
TACTCCCT_38_T1
GTTTTTTC_5_T2
CCCCTTGC_22_T1
ATATATTA_33_T1
TATACCCC_3_T1
ATTATAAG_11_T3_V82A_
GTCGTAAG_4_T2
ATGCCACC_5_T3_V82A_
ACTCGCTA_16_T2GGTCCCCC_17
_T1
TAAAGCTG_7_T3CCCCTCGC_11_T2
CCCA
CATG
_7_T
2
TAGATACA_7_T3_V82L
GCTACTCC_23_T2
GTTTA
ATA_
6_T3
_V82
A_
CGGGCATA_11_T2
TGTCCTTC_17_T3_V82A_
ACGCGCCG_4_T3
ACAA
GCTT
_12_
T2
GCGGACCC_31_T2 CCTACTTA_13_T2
CAG
GCG
AC_3
_T3
ATCC
TTCC
_11_
T2
ACGACATC_20
_T3_V82A
_
CGGTTCAC_14_T2
GATGATA
T_14
_T3_
V82A_
TGGTCGCA_9_T3
GTT
GCG
CA_1
4_T2
TGATAGCC_6_T2
GGGTTCTG_22_T2
TCCC
GTTA
_4_T
1
TACTCCAT_19_T2
ACAA
AGCC
_48_
T1
CCGTGATA_16_T2
ATTTTTGA_3_T3
CCGGACAT_15_T3_L90M_
GCTCCGTG_8_T3_L90M_
TATGG
TAT_12_T3_V82A_
GAGCGCAA_5_T3_V82A_
TCTC
CATA
_40_
T1CT
GCGG
TC_2
8_T2
GGTCTTCA_10_
T2
ACCCACGC_38_T2
TCCT
GACT_
3_T1
AAGAATGG_5_T1
GAGCCAAT_9_T2
ACACCCTC_3_T3_I84V_
GCGCACCT_43_T
1
TCCTATGC_35
_T1
TAGG
CACA
_37_
T1
CGACGTGA_13_T1
CAC
CTT
GC
_14_
T3_V
82A_
ATCCGTTT_35_T2AAGCCGCG_7_T
2
GTCTT
ACT_
3_T1
CGCGTACC_3_T3_V82A_
ACCATACA_5_T3_V82A_
ATACCCGA_6_T2 AGTTCCAA_37_T3_V82A_
CACGTGCC_22_T2
AAAA
GTA
T_5_
T3_L
90M
_
ACTTTTGA_12_T2
CACCCGTG_7_T3_V82A_
GGTGCATG_5_T3_L90M_ AACA
GGGA
_5_T
3
TGTTTTCT_33_T1
TCCGCATC_14_T2
GCCCGCGG_19_T2
_A28V_3T_62_CTTT
GAGT
TCTCCCGT_30_T1
CTCA
CTCT
_17_
T2
CCGG
GTCC
_18_
T2
CGCCCCCC_13_T2CCATTTTG
_4_T1
TGTCTACC_14_T2
TGTG
CGTT
_15_
T2
TTGCCCGC_12_T2
CCCCTCAC_4_T1
GTAAAG
CC_12_T3_V82A_
CATCCCCA_13_T2
TCATC
CAT_20_T3_V82A_GGGCTCAA_5_T3_L90M_
CAGCAAG
C_22_T2ACCA
GAG
A_21
_T2
ACTA
ACAG
_9_T
3
ATCCGTGG_13_T2
CCGTAAAG_15_T3
TTCTGGGC_6_T2
CAGT
TACT
_4_T
2
TGTGCCGC_9_T2
CACTGAGA_15_T3_V82A_
CAAAACTG_13_T3_I84V_
CCGCTCCC_4_T2
AGACAAAT_8_T3_V82A_
GTAC
GAAT_19_T3_V82A_
AGGGAGTC_34_T1
CACTGTGT_9_T3_V82A_
CCG
AATA
A_14
_T1
GCCGCGCT_45_T1
GTTC
ACAT_15_T3_V82A_
CCCGCTTA_4_T2
TGCCGTTT_18_T2
CGGAACTG_19_T3_L90M_
TCC
GTTTA_12_T3_V82A_
GCCCAACC_24_T1
GTGTTACG_6_T3_L90M_
GAGT
TCCT
_11_
T2
ACCCACAC_9_T2
CTAGTACT_20_T2
CCAC
TGGT_3
_T1
TTAATCTG_20_T2
CAATGTAC_4_T2
GTCCCTTC_5_T
2
AGCCCCTT_26_T2
TTAA
ACTC
_25_
T1
CATATG
AC_25_T1
AGGTATTT_4_T3_V82A_
CCTGGACG_13_T3
TAAAAGAT_4_T3_V
82A_
GTAACACC_12_T3
ACATTCGC_6_T3_V82A_ACGAGTCA_27_T1
ATCACCAA_3_T3_V82A_
GAATGTAA_20_T3
CTTCTCGG_20_T1
TCGTCGAC_8_T2
CCGCCCAG
_14_T3_V82A_
CGAACTGA_17
_T3_V
82A_
CTG
CCAC
T_16
_T2
ACGAACTG_9_T3_I84V_
CATTCGGA_8_T2
GCG
GG
AGA_16_T3_V82A_AACATTAA_5_T3_L90M_
GAC
ACC
CA_18_T3_V82A_
TTCCCCCT_22_T2
GACTTTAA_14_T3_L90M_
CGTAGCCT_7_T3_L90M_
TGCGAGCT_3_T3
CAGCTCCG
_23_T1
AGG
GG
CAC_
16_T
2
GCGCGGGA_6_T2
ATGTCTAG_8_T3_L90M_
CGCG
GTTC_9_T3_V82A_
ACTA
AATT
_7_T
1
CGCT
GATA
_30_
T1 TCTTGG
CG_22_T1
CCGCGCCA_13_T2
ATAT
AGTC
_17_
T3
AGTTCTGA_3_T3_I84V_
AGTA
ACG
C_13
_T2
ACTA
TACA
_5_T
3_L9
0M_
CCCTACAA
_12_T
3_V82A
_
CATCCCGC_23
_T3_V8
2A_
TGGAATAC_12_T3AA
GGTCAG
_6_T1
GCCCCCTG_3_T1
CTTA
TACT
_17_
T3
ATTT
AGG
G_1
3_T2
ACCCCACT_21_T2
GCTAGACT_30_T1
CCCTCCCG_20_T2
CCCTGACC_10_T2
TATTTCC
T_19_T3_V82A_
GCCG
ACCC
_21_T
1
CCCC
ACAA
_12_
T2
GCTCCCTC_27_T2
GTAAAACG_14_T3
2T_3
_CAC
TGT
TT
GATCTT
CT_12_T
3_V82A
_
ACCCCCTG_3_T2
ATTT
ACCC_1
0_T3
_V82
A_
GGCA
CCCG
_38_T1
CCCA
CCCC
_23_
T1
TCTCTCCC_19_T1
AGGCCGCA_9_T2
GCCCCAAC_29
_T1
CCAGGCGC_27_T2
TCCGCTCG_29_T2
TGCG
GGCG
_24_T
2
CGAC
GGGT
_7_T
3
TGGACCAA_22_T3
ATCAATA
G_19_T
3_V82A
_
CATTCTCC_8_T3
TCTTAATC_12_T3_I84V_
TCCTATCA_17_T1
CGCTGAAA_8_T3_I84V_
GTAC
GTCT
_24_
T2
AACAACTG_17_T2
CTG
AAAC
G_8
_T3
ACCC
CTG
C_4_
T2GTGACAGA_11_T3_L90M_
GCCACACA_15_T2
CCAATACT_21_T3_L90M_AGCGCAAC_10_T3_L90M_
GGAGTAAC_35_T1
TGGGTGCT_26_T2
CGTG
CGCT_43_T1
GACACTCA_14_T3_V82A_
ACCCCGTC_9_T2
GAAC
TTAC_17_T3_V82A_
CCCCCACC_44_T2
AGGTGTTT_13_T2
CCTC
AACT
_35_T
1
TTGCTGAA_15_T1
TTTCGGTG_3_T2
TGCC
AATT
_24_
T3
CCTA
GCAT
_3_T
2
CTTCACCT_6_
T2
CTAG
ATAA
_52_
T1
GGATGCGT_37_T1
CGAAAGAC_5_T3_V82A_
CTTGG
AAA_11_T3_V82A_
3T_11_GCA
GCTTC
GCCCATCC_30_T1
TTAA
AATG
_13_
T3
AAGGTAAG_7_T3_V82A_
CCCT
CCCC
_24_T
1
GCG
TATAA_41_T1
ACTTAGAA_17_T2
TTTCTCCC_12_T2
TTGCAGTC_13_T3
CAAG
AACA_16_T3_V82A_
CAC
CG
TGC
_22_T3_V82I
TTTAGCAC_12_T2
CACCTCCG_16_T3_L90M_
CACCACAG_13_T3
TCTGGCCC_28_T2
CTAGGCCC_16_T2
CG
GTT
CC
T_22
_T3_
V82A
_
GATTGTAC_6_T1
GCGT
CGGG
_15_
T2
CCCCTTCC_46_T1
ACTTAAC
A_14_T3_V82A_
CCTG
AGAT
_7_T
3_V8
2A_
TGTGTGCC_6_T2
GCCC
TGCC
_22_
T2
GCCTGGGA_6_T2
TCACCTCC_26_T1
CATCACCA_20_T2
GAAAC
ATT_
7_T3
_V82
A_
TTCATCCA_27_T2
CCCA
TTG
C_6_
T3
CTAT
CCTG
_3_T
2
CGGTAAG
T_24_T
1
ACTTGAAA_9_T3
CCCC
GGGG
_5_T1
CCTCCCCG_40_T1
AACT
CACA
_9_T
3_L9
0M_
CCCCGCAT_21_T2
TCACTTTT_8_T1
ACAC
ACCG_13
_T3_V
82A_
AGG
CAACG_12_T3_V82A_
CGCAAATT_10_T2
ACGTAACA_15_T3_V
82A_
AGCCAATC_10_T3_V82A_
TGCC
GCCA
_51_
T1
ATCCTT
TC_9
_T3_
V82A
_
GCGTGACA_35_T2
TTGC
AACT_15_T3_V82A_
GTGCGTCG_26_T2
GAC
GTT
AG_9
_T2
CCCCCCCT_29_T2
GCGTCCCA_15_T2
AAAA
TGC
A_7_
T3_V
82A_
CCCA
TCGC
_5_T
2
GCTCCCAG_3_T1
TATT
TAG
C_10
_T3
TCCCCATC_15_T2
GTG
CAAC
T_12
_T3
TCCGGCGG_26_T1
CCTA
CAAC
_15_
T3
TGG
TTGG
C_22_T2
CTA
GTA
CA_
10_T
3_V8
2A_
TCCCCGCC_27_T2
CTCCTTCA_43_T1
CTAGAACG
_12_T3_V82A_
ATGGCCCC_16_T2
GTCG
GTGC
_21_
T2
CCACCCAC_3_T1
GCAC
TGGT
_20_
T1
AGTAATAG_7_T3
GTTGTATG_20_T3
CCCCACCG_8_T2
GCCTATCG_15_T2TC
CAACGT_1
9_T1
CCCC
CTTG
_4_T
2
TTACGTAC_23
_T2
TTAG
AAGC_17
_T1
TACA
TGAC
_12_
T3
TAAA
ACAC
_17_
T2 CTTT
GG
CA_9
_T2 GCCCTCCC_8_T2
GCAGCGTC_10_T2
ACTATCTT_20_T1
GTA
CACT
T_13
_T2
GC
ATCC
TT_21_T3_V82A_
GAGGGCAC_23_T2
TTGCACCG_12_T3_L90M_
CCCTTACC_3_T1AACCTCCC_5_T3_
V82A_
TTCCCTAT_9_T3_V82A_
CTAC
CCGG
_7_T
3_L9
0M_V
82I
CCGAGGCG_3_
T3_V82
A_
CTCATCGG_35_T1
CTCGTTCC_25_T2
GGAATTCC_9_T3_L90M_
TTATGACA_7_T3
CTG
TTG
AC_1
1_T2
AACGGACA_7_T3_I84V_
GCGCGCGA_14_T3_L90M_
TAGCTCTT_9_T2
CTTCACTT_11_T2GTTTCTAG_7_T
2
ACTCCCCT_16_T2
CACCTATT_17_T3
TTGTG
CAC_25_T1
TAACGTTT_33
_T1
AGAGTGGT_17_T3_V82A_
TCCA
CGCC
_23_T2
GCCTCGCG_14_T2
ACCCCCCA_15_T2
CGCGGACT_17_T3_V82L
TGATGCGG_41_T1
GTCGGGCC_4_T2
CACGTCCT_6_T3
CG
AGTTAC
_7_T3_V82I
GAACG
GAC_12_T3_V82A_
ACAACTCT_3_T1
TGGTGTGT_8_T2
CCCT
AATT
_11_
T3_L
90M
_
GCCGTCGA_10_T2
TAAAGACG_6_T3
GATCGAAC_12_T3
CTTAGG
TA_11_T3_V82A_
CCCCGTTA_21_T2
AAATTTAC_8_T3_I84V_
GCTGACCC_25_T1
TAGC
CTTC
_15_T3_V82A_
CGTCATTC_7_T3_V82A_
CCAAGTCC_21_T2
CCAT
GTCA
_26_
T2
AACGAACA_7_T3
TCCC
CGTT
_21_
T1
ACGATGTC_6_T3_V82A_
ATCCTT
TC_17
_T1
CACCACCC_16_T2
GGCCCCGT_6_T2
CCGA
CCGC
_17_
T1
CATA
CCTG
_5_T
2
GCGCTTTG_11_T2
ACGCATG
T_9_T3_V82A_
AGCCCCCA_1
5_T1
CACC
CGTG
_19_
T2
AATCTTCA_9_T2
CGG
ATCC
G_1
3_T2
ACTACACC_13_T1
CCACTGAT_23_T1AAACTTAC_33_T1
GAGAGGGT_14_T3_I84V_
GTAGGGAG
_11_T
3_V82A
_
TTTATTAA_22_T3
TCAAACTT_8_T3_V82A_
ACCA
ACTC
_6_T
3
AAGCTCG
A_10_T3_V82A_
CTAGTCCT_28_T1
CTACCATT_5_T3_V82A_
ATG
TTAA
G_1
2_T3
_V82
A_
CCCC
AACC
_7_T
2
TTTCGTCC_19_T2
TGAG
AAAG_12_T3_V82A_
CTCC
CCTA
_10_
T2
CTCACGTA_14_T3_V82A_
TGTCCCAA_19_T3_V82A_
CC
GTG
TCT_15_T3_V82A_
GTAT
CGGT
_9_T
1
CGCAGGAG_15_T3
AACCTGAA_19_T3
TAGGACAT_
14_T3
_V82A
_
CCCGACGC_26_T2
TACAGATT_5_T3_V
82A_
AGTACTTT_10_T3_V82A_
CCTC
GGTA
_25_
T2
AGTG
TCAG
_9_T
2
ACGCTGAT_23_T3_L90M_
CACACGGA_7_T3_V82I
GG
GTG
AGA_11_T3_V82A_
GTTAGCGT_28_T1
TGATCCAG_5_T3_V82A_
TTTAGTCC_7_T2
TTCGCCTT_
14_T1
ATACCGCA_15_T3
_V82A_
ATTC
GTGG
_9_T
2
GATG
GTG
T_11_T3_V82A_
TTAG
GAC
A_16
_T3_
V82A
_
GCTCCG
AA_10_T3_V82A_
TGGA
TCCG
_34_
T2
CTGCGATC_28_T2
CCGTTCTT_20_T1
GTAGTGCT_6_T3_V82A_
AACGTGAG_52_T1
TCAG
GAG
T_8_T3_L90M_V82I
GGAG
GGTC
_18_
T2
TCAG
GTCC_
7_T1
TCTA
TCTA
_33_
T2
GCGC
CTGC
_15_
T2
ATCCTGTT_11_T2
CGCTTCGG_4_T2
TCCCACAT_9_T3_V82A_
TTG
ATCT
C_5_
T2
CCCTCGTA_6_T2
GCCCCCCT_9_T3
ACCACCCT_11_T3_L90M_
GAAAC
TCT_18_T3_V82A_
CGTGCCGT_24_T1TAATGGGA_15_T3TCAACACC_12_T2
GGTTAAAT_5_T3_L90M_
TCGCTTTG_33_T2
GCCTTTTA_8_T3_V82A_
TCAGTTCC_17_T2
AAGCTGTC
_30_T1
TGTCCCAG_8_T3_V82A_
CGGT
GGAG
_29_
T2
TGCTCCCG_14_T2
AGCC
TATC
_29_T
2
CCCGGAGG_8_T2
CATGAAAC_13_T3_V82A_
TCGAGCCA_7_T3_V82A_
GAACGGTA
_8_T
3_V8
2A_
GTGTCCTA_23_T2
CCTCAGCG_7_T3_I84V_
TTACTTG
T_19_T3_V82A_
CCCTTCAC_18_T2
TAG
GCG
CT_2
8_T2 ATTCCCTC_12_T2
CGTACCTG_35_T1TGAC
ACCC
_8_T
2
AATC
CACC
_7_T
3_V8
2A_
CTGCGCGC_32_T2
AGAGCACA_17_T3_V82A_
ATCC
ACCC
_21_
T3
CACG
AGTC
_11_
T2
ATAC
CCCC
_9_T1
AGTAAG
CC
_17_T3_V82A_
CAGATACT_14_T2
TGG
CATC
T_7_
T3AT
CTCA
AT_1
9_T3
AGTAAAGA_19_T3
CAG
AGC
TC_8
_T3_
V82A
_
TCGA
AGAG
_8_T
3_L9
0M_
GAGG
CGGC
_18_
T2
ACTC
TAAG
_11_
T1
AAG
CAAT
G_2
0_T3
GCCGCGCC_26_T2
CCTCCTCT_24_T2
CTGTAACC_9_T3_L90M_
GACTA
CGT_
40_T
1
TTAC
GAGG
_5_T
1
TTTAGGCC_7_T1
ATCGCGTC_12_T2
GCCGGGCT_40_T1
TCAG
GG
AA_1
0_T2
ATTAAAGT_4_T3_I84V_ ATCCCCGT_2
2_T1
TATA
TAG
A_13
_T3_
V82A
_
CACAAATT_4_T3_L90M_
TAGT
GGTT
_3_T
3
GCTTGTAT_5_T3
2T_3
_CTC
CGT
TT
CTGGTTAC_16_T2
GG
ACTACT_10_T3_V82A_
ATCAAGAC_12_T3
CCCCCCGC_15_T2
TTTCCTTG_3_T3_V82A_
GGAAAATC_16_T3_V82A_
GAACCCGG_53_T1
AAGAAAAG_12_T3_V82A_
GACCACAG_31_T2
CAAGATCG_11_T3_V82F
GCGC
CCAC
_12_
T2
CTTG
ACG
G_1
4_T3
ACTCTTTA_10_T3_L90M_
ACCCAGCG_21_T3
CTGACCCT_20_T
3_V82A
_
CTCT
GTAC_
11_T2
CATT
ACCG
_14_
T2
CAATTTAC_20_T3
GTA
AGG
TA_5
_T3_
V82A
_
GCCCGCCC_29_T2CGTCTCCC_24_T2CG
CCCC
CC_7_
T1
TCAT
CAGT
_7_T
3_L9
0M_
GTCATCTT_12_T3
ACC
AATAT_21_T3_V82A_
TCTG
CATT
_3_T1
CCCTCCCT_24_T2
GCCAAAAG
_8_T3_V82A_
GACGCGCC_42_T1
CTTATGG
G_39_T1
AGGT
CTCT
_17_
T2
TCCCACCA_46_T1
TTCCAGAC_6_
T1
GTACTCCC_13_T3_L90M_
TAAG
TACG_1
1_T3
_V82
A_
CTGT
GGGT
_3_T
2
ACCC
CCAC
_12_
T2
CTAAACTA
_29_T3
_V82A_
CAGATTGC_3_T3_V82A_
GTATCGCC_9_T3
CCCCGTAA_4_T3
GTGGAT
CC_5_
T1
AGGG
AGAC
_9_T3
GATAC
TTA_
7_T1
TCGC
TGTT
_30_
T3
CACCGTCG_14_T2
TATTCGGG_17_T2
TTAG
ACGC
_6_T
2
TCAC
GCAC
_28_
T1
TCTC
CCAC
_7_T
3_V8
2A_ CCG
TTGTG
_17_T3
GAC
CGAG
T_12
_T2
CAAC
CGGG
_8_T
3_L9
0M_
AAGTGCAA_11_T2
CATGCACG_4_T2
GCGGAGGC_12_T2
CGTC
CCCA
_13_
T2
CACGGTAT_12_T2
ATGAAACA_35
_T1
GG
ACG
ATC_16_T3_V82A_
GTTATAAG_4_T3_V82A_
TCAT
TACT
_8_T
1
GTCGTGTC_15_T3
ATAGTCAT_19_T3_L90M_
CTCC
CCGG
_4_T
2
CTAAAC
TC_17_T3_V82A_
CTGC
CCCC
_14_
T1
CGGGCCCT_18_T2
CTGATGAC_13_T2 TAACCCTC_4_T3
CATCTGCC_43_T1
GACATTTC_6_T3
TCCT
GCGG
_12_
T2
CGCAGCAT_4_T2
TCTCCCCT_31_T2
ACATACCA_28_T1
CTAGCGGA_15_T3_V82A_
CTATC
ATT_7_
T3_V82A
_
ACCCACGT_28_T2
ACTAGCGA_9_T3_V82A_
AGGTACAT_30_T3_L90M_
CCTACAGA_11_T3_V82A_
CCAT
TCG
A_13
_T2
TGATTA
AC_3_T3_
V82A_
TCTGAATG
_6_T3_V82A_
TCCATGAT_17_T3_V
82A_ TC
TGAATA_34_T3_V82L
CCAACTCT_9_T3_V82A_
GCCC
GCCC
_10_
T3
CAGAG
ACG_23_T1
CCCC
GCAG_13
_T1
CCAT
ATG
C_7_
T3
CTCGTGGT_44_T
1
CCCA
TTCA
_3_T
2
GTATCTCG_9_T1
GCCC
GCAC
_9_T
2
TATA
TGTT
_23_
T1
GGTGGAAT_27_T
2
AGTCCTAA_38_T1
CGCC
GTCT
_13_
T2
GGGCTCCG_5_T2
CGCG
GTCC
_17_
T2
ACTTCCAT_5_T3_I84V_CATATACG
_8_T3_V82A_
TTTT
CCGA
_17_
T3_L
90M
_
CCAC
CCAC
_12_
T2
GTCTCCAC_3_T2CTCCCCTC_12_T2
TACC
ATAT_12_T3_V82A_
CCCGGCGC_15_T2
TTTT
TGCC_1
0_T3
_V82
A_
GCGTTGGG_5_T
2
ATAGGACT_11_T3_V
82A_
TAAC
GAA
T_6_
T3
GTCTTGGA_3_T1
GAGCGTGC_37_T1
GGAAACCG_8_T2
GTTTCTTC
_46_T1
GCCGCCTC_33_T1
GCAG
GAAG
_13_T3_V82A_
TTTGTACT_12_T2
TTCT
CAG
C_3_
T3
GGTTCCAA_5_T3_V82A_
GTCGTCTC_11_T3_I84V_
CGTC
CGAG
_25_
T1
TCTTGCCT_7_
T2
ACCCCCTT_5_T2
CCCCTCGA_19_T1
AATAGACA_41_T1
3T_01_G
GGCACAA
GTGAATCT_24_T3
CTCAACAG_16_T2
ATCTAGCC_15_T3
CACCAGCC_4_T1
CGTAGCGT_19_T2
TCCCTTCC_9_T2
CAAGACGT_27_T1
TCCCCACC_33_T2
CGGAATGT_11_T1
GAATCGGA_28_T1
CTCGCCG
G_25_T1
ATAG
CGGC_12_T
3_V82A
_
AGGACCGG_7_T1
TAAGTG
AT_17_T3_V82A_
CGGCG
GAC_14
_T1
CCGCCCCT_13_T2
AAAATATT_15_T3_L90M_
ATTGTG
GA_16_T3_V82A_
CGTCGCAA_8_T2
CCGAAATA_9_T3_I84V_
TCAAAGTA_7_T3_V82A_
GCAG
TGCT
_33_
T1
CGTTACGA_4_
T3_V82A
_
CACATACG_37_T1
ACGTCTG
C_3_T1
CCAT
AGCA
_3_T
2
CC
AGAAC
A_20_T3_V82A_G
GC
AGTC
T_30_T3_V82I
GGGGTTCG_18_T2
TCGG
TCTG_20_T2
CGGGATCC_29_T2
CCTC
CCTT
_15_
T1
AACGCCGC_8_T2
GGGGCCCC_10_T2
GG
TTCA
AA_9
_T3
CCGCG
TAA_10_T3_V82A_
CCAC
CCTA
_34_
T1
CGCACGCC_12_T2
CAAGGCAT_6_T3_V82A_
CTG
TAG
CA_1
5_T2
ATACTAAC_4_T3_V82A_
ACTCATTA_4_T3_V82A_TCGGCGAT_7_T3_V82A_
CTCCGATT_21_T3_V82A_
GATTGTAA_15_T3_L90M_
TGCTATTC_23_T1
AGCCGCAC_24_T3
_V82A_
TACCATAA_12_T1
CGGCGGGC_10_T2
ACGC
GCTG
_25_
T2
TGCC
GTTC
_4_T
1
TTTC
TATT
_30_
T2
AAACTACT_6_T3_I84V_CGCT
AGAG
_6_T
3_V8
2A_
CTCACCCG_13_T3_V82A_
GG
TCG
CAG
_9_T
2
TGCTTTTG_7_T2
CACGACTG_21_T3_L90M_
TTACAATG_14_T3_V
82A_
CTCT
CGAT
_3_T
1
TCCATTCT_3_T3
ATTACCGA_3_T1
TACC
CCTT
_22_
T2
TCCT
CCTC
_23_
T2
AAAGGCTG_7_T3_V82I
CGAGGGCG_9_T1
TACTATAT_15_T3_I84V_
AACTACAT_15_T3_V82A_
GCG
GG
TTT_8_T3_V82A_
_A28
V_3T
_32_
ATC
GAG
CA
GAGCTAGC_9_T2
GC
GTTC
AT_16_T3_V82A_
3T_4
_AA
GACC
GT
CCCC
TGTC
_15_
T2
TTTCGGAC_9_T3_V82A_
GCTGTGTT_10_T3_L90M_
TACAT
GGG_13_T
3_V82A
_
AATCCATC_11_T2
GCTACCGG_5_T2
GGTA
GCTC
_15_
T1
TAAT
TGG
C_17
_T3
CTAG
ACAC
_23_
T3
TGTAATCT_11_T3
ACCTGGGA_3_T2
ATCCGGGC_13_T2
ATGC
GACA
_9_T
2
ACCCAACA_8_T2 GCCCTCCG_10_T2
AAGCCCTT_11_T3
ACCCCCCC_17_T2TCGGCGCG_17_T2
_A28
V_3T
_52_
AC
GGA
GCA
AGAG
TCTG
_11_
T3_V
82A_
TCCCGTGC_11_T2
AGGAC
AGA_
11_T1
GCAC
CAGA
_17_
T3
TTCGAAGA_7_T3_V82A_
CATG
AGCG
_18_
T2
TGCTT
ATA_21
_T3_V
82A_
AATG
AGAT
_4_T
2
TTGCCCCT_17_T2TCAGCTGC_10_T2
CTATAAAA_15_T3_L90M_
GCTTCTCG_46_T1
CCCC
TCTT
_16_
T2
TCTC
AGCT
_21_
T1_A28
V_3T
_22_
ATT
GCT
AG
GGCC
CCCC
_18_
T2
AGGCATCA_20_T3
CGCC
CCTT
_15_
T1
GAGA
CCGC
_8_T
1
ACGCCGAC_7_T3_V82A_
GACCCGAC_3_T3_I84V_
TTTCC
GC
A_13_T3_V82A_
CCCTACGC_22_T1CCGGTGGC_5_T2
GTG
CCAG
A_8_
T2
CAATCACG_29_T2
CCTATCGT_19_T2
ACATTGTA_3_T3_V82A_
CGGGATTA
_9_T
3_V8
2A_
CTCTACAA_9_T3_V82A_
TGCCTAGA_9_T2
CCCGCCGA_42_T2
AAGCCTGA_19_T3_V82A__L90M_
GACCGGCC_3_T1
CCCACCCC_28_T2CCTG
CGTC
_5_T
2
GTT
CGAG
C_36
_T2
TTATCACC_10_T2
CAGTTCAG_12
_T3_V82A
_
_A28
V_3T
_12_
CC
CTGT
CT
GTAAC
AAA_
8_T3
_V82
A_
TCTT
CTA
G_1
3_T3
CTAT
TTAT
_21_
T3
CTCCTATC_30_T1
GGAC
GAGT
_9_T3
CCCAAATA_44_T1
ATTC
AACC
_3_T
3
CC
CTATAA_13_T3_V82A_
TCTTTC
CG
_17_T3_V82A_
CCCT
AGCA
_12_
T2
GACAC
GAC_7_
T1
TCCG
TCG
C_14
_T2
TCCC
TGGC
_21_
T2
GAATAACT_14_T3_V82L
GC
AGTA
AG_3
_T3_
V82I
CAGCAACC_10_T3_L90M_
GCGCACAC_18_T2
GCCGTCGC_16_T3_L90M_
CTAGGGCT_2
0_T1
TCCTC
CAA_17_T
3_V82A
_
CGTTCAAG_9_T3_I84V_
CCCC
TGTG
_9_T
3_I84
V_CA
CCG
TAG
_7_T
2
CCAACACC_5_T3_V82A_
CCTCTTCC_7_T2
CG
ACG
CAT_17_T3_V82A_
CCGGGTGG_3_T1
CCGC
CCCC
_5_T
2
CATG
TAG
G_1
5_T3
_V82
A_
TCCA
TCTC
_50_
T1
CCGTAGCC_3_T1
TTATTATC_17_T3_L90M_TATGGTTG_9_T3_L90M_
CGCC
CCAT
_6_T
2
GCTAGCCT_3_T2
ATGTCCCC_3_T2
TCTT
TATC
_26_T3
_V82A
_
CATAAGCA_3_T3_I84V_
CAATTCCT_46_T1GTGCTCGG_22_T2
CTTTCCTC_12_T3_V82LAGCCAAAA_9_T3 ATCCCCTC_10_T
2
TCGCCTCC_3_T1
CAAA
TTAG
_7_T
3
GG
GCG
CGC_
25_T
2
TTTCCCTT_15_T2
CCCCCGGA_18_T2
TTTA
AATG
_7_T
3_V8
2A_
ATCTCGCG_27_T2
AGCC
TAGG
_24_
T2
GGGCGTCG_33_T1
CGACACTA_9_T3_
V82A_
AAGGGATC_33_T3_L90M_
TTCTGTCG_3_T3_V82A_
ATACTAAA_60_T1
ACCTAATC_4_T2
AGGGTGGG_15_T2
AATAGAC
T_13_T3_V82A_
GAC
GG
TTC_
19_T
2
CACCCGCT_28_T2
GAAACACA_19_T3
_V82A
_
CATG
GTCA
_24_
T2
ATCAC
CCT_15_T3_V82A_
CACTAGAT_14_T3
GTGCATTG_8_T3
ACCGTCGA_8_T3_V82A_
TCCAAAAA_9_T3_V82A_
ATGAG
CTA_10_T1
AAATCCGC_27_T1
ACCCGGGC_35_T2CCGCGACT_11_
T2
CCCACGGT_17_T1
GTTTTGAG_12_T2
CTGC
CCCC
_4_T
2
GCCGAGAT_51_T
1
ATTATTAA_11_T2
TAATCAAC_37_T1
TCAAAATT_14_T3_V82A_
CTAAACTT_19_T3_V82A_
GAGTGCGG_11_T2
GTG
TTAG
T_9_
T2
CCAGGGAG_22_T1
TAGCGTTA_7_T3GTCGTGTG_21_T1
CCGG
TGGC
_3_T1
CCCG
CGCA
_20_T
1
GGAT
TGCG
_10_
T1
CTCCCTGT_3_T1_L90M_
AGTCCACC_7_T3_V82A_
TCTGAAGG_29_T1
GCCTCAA
C_9_T3
_V82A
_
TTTGTCCG_9_T3_L90M_CTCCGAAC_18_T3_L90M_
ACAC
ATCG
_11_
T3
AGCAG
TCA_11_T3_V82A_
GTAGCATC_5_T2
CCCAACTT_12_T2
TAATGTCG_8_T2CTCCCCCA_25_T1
GC
AAAGG
G_14_T3_V82A_
GGATGCCG_14_T2
ACGCATTA_8_T3_V82A_
ATCCAAGC_9_T3_V82A_
GCCCG
CTA_12_T3_V82A_
TGAT
CTAT
_14_
T2
TCCTCTGA_7_T3_V82A_
TCAA
CAA
T_9_
T3_V
82A_
CCGCCGCC_25_T2
CCTTGTCG_5_T1
GCCCCTTA_22_T2AGAACAGA_15_T2
TTAATCCG_10_T3_V82A_
AACT
TTG
A_5_
T3_L
90M
_
GAAAG
TAT_19_T3_V82I
AGCACTAA_15_T1
ACCC
ATCA
_5_T
2
TATA
ACAA
_9_T
3_L9
0M_
AGGGTCCC_16_T1
TACC
GCCC
_6_T
1
TAGATATA_13_T3_V82A_
TAGA
GTAA
_29_
T3
ATAGGAAC_13_T3
AATAATCA_14_T3_L90M_
TTGATC
TA_14_T3_V82A_
CCAA
ATGC
_14_
T3_V
82A_
ATAACAAA_7_T3_V82A_ACAGAACC_19
_T3_V
82A_
_A28V_3T_52_GT
GCT
GCA
TTCCTCCC_36_
T1
ACGC
GGGT
_11_
T2
GG
TAAACG_12_T3
TACGTACC_11_T3_V82A_
TGCG
TCGC
_19_
T1
CTACCTAG_7_T2
ACATTTCC_5_T3_V82A_
GCCACGGA_6_T3
CCCGCAAA_15_T1
GAAAT
AGA_
5_T3
_V82
A_
GCGAGCGC_33_T2
ACTTTCCT_9_T3CTATGCGA_18_T3
TCAG
AAGG
_11_
T3
TTCG
TTAA
_5_T
3_L9
0M_
3T_4_AACAT
GAT
ACCTGGGA_22_T3_L90M_
GTGCCCGT_41_T2
CGCC
GACG
_24_
T2
CACG
CGCG
_9_T
2
CATGGAGC_21_T3_V82A_
TAAGGCTC_8_T3_V82A_
CTTCCTAT_6_T3
AGAATG
GC
_15_T3_V82A_
CGGATAAG_3_T1
ATTGGGTA
_20_T1
CAC
AGC
AT_19_T3_V82A_
TAGATCGG_12_T2
ACCTCAAA_26
_T1
CTGCTATT_5_T1
CTTAAAGT_14_T3
TTTAACTC_25_T3_L90M_
CCGC
CATC
_9_T
2
CTAGTTCC_11_T3_V82A_
ACTG
CTCT
_17_
T3
TCCAAACT_5_T3_V82A_
AAGTTACA_4_T3
GCCTCTTG_15_T2
CCAACCCC_4_T2
CTTATGTG
_10_T3_V82A_
GCCT
GTGC
_18_
T2
GG
GTA
GG
A_22
_T1
CGTGTCTC_19_T2
CCCC
GTAG
_13_
T2
TCTTTAG
A_20_T3_V82A_
AAGTCAAG_11_T1
AATCTGGA_8_T3_V82A_
CCG
GG
CAC_
18_T
2
ACGAATAA_15_T3_V82A_
TATT
CAAT_
8_T3
_V82
A_
CACGGACA_6_T3_
V82A_
TACCGGCG_3_T1
AAATAGAA_22_T3_I84V_
ACATTTAA_13_T3_I84V_
CAAGGCCC_17_T2
CCTCACTA_5_T3_V82A_
CGAATATT_35_T1
TTTA
CAAC
_5_T
3
GAG
GG
GCG
_5_T3_V82A_
TGAACGCT_6_T3_V82A_
GTTCTT
AA_21_T3
_V82A
_
ACACCCAC_17_T3_V82A_
ACAAG
TCG
_18_T3_V82A_
CGGTGCAC_13_T2
ACTA
GTC
A_11
_T3
TAAACTAA_3_T3_L90M_
CTGCTCCC_9_T2
TACAG
TCA_14_T3_V82A_
ACTTC
CTT_17_T3_V82A_
CTCATGCC_5_T2
GCGCACAT_11_T2
CGTACGAT_16_T2
AAGTATAT_12_T3_L90M_
CAACGAT
C_10_
T3_V
82A_
AGACCGAA_11_T3_I84V_
GTCAACCA_21_T2
GTGGCGAC_6_
T3_V82A
_
CCGGCGCC_3_T2
CTTCCCCA_25_T2
CCTT
AAAG
_15_
T2
CCAGATAT_21_T3_L90M_CTG
CGTAA_9_T3_V82A_
TGACG
GG
T_24_T2
CTTC
TCTG
_4_T1
AAACTAAG_6_T3
GTTAC
CCT_13_
T3_V
82A_
TCGA
ACCC
_24_
T1
ATAAGTGA_9_T3_V82A_CCCCCTCC_14_T1
ATCTGG
GA_12_T3_V82A_
AGTTCAAA_22_T3
GTCCGCAT_7_T3_V82A_
TACT
ACAT
_13_
T3_L
90M
_
GCGCG
GGG_19_T1
GTTCTCTC_4_T2
TGACCCCA_5_T2
TTAGTTAC_10_T3_L90M_
TTGGAACT_7_T3_I84V_
AAGA
TGTT
_7_T
1
Fig. 3. Phylogenetic representation of protease population derived from deepsequencing with a Primer ID. A Neighbor-Joining tree was constructed fromsequences derived from all three time points and colored based on susceptibilityto ritonavir. Blue colored taxa represent susceptible variants (defined as notV82A/I/L/F, I84V, or L90M). Red colored taxa represent variants containing themajor ritonavir resistant variant, V82A. Pink colored taxa represent the minorresistant variants V82I/L/F. Green and orange colored taxa represent the minorresistant alleles L90M and I84V, respectively. Within a color, color brightness iscorrelated with sample time. Dark green and red arrows point to pre-RTV low-abundance sequences that clonally amplified to their respective clades.
Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20169
MICRO
BIOLO
GY
Dow
nloa
ded
by g
uest
on
May
30,
202
0
the number of pathogen genomes in the sample is limited, and theuse of PCR can obscure the quality of the sampling by creating alarge amount of DNA from a relatively small number of startingtemplates. This can create artificial homogeneity, inflate estimatesof segregating genetic variation, skew the distribution of alleles inthe population, and introduce artificial diversity.We have developed a strategy that allows each sampled tem-
plate to be tagged with a unique ID by a primer that has a de-generate sequence tag incorporated during the primer oligonu-cleotide synthesis (Fig. S7). This tag can then be followed throughthe PCR and the deep sequencing protocol to identify sequenc-ing over-coverage (resampling) of the individual viral templates.Because the Primer ID allows for the identification of over-cov-erage, this can then be used to create a consensus sequence foreach template, avoiding both PCR-related errors and sequencingerrors (Fig. S8). In addition, the number of different Primer IDsreflects the number of templates that were actually sampled. Thisallows a realistic assessment of the depth of population samplingand makes it possible to apply a more rigorous analysis of minorvariants by correcting the allelic skewing during the PCR.We tested the Primer ID approach by sequencing the HIV-1
protease coding domain at three time points in a subject who wasintermittently exposed to a protease inhibitor between the secondand third time points. A key feature of our approach is the re-moval of fortuitous errors and accounting for resampling, whichresults in a dramatic reshaping of the original data set of 72,162reads. Other approaches that rely on statistical modeling havebeen developed to deal with the problem of high sequencing errorrates associated with deep sequencing technologies (49–51). Theuse of the Primer ID to create consensus sequences resulted inthe removal of 80% of the unique sequence polymorphisms(defined as a change in the consensus without regard to frequencyof appearance) in the data set. Similarly, allelic skewing wasdramatic among the sampled sequences, in most cases rangingfrom 2- to 15-fold but going up to nearly 100-fold. Although thePrimer ID reveals such skewing and helps correct it, this is clearlya poorly controlled feature of PCR amplifications that can dra-matically affect the observed abundance of complex populations,especially the minor variants. Allelic skewing may still persist ifthe cDNA primer or the upstream PCR primer binds differen-tially among the templates, or if cDNAs enter the PCR amplifi-cation in later rounds and are discarded because they do notresult in at least three reads to allow a consensus sequence to beformed. Also, residual misincorporation errors by RT and in thefirst round of PCR synthesis still limit the interpretation ofmutations that occur in the range of 0.01–0.1%. This problem isnot overcome with larger numbers of sequences. Given the lowdiversity in these samples, we removed all substitutions thatappeared once because their number approximated the expectednumber of residual sequence errors, and this resulted in a sensi-tivity of detection in the range of 0.1% for SNPs that appearedabove the frequency of the residual sequence error rate.Using the Primer ID approach, we were able to describe a
number of features of the protease sequence population, howeverour results are from a single individual and therefore cannot begeneralized. First, a pooled analysis of two time points six monthsapart showed that the variants present at greater than 0.5% inabundance made up two-thirds of the total population but rep-resented only 4% of unique genome sequences and containedonly 7% of the total unique sequence polymorphisms. About 60%of the diversity was stable over both time points, with synonymousSNPs maintained at a significantly higher proportion in the twotime points than nonsynonymous SNPs. Only 18% of the totaldiversity represented nonsynonymous SNPs that were present atboth time points. However, our ability to assess persistence ofthese sequences is limited by the depth of sampling, although wefeel we are approaching the practical limit of sampling with thistechnology. We observed nonviable substitutions and estimatethat most of the SNPs that appeared once were the result of re-maining method error. We found no pattern of conserved linkage
among these SNPs, consistent with high levels of recombinationacross the population.Although the overall measurement of diversity (π) was similar
between the first two time points, we noted that the biggestchanges in SNP abundance between the two time points were inthree synonymous codon positions (L24L, K70K, and G73G).These dynamic increases made these SNPs part of a larger groupof SNPs that accounted for 51% of the total sequences that wereotherwise identical to the consensus sequence (Q18Q, L19I, L24L,K70K, G73G, and Q18Q/L19I/L24L’). These SNPs also over-lapped the major SNPs that defined subgroups of the resistantvariants (L19I; L19V; G16G/L19V). We considered the possibilitythat there was a unifying feature of these SNPs. We found such afeature in that all of these SNPs, both coding and noncoding, resultin changes in two relatively large alternative ORFs that lie at the5′ and 3′ ends of the pro gene. Alternative reading frames havebeen suggested to generate cryptic CTL epitopes (52–54). In thisscenario, these abundant SNPs would represent various escapemutants. Such selective pressures could explain the dynamic be-havior of several of these SNPs between the first two time points.After intermittent exposure to the protease inhibitor ritonavir,
we were able to identify six independent lineages of drug re-sistance mutations. With the intermittent exposure in this par-ticular subject, it was possible to see the major V82A lineagemost often seen with ritonavir resistance, but also significantpopulations of I84V and L90M. We also saw minor populationsof V82I, V82L, and V82F. This mixed population of resistantlineages likely represents the early stages of the evolution ofresistance, a conclusion supported by the minor appearance ofthe L63P compensatory mutation and the complete absence ofI54V, which is an often seen compensatory mutation for V82A.We saw few examples of genomes with multiple resistancemutations, although these would be expected after more exten-sive selection (55, 56). We and others have previously examinedviral sequences that have been collected in large databases.Typically, these sequences represent the single predominant se-quence within an individual, and the use of these sequencesallows for assessment of interperson diversity. In the future, itwill be an interesting exercise to compare the conclusionsreached by examining viral diversity within a person to viral di-versity between people; however more intraperson diversityneeds to be measured at this level of detail to allow comparisonof inter- versus intraperson diversity.The presence of preexisting drug-resistant variants and their
role in therapy failure is of great interest, and accurate, deepsampling of a viral population can add significantly to our un-derstanding of this question. We were able to detect severalexamples of drug-resistance mutations but only at a very lowlevel. Our ability to reliably detect these mutations is limited tothose that appear at a frequency of 0.1–0.2%, limited in part bythe low overall diversity in the population. We were able to seeexamples of mutations that are typically seen only in the pres-ence of drug selection. However, the detection was usually as onegenome at two time points or two genomes at one time point.This was also the level of detection of active site mutations in theprotease and of termination codons, which must represent eithertransient viral genomes or residual misincorporation errors. Intwo cases, we were able to observe the resistance mutation(V82A and L90M) at pretherapy time points linked to the samepolymorphisms that were present on the variant that grew outduring drug exposure. Thus, although it is likely that we aredetecting relevant preexisting drug-resistant variants, these are atthe limit of detection and, if they are maintained at a steady-statelevel, it is well under 0.5% abundance.Most protocols of high throughput sequencing technologies
still require an initial quantity of DNA that necessitates anupfront PCR step for many applications. The use of a Primer IDwill help clarify the sequencing products in any strategy that usesan initial PCR step with its attendant error rate, recombination,and resampling. In an independent effort Kinde et al. have de-scribed an analogous approach in another deep sequencing
20170 | www.pnas.org/cgi/doi/10.1073/pnas.1110064108 Jabara et al.
Dow
nloa
ded
by g
uest
on
May
30,
202
0
setting (36). We believe a strategy that allows an initial taggingof individual templates before PCR and subsequent sequenceanalysis will be essential for understanding the true complexityand diversity of genetically dynamic populations.
Materials and MethodsViral RNA was isolated from blood plasma using the QIAmp Viral RNA kit(Qiagen). cDNA was generated using SuperScript III Reverse Transcriptase (Invi-trogen) using the primer (with Primer ID) as described. Following the reaction,RNA in hybrid was removed by RNaseH treatment (Invitrogen). Unincorporated
cDNAprimerwas removed, and the cDNAproduct amplifiedby PCR. Sequencingwas done using the 454 platform (Roche). Detailed methods for cDNA tagging,amplification, and analysis are presented in the SI Materials and Methods.
ACKNOWLEDGMENTS. C.B.J. thanks Jesse Walsh for training on the GS FLXplatform. We also thank Dr. Dale Kempf of Abbott Laboratories for makingclinical samples available. This work was supported by National Institutes ofHealth (NIH) Awards GM P01 GM066524 (with subcontract to R.S.) and R37AI44667 (to R.S.). In addition, we received support from University of NorthCarolina (UNC) Center For AIDS Research NIH Award P30 AI50410 and UNCLineberger Comprehensive Cancer Center NIH Award P30 CA16086.
1. Margulies M, et al. (2005) Genome sequencing in microfabricated high-density pico-litre reactors. Nature 437:376–380.
2. Eid J, et al. (2009) Real-time DNA sequencing from single polymerase molecules.Science 323:133–138.
3. Bentley DR, et al. (2008) Accurate whole human genome sequencing using reversibleterminator chemistry. Nature 456:53–59.
4. Metzker ML (2010) Sequencing technologies - the next generation. Nat Rev Genet 11:31–46.
5. Fischer W, et al. (2010) Transmission of single HIV-1 genomes and dynamics of earlyimmune escape revealed by ultra-deep sequencing. PLoS ONE 5:e12303.
6. Wang C, Mitsuya Y, Gharizadeh B, Ronaghi M, Shafer RW (2007) Characterization ofmutation spectra with ultra-deep pyrosequencing: Application to HIV-1 drug re-sistance. Genome Res 17:1195–1201.
7. Hoffmann C, et al. (2007) DNA bar coding and pyrosequencing to identify rare HIVdrug resistance mutations. Nucleic Acids Res 35:e91.
8. Bushman FD, et al. (2008) Massively parallel pyrosequencing in HIV research. AIDS 22:1411–1415.
9. Varghese V, et al. (2009) Minority variants associated with transmitted and acquiredHIV-1 nonnucleoside reverse transcriptase inhibitor resistance: Implications for theuse of second-generation nonnucleoside reverse transcriptase inhibitors. J AcquirImmune Defic Syndr 52:309–315.
10. Mitsuya Y, et al. (2008) Minority human immunodeficiency virus type 1 variants inantiretroviral-naive persons with reverse transcriptase codon 215 revertant muta-tions. J Virol 82:10747–10755.
11. Gunthard HF, Wong JK, Ignacio CC, Havlir DV, Richman DD (1998) Comparativeperformance of high-density oligonucleotide sequencing and dideoxynucleotide se-quencing of HIV type 1 pol from clinical samples. AIDS Res Hum Retroviruses 14:869–876.
12. Van Laethem K, et al. (1999) Phenotypic assays and sequencing are less sensitive thanpoint mutation assays for detection of resistance in mixed HIV-1 genotypic pop-ulations. J Acquir Immune Defic Syndr 22:107–118.
13. Palmer S, et al. (2006) Persistence of nevirapine-resistant HIV-1 in women after single-dose nevirapine therapy for prevention of maternal-to-fetal HIV-1 transmission. ProcNatl Acad Sci USA 103:7094–7099.
14. Flys TS, et al. (2006) Quantitative analysis of HIV-1 variants with the K103N resistancemutation after single-dose nevirapine in women with HIV-1 subtypes A, C, and D.J Acquir Immune Defic Syndr 42:610–613.
15. Cai F, et al. (2007) Detection of minor drug-resistant populations by parallel allele-specific sequencing. Nat Methods 4:123–125.
16. Beck IA, et al. (2008) Optimization of the oligonucleotide ligation assay, a rapid andinexpensive test for detection of HIV-1 drug resistance mutations, for non-NorthAmerican variants. J Acquir Immune Defic Syndr 48:418–427.
17. Johnson JA, et al. (2008) Minority HIV-1 drug resistance mutations are present inantiretroviral treatment-naive populations and associate with reduced treatmentefficacy. PLoS Med 5:e158.
18. Johnson JA, et al. (2007) Simple PCR assays improve the sensitivity of HIV-1 subtype Bdrug resistance testing and allow linking of resistance mutations. PLoS ONE 2:e638.
19. Metzner KJ, et al. (2003) Emergence of minor populations of human immunodefi-ciency virus type 1 carrying the M184V and L90M mutations in subjects undergoingstructured treatment interruptions. J Infect Dis 188:1433–1443.
20. Paredes R, Marconi VC, Campbell TB, Kuritzkes DR (2007) Systematic evaluation ofallele-specific real-time PCR for the detection of minor HIV-1 variants with pol andenv resistance mutations. J Virol Methods 146:136–146.
21. Li JZ, et al. (2011) Low-frequency HIV-1 drug resistance mutations and risk of NNRTI-based antiretroviral treatment failure: A systematic review and pooled analysis. JAMA305:1327–1335.
22. Metzner KJ, et al. (2009) Minority quasispecies of drug-resistant HIV-1 that lead toearly therapy failure in treatment-naive and -adherent patients. Clin Infect Dis 48:239–247.
23. Halvas EK, et al. (2006) Blinded, multicenter comparison of methods to detect a drug-resistant mutant of human immunodeficiency virus type 1 at low frequency. J ClinMicrobiol 44:2612–2614.
24. Mansky LM, Temin HM (1995) Lower in vivo mutation rate of human immunodefi-ciency virus type 1 than that predicted from the fidelity of purified reverse tran-scriptase. J Virol 69:5087–5094.
25. Perelson AS, Neumann AU, Markowitz M, Leonard JM, Ho DD (1996) HIV-1 dynamicsin vivo: Virion clearance rate, infected cell life-span, and viral generation time. Science271:1582–1586.
26. Hughes JP, Totten P (2003) Estimating the accuracy of polymerase chain reaction-based tests using endpoint dilution. Biometrics 59:505–511.
27. Kanagawa T (2003) Bias and artifacts in multitemplate polymerase chain reactions(PCR). J Biosci Bioeng 96:317–323.
28. Meyerhans A, Vartanian JP, Wain-Hobson S (1990) DNA recombination during PCR.Nucleic Acids Res 18:1687–1691.
29. Yang YL, Wang G, Dorman K, Kaplan AH (1996) Long polymerase chain reactionamplification of heterogeneous HIV type 1 templates produces recombination ata relatively high frequency. AIDS Res Hum Retroviruses 12:303–306.
30. Liu SL, et al. (1996) HIV quasispecies and resampling. Science 273:415–416.31. Judo MS, Wedel AB, Wilson C (1998) Stimulation and suppression of PCR-mediated
recombination. Nucleic Acids Res 26:1819–1825.32. Salazar-Gonzalez JF, et al. (2008) Deciphering human immunodeficiency virus type 1
transmission and early envelope diversification by single-genome amplification andsequencing. J Virol 82:3952–3970.
33. Palmer S, et al. (2005) Multiple, linked human immunodeficiency virus type 1 drugresistance mutations in treatment-experienced patients are missed by standard ge-notype analysis. J Clin Microbiol 43:406–413.
34. Simmonds P, Balfe P, Ludlam CA, Bishop JO, Brown AJ (1990) Analysis of sequencediversity in hypervariable regions of the external glycoprotein of human immuno-deficiency virus type 1. J Virol 64:5840–5850.
35. Edmonson PF, Mullins JI (1992) Efficient amplification of HIV half-genomes fromtissue DNA. Nucleic Acids Res 20:4933.
36. Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B (2011) Detection andquantification of rare mutations with massively parallel sequencing. Proc Natl AcadSci USA 108:9530–9535.
37. Cameron DW, et al. (1998) Randomised placebo-controlled trial of ritonavir in ad-vanced HIV-1 disease. The Advanced HIV Disease Ritonavir Study Group. Lancet 351:543–549.
38. Potter J, Zheng W, Lee J (2003) Thermal stability and cDNA synthesis capability ofSuperScript III reverse transcriptase. Focus (Invitrogen, Carlsbad, CA), pp 19–24.
39. Barnes WM (1992) The fidelity of Taq polymerase catalyzing PCR is improved by anN-terminal deletion. Gene 112:29–35.
40. Shriner D, Liu Y, Nickle DC, Mullins JI (2006) Evolution of intrahost HIV-1 genetic di-versity during chronic infection. Evolution 60:1165–1176.
41. Johnson VA, et al. (2005) Update of the drug resistance mutations in HIV-1: Fall 2005.Top HIV Med 13:125–131.
42. Shafer RW, Jung DR, Betts BJ (2000) Human immunodeficiency virus type 1 reversetranscriptase and protease mutation search engine for queries. Nat Med 6:1290–1292.
43. Carrillo A, et al. (1998) In vitro selection and characterization of human immunode-ficiency virus type 1 variants with increased resistance to ABT-378, a novel proteaseinhibitor. J Virol 72:7532–7541.
44. Eastman PS, et al. (1998) Genotypic changes in human immunodeficiency virus type 1associated with loss of suppression of plasma viral RNA levels in subjects treated withritonavir (Norvir) monotherapy. J Virol 72:5154–5164.
45. Drake JW, Holland JJ (1999) Mutation rates among RNA viruses. Proc Natl Acad SciUSA 96:13910–13913.
46. Duffy S, Shackelton LA, Holmes EC (2008) Rates of evolutionary change in viruses:Patterns and determinants. Nat Rev Genet 9:267–276.
47. Onafuwa-Nuga A, Telesnitsky A (2009) The remarkable frequency of human immu-nodeficiency virus type 1 genetic recombination. Microbiol Mol Biol Rev 73:451–480.
48. Shafer RW (2009) Low-abundance drug-resistant HIV-1 variants: Finding significancein an era of abundant diagnostic and therapeutic options. J Infect Dis 199:610–612.
49. Eriksson N, et al. (2008) Viral population estimation using pyrosequencing. PLOSComput Biol 4:e1000074.
50. Zagordi O, Klein R, Daumer M, Beerenwinkel N (2010) Error correction of next-generationsequencingdataandreliableestimationofHIVquasispecies.NucleicAcidsRes38:7400–7409.
51. Zagordi O, Geyrhofer L, Roth V, Beerenwinkel N (2010) Deep sequencing of a ge-netically heterogeneous sample: Local haplotype reconstruction and read error cor-rection. J Comput Biol 17:417–428.
52. Cardinaud S, et al. (2004) Identification of cryptic MHC I-restricted epitopes encodedby HIV-1 alternative reading frames. J Exp Med 199:1053–1063.
53. Bansal A, et al. (2010) CD8 T cell response and evolutionary pressure to HIV-1 crypticepitopes derived from antisense transcription. J Exp Med 207:51–59.
54. Berger CT, et al. (2010) Viral adaptation to immune selection pressure by HLA class I-restricted CTL responses targeting epitopes in HIV frameshift sequences. J Exp Med207:61–75.
55. Resch W, Parkin N, Watkins T, Harris J, Swanstrom R (2005) Evolution of human im-munodeficiency virus type 1 protease genotypes and phenotypes in vivo under se-lective pressure of the protease inhibitor ritonavir. J Virol 79:10638–10649.
56. HanceAJ, et al. (2001) Changes in human immunodeficiency virus type 1populations aftertreatment interruption in patients failing antiretroviral therapy. J Virol 75:6410–6417.
Jabara et al. PNAS | December 13, 2011 | vol. 108 | no. 50 | 20171
MICRO
BIOLO
GY
Dow
nloa
ded
by g
uest
on
May
30,
202
0