Null Alleles in Genetic Genealogy
0Thomas Krahn FTDNA Conference 2009
Definition of Null Allele
● Original meaning: A mutant copy of a gene that completely lacks that gene's normal function. (Wikipedia)DNA
Promotor
Gene
RNA Polymerase
Splicing
mRNA
Transcription
ProteinTranslation
Ribosomes
Not a sharp definition.
Many things can go wrongin the complex geneexpression process.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.
Note: This is my own definition. Other definitions I found in the literature and on the internet usually focus on a very narrow subtype of a DNA segment. E.g. STR markers.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.
For a PCR reaction we need a solution of intact DNA. Degraded (sheared) DNA cannot be amplified because the TAQ polymerase needs to extend one DNA strand down until the reverse primer. If the TAQ drops off from the DNA segment before it reaches the reverse primer we will not get an exponential amplification. Since degraded DNA doesn't represent a species who can have descendants, we exclude degraded DNA from being a Null Allele for genealogical purpose.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.
All our known STR markers (e.g. DYS391, DYF385S1, vWA etc.) are DNA segments that are defined by flanking PCR primer sequences. DYS stands for “DNA Y-chromosome Segment”. The famous database GDB that recorded all primerpairs is unfortunately off-line since summer 2007. So it is sometimes difficult to lookup the exact primers for D markers from older publications. Genbank still keeps recordof a partial subset of the GDB markers. There they are also called STS markers (=Sequence Tagged Sites). An STS may also contain one or more SNP markers.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.
The actual characteristic of a Null Allele is that we can't detect a signal from a PCR product. We'll go into detail later what “detection” means, but this makes already clear that we need to precisely define a detection limit above the background noise ofthe detection instrument. Some mutations in the primer binding region don't completelyinhibit the formation of a PCR product so that a small signal persists despite the mutation. With alternative assays such a small signal may be still identified as a Null Allele.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.
Here comes the population genetic aspect of Null Alleles as a usable phylogeneticmarker. It is however important to understand the molecular genetic background of the mutation mechanism. Some of these genetic changes may occur independently on completely different branches of the phylogenetic tree, some of them may even berevertible. Depending on the stability of the marker we may need to select independent assays to restrict or confirm the phylogenetic position of a Null Allele marker.
Definition of Null Allele
● Concerning DNA markers:A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.This makes clear that every Null Allele requires a positive control. This is usually easywith routine STR markers. However, if “other samples” is restricted to a narrow population the samples with Null Alleles may become the majority.Alternatively a competitive primer may sometimes be designed that inverts the definitionof a Null Allele marker to the contrary. This primer matches only the samples who carry the mutation and doesn't yield a PCR product for the “normal” samples.In our lab we have designed assays that combine both primers so that we are able toproperly distinguish the alleles and always get at least one positive result.
Basics
Basics
Basics
● Capillary electrophoresis to detect the PCR products
AAAATTGGTTCCTTGGGGTTTTGGAAGGGGCC
- +
What can go wrong at PCR?
● Bad DNA template● Assay doesn't work● Detection method fails
If we can exclude the above, but still get nosignal from a PCR product
=> then a NULL allele is very likely
(...but not proven).
DYS439 Null mutation (L1)
Not observedwhen STR testing was performed inthe GRC lab because weuse a differentforward primer.
DYS437 Null
DYS391 Null
DYS463 Null
DYS565 Null
DYS448 Null
DYS448 NullPCR with more distant primers did NOT yield any PCR products.
D Y S 4 4 8
Regular primer pair
Outer primers
4000
2000
1250800500300200100
GR
C00
5356
->
DY
S44
8 N
ull
GR
C0
0343
6 ->
DY
S4
48 1
9
GR
C00
000
1 ->
DY
S4
48 1
8
GR
C0
0002
7 -
> f
emal
e
Siz
e S
tand
ard
DYS448 Null PCR product on agarose gel
The DYS448 Y-STR marker has been amplified with alternative primers DYS448_f: GAGGAGGATATGTCAAAGGATTCDYS448_r: CAGTTTCACTTCATGTTTGGGand PCR products have been sized on an agarose gel (FlashGel 1.2% agarose Lonza 200V/5min).The positive controls (19 and 18 repeats) show a band ant the expected size of ca. 800 bp.The female negative control and the DYS448 Null allele sample don't have a PCR product and their lanes on the gel are empty. Amplification assays with alternative primer sets practically eliminate the hypothesis of an inhibited PCR due to a mutation on the primer binding site.
DYS448 Null
Palindromic Pack results are generally inconspicuous...
Except DYF397has possiblyonly 2 alleles
DYS448 Null
DYS448 is located on the unique loopof the P3 palindrome
DYS464 G-type
DYS464 C-type
DYS725
DYS725
DYS725
DYS725
DYF399 T-type
DYF399 C-type
DYF408
DYF408
DYS724
DYS724
DYS459
DYS459 DYF385
DYF385DYF401
DYF401
DYF399 ins G, T-type
P1
P2
DYF397
DYF397
DYF387
DYF387
DYF371 C-type
DYF371 C-type
DYF397
DYF397
P3
N.N.
N.N.
188 bp
188 bp
DYS448
DYS392
DYS485
DYS461
DYS464 C-type
DYS464 C-type
DYS452
DYS392 is NOT missing
DYF397 only 2 alleles DYS464 and DYS725 only 2 allelesDYF399 only 2 allelesand no .1 allele
DYS448 = NullDYS464 = 14-15DYS459 = 9-10DYS401 = 14-17DYF408 = 188-188-8-13DYF399 = 21t-25c (no .1 allele!)DYF397 = 14-15DYS725 = 31-31DYS392 = 11
DYS448 Null
DYS464 G-type
DYS464 C-type
DYS725
DYS725
DYS725
DYF399 T-type
DYF399 C-type
DYF408
DYS724
DYS459
DYF385
DYF401
DYF399 ins G, T-type
P1
DYF397
DYF397
DYF387
DYF371 C-type
DYF371 C-type
DYF397
DYS448
DYS392
DYS485
DYS461
DYS464 C-type
DYS464 C-type
DYS452 DYF397
Loop Constellation!
Recombination
DYS448 = NullDYS464 = 14-15DYS459 = 9-10DYS401 = 14-17DYF408 = 188-188-8-13DYF399 = 21t-25c (no .1 allele!)DYF397 = 14-15DYS725 = 31-31DYS392 = 11
DYS448 Null
DYS464 G-type
DYS464 C-type
DYS725
DYS725
DYS725
DYF399 T-type
DYF399 C-type
DYF408
DYS724
DYS459
DYF385
DYF401
DYF399 ins G, T-type
P1
DYF397
DYF397
DYF387
DYF371 C-type
DYF371 C-type
DYF397
DYS448
DYS392
DYS485
DYS461
DYS464 C-type
DYS464 C-type
DYS452
DYS448 = NullDYS464 = 14-15DYS459 = 9-10DYS401 = 14-17DYF408 = 188-188-8-13DYF399 = 21t-25c (no .1 allele!)DYF397 = 14-15DYS725 = 31-31DYS392 = 11
DYF397
Loop Constellation!
DYS389 Null
DYS389 NullYfiler
Singleplex
DYS389 Null
1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 1 2 3 1 2 3 4 5 6 7 8 9 1011
Deletion of the middle fragment in between DYS389I and DYS389B
DYS389I
DYS389II
The nomenclature of DYS389 is defined asDYS389I: [TCTG]q [TCTA]r = GenBank top strandDYS389II: [TCTG]n[TCTA]p[TCTG]q [TCTA]r = GenBank top strandSee: http://www.cstl.nist.gov/biotech/strbase/str_y389.htm
The deleted sample matches the first 5 repeats [TCTG] from the related samples in R1b1c.It shows 10 repeats of TCTA which we can align to the left or to the right side.
5 x [TCTG] + 10 x [TCTA] = 5 x [TCTG] + 10 x [TCTA] = 15 repeat units15 repeat units
DYS389 Null
Peak shows up at “16” - But really has 1515 repeats!
DYS389 Null
13 24 14 10 11 15 12 12 12 15 13 0 18 9 10 11 11 25 15 19 30 15 15 17 1813 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 16 16 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 31 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 1713 24 14 10 11 15 12 12 12 14 13 30 18 9 10 11 11 25 15 19 30 13 15 16 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 13 15 1513 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 16 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 15 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 13 15 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 12 15 16 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 16 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 1713 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 1713 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17
393
390
19
391
385a
385b
426
388
439
389|1
392
389|2
458
459a
459b
455
454
447
437
448
449
464a
464b
464c
464d
DYS389 Null
DYS389 Null
1 2 3 4 5 1 2 3 4 5 6 7 8 9 10 1 2 3 1 2 3 4 5 6 7 8 9 1011
DYS389I
DYS389II
Looping constellation
Recombination Deletion
5? 10? 3? 10?
13?
28?
5 10
DYS425 Null
DYS425 NullDYS413
DYS413
YCAII
YCAII
DYF408
DYF408
DYS385b*
DYS385a*
DYS464 G-type
DYS464 C-type
DYS725
DYS725
DYS725
DYS725
DYF399 T-type
DYF399 C-type
DYF408
DYF408
DYS724
DYS724
DYS459
DYS459 DYF385
DYF385DYF401
DYF401
DYF399 ins G, T-type
P1
P2
P4
P5
P8
DYF411
DYF411
DYF395
DYF395
DYF397
DYF397
DYF387
DYF387
DYF371 C-type
DYF371 C-type
DYF397
DYF397
P3
N.N.
N.N.
188 bp
188 bp
DYS448
DYS392
DYS485
DYS452
DYS461
DYS390DYF371 T-type
DYF371 C-type
DYS464 C-type
DYS464 C-type
DYS425 = DYF371 T-type allele
The T-type SNP can get lost by a recLOH
This is seen as a “NULL-Allele” if only DYS425 is tested
DYS425 Null / DYF371X
DYS425 12 DYS425 Null
DYS425 Null
The HUGO sequence has also a Null allele at DYS425
10c-10c-13c-14c
Normally in R1b (and most other haplogroups):
10c-12t-13c-14c
Multi Marker Deletion
Multi Marker DeletionMarker Allele Region Start StopDYS393 13 3191128 3191246DYS19 16 10131934 10132128DYS391 10 12612758 12613044DYS437 14 12976972 12977163DYS439 11 13025167 13025418DYS389I 14 13122100 13122515DYS389II 32 13122100 13122515DYS388 13 13256856 13257013DYS438 10 13447189 13447409DYS390 0 15784268 15784613DYS426 0 17644207 17644303DYS385b 0 19260844 19261212DYS385a 0 19301724 19302104DYS392 0 21043146 21043399
ChrYChrYChrYChrYChrYChrYChrYChrYChrYChrYChrYChrYChrYChrY
Possible P1/P5 deletion in the palindromic region
GRC Lab Pics
Astrid Krahn (mt hg J)
GRC Lab Pics
Dr. Connie Bormans (mt hg I)
GRC Lab Pics
Jory Clark (Y hg T)
GRC Lab Pics
Brent Maning (Y hg R-U106*)
GRC Lab Pics
Dr. Arjan Bormans (Y hg R-L2*)
GRC Lab Pics
...and our other lab coworkers
Thanks for listening!