Date post: | 28-Mar-2015 |
Category: |
Documents |
Upload: | amia-hammond |
View: | 217 times |
Download: | 3 times |
Molecular Genetic Methodsin Psychology
www.well.ox.ac.uk/~tprice/presentations.xml
Tom Price
Recap: Heredity
• ‘Heritable’ characteristics are influenced by genetic variation (Mendel’s pea plants)
• Traits are correlated within families (Galton)
• Twin and adoption studies provide evidence of heritability
How?
• Crick and Watson (1952) provided the mechanism.
“the single biggest advance in
molecular biology”
Central Dogma
DNA
• DNA exists in the nucleus in twin strands
• Each strand consists of A, C, G, T ‘bases’ on a sugar-phosphate ‘backbone’
• Each base binds only to its complement
• The sequence of bases along a strand is called the ‘DNA sequence’
DNA Replication
• During replication the DNA molecule unwinds, with each single strand becoming a template for synthesis of a new, complementary strand. Each daughter molecule, consisting of one old and one new DNA strand, is an exact copy of the parent molecule.
Transcription & Translation
• DNA is first transcribed (copied) to a molecule of messenger RNA in a process similar to DNA replication.
• The mRNA molecules then leave the cell nucleus and enter the cytoplasm to be translated into protein in the ribosomes.
Triplets of bases (codons) in the mRNA form the genetic code that specify the particular amino acids that make up an individual protein.
Genes
• A gene is a region of DNA whose sequence encodes a protein.
• The human genome contains ~30,000 genes.• Only about 10% of the genome is known to include the
protein- coding sequences (exons) of genes.
Start of transcription
exons
introns
Chromosomes
• Humans have ‘diploid’ chromosomes: each contains 2 DNA molecules, one from each parent
• Humans have 23 ‘autosomal’ chromosomes and 1 sex chromosome (XX for females, XY for males)
The extra copy of chromosome 21 identifies this individual as having Down syndrome.
Genetic Variation
• Genetic variants (polymorphisms) arise by mutation, either spontaneously or from radiation, viruses, cancer, toxins…
• Mutations in coding regions can change the gene product (‘coding variations’) – or not (‘silent mutations’)
• Variations in non-coding regions can affect transcription (‘gene expression’)
• Most variation occurs in ‘junk’ DNA
Polymorphisms
• Deletion (e.g. Williams Syndrome)
• Polysomy (e.g. Down Syndrome)
• Variable-number repeat (e.g. Fragile X)
• Single-Nucleotide Polymorphism (e.g. FOXP2 mutation in KE family with severe speech disorder)
• Insertions, inversions, translocations…
Meiosis and Recombination
• During meiosis, the chromosomes duplicate, then cross over (‘recombine’) to produce a haploid gamete (sperm/egg)
• The gamete derives genetic variants from both parents
• Meiosis is the basis for heredity
Mother
Egg
Father
Sperm
Child
Meiosis
Fertilisation
Alleles and Genotype
• Alleles = the genetic variants that exist at a particular genetic location (locus)
• Genotype = the alleles present at a locus– cp. Phenotype = trait(s) of organism
• Homozygous = 2 of same allele
• Heterozygous = different alleles
• Allele frequency = % of allele in a population
How to Find A Gene
• Candidate genes:– You already have good reason to believe it is
implicated. e.g. pharmacological evidence: dopamine transporter & receptor genes in ADHD
• Functional genes:– Candidate based on what it is known to do.
e.g. expression patterns in relevant tissue.
BUT ~15,000 genes expressed in the brain
Positional Cloning
• The identification of a gene based solely on its position in the genome
• Most widespread strategy in human genetics in the last 15 years
• Strengths:– No knowledge of gene product required– Very strong track record in single-gene disorders
• Weaknesses:– Understanding of function not a certain outcome– Poor track record with multifactorial traits
Sequencing of Human Genome Facilitates Positional Cloning
Collins, F.S. Positional cloning moves from perditional to traditional, Nat Genet, 9:347-350, 1995
Positional Cloning
Mendel’s Laws: I.Segregation
There are two elements of heredity governing a trait in each individual, and these segregate (separate) during reproduction.
-
+
-+Alleles
-
+
-+Alleles
Dominant Recessive
Mendelian Disorders
• Measured phenotype caused by a single gene– May have multiple mutations in gene– May be additional (environmental) causes
• Follow clear segregation in families• Typically rare in population• Examples
– Duchenne Muscular Dystrophy– Cystic Fibrosis (1989)– Huntingdon’s Disease (1993)– ~1200 have been mapped
Pedigree Analysis
• Genetic disorders, e.g. PKU caused by a recessive allele, have characteristic patterns of inheritance within families.
• above: autosomal dominant • below: autosomal recessive
Mendel’s Laws: II.Independent Assortment
• Traits are inherited independently of each other.
NB. This is law is violated for traits governed by genes close by on the same chromosome. Alleles of these ‘linked’ loci will tend to co-segregate during recombination.
Linkage
• Only ~1 recombination per chromosome→ Loci that are close together on the same
chromosome tend to be inherited together (‘linked’ or ‘in LD’)
• The closer the loci, the more linkage→ Degree of linkage is a measure of genetic distance
• Linkage is measured by the recombination fraction, θ = proportion of recombinants
θ = 0: no linkage
θ = 0.5: complete linkage
Recombinants & Nonrecombinants
• Grandchildren in generation III who received either A1B1 or A2B2 from their father are the product of nonrecombinant sperm; persons who received A1B2 or A2B1 are recombinant.
Estimated recombination fraction = 2 / 7 = 0.28• We cannot classify any of the individuals in generations I and II as
recombinant or nonrecombinant, or identify recombinants arising from oogenesis in individual II2.
Paternal alleles (where it can be worked out)
Markers
• A polymorphic ‘marker’ locus can be informative about a disease locus over 106 base pairs away
• Originally, phenotypic markers used in place of genotype e.g. blood groups and APOe4 in Alzheimer’s Disease
• Sequencing of genome → many markers• The vast majority of markers have no
effect on phenotype.
Genetic Linkage
Trait co-segregates with marker allele within families
Requirements:(i) Many families with trait of interest
(ii) Informative markers
Linkage Analysis
• We do not usually have this much information to work out recombinants / nonrecombinants.
• If inheritance (e.g. dominant / recessive) is known, the likelihood of linkage can be calculated:
LOD = log10[ ]
Paternal alleles (where it can be worked out)
P( θ = 0.5 ) P( θ =
0 )
Single Gene Linkage Analysis
Nonparametric Linkage Analysis
• In practice, complex inheritance is the norm, and nonparametric linkage analysis, which does not require the genetic model to be specified, is most commonly used.
• A design employing affected sib pairs allows model-free analysis in nuclear families using programs like MAPMAKER/SIBS or GENEHUNTER.
• LOD > 3.3 generally accepted as threshold for genome-wide significance.
Netherton Syndrome Linkage
Chavanas et al., Am J Hum Genet, 66:914-921, 2000
Netherton Syndrome Haplotypes
Netherton Syndrome Gene
Chavanas et al. 2000, Nature Genetics
Linkage: Success Stories
• Linkage analysis has been successfully used to map many single gene disorders, e.g. early-onset Alzheimer’s Disease, many forms of mental retardation
Linkage: Problems
• For complex traits, there have been many unreplicated findings
“True linkage is hard to find”
Multifactorial (‘Complex’) Traits
• No clear segregation pattern in families• Caused by > 1 gene• Possibly triggered / moderated by environment• Each gene (environment) may have small effect• Epistasis or intragenic interactions likely• Pleiotropy, environmental influences, gene x
environment interactions likely• Epigenetic influences possible• Measurement of phenotype not highly reliable• Heterogeneity
Why such limited success with Complex Trait Linkage studies?
• Power– Power calculations have always indicated need for many 100’s,
probably thousands of families to detect genes of even moderate effect
– N ~ 200 for most studies conducted to date– For QTL, this is about enough to detect a locus explaining 25%
of the total variance in the trait
• Hope for ‘low-hanging’ fruit– If there are one or a few monogenic-like loci within oligogenic
spectrum, could lead to pathway information– Not supported by data.
• Practical problems: errors in data
A ‘Link’ in the Chain
• Linkage analysis can do no more than point to broad regions – ‘linkage hotspots’ – at best ~20cM, ~200 genes
• More powerful methods must be used to ‘home in’ on the crucial gene.
The Next Link
(Allelic) Association
• Why?
Markers remain in LD with the ‘founding’ mutation over many generations
Trait correlates with marker allelein population
Association = same ancestral origin
Generation 1: a disease-causing mutation occurs on a chromosome
Generation 2: about 50% of the children receive the mutation and the surrounding chromosomal segment from the mutated founder
Generation 3: segments originating from the mutated founder chromosome get shorter
…
Generation N: very short segments around the mutated locus are conserved
Linkage: Allelic association within families
Allelic Association:Extension of linkage to the population
• For both families, the same marker is ‘linked’ with the trait, but a different allele is implicated
Allelic Association:Extension of linkage to the population
Trait is ‘linked’ with the same marker in all families:
Allele 6 is ‘associated’ with trait.
Allelic Association
Allele 6 is ‘associated’ with disease
Allelic Association:Three Common Forms
• Direct Association– Mutant or ‘susceptible’ polymorphism– Allele of interest is itself involved in phenotype
• Indirect Association– Allele itself is not involved, but a nearby correlated gene
changes phenotype
• Spurious association– Apparent association not related to genetic aetiology– Including: Natural selection , statistical artifact, and population
stratification (see later)
Indirect & Direct Allelic Association
Direct Association
Measure trait relevance (*) directly, ignoring correlated markers nearby
Indirect Association & LD
Assess trait effects on D via correlated markers (Mi) rather than susceptibility/etiologic variants.
Linkage Disequilibrium: correlation between (any) markers in populationAllelic Association: correlation between marker allele and trait
Population Stratification
• Recent admixture of populations• Requirements:
– Group differences in allele frequency– Group differences in outcome
• Leads to spurious association• In epidemiology, this is a classic matching
problem, with genetics as a confounding variable
Most oft-cited reason for lack of association replication
Population Stratification
Association induced by sample mixing
Population Stratification: Solutions
• Because of fear of stratification, complex trait genetics turned away from case/control studies
1. Family-based controls (e.g. TDT)
2. ‘Genetic control’: extra genotyping• Look for evidence of background population
substructure and account for it.
Linkage v. Association
Linkage Association
Requires families Families or unrelateds
Matching/ethnicity generally unimportant Matching/ethnicity important
Few markers for genome coverage(300-400 STRs)
Many markers for genome coverage(105 – 106 SNPs)
Weak design (allele-sharing based on covariances)
Powerful design (based on mean differences)
Yields coarse location Yields fine-scale location
Good for initial detection,poor for fine-mapping
Good for fine-mapping, poor for initial detection
Powerful for rare variants Powerful for common variants,rare variants generally impossible
Association Study Outcomes
Reported p-values from association studies in
Am J Med Genet or Psychiatric Genet, 1997
Terwilliger & Weiss, Curr Opin Biotech, 9:578-594, 1998
Why limited success with association studies?
1. Small sample sizes → results overinterpreted
2. Phenotypes are complex. Candidate genes difficult to choose
3. Allelic/genotypic contributions are complex. Even true associations difficult to see.
4. Background patterns of LD are unknown. Difficult to appreciate signal when can’t assess noise.
5. Spurious results due to population stratification
Heterogeneity
Effects of Linkage Disequilibrium
Roses, Nature 2000
Alzheimer’s Disease
• Common Disease of old age:– Main cause of dementia in the elderly– 4th leading cause of death– Prevalence increases with age; much earlier
onset in rare cases
• Progressive loss of memory, cognitive deterioration, and emotional disturbance
• Loss of neurons with many amyloid-containing plaques, neurofibrillary tangles
Genetic Epidemiology
• Early-onset disease is sometimes Mendelian and autosomal dominant.
• Standard lod score analysis in dominant early-onset families allowed mapping and subsequently cloning of three genes.
• Multicase late-onset families showed evidence of linkage to chromosome 19 when analyzed by the affected pedigree member method.
Apolipoprotein E
• 3 alleles: E2 (8%), E3 (77%), E4 (15%).
• Risk relative to E3/E3 at age 65+– E3/E4: ~3– E4/E4: ~14
• Accounts for ~20% of susceptibility
• APOe risk associated with age of onset, clinical manifestations of AD, selective effect on episodic memory
Investigation of APOe Risk
• Mechanism currently not known
• Possible ethnic differences
• Genetic risk interacts with head injury, education, possibly nutrition (anti-oxidants?)
• Clinical trials of folic acid, statins, NSAIDs as protective factors.
AD & APOe
Poster child for behavioural genetics?
Or cautionary tale?
Further Reading
Plomin R, DeFries JC, McClearn GE & McGuffin P. (2001). Behavioral Genetics (4th ed.). Worth.
Strachan T & Read AP (1999). Human Molecular Genetics. Bios. (look online)
Lahiri DK, Sambamurti K & Bennett DA. Apolipoprotein gene and its interaction with the environmentally driven risk factors: molecular, genetic and epidemiological studies of Alzheimer’s disease. Neurobiology of Aging 25:651–660.