+ All Categories
Transcript
Page 1: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Issues affecting the calibration of Affymetrix GeneChips

[email protected]

Page 2: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Overview of GeneChip technology and how gene expression is measured.

How the data is calibrated

Biophysics of GeneChip technology

Biological issues such as splicing

Potential problems with parts of the data

Page 3: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Many laboratories have an almost identical set-up for running GeneChips.

Page 4: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probe cells of an Affymetrix GeneChip contain millions of identical 25-mers

25-mer

Page 5: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Fragmentation of RNA to mean length of ~100 bases.

HybridizationLabelling with a fluorescent marker (on the Us).

Remove partial hybrids by washing in a solution with a reduced salt content (phosphate backbones of nucleic acids have negative charge).

Detect fluorescence

Page 6: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Reverse transcription has a chance of stopping at every base. This results in a population of aRNA copies which are biased towards the 3’ end.

Page 7: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

perfect match probe cellsmismatch probe cells

5’ 3’

GTGGGAATTGGGTCAGAAGGACTGTGGCTAGG GGAATTGGGTCAGAAGGACTGTGGC GGAATTGGGTCACAAGGACTGTGGCProbe-pairs scattered on chip

Affymetrix microarrays

Page 8: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probes can map to different transcripts. Such multiple targeting should be identified (Okoniewski and Miller BMC Bioinformatics 2006)

Some probes no longer map to the most update transcript information for a gene. These probes will not detect expression related to the rest of the probeset to which they belong.

There are a number of alternative Chip Definition Files, which group probes into probe-sets.

Page 9: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Affymetrix probe set

Perfect Match (PM)

Mismatch (MM)

Probe cell (aka feature)

Probe pair

Page 10: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Affymetrix software derives the intensity for each probe from the 75% quantile of the pixel values in each box.

Page 11: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Chip calibration

High-level analysis, biological interpretation

Correct Background, Normalise, Correct for Cross Hybridisation, Expression Measure

Page 12: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Background Fluorescence needs to be corrected

e.g. MAS and RMA algorithms

Page 13: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Chips need to be normalised against each other.

e.g. invariant genes, lowess, quantiles

Each chip is a different colour

Page 14: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

RMA uses Quantile normalisation at the probe level

Chip 1

Chip 2

Chip 3

1 2 3 4 5

1 2 3 5 7

2 3 4 5 9

Order by ranks

PA PB PC PD PE

Chip 1

Chip 2

Chip 3

1 2 4 3 5

7 2 5 3 1

5 3 4 2 9

Average the intensities at each rank

Chip 1

Chip 2

Chip 3

1.33 2.33 3.33 4.66 7

1.33 2.33 3.33 4.66 7

1.33 2.33 3.33 4.66 7

PA PB PC PD PE

Chip 1

Chip 2

Chip 3

1.33 2.33 4.66 3.33 7

7 2.33 4.66 3.33 1.33

4.66 2.33 3.33 1.33 7

Reorder by probe

Page 15: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

The intensities of the multiple probes within a probeset are combined into ONE measure of expression

Expression Measure

Page 16: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Cross Hybridisation

MAS 5.0 (Affymetrix) corrects for cross-hybridisation by subtracting the MisMatch signal from the Perfect-Match.

RMA ignore the mismatches because they hybridise to the Perfect Signal.

Page 17: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

MAS 5.0 (Signal) takes the Tukey bi-weighted mean of the difference in logs of PM and MM.

Page 18: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

dChip and RMA ‘model’ the systematic hybridisation patterns when calibrating an expression measure.

1-9 are different chips.

Page 19: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Spike-in measurements show there remains considerable signal at low concentrations.

Page 20: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Cross Hybridisation

MAS 5.0 (Affymetrix) corrects for cross-hybridisation by subtracting the MisMatch signal from the Perfect-Match.

RMA ignore the mismatches because they hybridise to the Perfect Signal.

How can you measure cross-hybridisation without using the MisMatch signal?

There is a need for a model of the physics of hybridisation (Naef and Magnasco 2003)

Page 21: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

G

A

T

C

Purines (G & A) are large

Pyrimidines (C & T) are small

There will be no steric hindrance between the pyrimidine in the mismatch and the pyrimidine in the mRNA of interest.

There will be a large steric hindrance between the purine in the mismatch and the purine in the mRNA of interest.

e.g. perfect match #13 = A, so mismatch #13 is T, and the complementary base in mRNA is also T/U

Size is important

Page 22: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Naef and Magnasco (2003)

Page 23: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

AT bonds have two hydrogen bonds.

GC have 3 hydrogen bonds

GC content is important

Page 24: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probes are ssDNA

Target is RNA

Page 25: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Biotin labelling interferes with the hybridisation

C & T (pyrimidines) are labelled. So GC* binds less strongly than CG, and AT* binding is weaker than TA.

If the probe contains no C & T, it will hybridise well but with no fluorescence. If you have all C & T, it will have difficulty hybridising.

Page 26: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Van der Waals interactions between adjacent bases

H-bond interactions between adjacent bases

Nearest-neighbour interactions predict duplex kinetics and so sequence order is important (Santa Lucia)

The binding energy of GAC is not the same as CAG

Page 27: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Wu and Irizarry (2004) have written GCRMA (which is available in Bioconductor).

Lots of close sequences will hybridise to a given probe. Wu and Irizarry model the variation in hybridisation of these similar processes using a statistical model.

GCRMA determines the contribution to the PM from Signal and from Non-Specific Hybridisation

Page 28: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

GCRMA produces a good linear relationship between intensity and concentration

Page 29: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Once chips have gone through the DATCELExpression Measure process, changes in gene expression between conditions or over time

can be observed.

m=log2(Fold Change), a=log2(Average Intensity)

The change in expression between two conditions for all the genes on an array can be viewed on a MA plot

Page 30: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Some genes are represented by multiple probe-sets.

Probe-set A Probe-set B

If they are measuring the same gene the signals should be up and down regulated together!

Is that always true? No

Stalteri and Harrison, 2007, BMC Bioinformatics, 8:13

Page 31: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk
Page 32: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Alternative Splicing results in different permutations of exons from the same gene. Sometimes introns are also included. Each of these permutations results in a different protein.

Alternative Splicing affects > 50% of our genes. It is the most obvious way in which 25,000 genes in the Human Genome can produce ~100,000 proteins.

Page 33: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probes map to different exons. Because of alternative splicing, some of the exons may be upregulated whereas others may be downregulated.

Page 34: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Tian et al. (2000)

Alternative PolyadenylationAlternative Polyadenylation

Single poly(A)

site

Alternative poly(A) sites

in the 3’-most exon

Alternative poly(A) sites in different

exons

Page 35: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probes map to different sides of a polyadenylation signal. Because of alternative polyadenylation, some of the probes may be upregulated whereas others may be downregulated.

Page 36: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

3 0.14

4 0.42

5 0.49

6 0.62

7 0.75

Number of contiguous Gs

Mean Correlation

Comparing probes with runs of Gs.

We are only looking at a small fraction of the entire probe, yet it is dominating the effects across all experiments.

Page 37: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

A tetrad of Guanines can bind to each other through Hoogsteen Hydrogen bonds with the help of a central cation. G-quadruplexes are prevalent in telomeres (single stranded DNA at the end of chromosomes). G-quadruplexes are thermally stable.

G-quadruplexes take a range of topologies.

Page 38: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Adjacent probes within a cell on a GeneChip have the same sequence – a run of Guanines will result in closely packed DNA with just the right properties to form quadruplexes.

Upton et al. 2008 BMC Genomics, 9, 613

Page 39: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Parallel G-quadruplexes have a left-handed helical twist.

We suggest 4 probes can efficiently form a “Maypole”. Outside the corset of the “G-spot”, the probes have little affinity for bases of the same sequence and the phosphate backbones will repel each other. Inside the G-spot the bases are on the inside and cannot bind target.

GGGG

Page 40: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Probe + Target Duplex

kf

kr

r

f

k

kK

G = - RT ln K

Hybridization

Dissociation

All spontaneous physical and chemical changes take place in the direction of a decrease in free energy, G < 0

R is the Gas Constant, and T is temperature.

Page 41: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Phosphates on chains of nucleic acids have a negative charge.

There is a coulomb block of hybridization on microarrays (Vainrub and Pettitt 2002). The environment caused by probe-probe interactions acts to modify the hybridization of RNA.

Hagan and Chakraborty 2004, Journal of Chemical Physics

The strength of binding depends upon probe density

r

f

k

kK

G = - RT ln K

Page 42: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

GGGG

GGGG

GGGG

G = - RT ln K

Probes that are not bound in G-quadruplexes will have a reduced probe density in the immediate environment of the runs of Guanines. This will result in very effective nucleation, and binding, with respect to hybridization to the rest of the probe.

r

f

k

kK

The binding will efficiently occur in the G-spot. Any RNA molecule with a run of Cs will hybridize. Thus, there will be enhanced correlations between all the probes that are able to form G-quadruplexes.

Page 43: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk
Page 44: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Kerkhoven et al. 2008, PLoS ONE 3(4): e1980

Probes containing GCCTCCC will hybridize to the primer spacer sequence that is attached to all aRNA prior to hybridization.

Page 45: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

GeneChips require calibration in order to reliably combine the information from multiple probes.

Biophysics of GeneChip technology has been studied in a lot of detail – but there is still things to learn. Algorithms such as GCRMA attempt to provide a physics-based transformation between light and RNA concentration.

Biological issues such as splicing

Potential problems with particular sequences

Page 46: Issues affecting the calibration of Affymetrix GeneChips harry@essex.ac.uk

Bioinformatix, Genomix, Mathematix, Physix, Statistix, Transcriptomix

are needed in order to extract reliable information from Affymetrix GeneChips

Thank you for your attention.


Top Related