Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | rudolf-watts |
View: | 223 times |
Download: | 0 times |
GeneChips andMicroarray Expression Data
David Paoletti
The Problem
• Determine gene expression (activity)
• What proteins are being produced by a group of cells?
The Assumption
• The RNA present in the cell determines what proteins are being produced
• Efficiency
The Why
• Understanding
• Toxicology
• Drug design– Evaluation– Specificity– Response
What is a GeneChip?
• 1.28 x 1.28 cm glass wafer 500,000 features
– 24 x 24 m probe site– 25 mer oligo, complementary
• PM: perfect match
• MM: mismatch
2.5 M copies
GeneChip
The Solution
The Gains
• Speed
• Possibility
• Sensitivity
• Reproducibility
The Process
CellsPoly-ARNA
AAAA
cDNA
L L L
L
IVT
Biotin-labeledAntisense cRNA
L
Fragment (heat, Mg2+)
Labeledfragments
Hybridize Wash/stain Scan
L
Hybridization and Staining
LL
GeneChip BiotinLabeled cRNA
+L
L
L
L
L
L
L
L
L
L+
SAPEStreptavidin-phycoerythrin
Hybridized Array
Specialized Equipment
How Features Are Chosen5’ 3’Gene Sequence
Multipleoligo probes
25 mers
Perfect MatchMismatch
Feature Values
83 112 96 32
47 382 165 87
55 246 140 93
104 552 187 65
Remove outermost rows and columns
Find 75th percentile of remaining values
This value is taken as representative of this feature
Background Noise Removal
• The array is divided into 16 equal sectors
• For each sector– Find the lowest 2% of the feature intensities– Average these– Subtract this average from the intensity value of
all features in the sector
Noise Calculation
bgi i
iraw
pixel
stdev
NQ
1
NFSFQQ raw
Average Difference Intensity
• For a given gene– For each probe pair for the given gene
• Calculate the difference PM-MM
– Calculate , for this set– If abs( (PM – MM) - ) 3, delete from set– Remaining set is pairs in avg
avgin pairsavgin pairs#
1
iii MMPMAvgDiff
Positive & Negative Probe Pairs
If both true, mark as positive
If both true, mark as negative
PM-MM SDT
PM/MM SRT
MM-PM SDT
MM/PM SRT
SDT = Q · STDmult
By default, SRT = 1.5, STDmult = 2.0 (low density), 4.0 (high)
Voting Methods forAbsolute Call
• Positive/negative ratio
PNR = #pos / #neg
• Positive fraction
PF = #pos / #used
• Log average ratio
avgin pairs
)/log(avgin pairs#
10MMPMLA
Decision Matrix
Absent Marginal Present
PNR 3.00 4.00
PF 0.33 0.43
LA 0.90 1.30
Average Difference andAbsolute Call
• Which of these do you base a decision on, for whether a gene is being expressed?
• Use the absolute call for decision
• Use average difference to compare those which are present
Conclusions
• Incredible amalgam of biological and computational processes
• Allows analyses that would not be performed otherwise
• Already of proven worth
References
• Moore, S K; Making chips to probe genes, IEEE Spectrum, March 2001, 54-60.
• GeneChip Gene Expression Algorithm Training, Part I: Absolute Analysis; Affymetrix.
• Berberich, S, and McGorry, M; GeneChip protocols; Wright State University.