Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 0 times |
Low-Level Low-Level Analysis and QCAnalysis and QC
Regional BiasesRegional Biases
Mark Reimers, NCIMark Reimers, NCI
OutlineOutline
Regional biases on spotted arraysRegional biases on spotted arrays Relation to backgroundRelation to background Measures of biasMeasures of bias
Affy technical variation measuresAffy technical variation measures Dynamic rangeDynamic range RNA degradationRNA degradation
Regional biases on Affymetrix arraysRegional biases on Affymetrix arrays Using bias.display and affyPLM for QCUsing bias.display and affyPLM for QC
The Quality IssueThe Quality Issue
Frequent outliers in experimentsFrequent outliers in experiments Lack of agreement between labsLack of agreement between labs The hybridization process is complex The hybridization process is complex
and cannot be observed directlyand cannot be observed directly Many factors cannot be optimized for Many factors cannot be optimized for
all reactionsall reactions Statistical QC tools attempt to make Statistical QC tools attempt to make
visible subtle but pervasive effectsvisible subtle but pervasive effects
What are Regional What are Regional Biases?Biases?
Regions where all genes give consistently higher Regions where all genes give consistently higher reading in one dye than other regions, or the reading in one dye than other regions, or the same region on other slidessame region on other slides Most spots in images are relatively darkMost spots in images are relatively dark Region may not appear brighter in one dye or the other Region may not appear brighter in one dye or the other Biases not obvious by image inspection Biases not obvious by image inspection
Barazsi et al (2003), Qian et al (2003) identified Barazsi et al (2003), Qian et al (2003) identified high correlation between nearby probes in high correlation between nearby probes in Spellman cell-cycle data, other data setsSpellman cell-cycle data, other data sets
Workman et al (2002), Colantuoni et al Workman et al (2002), Colantuoni et al (SNOMAD, 2003) identified regional biases in (SNOMAD, 2003) identified regional biases in cDNA arrays by fitting loess surfaces to ratios cDNA arrays by fitting loess surfaces to ratios across each slideacross each slide
Visualizing Bias by Visualizing Bias by RatiosRatios
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Display ratios for each spot at constant brightness- easier to see biases
Some slides show bias toward one color in some areas
A Common StandardA Common Standard Expression ratios vary from spot to spotExpression ratios vary from spot to spot
Harder to see patternsHarder to see patterns Often a series of experiments on a single tissue, Often a series of experiments on a single tissue,
use a common referenceuse a common reference Construct average ratios (tissue typical ratios?) Construct average ratios (tissue typical ratios?) More informative image: spot ratios compared to More informative image: spot ratios compared to
typical ratio for that spot across all slidestypical ratio for that spot across all slides
Common Reference Common Reference Highlights DifferenceHighlights Difference
Red/Green ratios show variation
Ratios of ratios on slide to ratios on standard show less variation
Visualizing Bias using Visualizing Bias using StandardStandard
Ratio of ratios shows much clearer concentration of red spots on some slides
Note non-random but highly irregular concentration of red
Bias and BackgroundBias and Background We observe that local We observe that local
background background contributes to biascontributes to bias
Does subtracting Does subtracting background remove background remove bias?bias?
Local off-spot background Local off-spot background may not be the best may not be the best estimate of spot estimate of spot background (non-specific background (non-specific hyb)hyb)
Spots BG subtracted
Bias and Background Bias and Background (2)(2)
Raw spot ratios show a mild bias relative to averageAfter subtracting a high green bg in the center a red bias results
Other Bias PatternsOther Bias Patterns
This spotted oligo array shows strong biases at the beginning and end of each print-tip group
The background shows a milder version of this effect
Subtracting background removes some regional biases while adding bias in other regions
Processed Raw Spot Background
How to Measure Regional How to Measure Regional Biases?Biases?
Correlation between neighboring probesCorrelation between neighboring probes r = Cor( rr = Cor( ri,ji,j, ( r, ( ri-1,ji-1,j + r + ri+1,ji+1,j + r + ri,j-1i,j-1+ r+ ri,j+1i,j+1)/4 ), )/4 ),
where rwhere ri,ji,j is log ratio relative to standard at row i is log ratio relative to standard at row i column jcolumn j
Red-green ratios: r ~ 0.05-0.1 Red-green ratios: r ~ 0.05-0.1 Ratio to average: r ~ 0.1 - 0.3Ratio to average: r ~ 0.1 - 0.3 For some slides r > 0.5For some slides r > 0.5
Regional Bias Affects Regional Bias Affects AnalysisAnalysis
A major source of false positives for single A major source of false positives for single slidesslides In some slides half the apparently most up-In some slides half the apparently most up-
regulated genes come from 10% of slide arearegulated genes come from 10% of slide area In replicated experimental samples, In replicated experimental samples,
regional bias results in increased variance regional bias results in increased variance - false negatives- false negatives
In clinical samples, regional bias results In clinical samples, regional bias results in serious distortion of exploratory in serious distortion of exploratory procedures such as clusteringprocedures such as clustering
Visualizing Other QC Visualizing Other QC MeasuresMeasures
A heat plot of signal/SD ratios shows clearly that some slides and regions are better than others
One persistently bad region in a batch was printed poorly
S/Nratio
Low S/N implies less reliable ratios
Prospects for Prospects for NormalizationNormalization
Try to fit smooth (loess) surface to Try to fit smooth (loess) surface to ratios to estimate bias.ratios to estimate bias. Workman (2002) finds modest (20%) Workman (2002) finds modest (20%)
improvements in replicates’ variance improvements in replicates’ variance Colantuoni (2003) finds moderate Colantuoni (2003) finds moderate
improvementsimprovements Qian et al (2004) find that SNOMAD Qian et al (2004) find that SNOMAD
does not remove a majority of does not remove a majority of correlation between neighboring probescorrelation between neighboring probes
Prospects for Prospects for Normalization (2)Normalization (2)
Are ratios described well by smooth Are ratios described well by smooth gradient?gradient? Irregular regions are commonIrregular regions are common Short-range effects Short-range effects Poor prospects for normalization by Poor prospects for normalization by
smoothingsmoothing
Regional Bias on Affy Regional Bias on Affy ChipsChips
Current Quality Current Quality MeasuresMeasures
RNA quality RNA quality Gel or BioAnalyzerGel or BioAnalyzer
Affymetrix Microarray Suite:Affymetrix Microarray Suite: 3’/5’ ratios 3’/5’ ratios
Process of reverse transcriptionProcess of reverse transcription
Scaling factor Scaling factor Labeling efficiency (and total RNA)Labeling efficiency (and total RNA)
Per cent present callsPer cent present calls
PM/MM ratiosPM/MM ratios Specificity of hybridizationSpecificity of hybridization Varies with stringency of wash solutionVaries with stringency of wash solution
Types of Problems Types of Problems UndetectedUndetected
Local Artifacts - scratches, smudgesLocal Artifacts - scratches, smudges Regional Bias - large regions shifted Regional Bias - large regions shifted Hybridization differences causing Hybridization differences causing
differences in dynamic rangedifferences in dynamic range Small differences in RNA Small differences in RNA
degradationdegradation
Three VariablesThree Variables
RNA QualityRNA Quality RNA degrades rapidly in intact samplesRNA degrades rapidly in intact samples cRNA production may be variablecRNA production may be variable
Hybridization conditionsHybridization conditions Temperature, salinity Temperature, salinity
Defects or uneven conditions on chip Defects or uneven conditions on chip Bubbles spend more time in some placesBubbles spend more time in some places Leading to regional biasesLeading to regional biases
RNA Degradation PlotRNA Degradation Plot MAS5.0 displays 5’/3’ ratios for selected MAS5.0 displays 5’/3’ ratios for selected
genesgenes Degradation plot displays relative signal Degradation plot displays relative signal
at each position from 5’ to 3’ end of probe at each position from 5’ to 3’ end of probe sequence sequence
AffyRNADeg function in affy package of AffyRNADeg function in affy package of bioconductorbioconductor
Home-crafted plotting functionHome-crafted plotting function
Amplified RNA Deg. PlotAmplified RNA Deg. Plot
Doubly Doubly amplified amplified cRNAcRNA
Fairly evenFairly even No great No great
discrepancidiscrepancieses
Hybridization ConditionsHybridization Conditions
Variation in thermodynamics of Variation in thermodynamics of hybridization affectshybridization affects BackgroundBackground Ratios of PM to MMRatios of PM to MM Specificity of hybridizationSpecificity of hybridization Distribution of signals from probesDistribution of signals from probes
Each of these can be investigatedEach of these can be investigated
Visualizing Probe Visualizing Probe DistributionDistribution
Either as signal distribution (log Either as signal distribution (log scale works best) or as ratiosscale works best) or as ratios
Ratios:Ratios: Construct reference standard: average Construct reference standard: average
each probe over all chips (20% trimmed each probe over all chips (20% trimmed mean)mean)
Log scale works bestLog scale works best Subtract log standard from log probe Subtract log standard from log probe
signalssignals
Effects of Distribution Effects of Distribution Changes Changes
MDS Plot of ChipsDistribution of Probe Ratios
90122, 90123, 90124, 97444 are replicates
Local Artifacts, Regional Local Artifacts, Regional BiasBias
Workman et al (2003) identified artifacts by Workman et al (2003) identified artifacts by displaying raw data image on log2 scaledisplaying raw data image on log2 scale Not many scars visible - are the chips that good?Not many scars visible - are the chips that good? Running means of (log2) intensity show little bias Running means of (log2) intensity show little bias Dynamic range - neighboring probes vary 10X to Dynamic range - neighboring probes vary 10X to
100X100X No obvious referenceNo obvious reference
Need to compensate for large dynamic rangeNeed to compensate for large dynamic range
Visualizing Artifacts by Visualizing Artifacts by RatioRatio
Construct a standard Construct a standard (virtual) chip:(virtual) chip: Trimmed (20%) mean of Trimmed (20%) mean of
each probe across all each probe across all chips chips
Roughly estimates Roughly estimates ‘typical’ level‘typical’ level
Robust: genes highly Robust: genes highly expressed in few expressed in few samples don’t affectsamples don’t affect
Compute ratio of each Compute ratio of each probe on any chip to probe on any chip to corresponding probe on corresponding probe on standard chipstandard chip
Visualizing Artifacts, Visualizing Artifacts, BiasBias
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Image of raw data on a log2 scale shows striations but no obvious artifacts
Image of ratios of probes to standard shows a smudge
Non-coding probes
Background and ScaleBackground and Scale
For each region: fit regression lines For each region: fit regression lines to probes on this chip vs to probes on this chip vs corresponding probes on standardcorresponding probes on standard
Intercept and slope may be interpreted as local minimum intensity (background) and sensitivity (scale factor)
Slope ~ 1.4
y=x
Background ~ +10
Visualizing Bias as BG Visualizing Bias as BG and Scaleand Scale
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
A Good ChipA Good Chip
Probe ratio image shows small (<5%) elevated region
Background plot shows this artifact mostly in background
An Acceptable ChipAn Acceptable Chip
Less than 10% of chip area affected in both background and scale
A Bad ChipA Bad Chip
Half of this chip shows strong biases in background
Quantifying BiasQuantifying Bias
Compute correlation over the chip Compute correlation over the chip between probe log-intensities and the between probe log-intensities and the averages of the 4 nearest neighborsaverages of the 4 nearest neighbors
Typical ‘good’ Affy chip has Typical ‘good’ Affy chip has correlation of ratios ~.2correlation of ratios ~.2
Some chips have correlations near 0.8Some chips have correlations near 0.8 Horizontal correlation > vertical Horizontal correlation > vertical
correlationcorrelation
Does Bias Affect Does Bias Affect Measures?Measures?
Affymetrix distributes probes - Affymetrix distributes probes - robust?robust?
Experiment: distort a chip in Experiment: distort a chip in softwaresoftware 10,000 probes raised 2X10,000 probes raised 2X
4% of genes distorted > 0.2 in MAS5 (log2 scale)0.2 % show distortions > 0.2 by RMA (log2 scale)
MAS5 RMA
Bias Affects Measures - Bias Affects Measures - IIII
Experiment: 50% of probes raised 2X
Consequences for Consequences for AnalysisAnalysis
A study with 41 chips founders on A study with 41 chips founders on qualityquality
Six groups - color coded in plot at Six groups - color coded in plot at rightright
Several chips seem very atypical for their groups
QC by affyPLMQC by affyPLM
Robust Multi-chip Analysis (RMA) Robust Multi-chip Analysis (RMA) fits a linear model to each probe setfits a linear model to each probe set
High residuals in green
High residuals show regional patternsMean residuals a global indicator of qualityAvailable in affyPLM package at www.bioconductor.org
Current Affy PipelineCurrent Affy Pipeline
Construct standard chip Construct standard chip if few samples, add samples of similar if few samples, add samples of similar
tissuestissues Compute ratios of probes to Compute ratios of probes to
standardstandard Compute correlations of ratiosCompute correlations of ratios Examine imagesExamine images Decide to accept/reject Decide to accept/reject