+ All Categories
Home > Documents > Two Color Microarrays

Two Color Microarrays

Date post: 24-Feb-2016
Category:
Upload: chaman
View: 28 times
Download: 0 times
Share this document with a friend
Description:
Two Color Microarrays. SPH 247 Statistical Analysis of Laboratory Data. Two-Color Arrays. Two-color arrays are designed to account for variability in slides and spots by using two samples on each slide, each labeled with a different dye. - PowerPoint PPT Presentation
27
Two Color Microarrays SPH 247 Statistical Analysis of Laboratory Data
Transcript
Page 1: Two Color Microarrays

Two Color MicroarraysSPH 247

Statistical Analysis ofLaboratory Data

Page 2: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 2

Two-Color ArraysTwo-color arrays are designed to account for

variability in slides and spots by using two samples on each slide, each labeled with a different dye.

If a spot is too large, for example, both signals will be too big, and the difference or ratio will eliminate that source of variability

May 14, 2010

Page 3: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 3

DyesThe most common dye sets are Cy3 (green)

and Cy5 (red), which fluoresce at approximately 550 nm and 649 nm respectively (red light ~ 700 nm, green light ~ 550 nm)

The dyes are excited with lasers at 532 nm (Cy3 green) and 635 nm (Cy5 red)

The emissions are read via filters using a CCD device

May 14, 2010

Page 4: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 4May 14, 2010

Page 5: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 5May 14, 2010

Page 6: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 6May 14, 2010

Page 7: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 7

File FormatA slide scanned with Axon GenePix produces

a file with extension .gpr that contains the results:http://www.axon.com/gn_GenePix_File_Formats.html

This contains 29 rows of headers followed by 43 columns of data (in our example files)

For full analysis one may also need a .gal file that describes the layout of the arrays

May 14, 2010

Page 8: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 8May 14, 2010

"Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." "F635 Median" "F635 Mean" "F635 SD" "B635 Median" "B635 Mean" "B635 SD" "% > B635+1SD" "% > B635+2SD" "F635 % Sat." "F532 Median" "F532 Mean" "F532 SD"

"B532 Median" "B532 Mean" "B532 SD" "% > B532+1SD" "% > B532+2SD" "F532 % Sat." "Ratio of Medians (635/532)" "Ratio of Means (635/532)" "Median of Ratios (635/532)" "Mean of Ratios (635/532)" "Ratios SD (635/532)""Rgn Ratio (635/532)" "Rgn R² (635/532)" "F Pixels" "B Pixels" "Sum of Medians" "Sum of Means" "Log Ratio (635/532)" "F635 Median - B635""F532 Median - B532" "F635 Mean - B635" "F532 Mean - B532" "Flags"

Page 9: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 9

Analysis ChoicesMean or median foreground intensityBackground corrected or notLog transform (base 2, e, or 10) or glog

transformLog is compatible only with no background

correctionGlog is best with background correction

May 14, 2010

Page 10: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 10

Array normalizationArray normalization is meant to increase the

precision of comparisons by adjusting for variations that cover entire arrays

Without normalization, the analysis would be valid, but possibly less sensitive

However, a poor normalization method will be worse than none at all.

May 14, 2010

Page 11: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 11

Possible normalization methodsWe can equalize the mean or median

intensity by adding or multiplying a correction term

We can use different normalizations at different intensity levels (intensity-based normalization) for example by lowess or quantiles

We can normalize for other things such as print tips

May 14, 2010

Page 12: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 12May 14, 2010

Group 1 Group 2

Array 1 Array 2 Array 3 Array 4

Gene 1 1100 900 425 550

Gene 2 110 95 85 110

Gene 3 80 65 55 80

Example for Normalization

Page 13: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 13May 14, 2010

> normex <- matrix(c(1100,110,80,900,95,65,425,85,55,550,110,80),ncol=4)> normex [,1] [,2] [,3] [,4][1,] 1100 900 425 550[2,] 110 95 85 110[3,] 80 65 55 80> group <- as.factor(c(1,1,2,2))

> anova(lm(normex[1,] ~ group))Analysis of Variance Table

Response: normex[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 262656 262656 18.888 0.04908 *Residuals 2 27812 13906 ---Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Page 14: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 14May 14, 2010

> anova(lm(normex[2,] ~ group))Analysis of Variance Table

Response: normex[2, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 25.0 25.0 0.1176 0.7643Residuals 2 425.0 212.5

> anova(lm(normex[3,] ~ group))Analysis of Variance Table

Response: normex[3, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 25.0 25.0 0.1176 0.7643Residuals 2 425.0 212.5

Page 15: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 15May 14, 2010

Group 1 Group 2

Array 1 Array 2 Array 3 Array 4

Gene 1 975 851 541 608

Gene 2 -15 46 201 168

Gene 3 -45 16 171 138

Additive Normalization by Means

Page 16: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 16May 14, 2010

> cmn <- apply(normex,2,mean)> cmn[1] 430.0000 353.3333 188.3333 246.6667

> mn <- mean(cmn)> normex - rbind(cmn,cmn,cmn)+mn [,1] [,2] [,3] [,4]cmn 974.58333 851.25 541.25 607.9167cmn -15.41667 46.25 201.25 167.9167cmn -45.41667 16.25 171.25 137.9167> normex.1 <- normex - rbind(cmn,cmn,cmn)+mn

Page 17: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 17May 14, 2010

> anova(lm(normex.1[1,] ~ group))Analysis of Variance Table

Response: normex.1[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 114469 114469 23.295 0.04035 *Residuals 2 9828 4914 > anova(lm(normex.1[2,] ~ group))Analysis of Variance Table

Response: normex.1[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 *Residuals 2 2456.9 1228.5 > anova(lm(normex.1[3,] ~ group))Analysis of Variance Table

Response: normex.1[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 *Residuals 2 2456.9 1228.5

Page 18: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 18May 14, 2010

Group 1 Group 2

Array 1 Array 2 Array 3 Array 4

Gene 1 779 776 687 679

Gene 2 78 82 137 136

Gene 3 57 56 89 99

Multiplicative Normalization by Means

Page 19: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 19May 14, 2010

> normex*mn/rbind(cmn,cmn,cmn) [,1] [,2] [,3] [,4]cmn 779.16667 775.82547 687.33407 679.13851cmn 77.91667 81.89269 137.46681 135.82770cmn 56.66667 56.03184 88.94912 98.78378> normex.2 <- normex*mn/rbind(cmn,cmn,cmn)> anova(lm(normex.2[1,] ~ group))

Response: normex.2[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 8884.9 8884.9 453.71 0.002197 **Residuals 2 39.2 19.6 > anova(lm(normex.2[2,] ~ group))

Response: normex.2[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 3219.7 3219.7 696.33 0.001433 **Residuals 2 9.2 4.6 > anova(lm(normex.2[3,] ~ group))

Response: normex.2[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 1407.54 1407.54 57.969 0.01682 *Residuals 2 48.56 24.28

Page 20: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 20May 14, 2010

Group 1 Group 2

Array 1 Array 2 Array 3 Array 4

Gene 1 1000 947 500 500

Gene 2 100 100 100 100

Gene 3 73 68 65 73

Multiplicative Normalization by Medians

Page 21: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 21May 14, 2010

> cmd <- apply(normex,2,median)> cmd[1] 110 95 85 110> normex.3 <- normex*md/rbind(cmd,cmd,cmd)> normex.3 [,1] [,2] [,3] [,4]cmd 1000.00000 947.36842 500.00000 500.00000cmd 100.00000 100.00000 100.00000 100.00000cmd 72.72727 68.42105 64.70588 72.72727> anova(lm(normex.3[1,] ~ group))

Response: normex.3[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 224377 224377 324 0.003072 **Residuals 2 1385 693 > anova(lm(normex.3[2,] ~ group))

Response: normex.3[2, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 0 0 Residuals 2 0 0 > anova(lm(normex.3[3,] ~ group))

Response: normex.3[3, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 3.451 3.451 0.1665 0.7228Residuals 2 41.443 20.722

Page 22: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 22

Intensity-based normalizationNormalize by means, medians, etc., but do so

only in groups of genes with similar expression levels.

lowess is a procedure that produces a running estimate of the middle, like a robustified mean

If we subtract the lowess of each array and add the average of the lowess’s, we get the lowess normalization

May 14, 2010

Page 23: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 23May 14, 2010

norm <- function(mat1){ mat2 <- as.matrix(mat1) p <- dim(mat2)[1] n <- dim(mat2)[2] cmean <- apply(mat2,2,mean) cmean <- cmean - mean(cmean) mnmat <- matrix(rep(cmean,p),byrow=T,ncol=n) return(mat2-mnmat)}

Page 24: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 24May 14, 2010

lnorm <- function(mat1,span=.1){ mat2 <- as.matrix(mat1) p <- dim(mat2)[1] n <- dim(mat2)[2] rmeans <- apply(mat2,1,mean) rranks <- rank(rmeans,ties.method="first") matsort <- mat2[order(rranks),] r0 <- 1:p lcol <- function(x) { lx <- lowess(r0,x,f=span)$y } lmeans <- apply(matsort,2,lcol) lgrand <- apply(lmeans,1,mean) lgrand <- matrix(rep(lgrand,n),byrow=F,ncol=n) matnorm0 <- matsort-lmeans+lgrand matnorm1 <- matnorm0[rranks,] return(matnorm1)}

Page 25: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 25May 14, 2010

Page 26: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 26May 14, 2010

Page 27: Two Color Microarrays

SPH 247 Statistical Analysis of Laboratory Data 27May 14, 2010


Recommended