Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | sharyl-richard |
View: | 220 times |
Download: | 4 times |
Design of Experiments and Data Analysis
Let’s Work an Example
• Data obtained from MS Thesis
• Studied the “bioavailability” of metals in sediment cores
• We’ll analyze chromium data
Pt. Mugu Marsh
Analytical Techniques
• Sediment samples were taken with cores• Sliced into 1 cm slices• Sediment in each slice was extracted using a
strong acid• Extracts were analyzed using an Inductively
Coupled Plasma Mass Spectrometer (ICP-MS)• Calibrations were also conducted• Surfaces areas (SA) and organic carbon (OC)
contents of sediment in each slice were also measured
Core processing
1-cm slices
Organic Carbon
Surface Areas
Tessier Extractions
Objectives
• To determine if there is a correlation between sediment surface area and organic carbon content
• To determine if there is a relationship between concentration of a specific metal and sediment SA and/or OC
• To determine if there is a relationship between or among metal concentrations
Example of Results
0 1 2 3
10
8
6
4
2
0
Dep
th (
cm)
Organic Carbon (%)
1.2 1.8 2.4
CC01
Surface Area (m2/g)
0.3 0.6 0.9 1.2
0.8 1.6 2.4
LM02
0.3 0.6 0.9 1.2
CC02
0.6 1.2 1.8
0.1 0.2 0.3
0.21 0.24
CC03
0 2 4
0 2 4 6 8
LM01
Example of Results
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0 1 2 3 4 5 6 7
Surface Area (m2/g)
Org
an
ic C
arb
on
Co
nte
nt (
%) Slope = 0.39
R2 = 0.7
Data File
• Create a folder entitled “REU” in the C:\My Documents folder
• Create a folder entitled “2006” in this REU folder• Create a folder entitled “Data Analysis
Workshop” in this 2006 folder• Download Excel File REU_dataanalysis_data.xls
from instructional1.calstatela.edu/ckhachi into the Data Analysis Workshop folder
• Open the file
Data File Structure
• There should be 2 worksheets in the workbook:– Data: raw SA, OC, and metals concentration
data– Calibration Curves: ICP-MS calibration data
(relating raw metals concentrations to known calibration concentrations)
• Data for the cores are separated by yellow bands
Data File Structure
• Data Columns include:– ID: Random sample ID– Ave Depth: Ave depth of each slice– Solid Mass: Mass of sediments in each slice– Raw ICP-MS data for each of five metals
• Calibration Columns include:– Conc: Concentration of standards in parts per
billion (ppb)– ICP-MS responses for the 5 metals
Let’s Start with Calibration Curves
• Most instruments over reasonable ranges have linear responses (i.e., calibration curves are straight lines)
• We need to “model” the data – regression analysis to determine the best-fit line that relates ICP-MS response to concentrations
• We will then use these calibration equations to calculate concentrations for our samples
• Note: because we know that calibrations are usually linear, we will choose a linear regression model…if you don’t know the relationship b/w 2 variables, it sometimes helps to start with plots
Calibration Curve for Cr
• Linear response• We know slope and
intercept• R2 value provided• Best-fit line drawn
(looks good to me)• Not enough statistical
information provided to be able to conduct proper error analysis
y = 259.07x + 1787.3
R2 = 0.999
0.0000
2000.0000
4000.0000
6000.0000
8000.0000
10000.0000
12000.0000
14000.0000
16000.0000
0 10 20 30 40 50 60
Series1
Linear (Series1)
Regression Analysis for Cr
Rename Worksheet“Cr Analysis”
Assumptions
• On average, errors are not consistently positive nor negative.– Linear Model: yi = mx + b + ei, where ei is the error
associated with each observation– Line goes through the middle of data
• Variance of error terms the same across all observations
• Data are independent of each other• Error terms are normally distributed (not that
important)
Residual PlotResiduals Plot
-161.9482
0.0000
161.9482
323.8964
1 2 3 4 5 6 7
Observation
Re
sid
ua
l (g
rid
line
s =
SE
est
)
Look at data and linear fit carefully; points lie above the line for smaller values of concentration. If you delete the last point, you get a very different result
Regression Statistics
• Multiple R (or just r) is the correlation: – +1 perfectly positively correlated (as x goes
up, so does y)– 0 not correlated– -1 perfectly negatively correlated (as x
goes up, y goes down)
)y()x(
)yy)(xx(1n
1
r
n
1iii
Regression Statistics
• R Square (R2): coefficient of determination– Between 0 and 1
• 0 no linear relationship • 1 perfect linear relationship (+ or -)
– Square of the r value– Theoretically, as the number of data points ∞, R2
1 (denominator is fixed)
• Adjusted R Square: fixes this problem…is probably a better measure of how strong the linear relationship is (R2 more common)
• Use 2 or 3 significant figures to report these #s
Regression Statistics
• Standard Error: a measure of the amount of error in the prediction of y for an individual x.
• Observations: # of data points
ANOVA
• ANalysis Of VAriance (sometimes called an F test)
• df: degrees of freedom• SS: sum of squares
R2 = (1-SSresidual)/SStotal
• MS: Mean squares = SS/df
• F = MSregression/MSresidual larger reject null hypothesis (no correlation)
• Not very useful for single treatment
Correlation results• Linear Calibration: y = mx + b
– Slope (m) = 259.0709– Intercept (b) = 1787.2679
• Standard Error: used for hypothesis testing and confidence band formation
Correlation results• Confidence intervals
– Intercept• Lower: 1787.2679 – 70.2724 (2.571) = 1606.597• 2.571 standard two-tale t-test table with df = 5
and probability = 0.05
– Slope• Lower: 259.079 – 3.6280(2.571) = 249.74• Upper: 259.079 + 3.6280(2.571) = 268.40
• t stat: = Coefficient/Standard Error
Correlation results• P-value: probability of wrongly rejecting
the null hypothesis (Ho), in this case no correlation, if it is in fact true – p > 0.10 null hypothesis maybe OK– 0.10 < p < 0.05 slight evidence against null
hypothesis – p < 0.05 moderate evidence against null
hypothesis – p < 0.01 strong evidence against null
hypothesis
• Consult statistical tables again:– For df = 5 and t stat = 25.4, p < 0.000005– For df = 5 and t stat = 71.4, p < 0.0000001
• Very, very strong evidence that Ho is false the calibration curves are linear!
• Linear Model:
Correlation results
)27.70(27.1787ionConcentrat)63.3(07.259sponseRe
Using Calibration Equations
• Now we have an equation that relates the response of our equipment to concentrations
• Let’s use this equation to determine concentrations in our samples
Raw Data Excel Sheet
Measurement Errors
• Add 2 columns to the right of the Cr data• Assume instrument has a 3% error (in reality,
you need to run sample 3 times to get the proper error)
Propagation of Errors
• Let us assume that X is dependent upon the experimental variables p, q, and r, which fluctuate in a random and independent way.
• Addition or Subtraction: X = p + q - r:
• Where “s” is the standard deviation or error for each of the variables
2r
2q
2px ssss
Propagation of Errors (cont’d)
• Multiplication or Division: X = p * (q/r)
• Other equations exist for logs, etc.• Round +/- to the # of decimal places of the component
number with the fewest number of decimal places• Round x/÷ to the number of significant digits of the
component number with the fewest significant digits.
2
r
2
q
2
px
r
s
q
s
p
s
X
s
Let’s use the Calibration Eqn
• Response detector output
• Concentration what we are looking for in the column labeled “Cr Conc (ppb)”
)27.70(27.1787ionConcentrat)63.3(07.259sponseRe
Let’s use the Calibration Eqn
• Let’s look at the first line:
• Rearrange to solve for Conc:
• Let’s look at the numerator
70.27)1787.27( Concx 3.63)259.07( 244.69)8156.35(
x 3.63)259.07(
)27.071787.27(- 244.69)8156.35( Conc
• Num = 8156.35-1787.27 = 6369.08
• Error in Num:– Recall for +/-:
– Error in Conc =
• So now:
Let’s use the Calibration Eqn
2r
2q
2px ssss
)63.3259.07(
)27.071787.27(- 244.69)8156.35( Conc
63.307.259
58.25408.6369 Conc
244.692
70.272
254.58
• Conc =
• Recall, for x/÷: or
• So, ErrConc =
• Final result Conc = 24.58 ± 1.04
Let’s use the Calibration Eqn
2
r
2
q
2
px
r
s
q
s
p
s
X
s
63.307.259
58.25408.6369 Conc
6369.08
259.0724.58
2
r
2
q
2
px r
s
q
s
p
sXs
24.58254.58
6369.08
23.63
259.07
2
1.04
Final Results
• Use error bars in the plots
0
2
4
6
8
10
12
14
16
0 5 10 15 20 25 30
Chromium Concentration (ppb)
De
pth
(cm
)
Plotting Error Bars
• Error bars can be:– 1-3 standard deviation(s)– Standard error– etc…
• Just be clear in your figure caption what your error bar represents
Next Presentation
• A little about design of experiments
• A little more about errors, hypothesis testing, etc…