+ All Categories
Home > Documents > Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample...

Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample...

Date post: 30-Jan-2018
Category:
Upload: haliem
View: 216 times
Download: 1 times
Share this document with a friend
16
Statistics: Error (Chpt. 5) Always some amount of error in every analysis (How much can you tolerate?) We examine error in our measurements to know reliably that a given amount of analyte is in the sample To determine the error in the measurement, we run replicate samples: samples of about the same size that are carried through an analysis in exactly the same way If a measurement has no error, the replicate samples should yield the same answer. This does not happen With replicate data, we usually report the mean or average In some instances, we are interested in the median: middle value in a set of data that has been arranged in order of size Median is important in data sets with outlier. Outliers can have large effects on the mean, but they will have little effect on the median. Example: Consider masses: 3.080, 3.094, 3.107, 3.056, 3.112, 3.174, 3.198 What happens if record 31.07 on accident? 1
Transcript
Page 1: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistics: Error (Chpt. 5)• Always some amount of error in every analysis (How much can you tolerate?)

• We examine error in our measurements to know reliably that a given amount of analyte is in the sample

• To determine the error in the measurement, we run replicate samples: samples of about the same size that are carried through an analysis in exactly the same way

• If a measurement has no error, the replicate samples should yield the same answer. This does not happen

• With replicate data, we usually report the mean or average

• In some instances, we are interested in the median: middle value in a set of data that has been arranged in order of size

• Median is important in data sets with outlier. Outliers can have large effects on the mean, but they will have little effect on the median.

• Example: Consider masses: 3.080, 3.094, 3.107, 3.056, 3.112, 3.174, 3.198 What happens if record 31.07 on accident?

1

Page 2: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistics: Error (Chpt. 5)Precision vs. AccuracyPrecision is the closeness of data to other data that have been obtained in exactly the same way

High precision measurements have small standard deviations, variance, and coefficient of variance. These terms are a function of deviation from the mean value and have no relationship to the true value.

Accuracy is the closeness of a result to its true or accepted value. Accuracy determines how much error is in the method, not how reproducible the method is

2

Page 3: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistics: Error (Chpt. 5)Error related to AccuracyAbsolute error: difference between the measured value and the true value. It bears a sign E = xi – xt where xt is true or accepted value and xi is measured value Relative Error: absolute error divided by true value (aka % error)Example: True value is 20.0 ppm and measured value is 19.8 ppm Precision is determined by comparing replicate data, but accuracy is not as easy to determine (we usually don’t know the true value)

Different types of errorRandom (or indeterminate) errors: affect the precision of measurement; non-traceable Systematic (or determinate) errors: affect the accuracy of results; traceable; has assignable cause; same magnitude for replicate measurements Gross Error (aka outlier): quite large, don’t occur often, caused by human error (loss of precipitate, etc.)

3

Page 4: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistics: Error (Chpt. 5)Sources of systematic errors

1. Instrumental errors (fixed by calibration) - volumetric glassware may differ from listed value - electrical: increased resistance from dirty contacts or temperature changes 2. Method errors (from non-ideal behavior of reagents used in analysis) - slow reactivity between analyte and titrant, side reactions, end point vs. equiv. point - often most difficult to detect - fixed by doing analysis of standard samples (standard reference material) and/or by

performing blank determinations - also fixed by cross validation with other method

3. Personal errors (fix by taking care and doing replicates) - incorrect reading of liquid level in a buret - error in detecting color change in titrated (esp. if color blind) - prejudice in numerical readings - incorrect significant figures

4

Page 5: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistics: Error (Chpt. 5)Effect of systematic error on results 1. Constant error: same amount of error is made each time, but the relative error will change - independent of sample size - becomes serious as sample size decreases

Example: 0.5 mg of a precipitate (ppt) is lost as a result of a wash with liquid. Calculate Er if ppt is 500 mg or 50 mg 2. Proportional Error: absolute error changes, but the relative error remains constant - dependent on sample size - changes with sample size

Example: When washing a ppt with a liquid a proportional error is occurring. If the Er is 2.5% calculate E for washing a 50 mg and 500 mg ppt

5

Page 6: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Random Errors (Chpt. 6)Significant FiguresGeneral rule: don’t report what you don’t know

See pages 134-136, you must know these, we won’t cover 1. Addition/subtraction Do not be more specific that your least specific number 2. Multiplication/Division same general rules apply

6

Page 7: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Random Errors (Chpt. 6)All measurements have random error (can only be minimized not eliminated)

Consider measuring the volume dispensed by a 10-mL volumetric pipet

As N >30, starts to form bell-shaped curve

Central limit theorem: distribution of measurements subject to random errors is often a normal distribution (Gaussian distribution)

7

Page 8: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Random Errors (Chpt. 6)Properties of a Gaussian CurvePopulation (collection of all measurements of interest to a experiment) vs. sample (subset of measurements selected from the population)

Population mean (µ) vs. sample mean ( )

Precision = closeness of data to other data that have been obtained in a similar manner, expressed usually by standard deviation

Population std. dev. (σ)

8

Page 9: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Sample standard deviation (s): more calculator friendly

• Use sample std. dev. (s) with data sets of 30 points or less• Lower value of s indicates better precision • Scatter from “true” value will decrease as N is increased • What is n-1? Degrees of freedom: anytime you make an assumption, lose one degree of freedom, N-1 = # of data that remain independent

Random Errors (Chpt. 6)Properties of a Gaussian Curvez-variable: deviation from the mean relative to the standard deviation, describes all populations of data regardless of standard deviation

µ ± 1σ = 68.3%µ ± 2σ = 95.5%µ ± 3σ = 99.7%

s =

��ni=1(xi − x̄)2)

n− 1

9

Page 10: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Random Errors (Chpt. 6)Relative standard deviation RSD (parts per thousand) = Coefficient of variation (% RSD) = Standard error of the mean (Sm) - Shows relationship between mean and std. dev. Pooled Standard Deviation

SPooled is used to pool standard deviations from different measurements, done when increasing # of measurements is not possible (several subsets of data)

When have 2 sets of data can simplify to be calculator friendly (not in the book)

10

Page 11: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)Scientists use statistical calculations to judge the quality of experimental measurements

These calculations are based upon means, standard deviations, Gaussian curves and test statistics

Confidence Limitsdefine an interval around the experimentally determined mean that “probably” contains the population mean (µ)

If population standard deviation is known:

Again, CI decreases by

Value for z depends on confidence level in measurement

Confidence interval is:

Example: Determine 80% and 95 % confidence interval for experimentally determined glucose level of 1108 mg/Lif s = 19 mg/L and s is good estimator of σ (n=7)

11

Page 12: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)But…..s is not always a good estimator of σ

Then use t statistic, which depends on the number of measurements

Example: A chemist found the following data for the alcohol content of a sample of blood: 0.084%, 0.089%, and 0.079%. Calculate the 95% confidence level for the mean.

12

Page 13: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)Often use t or z statistic to accept or reject data: Hypothesis testing

Null hypothesis: postulates that there is no difference between two observed quantities

Rules for hypothesis testing when true mean is known:1. Write null hypothesis2. Depending upon whether σ or s is to be used, look up corresponding test statistic

(z or t) for a given confidence level3. Determine zcal or tcal

4. If calculated value is greater than table value, reject null hypothesis If calculated value is less than table value, accept null hypothesis

Example: A new procedure for test sulfur in fuel. Certified standard gives 0.123%S. New test (n=4) gives 0.112, 0.118, 0.115 and 0.119% S. Is there a biasat the 95% confidence level?

13

Page 14: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)Often times want to compare two different experimentally determined means (N<30)Use spooled and different tcal formula but same t table

Example: Analysis of two barrels of wine for alcohol content. 6 analyses of 1st barrel = 12.61%; 4 analyses of 2nd barrel = 12.53%. 10 analyses spooled = 0.070% At 95% CL, is there a difference between the 2 wines?Note: number of degrees of freedom: N1+N2 -2

14

Page 15: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)Comparison of precision: F-testSimilar to t-tests, but this test compares precision of two sets of dataCan be used to test experimental and true standard dev. Fcalc=

so that Fcalc > 1.0 if Fcalc > Ftable: reject null hypothesis

Example: Standard method for measuring CO = std dev of 0.21 ppm. This is a well established method that has been performed a number of times. Modification of method that was done 13 times leads to std. dev. of 0.15 ppm. Is one method more precise?

15

Page 16: Statistics: Error (Chpt. 5) - Home : SLUrmccull2/resources/statistics_for_data.pdf · Sample standard deviation (s): more calculator friendly • Use sample std. dev. (s) with data

Statistical Treatment of Data (Chpt. 7)Test for Outliers: Q-testIs the outlier from a gross error?For small data sets, it is best to try and collect more data If not possible apply Q-test where xq is questionable result, xn is nearest neighbor, and w is spread

If Qexp is greater than Qcrit then reject the questionable result, it is from a gross error

Consider the following data set: 81, 100, 101, 102, 103. Is 81 bad at 99% CL?

16


Recommended