Statistics and Curve Fitting
Vipuil Kishore
Lab Lecture 2
Statistics and Curve Fitting
We will learn the following:
Mean
Standard Deviation and Standard Error
Gaussian or Normal Distribution
Confidence Intervals
Curve Fitting using DataFit
Significant Figures
Solving y = axb
Mean (Average)
Example:
Set of test scores: X1 = 80; X2 = 100; X3 = 60; X4 = 70; X5 = 90
𝑀𝑒𝑎𝑛 = 𝑋 =
𝑖=1
𝑁𝑋𝑖𝑁
𝑋 =80 + 100 + 60 + 70 + 90
5= 80
N – number of data points
𝑋 = 80
Standard Deviation (Spread of the data)
Example:
Set of test scores: X1 = 80; X2 = 100; X3 = 60; X4 = 70; X5 = 90
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑠 =
𝑖=1
𝑁
𝑠𝑞𝑟𝑡[(𝑋𝑖−𝑋)
2
𝑁 − 1]
𝑠 =
𝑖=1
𝑁
𝑠𝑞𝑟𝑡[(80 − 80)2+(80 − 100)2+(80 − 60)2+(80 − 70)2+(80 − 90)2
5 − 1]
𝑠 = 15.81
Gaussian or Normal Distribution
Example:
Set of test scores: X1 = 80; X2 = 100; X3 = 60; X4 = 70; X5 = 90
60 70 80 90 100
X (values)
0.2
Actual Distribution Normal Distribution
(for large sample size)
X (values)
X
s
Standard Error
Example:
Set of test scores: X1 = 80; X2 = 100; X3 = 60; X4 = 70; X5 = 90
Standard Error: Standard deviation of the sampling means
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 =𝑠
𝑁
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 =15.81
5
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 = 7.07
Confidence Intervals (CI)
Confidence interval is an estimated range of values within
which the true value is likely to be present
Two Assumptions: 1) large sample size and 2) all errors are
random
For most experimental cases, 95% CI is acceptable
𝑋𝑖 = 𝑋 ±𝑧(𝑠)
𝑁
𝐼 = −𝑧
+𝑧 1
2πexp−𝑧2
2𝑑𝑧
z I
0 0
1 0.6826
1.96 0.95
6 0.999997
X
-z +z
Confidence Intervals (CI)
Example:
Set of test scores: X1 = 80; X2 = 100; X3 = 60; X4 = 70; X5 = 90
𝑋𝑖 = 𝑋 ±𝑧(𝑠)
𝑁
For 95% CI, 80 ±1.96(15.81)
5
95% CI = 80 ± 13.85
Upper Interval: 80 + 13.85 = 93.85
Lower Interval: 80 − 13.85 = 67.15
Therefore, 95% CI is 67.15 to 93.85
Curve Fitting
DataFit: datafit6.zip (Install from CD or http://my.fit.edu/~vkishore/CHE3265/Spring%202014/)
Licensed to Florida Tech; License Key: BTOG-MTPI-HWFZ-LFEC
Curve Fitting: Process of finding a mathematical equation
which reproduces the experimental data
Attempts to fit the data with as few parameters as possible
𝑖=1
𝑁
(𝑦𝑖,𝑒𝑥𝑝 − 𝑦𝑖,𝑐𝑎𝑙𝑐)2
X
Y
Residual Sum of Squares
yi,calc = f(xi) = ax + b
Curve Fitting
Example: Fit the stress vs. strain data using Datafit and
calculate the slope and intercept (y = ax + b)
Stress (psi) Strain (in/in)
0 0.00000
2087 0.00015
4174 0.00035
6261 0.00055
8348 0.00075
10435 0.00090
12522 0.00105
14609 0.00130
16696 0.00155
18783 0.00170
20870 0.00190
22957 0.00215
25044 0.00235
27131 0.00260
29218 0.00285
31304 0.00310
33391 0.00335
35478 0.00370
37565 0.00465
38191 0.00560
95% Confidence IntervalsVariable Value 95% (+/-)a 9753608.341 383448.105b 1320.759293 772.5472262
Datafit Output:
Model Plot:
a is slope and b is intercept
Significant Figures
The only significant digit in the error is the leftmost non-zero
The last significant digit in the error is in the same decimal
place as the first significant digit in the number itself
Rounding: 5+, round up; 4-, round down
Example:
Slope (a): 9753608.341 ± 383448.105
9800000 ± 400000
Intercept (b): 1320.75923 ± 772.5472262
1300 ± 800
Linearizing an Exponent Equation
𝑦 = 𝑎𝑥𝑏 a, b – unknown constants
𝑙𝑜𝑔𝑦 = 𝑙𝑜𝑔𝑎 + 𝑏𝑙𝑜𝑔𝑥
Taking log on both sides,
𝑦 = 𝑐 +𝑚𝑥
log X
Slope = b
Intercept = loga