I. Overview of Kinetics (Chemical Reaction Engineering, Levenspiel, 1999)
A
3
dN1 amount of A disappearing molr ,
V dt (volume) (time) m sA
or [concentration/time] (1)
-dC/dt = -rA = f (temperature, concentration) === a-E
n nRToA A kC = k e C (2)
concentration k [concentration]n
time
if n = 1, the order of the reaction = 1 with respect to A;
if n ≠1 (including zero), the order of the reaction is nth
with respect to A;
CA concentration of A (mol/m3)
NA amount of A (mol)
A. Rate constant k
Solve for dimensions of k
k [=] (concentration)1-n
(time)-1
Physical meaning of rate constant and activation energy
k: rate is highest at highest C, decreases with C Ea: change in rate with temperature
2
0
0
1ln ln
aE
RT
a
k k e
Ek k
R T
(3)
or, using a reference temperature
1 1a
r
E
R T T
rk k e
(4)
Although Eq. (3)is found more often in the literature, I find the form of Eq. (4) more useful
and easier to work with.
B. Derivation of equations for 1st, zero, and nth-order reactions
1. First-order reaction
3
Used very frequently for nutrient degradation and microbial destruction.
From Eq. (2)
dCkC
dt (5)
0 0
C t
C
dCk dt
C (6)
lnC – lnC0 = -k t (7)
C/Co = e-kt
(8)
Often we are interested in the amount of loss (conversion) of C. Loss = X = 1- C/C0. For
first-order reaction,
X = 1- e-kt
. (9)
The right side of Eq. (9) does not contain C0, so loss ≠ f ( initial concentration).
One advantage of 1st-order reaction is the model experiments can be run at high
concentrations, and the % retention (C/C0) will be the same at low concentrations. Not
true for any reaction that is non-first-order (including zero order).
a. First type of experimental design (most common) to determine if the reaction is first-
order: Vary time, hold temperature and initial concentration constant.
Take samples from isothermal conditions at different times. Plot ln C vs. t from Eq.
(7). Must be straight line.
4
To meet the requirement of isothermal conditions, lag time must be neglected.
Therefore, data are taken after isothermal conditions have been nearly reached.
b. Second type of experimental design to determine if reaction is first-order: Vary initial
concentration, hold time and temperature constant.
Take all samples at same time and temperature. Plot lnC vs. lnC0 from Eq.(7). Must
be a straight line of slope 1.0.
5
2. Zero-order reaction
Often used for changes in color of a processed food.
Rate of reaction is independent of the concentration of the materials:
0dCr kC
dt (10)
0 0
C t
CdC k dt (11)
C – C0 = -k t (12)
X = 1-C/C0 = k t/C0
Therefore, X = f (initial concentration)
Most zero-order reactions are at high concentrations. As concentration is lowered, the
order rises from zero.
3. nth
-order reaction
ndCkC
dt (13)
6
Many nonlinear regression routines can do this. Find Solver in Excel under ―Tools.‖ If
not there, click ―Add-ins‖ and check the ―Solver box.‖ It may ask you for the CD.
1
1
0 1
0
( 1)1 ( / ) 1 1
n
n
n ktX C C
C
(14)
Loss is also dependent on initial concentration.
C. Matlab Built-in functions for regression
Functions for Linear regression:
p = polyfit(x, y, n)
7
where x and y are the vectors of the independent and the dependent variables, respectively, and n
= the order of the polynomial. p(1) = slope, p(2)= intercept.
Functions for Multiple linear regression:
[b,bint,r,rint,stats] = regress(y,X)
Where b are the coefficients (b(0) is the intercept), bint are the 95% confidence intervals, and r
are the residuals. If rint does not contain zero, it may be an outlier. (We are not discussing
outliers in this class.) If bint does not contain 0, then the b parameter is significant.
residual = Yobserved – Ypredicted
dfittool will be used to show how to plot a residual histogram
plot([x1 x2] [y1 y2] 'k','Linewidth',2) will plot a black line from x1,y1 to y2, y2. We use this
to plot a horizontal line at y = zero for residuals for visual ease.
Functions for Nonlinear regression:
[beta,r,J,COVB,mse] = nlinfit(X,y,fun,beta0)
X in an n x q matrix of the independent variables; you can put extra columns
in with other data;
Y is an n-by-1 vector of observed responses.
Fun is a function handle to a function of the form:
yhat = modelfun(b,X)
beta0 is a vector containing initial values for the coefficients.
8
II. Numerical Integration
Integrals are usually needed when scaling up from a several-gram sample to kilogram-
sized sample, because the larger sample will have temperature, moisture, shear-rate, or
concentration gradients.
When one wishes to maximize Y (e.g. nutrient retention), one must track history of all
points in a commercial process. Often every element of the product has a different
history. In these cases, one method to calculate Y is integration of a model function.
Many of the most important models cannot be integrated analytically easily. Therefore,
numerical integration is required.
Following are three of the most common numerical integration techniques well-suited for
a spreadsheet or Matlab.
For Numerical Integration of Tabulated Data or of Functions
a. Trapezoidal Rule
Easiest to understand and quickest to use. Least accurate, yet often sufficiently accurate
for a first cut. Trapezoidal Rule uses a straight-line approximation, so polynomial
degree = 1. Advantage over Simpson’s Rule is trapez. rule can handle unevenly spaced
panels.
For evenly spaced x intervals:
1
0
1
22
n
n j
j
xI f f f
(15)
For evenly or unevenly spaced x interv
1
1
1
0
sum of (average heights*widths)2
nj j
j j
j
f fI x x
(16)
Eq. (16) is always correct, whereas Eq. (15) carries an assumption. Eq. (16) is better
suited than Eq. (15) for spreadsheets, and can be coded within a few minutes.
b. Simpson’s Rule
9
More accurate than trapezoidal rule, but cannot handle unevenly spaced intervals, and
must have even number of panels.
1 2
0
1 2 odd even
4 23
n n
n j j
j jj j
xI f f f f
(17)
Uses parabolic approximation (degree of polynomial = 2). Exact solution for
polynomial degree = 2 or lower.
For Numerical Integration of Functions
b. Romberg integration
Evenly spaced panels, constant panel width.
More accurate than trapezoidal and Simpson’s rule because it combines these
two methods using a generalized formula.
The main steps are
a) Compute trapezoidal approximation for n panels (In) and for n/2
panels (In/2). For In, panel size = h; for In/2, panel size = 2h.
b) Because the leading error term is proportional to h2, subtracting In/2
from 4* In eliminates the leading error, and leaves an estimate for
3* the integral.
c) We divide the result by 3 to obtain a better estimate, because the
leading error term has been removed.
d) This procedure is generalized below:
1 1
( ) /24
4 1
k k kk n n
n k
I II
(18)
n = number of panels (intervals) in the trapezoidal rule and is an integer = 2k,
k = the number of levels of error elimination. k = 1,2,3,…
A table can be set up showing Romberg integration results. See pdf by Peter
Young. The most accurate Romberg integration approximation is the one with
the most error extrapolations, i.e. the diagonal entry ( )
2k
kI
10
c. Gauss Quadrature—the Cadillac of numerical integration
Most accurate of the three, and can be very accurate with few panels, so reduces
computer time. Based on a normalized interval from –1 < +1.
Disadvantage compared to Trapezoidal and Simpson’s is that Gauss cannot handle
discrete data points without a continuous function.
Gauss integration should be your first choice unless a) no function, or b) time series.
1 1 2 2[ ( ) ( ) ... ( )]m mI C w f x w f x w f x (19)
where the xi are the m unequally spaced points determined by the type and degree of
function, and the wi are the weight factors that depend on m. C = (b-a)/2.
Uses approximation polynomial of degree 2m-1.
If we wish to evaluate ( )b
af x dx , must transform –1 < a < x < b. Use
the transformation:
2 2
b a b ax
(20)
Example 1 (Numerical Methods, Robert Hornbeck, 1982, Prentice Hall)
Evaluate the integral / 2
2
0cosx x dx
Exact solution = x2sin x +2xcos x – 2sin x = 0.467401
Gauss Quadrature
Choose m = 4
k wk
+ 0.3399810436 0.6521451549
+ 0.8611363116 0.3478548451
Solve for xk’s using Eq.(20)
1 1
/ 2 0 / 2 0 1.570796 1.570796( 0.861136) 0.109064
2 2 2 2x
11
2
1.570796 1.570796( 0.339981) 0.518378
2 2x
x3 = 1.05242
x4 = 1.46173
Corresponding values of f(x) = x2 cos x are
f(x1) = (0.109064)2 cos(0.109064) = 0.011824
f(x2) = 0.233413
f(x3) = 0.548777
f(x4) = 0.232572
12
Map from Gauss onto our function
13
Trap 4 panels
Simpson 4 panels
x f(x) av height avht*dx
x f(x) 2*f or 4*f 0 0
0 0 0 fo
0.392699 0.142474 0.0712369 0.027975
0.392699 0.142474 0.569895 4*f1
0.785398 0.436179 0.2893264 0.113618
0.785398 0.436179 0.872358 2*f2
1.178097 0.531131 0.4836552 0.189931
1.178097 0.531131 2.124525 4*f3
1.570796 1.51E-16 0.2655657 0.104287
1.570796 1.51E-16 6.05E-16 f4
0.435811
(x/3)*= 0.46689
From Eq. (19),
1 1 2 2 3 3 4 4
/ 2 0( ) ( ) ( ) ( )
2
1.570796[(0.347855)(0.011824) (0.652145)(0.233413)
2
(0.652145)(0.548777) (0.347855)(0.232572)]
0.467402
I w f x w f x w f x w f x
Comparison of results for each method:
Exact value Gauss Quadrature,
m=4
Romberg
integration, 4
panels
Simpson’s Rule, 4
panels
Trapezoidal
Rule, 4 panels
0.467401 0.467402 0.467565 0.466890 0.435811
Error = -0.000001 =
-0.0002139%
-0.000164 = -
0.0351%
+0.00051 =
+0.109%
+0.03159 =
+6.759%
14
III. Introduction to Parameter Estimation
Reference: Parameter Estimation in Engineering and Science, James V. Beck and Kenneth J. Arnold,
1977, John Wiley & Sons, Inc.
“Parameter estimation is a discipline that provides tools for the efficient use of data in the estimation
of constants appearing in mathematical models and for aiding in modeling of phenomena” (Beck and
Arnold, p.1).
Mathematical models have dependent variables, independent variables, and constants (parameters).
In first classes on physics or example math problems, typically the student is asked to calculate the
dependent variable for given conditions (independent variables set and parameters given). However,
in experiments based on mathematical models, often the experimenter sets the independent
variables, measures (not calculates) the dependent variable, and uses these two sets of data to infer,
or estimate the parameter. When this is not the case, it is usually because the parameter has been
well-established (such as g = 9.81 m/s2, or an electrical wire guaranteed to have a certain resistance.)
Inverse Problem
An inverse problem is where unknown causes must be determined from known effects. Or: The
answer is known, but the question is not.
Forward problem: Given the parameters = p, and the function Y(t, ), compute Y(t) at t1, t2,…
tn.
Inverse problem: Given the responses Y(t1), Y(t2),…Y(tn), and a function Y(t, ), estimate p.
Trivial example:
Forward problem: Y = mx + b Given, m, b, compute Y(x1, x2, …xp)
Inverse problem: Y = mx + b Given Y(x1, x2, …xp), estimate m and b
Inverse Problems always require measurements and data. Forward problems can be done with no data,
e.g. simulations of pressure and velocity of air or water flow over an air foil or over a cylinder. Most
commercial simulation software (such as Comsol, Fluent) majors on the forward problem, meaning all
parameters must be input before the program is run once. An explicit solution is given. The same
software can be used to run inverse problems, as follows:
1. Let Matlab be the “brain” that feeds the initial guesses of the parameters to the software;
2. Let the software run once and give answer 1;
3. Matlab computes the sum of squares of errors (S), and chooses a new set of parameters;
4. Repeat #2 and #3 until S is minimized. The final set of parameters will be the best estimate.
15
Therefore, the inverse problem requires that the forward problem be run numerous times with different
parameter estimates.
From Heidi Korhonen, “Introduction to Inverse Problems”
http://www.space.fmi.fi/graduateschool/Lectures07/HK_inversion.pdf
Applications of Inverse Problems:
Estimation of thermal properties from temperature measurements
Estimation hydraulic conductivity from flow measurements in the soil
Seismology—locating groundwater, oil, gas from measurement of seismic waves
CT scans in medicine to determine shape of organs
Nondestructive evaluation (NDE)
Scattering problems, such as identifying internal defects in fruit from measured scattering
data.
Airport security and x-rays
16
Physics and astronomy
Life sciences--Estimation of rate constants in chemical and biological reactions where
functions are known
Because multiple models may fit the same data, we desire prior information to eliminate some models
or give less weight to them.
“Parameter estimation can also be visualized as a study of inverse problems (Beck and Arnold, p.1).”
“When it is possible to determine governing equations(s), shapes(s) and size(s) of the domain(s),
boundary and initial conditions, material properties of the media contained in the field, and internal
sources and external forces or inputs, then the analysis determining the unknown field is considered
mathematically well-posed and solvable. If any of these elements is unknown or unavailable, then the
field problem becomes incompletely defined (ill-posed) and is of an indirect (or inverse) type.” (Inverse
Problems in Science & Engineering Journal.)
Examples:
Ohm’s Law V = I R.
gravitational accelaration: Throw an object up vertically with initial velocity Vo. The distance, D,
above the earth is described by D = Vot - gt2/2.
Heat Transfer: conduction: q = -k(dT/dx)
convection q = hA(T-T) = -CpV(dT/dt)
(T- T)/ (To- T) = exp(-hA/cpV)t
Rheology: nK
A. Linear vs. nonlinear models--General Case
Consider the true model as
1 2 1 2( , ,... ; , ,..., )k pX X X (21)
where is the dependent variable, X are the independent variables, and are the true values of
the parameters.
Y are the measurements, where the ith measurement is Yi, and there are a total of n
measurements. Often, the subscript “i” represents time. Because most experimental
measurements of the dependent variable include some error i, Y can be expressed as a
combination of the true model and the unknown errors:
17
1 1( ,..., ; ,..., ) ( , )i i ik p i i i i iY X X X β (22)
When we study regression, we will call the regression predicted value of the dependent value
Y (“Y hat”). This predicted value is not the same as the true model, though we hope they are
close. Therefore, note that the unknown errors Y- are not the same as the known residuals
e =Y-Y .
The sensitivity coefficient for parameter i is given by the first derivative of the dependent
variable with respect to i :
i
i
X
The model is linear if all the sensitivity coefficients Xi are not a functions of any parameter(s) i ,
i.e., if the model’s first derivative ( )ij
j
f
, which is the same as the second derivative
2
2
i i
j j
X
= 0, where i = 1,..n, j=1,…p. Otherwise, the model is nonlinear.
Examples:
Identify the dependent and independent variables, and the parameters. Determine if the
models are linear or nonlinear with respect to each parameter.
= mx + b
= mx2 + b
= m2x + b
=Ae-kt
log() = logA -kt
nK
Significance of linear vs. non-linear:
18
There is an analytical solution to the parameter if the model is linear in that parameter.
Therefore, no initial estimate of a linear parameter is needed. On the contrary, for non-linear
parameters, an initial estimate of that parameter is needed, because there is no analytical
solution and an iterative routine is required. There are several popular routines available,
called nonlinear regression algorithms. Solver in Excel® uses the Reduced Gradient Method.
Question: How do software packages (e.g. Excel®) calculate the parameters for models that
are non-linear, such as power-law, without initial estimates?
B. Regression
1. Linear
Parameters are estimated by minimizing sum of squares of errors.
Yi is measured value; Y is predicted (regression) value; Y is mean of all Yi
sum of squares of errors: S = 2ˆ( )i iSSE Y Y
sum of squares total: 2( )i iSST Y Y
0 < R2 < 1.0 is a measure of how well the data fit the regression line for linear models only:.
2 1SSE
RSST
(23)
Therefore, maximizing R2 is identical to minimizing SSE.
Root mean square error (RMSE) is a measure of error in the data, and is the standard deviation
of the estimate. RMSE has the same units as the measured Y. RMSE is a preferred method for
reporting error in one’s data, rather than SSE.
SSE
RMSEn p
(24)
where n is the number of data that were fit to a model, and p is the number of parameters.
19
Example 2 of parameter estimation for model X
S = (Yi - Xi)2
/ 2( )( ) 0i i iS Y X X (25)
2 0i i iY X X (26)
2
i i
i
Y X
X
(27)
The slope, , can be calculated directly from the data points (Xi,Yi)
Example 3 for two parameters
With the same method, the slope and intercept parameters can be calculated for the model
Xi +
S = (Yi - Xi - )2
Setting both first derivatives to zero
1 1 2/ 2( )( ) 0i i iS Y X X (28)
2 1 2/ 2( )( 1) 0i iS Y X (29)
gives two equations and two unknowns: 2
1 2
1 2
i i i i
i i
X X Y X
X n Y
(30)
and the solution for the world-wide most-used linear least squares model is the slope
20
1 2
22
11 1
( )( )
( )
i i i i i i
nn n
ii i
ii i
n Y X Y X X X Y Y
X Xn X X
(31)
and intercept
2 1Y X (32)
D. Matrix Formulation for Parameter Estimation for Linear Model Ordinary Least Squares Estimation (OLS) (Beck & Arnold, revised Chap 6, p. 5.24-5.25; p. 5.44-5.51)
Model: X ;
n is the number of measurements,
p is the number of parameters.
When nothing is known about the measurement errors, OLS is recommended.
Where the model; (n x 1)
X is the sensitivity matrix; (n x p)
is the parameter vector (p x 1)
111 12 11
21 22 2 22
1 2
, ,
p
p
n n np pn
X X X
X X X
X X X
η X= β = (33)
n x 1 n x p p x 1
Sum of squares in matrix form is
21
LSS T
(Y - Xβ) (Y - Xβ) (34)
Where Y = vector of measured dependent variables (n x 1)
From Beck & Arnold Revised Chapter 6 Eq. 5.2.7 and 5.2.10, the matrix derivative of a matrix
transpose times itself is
( )T T
A A)=2( A A (5.2.10, Beck and Arnold)
where is not a function of
Using 5.2.10, take the matrix derivative of SLS and then set it equal to zero:
2[ ( ) ][ ]T
LSS Y X Y X (35)
Because Y and X are not functions of and (AB)T = BTAT
( )T T T T
Y X X X (36)
Therefore, setting (35) equal to zero at bLS becomes
2( )( ) 0T
LS X Y Xb (37)
Solving for the parameter estimate vector b:
1( )T T
LS
b X X X Y (38)
Check matrix dimensions:
This estimator requires for unique estimation of all the p parameters that the p x p matrix XTX
be nonsingular or | XTX | ≠ 0. Even if this determinant approaches zero, difficulty will be
experienced in the estimation.
Therefore,
22
1. Any one column in X cannot be proportional to any other column or any linear combination of other columns in X because if such a proportionality (i.e., linear dependence exists, | XTX | = 0;
2. n, the number of measurements of Yi, be equal to or greater than the number of parameters p;
3. If the predicted curve ˆiY is not to pass through each observation it is further necessary that n
> p+1.
Another method to check how difficult the estimation will be is to compute the condition
number of the matrix X (Beck and Arnold, Revised Chapter 6, p. 5.8-5.10). In Matlab, use
cond(X). Estimation becomes more difficult and tends to become ill-conditioned as the
condition number becomes large, which is roughly about 1,000,000. The smallest condition
number possible condition number is 1.0. A linear estimation problem is more stable as the
condition number becomes smaller.
Example 5.4.1, p. 5.46, Beck and Arnold, Revised Chapter 6.
-How to set up the X matrix
Example 5.4.2, p. 5.47, Beck and Arnold, Revised Chapter 6.
-Showing the difference between errors and residuals.
Example 5.4.3, p. 5.49. We will use Matlab for this, as shown below:
Use OLS to find the parameters for the model:
logNu = 1+2 logRe + 3 (logRe)2 +
Also give Nu as a function of Re and compare with the data given below:
23
Re 0.1 1 10 102 103 104 105
Nu 0.45 0.84 1.83 5.1 15.7 56.5 245
Write the model in the familiar form
2 t +3 t2 Y =
Where is the model and Y is the measurement, and is the error, which could include
systematic error, natural variability in the substance being measure, and instrument error.
Where Y = logNu, = E(Y), and t = logRe. We are assuming that the mean of the errors = zero.
i 1 2 3 4 5 6 7
ti
Yi
24
Confidence intervals for Linear Regression models
Excel’s “Data” menu, “Data Analysis”, “Regression” gives a large amount of statistical
information:
CIs for parameters in linear models
SSE, MSE, confidence intervals for the parameters (slope and intercept), and the p value for
the parameters. If the p value <=0.05 for the slope, than the slope is significantly different
from zero. Especially when there is scatter in the data, and a lower R2, you should check the p
value.
Matlab function “regress” also gives these data.
CIs for the dependent variable in linear models
For linear models, neither Excel nor Matlab automatically compute a) the confidence interval
(CI) for the regression line, or b) the prediction interval (PI) for individual data.
The confidence limits (a,b) for the regression line at a predicted point ˆkY :
2
/2, 2
( )1ˆ( , ) ( )( )
kk n p
i
x xa b Y t s
n x x
(39)
Where s is the standard error:
2
mean square error
sum of square of errors = ˆ( )i i
i
s MSE
SSMSE
n p
SS Y Y
= (40)
25
The prediction limits (c,d) for a point ˆkY :
- 2
/2, 2
( )1ˆ( , ) ( ) 1( )
kk n p
i
x xc d Y t s
n x x
(41)
The prediction limits are larger than the confidence limits due only to the “1” inside the
square root.
Sum of squares in the denominator of Eq. (39)and Eq. (41) is easier calculated using this
formula:
22 2( ) /i i ix x x x n (42)
The value of the Student t distribution can be confusing due to notation. Assume there are 10
data. The degrees of freedom = n-p = 10-2 = 8. t0.025,8 = 2.306. Excel assumes a two-tailed
distribution, so use “=TINV(0.05,8).” In Matlab, use tinv(0.975,8).
We are 95% confident that the true regression line falls within the CI. The curved CI does not
mean that some regression lines are curved. The CI is curved like an hourglass because it is
the boundary of all possible straight lines. Because the slope and intercept both have
confidence intervals, like a fulcrum there will be more uncertainty at the extremes than at the
center. Many points, far more than 5% of all data, may lie outside the CI. If you collect many
data, you may see 50% or more of the points lying outside the CI.
See Motulsky and Christopolous, p. 51-57.
26
The figure below shows four possible regression lines (solid) that fall within the 95% CI (dashed).
If you collect many points, the PI can be much larger than the CI. The 95% PI is the region where
95% of the data are expected to lie. The figure below shows how much larger the PI (solid lines)
is. If you run Monte Carlo simulations, approximately 5% of the simulated data will fall outside
the PIs.
27