k 2DS00 Statistics 1 for Chemical Engineering lecture 5.

Post on 20-Dec-2015

225 views 2 download

Tags:

transcript

2DS00

Statistics 1 for Chemical

Engineering

lecture 5

Week schedule

Week 1: Measurement and statistics

Week 2: Error propagation

Week 3: Simple linear regression analysis

Week 4: Multiple linear regression analysis

Week 5: Non-linear regression analysis

Detailed contents of week 5

• intrinsically linear models

• well-known non-linear models

• non-linear regression

– model choice

– start values

– Marquardt and Gauss-Newton algorithm

– confidence intervals

– hypothesis testing

– residual plots

– overfitting

Approaches to non-linear models

1. transformation to linear model

2. approximation of non-linear model by linear model (linearization

through Taylor approximation)

3. non-linear regression analysis (numerically find parameters for

which sum of squares is minimal)

Remark: although 2) is often applied in the chemical literature, we

strongly recommend against this procedure because there is no

guarantee that it yields accurate results.

Intrinsically linear models

Some non-linear models may be transformed into linear models

10

xy e

transformed model must fulfil usual

assumptions!

0 1

1 0 1

ln( ) ln( )

y x

y b b x

10 .xy e e

0 1ln( ) ln( ) y x

Examples of non-linear models

exponential growth model

Mitscherlich model

inverse polynomial model

logistic growth model

Gompertz growth model

Von Bertalanffy model

Michaelis-Menten model

Exponential growth model

101

teYYt

Y

i

t

iieY

10i

t

iieY 1

0

non-linearadditive error term

intrinsically linearmultiplicative error term

Mitscherlich model

Yx

Y

01

ix

iieY 110

horizontalasymptote

determines speed of growth

ifmonomolecular model

12

e

Inverse polynomial model

201 Y

x

Y

ii

ii x

xY

10

Slow convergence to asymptote 1/ 1

Logistic (autoclytic) growth model

0

01

YY

t

Y

iti ieY

12

0

1

2

0

1)0(

Y 0 is horizontal asymptote

Gompertz growth model

YY

x

Y 01 ln

ie

i

it

eY

12

0

Logarithm of Gompertz curve is monomolecular curve

horizontalasymptote

determines growth speed

Von Bertalanffy model

i

mtmi

ieY )1/(11

01

Special cases of this general model are:• m=0 en : monomolecular model• m=2 en : logistic model• m1: Gompertz model

10

e

02 /

Michaelis-Menten model

itt ii eeY 21 11 21

This model is often used to describe diffusion kinetics

Watch out for overfitting in model with many parameters.

Marquardt algorithm

Non-linear regression requires numerical search for parameter values

that minimise error sum of squares.

Most important algorithms:

1. Gauss-Newton algorithm (uses 1st-order approximation; may

overshoot minimum)

2. steepest descent algorithm (searches for direction with largest

downhill slope; may be slow)

3. Marquardt algorithm (switches according to situation between

above mentioned algorithm)

Gauss algorithm

Marquardt algorithm

Choice between both methods is determined by Marquardt

parameter :

0 algorithm approaches to Gauss-Newton

algorithm approaches to steepest descent

The Marquardt algorithm is (deservedly) the most used method in

practice.

Numerical search for minimum of error sum of squares

local minimumtrue minimum

Where should we start the numerical search?

Choice of start values

• inspect data and use interpretation of parameters in model

– parameter is related to value of asymptote

– model value at certain setting

• use linear regression to obtain approximations to parameter values

– transform model to linear model

– approximate model by linear model

Possible causes for non-convergence

• model does not match data

• badly determined numerical derivatives

• overfitting:

– model has too many parameters

– some model parameters have almost same function

Important issues in non-linear regression analysis

• carefully consider choice of model

• choose starting values that relate to the model at hand

• experiment with different starting values to prevent convergence to

local minimum

• watch out for overfitting

Fritz and Schluender equation: start values for a and b

For C2=0, this reduces to

Use first 10 measurements

(i.e., those with C2=0) to

obtain start values for a and b.

1

31

11

1 2 2

b X

XX

aCq

C X C

1 1bq aC

Fritz and Schluender equation: other initial values

1

3

1

3

1

3

Xb1

X22

Xb1

X22

b11

Xb1

X22

b11

aC

CXy

aC

CX

aC

1

q

1

aC

CX

aC

1

q

1

)Cln()Cln()yln(

)Cln()Xb()Cln(X)a

Xln()yln(

)Cln()Xb()aln()Cln(X)Xln()yln(

12210

11232

11232

Examination• bring your notebook

• Monday October 6, 14.00 – 17.00 in Paviljoen J17 and L10 (not

Auditorium)

• clean copy of Statistisch Compendium is allowed

• contents:

– one exercise on error propagation

– three statistical analyses to be performed on your notebook