+ All Categories
Home > Documents > What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Date post: 27-Mar-2015
Category:
Upload: kevin-barnett
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
36
What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao
Transcript
Page 1: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

What Could We Do better?Alternative Statistical Methods

Jim Crooks and

Xingye Qiao

Page 2: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Review of the Statistical Model

• We take measurements of the beam displacement, y, at times t1,…,tn

• What we actually observe is ỹ which is a noisy version of y,

– or

• is the error resulting from imperfect measurement at time tj

)ε(t)y(t)(ty~ jjj

jjj εyy~

jε(t )

Page 3: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

A Statistical Model For Displacement

• Under the spring model it is assumed that the displacement over time is governed by the spring model:

So we could write:

K)C,y(t;

jjj εK)C,;y(ty~

Page 4: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

A Statistical Model for Displacement

• Remember the assumptions:– Data at different time points are independent– Residuals are normally distributed– Residuals’ variance is constant over time

• With these assumptions the model can be written: 2jj σK),C,;y(tN~y~

Page 5: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

What if we have Replicates?

• We may have repeated measurements of the same beam

• Notation: Let tij be the jth time point; then i indexes the repeats at tj

• Denote the repeated measurement of the beam displacement at tj by ỹi(tj)=ỹij.

• If we believe C and K are the same across replicates then we may write the model as:

independent over time and replicate

2ijij σK),C,;y(tN~y~

Page 6: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

The likelihood

• Because of our independence assumption, the likelihood of the model (which we think of as a function of the parameters C and K, not the data) is the product of the individual density functions evaluated at the data ỹ:

where N(x;,2) denotes the normal density with mean and variance 2 evaluated at x.

reps

1i

times

1j

2ijij

2 σK),C,;y(t;y~N)y~;σK,L(C,

Page 7: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Maximum Likelihood Estimates

• The Maximum Likelihood Estimates (MLE’s) for C and K, denoted and , are the values of C and K that maximize the likelihood function

• Given 2 known, the MLE’s are the same as what you’d get with a Least-Squares procedure (the former tends to justify the use of the latter)

C K

Page 8: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

How Good is the Model?• We can asses the goodness of fit by using the

spring model to “predict” the observed measurements

• The “predicted” (AKA “fitted”) values, ŷ(t) are obtained by evaluating the spring model with and :

• We can compare the fitted values at the observed times ŷ(tij) = ŷij to the observed values ỹij.

• Run the MATLAB file ‘inv_beam.m’ by typing: > inv_beam

CC KK )K,Cy(t;(t)y

Page 9: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model

Page 10: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model

Page 11: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model

Page 12: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Model Residuals

• We need to know the difference between beam displacement data and our model’s predictions for the beam displacement:

• These are called the model residuals

• The residuals are our best guess for the values of ij. Hence from our current model we would expect the eij to look independent and normally distributed with constant variance

ijijij yye ˆ~

Page 13: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model Residuals

• Are the residuals normally distributed?

• Are they independent (i.e., is there correlation in time)?

• Is their variance constant?

• Run the MATLAB file plotresidual.m by typing:

> plotresidual

Page 14: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model Residuals

Page 15: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model Residuals

Page 16: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model Residuals

Page 17: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spring Model QQ-plot

Page 18: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Coefficient of Determination R2

• One criteria to use when judging a model is the fraction of the variability in the data it can explain:

• SSTot is the total variability in the data ỹ• SSE is the variability left over after fitting the

model (SSE ≤ SSTot)• So R2 represents the fraction of variability in the

data that is explained by the model

j

2j

j

2jj

2

yy~

yy~

1SSTot

SSE1R

Page 19: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Coefficient of Determination R2

• In the example shown above we can find that:

• This means the spring model accounts for about 52% of the variability in our displacement measurements

• Is 52% a lot?

5261.0x101.2557

x105.95061R

6

-72

Page 20: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Coefficient of Determination R2

• Brief aside: note that we can also get an estimate of 2 from SSE:

where n is the number of data points and df is the number of ‘degrees of freedom’ (AKA the number of unknown parameters)

dfn

SSEσ2

Page 21: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

How Good is this Model?

• Is R2 = 52% any good? It depends.

• It can be useful (or even necessary) to set up a naïve “straw man” alternative against which to compare a physical model

• There are many possible alternatives and choosing between them is subjective

• To illustrate we will use a smoothing spline alternative

Page 22: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Smoothing Splines

• A cubic spline is a function that is a piecewise cubic polynomial: – Between each sequential pair of time points

the function is a cubic polynomial– At each time point the function is continuous

and has continuous first and second derivatives

– The time points are called “knots”

Page 23: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Smoothing Splines

• A smoothing spline is a type of cubic spline where:– The time points are specifically those at which

measurements are made, tj

– Given (yj,tj) for all j, is determined to be that cubic spline that minimizes

– Here is called the smoothing parameter

• What happens to the smoothness as α→∞?

(t)f

f

Times

1j

2jj

b

a

2

2

2

)f(tydtdt

f(t)dα

Page 24: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Smoothing Splines

• The value of parameterizes the relative importance of smoothness to fit– Larger values of result in a bigger penalty for

curvature and hence results in a smoother fit that may not fit the data

– Smaller values of result in a wigglier spline that more closely follows the data

flim

α

flim0α

Straight Line

Exact Interpolator

Page 25: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Smoothing Splines

• But how do we choose the value of ?• Another choice without an objectively correct

answer!• One useful answer is the value that minimizes

the “leave-one-out” predictive error:– Fit the spline to all the displacement data except one

point– Use the spline to predict the displacement at this time

point– Repeat over all displacement points and sum the

residual errors

• This is called “leave-one-out” cross-validation

Page 26: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Fitted Spline

Page 27: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Fitted Spline

Page 28: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Fitted Spline

Page 29: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Fitted Spline Residuals

• Are the residuals normally distributed?

• Are they independent (i.e., is there correlation in time)?

• Is their variance constant?

• You can make your own cross-validated spline using the MATLAB file splineplot.m

• Don’t do it now!!!

Page 30: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spline Residuals

Page 31: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spline Residuals

Page 32: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spline Residuals

Page 33: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Spline QQ-plot

Page 34: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Compare the Spring Model to Splines

• If you compare residuals, those for the spline are generally smaller (i.e., it fits the data better)

• Spline Coefficient of variation is

• Our spring model explains less of the variation than does a naive spline (52% < 88%)

0.8772x101.2557

x101.54221R

6

72

Page 35: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Compare the Spring Model to Splines

• Is the difference big enough to reject the use of the spring model?

• Again, this is subjective, but can use statistical tests to answer the question as objectively as possible

• Such tests are beyond the scope of this workshop, but if you are interested in supercharging your group project using them please ask me

Page 36: What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Good Luck!!!


Recommended