+ All Categories
Home > Documents > Practical Tips for Analyzing Materials Science...

Practical Tips for Analyzing Materials Science...

Date post: 15-Mar-2019
Category:
Upload: lykhanh
View: 222 times
Download: 0 times
Share this document with a friend
41
Practical Tips for Analyzing Materials Science Data Jamie Kruzic School of Mechanical, Industrial, and Manufacturing Engineering Oregon State University Materials Science Seminar, October 25 th , 2013 Institute for Advanced Ceramics, Hamburg University of Technology November 12, 2012
Transcript
Page 1: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Practical Tips for Analyzing Materials Science Data

Jamie Kruzic School of Mechanical, Industrial, and Manufacturing Engineering

Oregon State University

Materials Science Seminar, October 25th, 2013

Institute for Advanced Ceramics, Hamburg University of Technology November 12, 2012

Page 2: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Introduction

•  Publication of results is always easier with rigorous data analysis and statistical tests of results –  In bio-related fields (e.g., biomaterials,

biomechanics, etc.) this is absolutely necessary!

•  In Materials Science we are often tempted to use the “eyeball test” to compare data. –  Only convincing for very large effects –  Not useful for overlapping data sets,

scattered data, etc.

•  You can have a smoother journal review process by doing a proper analysis before submission.

http://www.ifimages.com/

Page 3: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Outline

•  Today we will discuss these common topics with examples from my research:

–  Comparing two means: the student’s t-test –  Comparing >2 means: the analysis of variances

(ANOVA) –  Comparing >2 means: post-hoc tests –  Regression analysis & proving significant trends –  Calculating uncertainty from experimental linear

calibrations

•  OSU has Windows software license for StatGraphics which can handle many of your needs. –  http://oregonstate.edu/helpdocs/software/

recommended-software/osuware

•  On my Mac I run it using Citrix Receiver. –  http://engineering.oregonstate.edu/computing/citrix/

steps.html

http://beyeager.blogspot.com/

Page 4: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Statistical Tests

•  This tutorial is not a rigorous lecture on statistical tests. Rather, the aim is to point out useful tips for material scientists.

•  Generally a statistical test is used to test a null hypothesis that implies no effect. –  e.g., the means of two groups are not different

•  If the null hypothesis is rejected, than there is a measurable difference between the groups.

•  Testing the null hypothesis is usually done by calculating the p-value –  p = probability of obtaining a given statistical test result

from a data set if null hypothesis is true. –  Usually p < 0.05 (i.e., 5%) is considered statistically

significant, but cutoff is somewhat arbitrary!

http://www.xda-developers.com/

Page 5: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two means: the student’s t-test

•  In Materials Science we often wish to compare two material states, processes, etc.

•  Often compare an experimental treatment to a known control condition:

–  Does heat treating cast alloy for 1hr at 650°C affect yield strength?

–  Does adding 1% Bi affect the piezoelectric constant?

•  Or alternatively, compare two methods or processes:

–  Do sol-gel and pulse laser deposition give the same thin film properties?

–  Do two different fracture toughness test methods produce the same result?

•  If the effect is small, an “eyeball test” will not be effective at convincing journal reviewers

http://www.vias.org/

http://mychinaconnection.com/

Page 6: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two means: the student’s t-test

•  Student’s t-test is ideal for assessing two different data sets (not more!).

•  Assumes scatter is data is normally distributed around mean. –  Good for random scatter, also assumes

variance is equal –  Not good for systematic scatter (e.g.,

instrument drift with time)

•  Null hypothesis is that the difference between two means = zero

http://projectile.sv.cmu.edu/

http://projectile.sv.cmu.edu/

Page 7: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two means: the student’s t-test

•  Data sets may be paired: –  Data points in each set are paired –  E.g., test hardness of each of 10 samples

before and after a heat treatment.

•  More commonly in Materials Science data sets are unpaired: –  Data sets are produced with independent

samples –  E.g., test dielectric constant for 10

samples made with 1% Bismuth added and 10 samples without.

http://blogs.jcvi.org/

http://apileofsheep.wordpress.com/

Page 8: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two fracture toughness test methods

•  N = 6 tests with only a notch •  N = 6 tests with a notch and pre-crack •  Data sets overlap considerably, are means different?

Kruzic, Kuskowski, Ritchie, J. Biomed Mater A (2005)

Page 9: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two fracture toughness test methods

Example is perfect for unpaired t-test

Kruzic, Kuskowski, Ritchie, J. Biomed Mater A (2005)

Page 10: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing two fracture toughness test methods

•  p = 0.13 and we cannot reject null hypothesis •  It is still possible there is a small difference between the means, but

our data cannot prove it. Kruzic, Kuskowski, Ritchie, J. Biomed Mater A (2005)

Page 11: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Effect of Sample Size

•  Using small sample size (low N) makes it more difficult to statistically prove differences

•  Example at right shows how more samples makes distribution tails overlap less

•  If you want to prove a small difference, test lots of samples!

Higher N

Page 12: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means: ANOVA

•  What if we have >2 different data sets to compare? •  What if we have more than one independent variable (factor)?

•  If either case applies, we should begin with an analysis of variances (ANOVA)

•  With x independent variables (factors) we use an “x-way” ANOVA.

http://mips.stanford.edu/

Page 13: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means: one-way ANOVA

•  One-way ANOVA is conceptual extension of t-test –  We still have one independent variable (factor) –  However, we change it to more than two different conditions –  E.g., Test elongation to failure of alloy with 0.1%, 2.5%, 5% spinel

additions.

•  Null hypothesis: All groups are simply random samples of the same population.

•  Rejecting the null hypothesis p < 0.05 tells us factor has an effect, but does not tell us which means are different! –  Need a post-hoc test for pair-wise comparisons

Page 14: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means with one way ANOVA

•  In this study we wanted to know if % spinel (factor) affected the elongation to fracture

•  Eyeball test says 2.5% is quite different from others, but what does statistical test say?

Schneibel, Brady, Kruzic, Ritchie, Zeit. Metall. (2005)

Page 15: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means with one way ANOVA

Schneibel, Brady, Kruzic, Ritchie, Zeit. Metall. (2005)

Page 16: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means with one way ANOVA

Schneibel, Brady, Kruzic, Ritchie, Zeit. Metall. (2005)

Page 17: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing >2 means with one way ANOVA

•  p = 0.0179, reject null hypothesis •  One way ANOVA tells us only that result is affected by factor, but not

which results are different! •  To do pairwise, multiple sample comparisons we need a post-hoc test

Schneibel, Brady, Kruzic, Ritchie, Zeit. Metall. (2005)

Page 18: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Post-hoc tests for multiple sample comparisons

•  There are many different post-hoc tests out there. Also called multiple-range tests is StatGraphics, some examples are:

–  Fisher's least significant difference (LSD) –  Duncan's new multiple range test –  Newman–Keuls method –  Rodger's method –  Scheffé's method –  Tukey's range test –  Dunnett's test –  Health and Performance Manawatu

•  Choosing can be confusing!

•  Tukey’s test is well respected by journals I publish in so I choose it.

•  You can look up advantages and disadvantages of various methods on your own.

Page 19: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Post-hoc tests for multiple sample comparisons

•  Despite the apparent difference in means, we cannot prove 2.5% spinel is different from 5% spinel without more testing

•  Statistical tests can indicate how much data is needed to prove your point (e.g., to your advisor, a reviewer, etc.)

Schneibel, Brady, Kruzic, Ritchie, Zeit. Metall. (2005)

Page 20: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Comparing means with >1 factors

•  When you have >1 independent variables (factors) you need to start with a multi-factor ANOVA

•  Two way ANOVA is used when there are two factors.

–  E.g., Strength of two different dental composites tested with two different pre-treatments (hydrated vs. not hydrated)

•  Result of ANOVA tells us: –  If the factor has an effect (called a

main effect) –  If there are interaction effects

between factors.

•  Again, ANOVA does not tell us which means are different!

–  To do pairwise, multiple sample comparisons we need a post-hoc test

Shah, Ferracane, Kruzic, Dent. Mater. (2009)

Page 21: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Two-way ANOVA for strength of composites

Shah, Ferracane, Kruzic, Dent. Mater. (2009)

Page 22: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Two-way ANOVA for strength of composites

Shah, Ferracane, Kruzic, Dent. Mater. (2009)

Page 23: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Two-way ANOVA for strength of composites

Shah, Ferracane, Kruzic, Dent. Mater. (2009)

Page 24: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Two-way ANOVA for strength of composites

•  Main effects are caused by the individual factors. In this case both factors have significant effects on strength (p < 0.05).

–  Since both factors only have two levels, we don’t need a post-hoc test. We know where the differences lie.

•  Interactions tell if a factor’s effect depends on other factors. –  In this case, there is no significant interaction (p = 0.14)

Shah, Ferracane, Kruzic, Dent. Mater. (2009)

Page 25: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Two-way ANOVA for strength of composites

•  Interaction plots can help us understand the interactions.

•  Here, composite type seems to have greater effect for hydrated case (i.e., an interaction).

•  However, this interaction is not significant (p > 0.05).

Page 26: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Post-hoc tests after multi-factor ANOVA

•  What if factors have >2 levels? –  Need post-hoc test to find which levels are

significantly different. •  Is there an interaction?

–  No: You can test each significant factor individually. –  Yes: I know of two options, but there are more.

1.  Analyze the whole set with one post-hoc test –  Each factor combination grouped as a single factor (e.g.:

AA, AB, BA, BB) –  Simplest way, but stats purists might take some issue

2.  Conduct a “simple main effects test.” –  Considered a more rigorous method. –  I don’t think StatGraphics can do this. –  Probably not worth the effort unless asked of you.

•  Be aware, there is no universal agreement on best way to do pairwise post-hoc testing of data with a significant interaction.

http://www.articulate.com/

http://www.ideachampions.com/

Page 27: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Post-hoc tests after multi-factor ANOVA

•  Strength of surgical sutures –  Factor 1: Suture type, 13 levels (nylon, silk, Polybutester, etc.) –  Factor 2: Test type, 2 levels (knotted or straight)

•  Two way ANOVA shows significant main effects AND interaction effect.

Naleway, Lear, Kruzic, Maughan JAAD (in review)

Page 28: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

A B C D E F G H I J K L M

Interaction effects

•  Interaction plot shows the strength of some suture types rank differently in knotted vs. straight configuration (e.g., F vs. G, L vs. M, etc.)

Naleway, Lear, Kruzic, Maughan JAAD (in review)

Page 29: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Post-hoc tests after multi-factor ANOVA

•  To find pairwise significant differences, did one-way ANOVA and Tukey’s multiple comparison test on 2 x 13 = 26 levels.

•  Result lists all pair comparisons and shows which pairs are significantly different (p < 0.05)

Naleway, Lear, Kruzic, Maughan JAAD (in review)

Page 30: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Regression Analysis & Proving Significant Trends

•  When the independent variable is continuous, a regression analysis is often most appropriate –  E.g., our example of ductility with % spinel

content

•  Also, what if we don’t have resources to do multiple tests for each level? –  A regression analysis can help prove trend

with only 1 or 2 data points per level.

http://pn.bmj.com/

http://www.vias.org/

Page 31: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue threshold results

•  Fatigue crack growth testing requires many samples to get one curve

•  ΔKTH is lower asymptote and often we get only one value despite multiple tests

•  Statistical test is problematic to compare ΔKTH if only one value per level

•  However, a regression is possible

LOG APPLIED STRESS INTENSITY RANGE, ΔKapp

CR

AC

K G

RO

WTH

RAT

E, d

a/dn

(m/c

ycle

)

10-10

10-6 mKCdNda )(/ Δ=

ΔKTH

10-8

Typical Crack Propagation Data

Typical ductile material

Typical brittle material

ΔKTH

Page 32: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue crack growth results

•  We measured fatigue crack growth curves for four different dental composites.

•  We substituted 0%, 5%, 10% and 15% of the filler with bioactive glass

•  Maybe we see a decrease in fatigue threshold?

•  Is it significant?

Khvostenko, Mitchell, Hilton, Ferracane, Kruzic, Dental Materials (in press)

Page 33: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue crack growth results

•  We can try a simple linear regression of the data:

Page 34: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue crack growth results

Page 35: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue crack growth results

•  Note, we can try many different functions if desired:

Page 36: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Example: Fatigue crack growth results

•  In this case p = 0.354 –  no significant effect

of adding bioactive glass

•  R2 = 0.41737 –  A linear fit only

explains 41.7% of the variability

•  We can conclude no linear downward trend

•  Be careful! Significance (p-value) depends on fitting function chosen (linear in this case)

Page 37: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Reporting regression values as results

•  In some cases the regression is the data we want to report. –  E.g., the slope of plot gives the fracture toughness. –  How do we put error bars on that fracture toughness value?

•  We can calculate uncertainty from the correlation coefficient, R –  J. Higbie, “Uncertainty in the linear regression slope,” Am. J. Phys, 1991,

59(2) 184-185.

–  Linear fit:

–  Uncertainty in m:

–  Uncertainty in b:

•  Don’t trust Excel LINEST function –  it can produce errors!

y= mx+b

Eilertsen, Subramanian, Kruzic, J. Alloys. Comp. (2013)

m tan arccos R( )( )n−2

m tan arccos R( )( )xrmsn−2

Page 38: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Reporting regression values as results

•  In this case, we forced fit through zero (y = mx).

•  We can use correlation coefficient to calculate uncertainty in m

Page 39: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Uncertainty in results calculated from regressions

•  Suppose we do get a good linear fit and want to use the slope in calculations. –  E.g., We used a

linear calibration to calculate stress from Raman data.

•  How do we determine uncertainty in the values calculated from the fit?

•  Need more than uncertainty in fit parameters.

Stre

ss (M

Pa)

Shift in Raman Peak Position (cm-1)

Greene, Gallops, Fünfschilling, T. Fett, M.J. Hoffmann, Ager III, Kruzic, J. Mech. Phys. Sol. (2012)

Page 40: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Uncertainty in results calculated from regressions

•  For the linear fit: –  y = stress (MPa) –  x = Raman peak shift (cm-1)

•  We can calculate y* for any measured value x = x*

•  What is the uncertainty in y*? –  Simply knowing the uncertainty in the slope does not answer this.

•  We can calculate the uncertainty in y* for x = x* as:

–  n = number of data points in the fit. –  tα/2,n-2 is critical value for the t-distribution. Look this up. –  α is the desired confidence level (α = 0.05 would be for 95%) –  is the mean experimental x value. –  xi, yi are the data points used in the fit.

y= mx

yerror = tα/2,n−2yi−mxi−b( )2∑n−2

1n

+n x*− x( )

n xi2− xi∑( )2∑

x

Page 41: Practical Tips for Analyzing Materials Science Datamatsci.oregonstate.edu/files/Kruzic_MatSciSeminar_Fall...Practical Tips for Analyzing Materials Science Data Jamie Kruzic School

Summary

•  You can strengthen your publications and ease your review process by doing statistical analyses of your data.

–  If you work in biomaterials/biomechanics, this will be absolutely required!

•  Resources available at OSU can help you ease this process

•  OSU has Windows software license for StatGraphics which can handle many of your needs.

–  http://oregonstate.edu/helpdocs/software/recommended-software/osuware

•  On my Mac I run it using Citrix Receiver.

–  http://engineering.oregonstate.edu/computing/citrix/steps.html

http://4.bp.blogspot.com/

http://www.kilkku.com/


Recommended