+ All Categories
Home > Documents > Introduction to resampling techniques for generating confidence measures.

Introduction to resampling techniques for generating confidence measures.

Date post: 04-Jan-2016
Category:
Upload: joel-edwards
View: 219 times
Download: 0 times
Share this document with a friend
16
Introduction to resampling techniques for generating confidence measures
Transcript
Page 1: Introduction to resampling techniques for generating confidence measures.

Introduction to resampling techniques for generating confidence measures

Page 2: Introduction to resampling techniques for generating confidence measures.

2

Resampling techniques

1) Randomization– Resampling without replacement (re-ordering, permutations)

2) Jackknife– Leaving one data point out at a time (not good for small sample sizes),

in paleobiology usually used for phylogenetic analyses3) Sampling Standardization

– When comparing samples of different sizes 4) Bootstrap

– Parametric• Generate datasets from a parametrized model and comparing

these with empirical data– Non parametric

• Most common in paleobiology

Page 3: Introduction to resampling techniques for generating confidence measures.

Empirical Data Randomization

Randomized Sample 1

Randomized Sample 2

Randomized Sample 3

…. Randomized Sample N

Page 4: Introduction to resampling techniques for generating confidence measures.

Jack knife sample 3

Jack-KnifeEmpirical Data

Jack knife sample 3

Jack knife sample 1

…..Jack knife sample N

Page 5: Introduction to resampling techniques for generating confidence measures.

Empirical data 1 Empirical data 2 Empirical data 3

Standardized Sample 1

Standardized Sample 2

… Standardized Sample N

Sampling Standardization

Page 6: Introduction to resampling techniques for generating confidence measures.

Bootstrapped Sample 1

Bootstrapped Sample 3

Bootstrapped Sample 2

….. Bootstrapped Sample N

Non-parametric bootstrapEmpirical Data

Page 7: Introduction to resampling techniques for generating confidence measures.

Empirical data

Estimate parameters

(model)

Bootstraps samples

Estimate parameters

Empirical data

Estimate parameters

(model)

Simulated samples

Estimate parameters

Non-parametric bootstrap Parametric bootstrap

Page 8: Introduction to resampling techniques for generating confidence measures.

8

Resampling techniques

1) Randomization– Resampling without replacement (re-ordering, permutations)

2) Jackknife– Leaving one data point out at a time (not good for small sample sizes),

in paleobiology usually used for phylogenetic analyses3) Sampling Standardization

– When comparing samples of different sizes 4) Bootstrap

– Parametric• Generate datasets from a parametrized model and comparing

these with empirical data– Non parametric

• Most common in paleobiology

Page 9: Introduction to resampling techniques for generating confidence measures.

9

Why resampling (now)

• Underlying distribution of data not well understood and/or complex

• Convenient way to generate uncertainty measures• Computer intensive (possible only with faster computers)

Page 10: Introduction to resampling techniques for generating confidence measures.

Bootstrapping• construct estimate of frequency distributions expected from a “generative

process”• Equivalent to generating replicate outcomes from an experiment (doing

something many times to see the range of results)• Assumption: data are representative sample of independent observations

derived randomly from the studied statistical population

Page 11: Introduction to resampling techniques for generating confidence measures.

Bootstrap error estimates• Estimate standard error by resampling from the single sample we have.• This approach uses sampling with replacement from observed sample to

simulate sampling without replacement from the underlying distribution.Procedure• Start with observed sample of size n and observed sample statistic, call it Z.• Randomly pick a sample of size n, with replacement, from the

observedsample.• Calculate the sample statistic of interest on this random sample; call

isZboot.• Repeat many times (generally hundreds to thousands, ideally untilestimate

of SE stabilizes).• Calculate standard deviation of the Zboot.• This is an estimate of the standard error of the observed sample statistic

Z:SD(Zboot) ≈ SE(Z)

Page 12: Introduction to resampling techniques for generating confidence measures.

Example (sampling standardization)Alroy et al. 2008. Phanerozoic trends in the global diversity of marine invertebrates. Science 321:97-100

Page 13: Introduction to resampling techniques for generating confidence measures.

Foote, M. 2006. Substrate affinity and diversity dynamics of Paleozoic marine animals Paleobiology 32:345-366.

Example (non parametric bootstrap)

Page 14: Introduction to resampling techniques for generating confidence measures.

Example (non parametric bootstrap)

Liow et al- 2009. Lower extinction risk in Sleep-or-Hide Mammals. Am Nat 173:264–272.

Page 15: Introduction to resampling techniques for generating confidence measures.

R demo• Packages (e.g. boot, boostrap)• Write your own: use the function

sampleNice help http://www.ats.ucla.edu/stat/r/library/bootstrap.htm

Page 16: Introduction to resampling techniques for generating confidence measures.

Links• http://www.paleo.geos.vt.edu/MK/Kowalewski_PNG_2010.pdf• http://www.stat.cmu.edu/~cshalizi/402/lectures/08-bootstrap/lecture-08.pdf


Recommended