+ All Categories
Home > Documents > Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample...

Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample...

Date post: 02-Jan-2016
Category:
Upload: evan-hart
View: 220 times
Download: 5 times
Share this document with a friend
Popular Tags:
44
http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2 , and sample size
Transcript
Page 1: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

http://mathworld.wolfram.com/Chi-SquaredDistribution.html

More stats...Outliers, R2, and sample size

Page 2: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.
Page 3: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

•Stats practice in next lab

•Also need to start putting together your group for inquiry 2... 3-5 people/group

•Inquiry 1 written and oral reports are due in lab Th 9/23 or M 9/27

•Homework #2 and #3 coming soon

•Online evaluation

•TA office hours calendar online

Page 4: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

•In your lab notebook: Write everything about your experiments. Each entry should have a date. Include notes (intro and conclusions), so when you, or someone else, go back to look at your notebook, the entries make sense.

Notebooks will be turned in as a HW later in the semester.

Page 5: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Outliers…

2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7, 121, 130

Median = 4

Mean = 18

Page 6: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Outliers: When is data invalid?

Page 7: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Outliers: When is data invalid?

Not simply when you want it to be.

Page 8: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Outliers: When is data invalid?

Not simply when you want it to be.

Dixon’s Q test can determine if a value is statistically an outlier.

Page 9: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Page 10: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Example: results from a blood test…789, 700, 772, 766, 777

Page 11: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Example: results from a blood test…789, 700, 772, 766, 777

Page 12: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Example: results from a blood test…789, 700, 772, 766, 777

Q=|(700 – 766)| ÷ |(789 – 700)|

Page 13: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Example: results from a blood test…789, 700, 772, 766, 777

Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742

Page 14: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Dixon’s Q test can determine if a value is statistically an outlier.

|(suspect value – nearest value)|Q = |(largest value – smallest value)|

Example: results from a blood test…789, 700, 772, 766, 777

Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742 So?

Page 15: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

You need the critical values for Q table:

Sample # Q critical value

3 0.970

4 0.831

5 0.717

6 0.621

7 0.568

10 0.466

12 0.426

15 0.384

20 0.342

25 0.317

30 0.298

If Q calc > Q critrejected

From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958)

Page 16: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

You need the critical values for Q table:

If Q calc > Q critthan the outlier can be rejected

Q calc = 0.742

Q crit = 0.717

= rejection

From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958)

Sample # Q critical value

3 0.970

4 0.831

5 0.717

6 0.621

7 0.568

10 0.466

12 0.426

15 0.384

20 0.342

25 0.317

30 0.298

Page 17: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What can outliers tell us?

Page 18: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

If you made a mistake, you should have already accounted for that.

Page 19: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Outliers can lead to important and fascinating discoveries.

Transposons “jumping genes” were discovered because they did not fit known modes of inheritance.

Page 20: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What about relating 2 variables?

XKCD.com

Page 21: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What about relating 2 variables?

R2 gives a measure of fit to a line.

If R2 = 1 the data fits perfectly to a straight line

If R2 = 0 there is no correlation between the data

Page 22: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

R2 gives a measure of fit to a line.

4 1711 146 7

12 172 136 213 21

birth month vs birth day

Page 23: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

birth month vs birth day

1 3 5 7 9 110

5

10

15

20

25

30

R² = 0.00546238003477373

Birth Month

Bir

th D

ay

Page 24: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

phosphate quantity vs absorbance

0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.00.0000.0500.1000.1500.2000.2500.3000.3500.4000.4500.500

R² = 0.999918160770785

Apyrase Assay Standard Curve 3-7-05

nMol Pi

OD

66

0

Page 25: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What about relating 2 variables?

•To use R2 the data must be continually variable...

R2 gives a measure of fit to a line.

If R2 = 1 the data fits perfectly to a straight line

If R2 = 0 there is no correlation between the data

Page 26: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Samples vs populations

Page 27: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Samples vs populationsPopulation- everything or everyone about which information is soughtSample- a subset of a population (that is hopefully representative of the population)

population

sample

Page 28: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Population-

• U.S. census

• Dogs

• 1 – infinity

Sample-

• Travis county

• Poodles

• Prime numbers

Page 29: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Why use a sample instead of a population?

Page 30: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Why use a sample instead of a population?

•Logistics

Page 31: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Why use a sample instead of a population?

•Logistics

•Cost

Page 32: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Why use a sample instead of a population?

•Logistics

•Cost

•Time

Page 33: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Samples:

Random- each member of population has an equal chance of being part of the sample.

or

Representative- ensuring that certain parameters of your sample match the population.

Page 34: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Replicates:

Technical vs Experimental

Technical replicate- one treatment is divided into multiple samples.

Experimental replicate- different, replicate, treatments are done to different samples.

Page 35: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Testing blood sugar levels after eating a Snickers:

Page 36: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Testing blood sugar levels after eating a Snickers:

Divide a participants blood into 3 samples and test blood sugar in each sample.

Technical or Experimental replicate?

Page 37: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Testing blood sugar levels after eating a Snickers:

Test 3 different people.

Technical or Experimental replicate?

Page 38: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

Testing blood sugar levels after eating a Snickers:

Test the same person on 3 different days.

Technical or Experimental replicate?

Page 39: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What sample size do you need?

Page 40: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What sample size do you need?

It depends on the error you expect.

Page 41: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

To determine an appropriate sample size, you need to estimate a few parameters.•Means•Standard Deviation

•Power: The probability that an experiment will have a significant (positive) result, that is have a p-value of less than the specified significance level (usually 5%).

Page 42: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

This calculator will help you determine the appropriate sample size:

http://www.stat.ubc.ca/~rollin/stats/ssize/n2.html

Page 43: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

What sample size do you need?

It depends on the error you expect.

(So it is impossible to predict with 100% accuracy before the experiment is carried out.)

Page 44: Http://mathworld.wolfram.com/Chi-SquaredDistribution.html More stats... Outliers, R 2, and sample size.

3rd Thursday at Blanton Art Museum(http://blantonmuseum.org/calendar_events/details/third_thursday7)

•Stats practice in next lab

•Also need to start putting together your group for inquiry 2... 3-5 people/group

•Inquiry 1 written and oral reports are due in lab Th 9/23 or M 9/27

•Homework #2 and #3 coming soon

•Online evaluation

•TA office hours calendar online


Recommended