Date post: | 13-Dec-2015 |
Category: |
Documents |
Upload: | eustace-richardson |
View: | 215 times |
Download: | 1 times |
Last lecture summary• Which measures of central tendency do you know?• Which measures of variability do you know?• Empirical rule
• Population, census, sample, statistic, parameter• Statistical inference
68% 95% 99.7%
Statistical jargon
Population - parameterMean Standard deviation
Sample - statisticMean Standard deviation
Výběr - statistikaVýběrový průměr Výběrová směrodatná odchylka
population (census) vs. sampleparameter (population) vs. statistic (sample)
Sample vs. population SD• We use sample standard deviation to approximate
population paramater
• But don’t get confused with the actual standard deviation of a small dataset.
• For example, let’s have this dataset: 5 2 1 0 7. Do you divide by or by ?
Median absolute deviation (MAD)• standard deviation is not robust• IQR is robust• mean absolute deviation MAD – a robust equivalent of the
standard deviation
• Take your data, find median, calculate absolute deviation from the median, find the median of absolutes deviations
Median absolute deviation (MAD)Data Median deviation Absolute deviation
5
10
30
20
30
5
15
10
15
Median:
MAD:
Playing chess• Pretend I am a chess player.• Which of the following tells you most about how good I
am:1. My rating is 1800.
2. 8110th place among world competitive chess players.
3. Ranked higher than 88% of competitive chess players.
Distribution
Distribution of scores in one particular year
We should use relative frequencies and convert all absolute frequencies to proportions.
Height data – absolute frequencies
http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_Dinov_020108_HeightsWeights
Height data – relative frequencies
30%
What proportion of values is between 170 cm and 173.75 cm?
173.5
Height data – relative frequencies
What proportion of values is between 170 cm and 175 cm?
We can’t tell for certain.
• How should we modify data/histogram to allow us a more detail?1. Adding more value to the dataset
2. Increasing the bin size
3. A smaller bin size
Quiz• What does a negative Z-score mean?
1. The original value is negative.
2. The original value is less than mean.
3. The original value is less than 0.
4. The original value minus the mean is negative.
Quiz II• If we standardize a distribution by converting every value
to a Z-score, what will be the new mean of this standardized distribution?
• If we standardize a distribution by converting every value to a Z-score, what will be the new standard deviation of this standardized distribution?
Z
Z – number of standard deviations away from the mean
If the Z-value is +1, how many percent are less than that value?
cca 84 %
0 +1 +2 +3-1-2-3
Quiz• Approximately what proportion of people is between 163
cm and 178 cm high?
173 178 183168163
81.5%
Quiz• What is the probability of randomly selecting a height in
the sample that is >5 standard deviations above the mean?1. 0.01
2. 0.3
3. 0.8
4. 0.99
Quiz• What is the probability of randomly selecting a height in
the sample that is <5 standard deviations below the mean?1. 0.01
2. 0.3
3. 0.8
4. 0.99
Quiz• What proportion of the data is either below 2 standard
deviations or above 2 standard deviations from the mean for a normal distribution?
95%
2.5% 2.5%