+ All Categories
Home > Documents > measures of centrality

measures of centrality

Date post: 23-Feb-2016
Category:
Upload: phuong
View: 61 times
Download: 0 times
Share this document with a friend
Description:
measures of centrality. Last lecture summary. Which graphs did we meet? scatter plot ( bodový graf ) bar chart (sloupcový graf) histogram pie chart (koláčový graf) How do they work, what are their advantages and/or disadvantages?. Random noise. Histogram. - PowerPoint PPT Presentation
Popular Tags:
21
MEASURES OF CENTRALITY
Transcript
Page 1: measures of centrality

MEASURES OF CENTRALITY

Page 2: measures of centrality

Last lecture summary• Which graphs did we meet?

• scatter plot (bodový graf)• bar chart (sloupcový graf)• histogram• pie chart (koláčový graf)

• How do they work, what are their advantages and/or disadvantages?

Page 3: measures of centrality

Random noise

SIZE [ft2] COST [$]1 300 88 0001 400 72 0001 600 94 0001 900 86 0002 100 112 0002 300 98 000

Page 4: measures of centrality

Histogram• Now I will collect heights of all of you in this room.

• Use Interactive Histogram Applet: http://www.shodor.org/interactivate/activities/Histogram/

• interval, bin

Page 5: measures of centrality

Histogram – Body fat• In Interactive Histogram Applet – choose „Body fat % in 252 men“

dataset.• Find reasonable bin size• Answer following questions. No matter of bin size what is always

true?

• Most scores fall around 20%.• The shape is roughly symmetrical.• Most scores fall in the middle of distribution.• There are more scores between 15 and 25 than between 35 and

50.• There are more scores between 0 and 10 than between 18 and 24.• Relatively more men have a body fat above 35% or below 5%.

Page 6: measures of centrality

Histogram – Income distribution• United States Census Bureau – http://www.census.gov

Income Number of houses10 000 9401

20 000 14447

30 000 13642

40 000 12388

50 000 11028

Page 7: measures of centrality

Histogram – Income distribution

• This is an example of a (positively) skewed distribution (zprava zešikmené rozdělení).

• This distribution is not symmetrical.

• Most incomes fall to the left of the distribution.

Page 8: measures of centrality

Bar chart and scatter plot• Which scatter plot corresponds to this bar chart?

Page 9: measures of centrality

Pie chart to histogram• Which histogram looks like it cames from the same data?

Page 10: measures of centrality

About statistics• Statistics – the science of collecting, organizing,

summaryzing, analyzing, and interpreting data• Goal – use imperfect information (our data) to infer facts,

make predictions, and make decisions

• Descriptive statistic – summarising data with numbers or pictures

• Inferential statistics – making conclusions or decisions based on data

Page 11: measures of centrality

Choosing a profession

Chemistry Geography

50 000 – 60 000 40 000 – 55 000

Page 12: measures of centrality

Choosing a profession• We made an interval estimate.• But ideally we want one number that describes the entire

dataset. This allows us to quickly summarize all our data.

Page 13: measures of centrality

Choosing a profession

1. The value at which frequency is highest.2. The value where frequency is lowest.3. Value in the middle.4. Biggest value o x-axis.5. Mean

Chemistry Geography

Page 14: measures of centrality

Three big M’s

• The value at which frequency is highest is called the mode. i.e. the most common value is the mode.

• The value in the middle of the distribution is called the median.

• The mean is the mean.

Chemistry Geography

Page 15: measures of centrality

Quick quiz• What is the mode in our data?

Page 16: measures of centrality

Mode in negatively skewed distribution

Page 17: measures of centrality

Mode in uniform distribution

Page 18: measures of centrality

Multimodal distribution

Page 19: measures of centrality

Mode in categorical data

Page 20: measures of centrality

More of modeTrue or False?

1. The mode can be used to describe any type of data we have, whether it’s numerical or categorical.

2. All scores in the dataset affect the mode.3. If we take a lot of samples from the same population, the mode will be

the same in each sample.4. There is an equation for the mode.

• Ad 3.• http://onlinestatbook.com/stat_sim/sampling_dist/ • Mode changes as you change a bin size.

• The mode depends on how you present data. And we can’t use mode to learn something about our population.

Page 21: measures of centrality

Life expectancy data• Watch TED talk by Hans Rosling, Gapminder Foundation:

http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html


Recommended