Date post: | 14-Apr-2017 |
Category: |
Education |
Upload: | bahareh-heravi |
View: | 494 times |
Download: | 1 times |
DATA JOURNALISM
Dr. Bahareh Heravi @Bahareh360
Week 11Newsroom Statistics
What we have learned so farWhat Data Journalism is about���Finding Data���Data collection���Data scraping���Data mashing and summarisation���Data cleaning���Data aanalysis���Data visualisation with graphs, charts and infographics���Data visualisation with maps���FOI���Social Media as a source
NEWSROOM STATISTICS
We have learned before
Simple newsroom math
sum, average, median
Rate
Percent change
ANALYSING RELATIONSHIPS
Correlation analysis
Correlation concerns the strength of relationship between values of two variables.
���Are height and weight correlated?
Are engine size and max speed in cars correlated?
Correlation
Perfect nega+ve Perfect posi+ve
No correla+on
-‐1
0
strong strong
weak weak
-‐0.5 0.5
1
Source: Sta+s+cs without tears, Derek Rowntree
-‐1 -‐0.8 -‐0.3
0.3
0
0.8 1
Student Theory Prac=cal A 59 70 B 63 69 C 64 76 D 70 79 E 76 74 F 78 80 G 82 77 H 79 86 I 86 84 J 92 90
50
55
60
65
70
75
80
85
90
95
50 55 60 65 70 75 80 85 90 95
Theo
ry
Prac=cal
50
55
60
65
70
75
80
85
90
95
50 55 60 65 70 75 80 85 90 95
Theo
ry
Prac=cal
Student Theory Prac=cal G 82 77 H 79 86 I 86 84
76 77 78 79 80 81 82 83 84 85 86 87
78 79 80 81 82 83 84 85 86 87
Theo
ry
Prac=cal
76 77 78 79 80 81 82 83 84 85 86 87
78 79 80 81 82 83 84 85 86 87
Theo
ry
Prac=cal
? !
SIGNIFICANCE TEST
Significance test
Significance test is to determine whether an observed relationship is real, or is it just one that we would anyway expect to see quite often by chance?
We start out assuming that there is no real
relationship between the two variables: null
hypothesis.
p valuep value: the probability that your relationship has happened by chance. The smaller the p value the more significant the relationship.
p value is calculated probability of an observed difference occurring by chance when really no difference/relationship actually exists (null hypothesis).
If p value was small enough(?*), we can reject the null hypothesis. ���
p value cut offs
p < 0.05 or 0.05 level significant*
p < 0.01 or 0.01 level highly significant**
���
WARNING
?
Correlation = Causation���
Other statistical analysis tools
R
PSPP
Excel solver ���
Hands-on
Correlation analysis and significant test for:
Penalty points in counties in Ireland and rate of road fatalities.
Use SPSS or PSPP
Go back to your penalty points and road fatalities story/data.
You have now completed all the data analysis and visualisation needed for our
penalty points story.������Well done!
Resources: Sta+s+cs without tears: A primer for non-‐mathema+cians, Derek Rowntree, first published 1981 Sta+s+cs done wrong, Alex Reinhart, 2015 hNp://www.sta+s+csdonewrong.com/
Ques=ons?
Bahareh R. Heravi
@Bahareh360