+ All Categories
Home > Technology > Analysis 101 correlation v causation

Analysis 101 correlation v causation

Date post: 23-Jun-2015
Category:
Upload: veolia
View: 347 times
Download: 0 times
Share this document with a friend
Description:
Examining why correlations are useful tools to analyse results between two samples, such as expected and actual results.
Popular Tags:
14
Analysis 101: Correlation v Causation Correlations & Dependence
Transcript
Page 1: Analysis 101   correlation v causation

Analysis 101: Correlation v Causation

Correlations & Dependence

Page 2: Analysis 101   correlation v causation

Well Known…

“Correlation doesn’t equal Causation”

Page 3: Analysis 101   correlation v causation

WHY?

Page 4: Analysis 101   correlation v causation

Definition: Causation

Two or More Things Happen at Once Known, Observable Chain of Events/Links Maths Best Tool for Job (Iconic Models [Fairly] Simple) Doesn’t Need Tonnes of Data

1kg 2kg

This

Goe

s D

own

This

Goe

s U

p

Page 5: Analysis 101   correlation v causation

Definition: Correlation

Two or More Things Happen at Once Non-Observable Chain of Events (What’s in the black-box?) May or May Not be Causal Statistics Best Tool for Job - Strength of Correlation can be Considered the ‘Chance’ they’re Causal

…But go find out for sure!

1kg 2kg

This

Goe

s D

own

This

Goe

s U

p

Page 6: Analysis 101   correlation v causation

Because for All You Know…

In Complex/Black Box Systems, Never Assume Makes an Ass out of U and Me!

This

Goe

s D

own

This

Goe

s U

p 1kg 2kg2kg

Page 7: Analysis 101   correlation v causation

Calculating Correlation Coefficients• Many Equations/Algorithms• Most Famous, Pearson Product Moment

Coefficient

• BEWARE: Pearson spots Linear Correlations!

Page 8: Analysis 101   correlation v causation

Calculating Correlation Coefficients• Calculations for Data Samples

• Bars denote Mean-Average for Datasets X & Y • Xi, Yi Denote i-th Sample of Dataset X and Y• Commonly Available– Excel Correl() Function = Sample Correlation– Cor() in R

Page 9: Analysis 101   correlation v causation

Correlations Need DATA!• …Because it’s Statistical• 25 – 30 Data Points Minimum for Each of X & Y• < 25 Causes Greater Uncertainty• Sign-is Important!

– Positive Correlation = Increase in Y when X increases– Negative Correlation = Decrease in Y when X increases

• Magnitude = Correlation Strength.

Value of the Correlation Coefficient

Strength of Correlation

1 Perfect

0.7 - 0.99 Strong

0.4 - 0.69 Moderate

0.1 - 0.39 Weak

0 - 0.09 Zero

Page 10: Analysis 101   correlation v causation

EXAMPLE: Maths & Writing TestsMaths Data Writing DataData Point Math Scores Data Point Math Scores

1 44.5 14 41.22 44.7 15 66.43 70.5 16 51.04 54.7 17 46.95 38.4 18 53.06 61.4 19 52.37 56.3 20 59.68 46.3 21 59.39 54.4 22 50.3

10 38.3 23 52.211 58.8 24 41.812 45.1 25 46.413 53.9 26 49.9

Data Point Writing Scores Data Point Writing Scores

1 64.5 14 51.52 43.7 15 65.13 56.7 16 59.34 56.7 17 56.75 46.3 18 54.16 64.5 19 43.07 39.1 20 56.78 39.1 21 54.19 51.5 22 47.6

10 64.5 23 48.911 43.7 24 48.912 41.1 25 54.113 59.3 26 64.5

Correlation = 0.215601457 (Not very strong)

CONCLUSION: Can’t use maths test scores as any sort of expectation on written tests

Page 11: Analysis 101   correlation v causation

Visualising CorrelationsGraphs

e.g. Change in one variable presents closely matched by a change another.

Correlation MatricesDataset X

Dat

aset

Ye.g. Quantitative Surveys. 1. Count respondent scores per

question2. Plot questions against each other

Page 12: Analysis 101   correlation v causation

Great For• A/B-testing Hypotheses• Effect of Retrospective

Changes on Stories– Multiple Items = Multiple

Data Points

• Guerrilla Testing Factors– Post-experiments

• Experimental Verification• Empirically Verifying Claims

from Politicians ;)

Not So Good For• Where System Statics and

Dynamics are Known– Unless identifying reasons for

error– Simply Generates Waste

Otherwise

• Qualitative Results• Retrospective Changes Where

Only a Handful of Results Are available – e.g. team changes or sickness

(unless you have enough data)

Page 13: Analysis 101   correlation v causation

Advanced Concepts• Using Multiple, Linked Correlations Increases

Certainty– Identify Factors or Behaviours…– …Potentially using Other, Strongly Correlated

Variables• Correlation Matrices First Step in Factor

Analysis– Identifying Influential Factors Above the Noise

Page 14: Analysis 101   correlation v causation

Thanks for ViewingFurther Reading

Correlation (Math is Fun, Advanced)http://www.mathsisfun.com/data/correlation.html

“Pearson Product-moment Correlation Coefficient” Wikipediahttp://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

“Correlation & Dependence” Wikipedia http://en.wikipedia.org/wiki/Correlation_and_dependence

Factor Analysishttp://en.wikipedia.org/wiki/Factor_analysis

Ethar Alali @EtharUK @Dynacognetics

Managing Director & Chief ArchitectPolymath-MathMo. Programming since 9 years old. TOGAF 9 Certified, Classic and Agile-EA, change agent. Blog: GoadingtheITGeek.blogspot.co.uk

Specialist ICT Strategists & Advisors. Member of HiveMind Network for some of the biggest household and corporate multi-nationals.

Accredited Growth Voucher Advisors certified to deliver IT & Web Growth Consultancy as part of the UK government’s Growth Voucher Scheme.

About Us

Accreditations & Associations


Recommended