+ All Categories
Home > Documents > Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James...

Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James...

Date post: 27-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
29
Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives & Graphing 1. Steps with data 2. Level of measurement & types of statistics 3. Descriptive statistics 4. Normal distribution 5. Non-normal distributions 6. Effect of skew on central tendency 7. Principles of graphing 8. Univariate graphical techniques 3 Steps with data (how to approach data)
Transcript
Page 1: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

Lecture 3Survey Research & Design in Psychology

James Neill, 2016Creative Commons Attribution 4.0

Descriptives & Graphing

2

Overview: Descriptives & Graphing

1. Steps with data2. Level of measurement

& types of statistics3. Descriptive statistics4. Normal distribution5. Non-normal distributions6. Effect of skew on central tendency7. Principles of graphing8. Univariate graphical techniques

3

Steps with data(how to approach data)

Page 2: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

4

Steps with dataCheck and

screen data

Explore,describe, &

graph

Test hypotheses

6

Data checking

• Have one one person read the survey responses aloud to another person who checks the electronic data file.

• For large studies, check a proportion of the surveys and declare the error-rate in the research report.

Page 3: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

7

Data screening

• Carefully 'screening' a data file helps to minimise errors and maximise validity.

• For example, screen for:– Out of range values (min. and max.)– Mis-entered data– Missing cases– Duplicate cases– Missing data

Page 4: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

11

Level of measurement &

types of statistics

12

Golden rule of data analysis

Level of measurement determines type of

descriptive statistics and graphs

Level of measurement determines which types of descriptive statistics and which types of graphs are appropriate.

Page 5: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

13

Levels of measurement andnon-parametric vs. parametric

Categorical & ordinal DVs → non-parametric

(Does not assume a normal distribution)

Interval & ratio DVs→ parametric

(Assumes a normal distribution)

→ non-parametric(If distribution is non-normal)

DVs = dependent variables

14

Parametric statistics

• Procedures which estimate parameters of a population, usually based on the normal distribution

• Key parametric statistics: –Univariate: M, SD, skewness,

kurtosis → t-tests, ANOVAs

–Bi/multivariate: r → linear regression, multiple linear regression

15

• More powerful (more sensitive)• More assumptions (normal

distribution)• More vulnerable to violations of

assumptions

Parametric statistics

Page 6: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

16

Non-parametric statistics(Distribution-free tests)

• Fewer assumptions (do not assume a normal distribution)

• Common non-parametric statistics:–Frequency → sign test, chi-squared

–Rank order → Mann-Whitney U test, Wilcoxon matched-pairs signed-ranks test

17

Univariate descriptive statistics

Page 7: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

19

What do we want to describe?

The distributional properties of underlying variables, based on:● Central tendency (ies):

Frequencies, Mode, Median, Mean

● Shape : Skewness, Kurtosis

● Spread (dispersion): Min., Max., Range, IQR, Percentiles, Var/SD

for sampled data.

20

Measures of central tendencyStatistics which represent the ‘centre’ of a frequency distribution:

–Mode (most frequent)–Median (50th percentile)–Mean (average)

Which ones to use depends on:–Type of data (level of measurement)–Shape of distribution (esp. skewness)

Reporting more than one may be appropriate.

21

Measures of central tendency

√√If meaningfulRatio

√√√Interval

√Ordinal

√Nominal

MeanMedianMode / Freq. /%s

If meaningful

x x

x

Page 8: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

22

Measures of distribution• Measures of shape, spread,

dispersion, and deviation from the central tendency

Non-parametric:• Min and max• Range• Percentiles

Parametric:• SD• Skewness• Kurtosis

23

√√√Ratio

√√Interval

√Ordinal

Nominal

Var / SDPercentileMin / Max, Range

Measures of spread / dispersion / deviation

If meaningful

x x x

x

24

Descriptives for nominal data• Nominal LOM = Labelled categories• Descriptive statistics:

–Most frequent? (Mode – e.g., females)

–Least frequent? (e.g., Males)

–Frequencies (e.g., 20 females, 10 males)

–Percentages (e.g. 67% females, 33% males)

–Cumulative percentages–Ratios (e.g., twice as many females as males)

Page 9: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

25

Descriptives for ordinal data

• Ordinal LOM = Conveys order but not distance (e.g., ranks)

• Descriptives approach is as for nominal (frequencies, mode etc.)

• Plus percentiles (including median) may be useful

26

Descriptives for interval data

• Interval LOM = order and distance, but no true 0 (0 is arbitrary).

• Central tendency (mode, median, mean)

• Shape/Spread (min., max., range, SD, skewness, kurtosis)

Interval data is discrete, but is often treated as ratio/continuous (especially for > 5 intervals)

27

Descriptives for ratio data

• Ratio = Numbers convey order and distance, meaningful 0 point

• As for interval, use median, mean, SD, skewness etc.

• Can also use ratios (e.g., Category A is twice as large as Category B)

Page 10: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

28

Mode (Mo)• Most common score - highest point in a

frequency distribution – a real score – the most common response

• Suitable for all levels of data, but may not be appropriate for ratio (continuous)

• Not affected by outliers• Check frequencies and bar graph

to see whether it is an accurate and useful statistic

29

Frequencies ( f) and percentages (%)

• # of responses in each category • % of responses in each category• Frequency table• Visualise using a bar or pie chart

30

Median (Mdn)

• Mid-point of distribution (Quartile 2, 50th percentile)

• Not badly affected by outliers • May not represent the central

tendency in skewed data• If the Median is useful, then

consider what other percentiles may also be worth reporting

Page 11: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

31

Summary: Descriptive statistics• Level of measurement and

normality determines whether data can be treated as parametric

• Describe the central tendency–Frequencies, Percentages–Mode, Median, Mean

• Describe the variability :–Min, Max, Range, Quartiles–Standard Deviation, Variance

32

Properties of the normal distribution

33

Four moments of a normal distribution

Row 1 Row 2 Row 3 Row 40

2

4

6

8

10

12

Column 1

Column 2

Column 3

Mean

←SD→-ve Skew +ve Skew

←K

urt→

Page 12: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

34

Four moments of a normal distribution

Four mathematical qualities (parameters) can describe a continuous distribution which as least roughly follows a bell curve shape:

• 1st = mean (central tendency)• 2nd = SD (dispersion)• 3rd = skewness (lean / tail)• 4th = kurtosis (peakedness / flattness)

35

Mean (1st moment )

• Average score

Mean = Σ X / N• For normally distributed ratio or

interval (if treating it as continuous) data. • Influenced by extreme scores

(outliers)

36

Beware inappropriate averaging...

With your head in an ovenand your feet in ice

you would feel, on average ,

just fineThe majority of people have more than the average number of legs

(M = 1.9999).

Page 13: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

37

Standard deviation (2nd moment )

• SD = square root of the variance= Σ (X - X)2

N – 1• For normally distributed interval or

ratio data• Affected by outliers• Can also derive the Standard Error

(SE) = SD / square root of N

38

Skewness (3rd moment )

• Lean of distribution– +ve = tail to right– -ve = tail to left

• Can be caused by an outlier, or ceiling or floor effects

• Can be accurate (e.g., cars owned per person would have a skewed distribution)

39

Skewness (3rd moment)(with ceiling and floor effects)

● Negative skew● Ceiling effect

● Positive skew● Floor effect

Page 14: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

40

Kurtosis (4th moment )

• Flatness or peakedness of distribution

+ve = peaked -ve = flattened• By altering the X &/or Y axis, any

distribution can be made to look more peaked or flat – add a normal curve to help judge kurtosis visually.

41

Kurtosis (4th moment )

Blue = Negative (platykurtic)

Red = Positive (leptokurtic)

42

Judging severity of skewness & kurtosis

• View histogram with normal curve • Deal with outliers• Rule of thumb:

Skewness and kurtosis > -1 or < 1 is generally considered to sufficiently normal for meeting the assumptions of parametric inferential statistics

• Significance tests of skewness: Tend to be overly sensitive (therefore avoid using)

Page 15: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

43

Areas under the normal curve

If distribution is normal(bell-shaped - or close):

~68% of scores within +/- 1 SD of M ~95% of scores within +/- 2 SD of M~99.7% of scores within +/- 3 SD of M

44

Areas under the normal curve

45

Non-normal distributions

Page 16: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

46

Types of non-normal distribution• Modality

– Uni-modal (one peak)– Bi-modal (two peaks)– Multi-modal (more than two peaks)

• Skewness– Positive (tail to right)– Negative (tail to left)

• Kurtosis– Platykurtic (Flat)– Leptokurtic (Peaked)

47

Non-normal distributions

48

Histogram of weight

WEIGHT

110.0100.090.080.070.060.050.040.0

Histogram

Fre

quen

cy

8

6

4

2

0

Std. Dev = 17.10

Mean = 69.6

N = 20.00

Page 17: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

49

Histogram of daily calorie intake

50

Histogram of fertility

51

Example ‘normal’ distribution 1

140120100806040200

60

50

40

30

20

10

0

Fre

que

ncy

Mean =81.21Std. Dev. =18.228

N =188

Page 18: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

52

Example ‘normal’ distribution 2

Very masculineFairly masculineAndrogynousFairly feminineVery feminine

Femininity-Masculinity

60

40

20

0

Cou

nt

This bimodal graph actually consists of two

different, underlying normal distributions.

53

Example ‘normal’ distribution 2

Very masculineFairly masculineAndrogynousFairly feminineVery feminine

Femininity-Masculinity

60

40

20

0

Cou

nt

Very masculineFairly masculineAndrogynousFairly feminine

Femininity-Masculinity

50

40

30

20

10

0

Cou

nt

Gender: male

Distribution for females Distribution for males

Very masculineFairly masculineAndrogynousFairly feminineVery feminine

Femininity-Masculinity

60

40

20

0

Cou

nt

Gender: female

54

Non-normal distribution: Use non-parametric descriptive statistics

• Min. & Max.• Range = Max.-Min.• Percentiles• Quartiles

–Q1–Mdn (Q2)–Q3–IQR (Q3-Q1)

Page 19: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

55

Effects of skew on measures of central tendency

+vely skewed distributions mode < median < mean

symmetrical (normal) distributionsmean = median = mode

-vely skewed distributionsmean < median < mode

56

Effects of skew on measures of central tendency

57

Transformations

• Converts data using various formulae to achieve normality and allow more powerful tests

• Loses original metric• Complicates interpretation

Page 20: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

58

1.If a survey question produces a ‘floor effect’, where will the mean, median and mode lie in relation to one another?

Review questions

59

2.Would the mean # of cars owned in Australia to exceed the median?

Review questions

60

3.Would you expect the mean score on an easy test to exceed the median performance?

Review questions

Page 21: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

63

Science is beautiful(Nature Video)

(Youtube – 5:30 mins )

Page 22: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

64

Is Pivot a turning point for web exploration?

(Gary Flake)

(TED talk - 6 min. )

65

Principles of graphing

• Clear purpose• Maximise clarity• Minimise clutter• Allow visual comparison

66

Graphs(Edward Tufte)

• Visualise data• Reveal data

– Describe– Explore– Tabulate– Decorate

• Communicate complex ideas with clarity, precision, and efficiency

Page 23: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

67

Graphing steps

1. Identify purpose of the graph (make large amounts of data coherent; present many #s in small space; encourage the eye to make comparisons)

2. Select type of graph to use 3. Draw and modify graph to be

clear, non-distorting, and well-labelled (maximise clarity, minimise clarity; show the data; avoid distortion; reveal data at several levels/layers)

68

Software for data visualisation (graphing)

1. Statistical packages ● e.g., SPSS Graphs or via Analyses

2. Spreadsheet packages● e.g., MS Excel

3. Word-processors● e.g., MS Word – Insert – Object –

Micrograph Graph Chart

69

Cleveland’s hierarchy

Page 24: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

70

Univariate graphs

• Bar graph• Pie chart• Histogram• Stem & leaf plot• Data plot /

Error bar• Box plot

Non-parametrici.e., nominal, ordinal, or non-

normal interval or ratio}

} Parametrici.e., normally distributed

interval or ratio

71

Bar chart (Bar graph)

AREA

Bio logy

Ant h ropo logy

In fo rm at ion T echno lo

P sychology

Socio logy

Co

un

t

13

12

12

11

11

10

10

9

9

AREA

Bio logy

An th ropo logy

In fo rmat ion T echno lo

P sycho logy

Socio logy

Co

un

t

12

11

10

9

8

7

6

5

4

3

2

1

0

• Allows comparison of heights of bars• X-axis: Collapse if too many categories• Y-axis: Count/Frequency or % - truncation

exaggerates differences• Can add data labels (data values for each bar)

Note truncated

Y-axis

72

Bio logy

An th ropo logy

Info rmat ion T echno lo

P sycho logy

Socio logy

• Use a bar chart instead• Hard to read

–Difficult to show• Small values• Small differences

–Rotation of chart and position of slicesinfluences perception

Pie chart

Page 25: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

73

Histogram

Participant Age

6 2.552 .542 .53 2.52 2. 51 2.5

30 00

20 00

10 00

0

Std. D ev = 9.1 6

M e an = 24 .0

N = 5 57 5.0 0

Participant Age

60 0

50 0

40 0

30 0

20 0

10 0

0

Std. Dev = 9.1 6

M ea n = 24 .0

N = 557 5.0 0

Participant Age

6 5

61

57

53

49

45

4 1

3 7

33

29

25

21

17

1 3

9

1000

800

600

400

200

0

Std. Dev = 9.16

Mean = 2 4

N = 5575 .00

• For continuous data (Likert?, Ratio)• X-axis needs a happy medium for #

of categories• Y-axis matters (can exaggerate)

74

Histogram of male & female heights

Wild & Seber (2000)

75

Stem & leaf plot● Use for ordinal, interval and ratio data

(if rounded)● May look confusing to unfamiliar reader

Page 26: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

76

• Contains actual data• Collapses tails• Underused alternative to histogram

Stem & leaf plot

Frequency Stem & Leaf 7.00 1 . & 192.00 1 . 22223333333 541.00 1 . 444444444444444455555555555555 610.00 1 . 6666666666666677777777777777777777 849.00 1 . 88888888888888888888888888899999999999999999999 614.00 2 . 0000000000000000111111111111111111 602.00 2 . 222222222222222233333333333333333 447.00 2 . 4444444444444455555555555 291.00 2 . 66666666677777777 240.00 2 . 88888889999999 167.00 3 . 000001111 146.00 3 . 22223333 153.00 3 . 44445555 118.00 3 . 666777 99.00 3 . 888999 106.00 4 . 000111 54.00 4 . 222 339.00 Extremes (>=43)

77

Box plot(Box &

whisker)

● Useful for interval and ratio data

● Represents min., max, median, quartiles, & outliers

78

• Alternative to histogram• Useful for screening• Useful for comparing variables• Can get messy - too much info• Confusing to unfamiliar reader

Box plot (Box & whisker)

Participant Gender

FemaleMaleMissing

10

8

6

4

2

0

T ime Managem ent -T 1

Self-Confidence-T 1

4495416257825962841404204435327518234186233051762300655912821149532014193588284754754001983245128982003364735215712950426872431825592834542721166904052344444234236354035190672739468931373562338330403962312229122552555452385410773323584004

552433515563

282944822672531541202262284515042319399839026463552217930205274353149973645414164129025481686281441671963261441719551744438268828222626179317471482187367355103995224342505536235949986496205106383442300329625625273564431714930284362690210123351969300929654153990553822931421688363427433593251521081985531655582138303424526783352317248029602492645428431654228518641932476647266229160843081726993556334

1503275241623466255243493045304032431371222596415943511907247380402818082659197862231372721142861

226520672270403852527688296021515564300430321938532836535506271835192336608405435012183292849986302224518624385114882241278064127432944232125706611465427925764302292324762312149323344308292014254307

5695491

Page 27: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

79

Data plot & error barData plot Error bar

80

• Alternative to histogram• Implies continuity e.g., time• Can show multiple lines

Line graph

OVE RALL SCAL ES-T 3

OVERALL SCAL E S-T 2

OVERAL L SCAL E S-T 1

OVE RAL L SCALE S-T 0

Mea

n

8 .0

7 .5

7 .0

6 .5

6 .0

5 .5

5 .0

81

Graphical integrity

(part of academic integrity)

Page 28: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

82

"Like good writing, good graphical displays of data communicate ideas with clarity, precision, and efficiency.

Like poor writing, bad graphical displays distort or obscure the data,

make it harder to understand or compare, or otherwise thwart the communicative effect which the

graph should convey." Michael Friendly –

Gallery of Data Visualisation

83

Tufte’s graphical integrity

• Some lapses intentional, some not • Lie Factor = size of effect in graph

size of effect in data• Misleading uses of area• Misleading uses of perspective• Leaving out important context• Lack of taste and aesthetics

84

Review exercise:Fill in the cells in this table

Level Properties Examples Descriptive Statistics

Graphs

Nominal/Categorical

Ordinal / Rank

Interval

Ratio

Answers: http://goo.gl/Ln9e1

Page 29: Lecture 3 - Wikimedia · 2018. 1. 8. · Lecture 3 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Descriptives & Graphing 2 Overview: Descriptives

85

References1. Chambers, J., Cleveland, B., Kleiner, B., & Tukey, P. (1983).

Graphical methods for data analysis. Boston, MA: Duxbury Press.

2. Cleveland, W. S. (1985). The elements of graphing data. Monterey, CA: Wadsworth.

3. Jones, G. E. (2006). How to lie with charts. Santa Monica, CA: LaPuerta.

4. Tufte, E. R. (1983). The visual display of quantitative information. Cheshire, CT: Graphics Press.

5. Tufte. E. R. (2001). Visualizing quantitative data. Cheshire, CT: Graphics Press.

6. Tukey J. (1977). Exploratory data analysis. Addison-Wesley.

7. Wild, C. J., & Seber, G. A. F. (2000). Chance encounters: A first course in data analysis and inference. New York: Wiley.


Recommended