+ All Categories
Home > Documents > HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Date post: 23-Feb-2016
Category:
Upload: yaakov
View: 31 times
Download: 0 times
Share this document with a friend
Description:
Fox/Levin/Forde, Elementary Statistics in Social Research, 12e. Chapter 2: Organizing the Data. HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D. 2/3/2014 Spring 2014. 2.1. Announcement. - PowerPoint PPT Presentation
Popular Tags:
40
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D. 2/3/2014 Spring 2014 Fox/Levin/Forde, Elementary Statistics in Social Research, 12e Chapter 2: Organizing the Data 1
Transcript
Page 1: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

HLTH 300 Biostatistics for Public Health Practice,

Raul Cruz-Cano, Ph.D.2/3/2014 Spring 2014

Fox/Levin/Forde, Elementary Statistics in Social Research, 12e

Chapter 2: Organizing the Data

1

Page 2: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

2

2.1

WRT the homework: You are allowed to literally “copy” and “paste” the problem from the book

Announcement

Page 3: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

Create frequency distributions of nominal data

Calculate proportions, percentages, ratios, and rates

Create simple and grouped frequency distributions

Create cross-tabulations

Distinguish between various forms of graphic presentations

CHAPTER OBJECTIVES

2.1

2.2

2.3

2.4

2.5

Page 4: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Create frequency distributions of nominal data

Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes

2.1

Page 5: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

5

2.1

Formulas and statistical techniques are used by researchers to:

• Organize raw data• Test hypotheses

Raw data is often difficult to synthesize

Frequency tables make raw data easier to understand

Introduction

Page 6: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

6

2.1 Frequency Distributions of Nominal Data

Responses of Young Boys to Removal of Toy

Response of Child fCry 25Express Anger 15Withdraw 5Ply with another toy 5

N=50

Characteristics of a frequency distribution of nominal data:• Title• Consists of two columns:

• Left column: characteristics (e.g., Response of Child)

• Right column: frequency (f)

Page 7: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

7

2.1

Comparisons clarify results, add information, and allow for comparisons

Comparing Distributions

Response to Removal of Toy by Gender of Child

Gender of ChildResponse of Child Male FemaleCry 25 28Express Anger 15 3Withdraw 5 4Play with another toy 5 15 Total 50 50

Page 8: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Calculate proportions, percentages, ratios, and rates

Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes

2.2

Page 9: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

9

2.2

Allows for a comparison of groups of different sizes

Proportion – number of casescompared to the total size of distribution

Percentage – the frequency of occurrence of a category per 100 cases

Proportions and Percentages

fPN

% (100) fN

Page 10: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

10

Examples

Responses of Young Boys to Removal of Toy

Response of Child fCry 25Express Anger 15Withdraw 5Ply with another toy 5

N=50

fPN

% (100) fN

Proportion of children that cried

5.5025

%505025100

Percentage of children that cried

Page 11: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

11

2.2

Ratio – compares the frequencyof one category to another

Rate – compares betweenactual and potential cases

Ratio and Rates

1

2

Ratioff

actual casesRate 1,000 potential casesff

Page 12: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

12

Examples

Responses of Young Boys to Removal of Toy

Response of Child fCry 25Express Anger 15Withdraw 5Ply with another toy 5

N=50

Ratio of children that cried for every child that withdraw

5.525

children 1000per 50050251000

Rate of children that cried

1

2

Ratioff

actual casesRate 1,000 potential casesff

Page 13: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Create simple and grouped frequency distributions

Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes

2.3

Page 14: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

Table 2.4

Page 15: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

Table 2.5

Not in Order!

Page 16: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

16

2.3

Used to clarify the presentation of interval-level scores spread over a wide range

Class Intervals• Smaller categories or groups containing more than one score• Class interval size determined by the number of score values it

contains

Grouped Frequency Distribution of Interval Data

Page 17: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

17

2.3

Class Limits• The point halfway between

adjacent intervals • Upper and lower limits

– Distance from upper and lower limit determines the size of class interval

The Midpoint• The middlemost score value in a class interval

– The sum of the lowest and highest value in a class interval divided by two

Class Limits and the Midpoint

i U L

size of a class interval upper limit of a class interval lower limit of a class interval

iUL

lowest score value highest score value2

m

Careful, many time they are not as evident as they seem

Page 18: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

18

More about the length of class intervals

  f50-54 455-59 560-64 565-69 1270-74 1775-79 1280-84 785-89 490-94 295-95 3

71

TABLE 2.7 Grouped FrequencyDistribution of Final-ExaminationGrades for 71 Students

Usually the second category would be considered to be from 54.5 to 59.5

But notice that in a survey about age the respondents would consider to be from 55.0 to 55 + (364/365)

In other words “…comes down to personal preference, feasibility and logical sense, not what is strictly right or wrong” (page 52)

Midpoint = (55 +59)/2 = 114/2 =57

Page 19: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

19

2.3

Cumulative Frequencies• Total number of cases having a given score or a score that is

lower– Shown as cf– Obtained by the sum of frequencies in that category plus all

lower categories’ frequencies

Cumulative Percentage• Percentage of cases having a given score or a score that is

lower

Cumulative Distributions

% 100 cfcN

Page 20: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.3

Table 2.7

  f % cf c%50-54 4 5.63 4 5.6355-59 5 7.04 9 12.6860-64 5 7.04 14 19.7265-69 12 16.90 26 36.6270-74 17 23.94 43 60.5675-79 12 16.90 55 77.4680-84 7 9.86 62 87.3285-89 4 5.63 66 92.9690-94 2 2.82 68 95.7795-95 3 4.23 71 100.00

71 100.00

TABLE 2.7 Grouped FrequencyDistribution of Final-ExaminationGrades for 71 Students

Page 21: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

21

2.3

The percentage of cases falling at or below a given score

• Deciles – points that divide a distribution into 10 equally sized portions

• Quartiles – points that divide a distribution into quarters• Median – the point that divides a distribution in two, half above

it and half below it

Let’s talk about it after the Frequency Polygons and Line Charts

Percentiles

Page 22: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Create cross-tabulations

Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes

2.4

Page 23: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.4

Table 2.17

Notice that sometimes is useful to divide the data using more than one variable, e.g. by Relationship and by Victim Sex

Page 24: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

24

2.4 Percents within Cross-Tabulations

Total Percents: % 100

Row Percents: % 100

Column Percents: % 100

total

row

column

ftotalNfrowN

fcolumnN

The choice comes down to which is more relevant to the purpose of the analysis

• If the independent variable is on the rows, use row percents• If the independent variable is on the columns, use column

percents• If the independent variable is unclear, use whichever percent is

most meaningful

Page 25: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

25

Page 26: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

26

Solution

a) Does you class determine if you buy a new car? Or Does buying a new car determines your class?

b) Pct. New Car = 100(17/ 73) = 23.28%c) Pct. Upper class with new car = 100(23/33) = 69.69%d) Pct. Middle class with new car = 100(6/27) = 22.22%e) Pct. Lower class with new car = 100(1/13) = 7.69%f) Effect of social class in buying a car?

  No New Car New Car Total rowUpper Class 23 10 33Middle Class 21 6 27Lower Class 12 1 13Total Column 56 17 73

Page 27: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

27

Page 28: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

28

Solution

Score Value f Class Interval f Midpoint Percentage cf Cum. Pct.

39 4 15-19 12 (15+19)/2 =17 100(12/74)=16.21 12 100(12/74)=16.21

38 4 20-24 23 (20+24)/2=22 100(23/74)=31.08 12+23=35 100(35/74)=47.29

35 2 25-29 22 (25+29)/2=27 100(22/74)=29.72 35+22=57 100(57/74)=77.02

32 3 30-34 7 (30+34)/2=32 100(7/74)=9.45 57+7=64 100(64/74)=86.48

31 4 35-39 10 (35+39)/2=37 100(10/74)=13.51 64+10=74 100

27 9   74   99.97=100 approx.    

26 7

25 6

21 13

20 10

17 5

15 774

Page 29: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

Distinguish between various forms of graphic presentations

Learning ObjectivesAfter this lecture, you should be able to complete the following Learning Outcomes

2.5

Page 30: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.4 Pie Charts

Page 31: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.6 Bar Graph & Histograms

Page 32: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.9 Frequency polygons

(frequency indicated at midpoint of each class)

Page 33: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

  Midpoint f % cf c%50-54 52.5 4 5.63 4 5.6355-59 57.5 5 7.04 9 12.6860-64 62.5 5 7.04 14 19.7265-69 67.5 12 16.9 26 36.6270-74 72.5 17 23.94 43 60.5675-79 77.5 12 16.9 55 77.4680-84 82.5 7 9.86 62 87.3285-89 87.5 4 5.63 66 92.9690-94 92.5 2 2.82 68 95.7795-95 97.5 3 4.23 71 100

    71 100    

From Table 2.7

52.5 57.5 62.5 67.5 72.5 77.5 82.5 87.5 92.5 97.50

10

20

30

40

50

60

70

80

90

100

50 percentile =70 approx

The smaller the class interval the better

Page 34: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.11Taller than who? Flatter than who?

Page 35: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.12

Page 36: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.14 Line Chart(discrete values)

Page 37: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

2.5

Figure 2.15 Maps

Page 38: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

38

Let’s work in MS Excel

Page 39: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

© 2014 by Pearson Higher Education, IncUpper Saddle River, New Jersey 07458 • All Rights Reserved

Frequency distributions can be created to help researchers visualize distributions

Proportions, percentages, ratios, and rates can be calculated as a way to describe data

Simple frequency distributions can be created using data at any level of measurement, while interval level

data is needed to create a grouped frequency data

Cross-tabulations can be created to illustrate the relationship between two variables

CHAPTER SUMMARY

2.1

2.2

2.3

2.4

Several forms of graphs can be used to demonstrate patterns and relationships between variables 2.5

Page 40: HLTH 300 Biostatistics for Public Health Practice, Raul Cruz-Cano, Ph.D.

40

2.3

Problems: 14 and 31I know that they are not exactly the same as those solved in class

No Excel this time, but maybe next

Homework


Recommended