+ All Categories
Home > Documents > © 2006 Dr Rotimi Adigun. Recommended text High Yield Biostatistics, Epidemiology and Public...

© 2006 Dr Rotimi Adigun. Recommended text High Yield Biostatistics, Epidemiology and Public...

Date post: 16-Jan-2016
Category:
Upload: michael-goodwin
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
58
© 2006 Dr Rotimi Adigun
Transcript
Page 1: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

© 2006

Dr Rotimi Adigun

Page 2: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Recommended text

High Yield Biostatistics, Epidemiology and Public Health ,Michael Glaser, Fourth edition.

Pre-Test Preventive Medicine and Public Health, Sylvie Ratelle(Chapter 1) for practice questions.

Page 3: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

I. Descriptive statistics Populations, samples

and Elements Probability Types of Data Frequency Distribution Measures of Central

Tendency Measures of Variability Z scores

II. Inferential Statistics Statistics and

Parameters Estimating mean T-scores Hypothesis testing Steps of hypothesis

testing Z-tests The meaning of

statistical significance Type I and II errors Power Differences between

groups

Page 4: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

III. Correlation and

Predictive Techniques

Correlation Survival analysis

IV.

Research methods Sampling techniques Assessing the

Evidence Hierarchy of

evidence Systematic review Clinical Decision

making

Page 5: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

© 200604/21/23

Descriptive Statistics,Population, Samples, Elements

Types of Data, Measures of Central tendency, Measures of Variability,

Z-scores.

Page 6: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

A way to summarize data from a sample or a population

DSs illustrate the shape, central tendency, and variability of a set of data The shape of data has to do with the

frequencies of the values of observations

DSs= descriptive statistics.

04/21/23 6

Page 7: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Population A population is the group from which a sample is drawn e.g., automobile crash victims in an

emergency room ,Student scores on block I exams at Windor,

In research, it is not practical to include all members of a population

Thus, a sample (a subset of a population) is taken

04/21/23 7

Page 8: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Elements=A single observation—such as one students score—is an element, denoted by X.

The number of elements in a population is denoted by N

The number of elements in a sample is denoted by n.

04/21/23 8

Page 9: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Data Measurements or observations of a variable

Variable A characteristic that is observed or

manipulated Can take on different values

04/21/23 9

Page 10: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Independent variables Precede dependent variables in time Are often manipulated by the researcher The treatment or intervention that is used in a

study Dependent variables

What is measured as an outcome in a study Values depend on the independent variable

04/21/23 10

Page 11: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Independent and Dependent variables e.g.

-You study the effect of different drugs on colon cancer, the drugs, dosage, timing would be independent variables, the effect of the different drugs, dosage ,and timing on cancer would be the dependent variable.

04/21/23 11

Page 12: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

We compare the effect of stimulants on memory and retention.

Independent variables would be the different stimulants and dosages..dependent variables would be the different outcomes..(improved retention, decreased retention or no effect)

04/21/23 12

Page 13: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Mathematically on a graph or equation where Y depends on the value of X,

Y is a function of X or f(x)

Independent variable is usually plotted on the X axis while the dependent variable is plotted on the Y axis.

04/21/23 13

Page 14: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Parameters Summary data from a population

Statistics Summary data from a sample

04/21/23 14

Page 15: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Central tendency describes the location of the middle of the data

Variability is the extent values are spread above and below the middle values a.k.a., Dispersion

DSs can be distinguished from inferential statistics DSs are not capable of testing hypotheses

04/21/23 15

Page 16: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Mean (a.k.a., average) The most commonly used DS

To calculate the mean Add all values of a series of numbers and then

divided by the total number of elements

04/21/23 16

Page 17: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Mean of a sample

Mean of a population

(X bar) refers to the mean of a sample and refers to the mean of a population

X is a command that adds all of the X values n is the total number of values in the series of a

sample and N is the same for a population

X μ

04/21/23 17

N

X

n

XX

Page 18: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Mode The most frequently

occurring value in a series

The modal value is the highest bar in a histogram

04/21/23 18

ModeMode

Page 19: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Median The value that divides a series of values in

half when they are all listed in order When there are an odd number of values

The median is the middle value When there are an even number of values

Count from each end of the series toward the middle and then average the 2 middle values

04/21/23 19

Page 20: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Each of the three methods of measuring central tendency has certain advantages and disadvantages

Which method should be used? It depends on the type of data that is being

analyzed e.g., categorical, continuous, and the level of

measurement that is involved

04/21/23 20

Page 21: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

There are 4 levels of measurement Nominal, ordinal, interval, and ratio

1. Nominal Data are coded by a number, name, or letter

that is assigned to a category or group No ordering Examples

Gender (e.g., male, female) Race Treatment preference (e.g., Surgery,

Radiotherapy, Hormone, Chemotherapy)

04/21/23 21

Page 22: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

2. Ordinal Is similar to nominal because the

measurements involve categories However, the categories are ordered by rank Examples

Pain level (e.g., mild, moderate, severe) Military rank (e.g., lieutenant, captain, major,

colonel, general) Opinion- (Agree, strongly agree) Severity – Mild, moderate, severe (dysplasia)

04/21/23 22

Page 23: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Ordinal values only describe order, not quantity Thus, severe pain is not the same as 2 times

mild pain The only mathematical operations

allowed for nominal and ordinal data are counting of categories e.g., 25 males and 30 females

04/21/23 23

Page 24: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

3. Interval Measurements are ordered (like ordinal data) Have equal intervals Does not have a true zero Examples

The Fahrenheit scale, where 0° does not correspond to an absence of heat (no true zero)

In contrast to Kelvin, which does have a true zero

04/21/23 24

Page 25: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

4. Ratio Measurements have equal intervals There is a true zero Ratio is the most advanced level of

measurement, which can handle most types of mathematical operations (including multiplication and division which are absent in interval scales)

04/21/23 25

Page 26: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Ratio examples Range of motion

No movement corresponds to zero degrees The interval between 10 and 20 degrees is the

same as between 40 and 50 degrees Lifting capacity

A person who is unable to lift scores zero A person who lifts 30 kg can lift twice as much as

one who lifts 15 kg

04/21/23 26

Page 27: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

NOIR is a mnemonic to help remember the names and order of the levels of measurement Nominal

OrdinalIntervalRatio

04/21/23 27

Page 28: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Measurement scalePermissible mathematic

operationsBest measure ofcentral tendency

Nominal Counting Mode

OrdinalGreater or less than

operationsMedian

Interval Addition and subtractionSymmetrical – Mean

Skewed – Median

RatioAddition, subtraction,

multiplication and division Symmetrical – Mean

Skewed – Median

04/21/23 28

Page 29: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Histograms of frequency distributions have shape

Distributions are often symmetrical with most scores falling in the middle and fewer toward the extremes

Most biological data are symmetrically distributed and form a normal curve (a.k.a, bell-shaped curve)

04/21/23 29

Page 30: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 30

Line depicting the shape of the data

Line depicting the shape of the data

Page 31: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

The area under a normal curve has a normal distribution (a.k.a., Gaussian distribution)

Properties of a normal distribution It is symmetric about its mean The highest point is at its mean The height of the curve decreases as one

moves away from the mean in either direction, approaching, but never reaching zero

04/21/23 31

Page 32: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 32

MeanMean

A normal distribution is symmetric about its meanA normal distribution is symmetric about its mean

As one moves away from the mean in either direction the height of the curve decreases, approaching, but never reaching zero

As one moves away from the mean in either direction the height of the curve decreases, approaching, but never reaching zero

The highest point of the overlying normal curve is at the mean

The highest point of the overlying normal curve is at the mean

Page 33: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 33

Mean = Median = ModeMean = Median = Mode

Page 34: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

The data are not distributed symmetrically in skewed distributions Consequently, the mean, median, and mode

are not equal and are in different positions Scores are clustered at one end of the

distribution A small number of extreme values are located

in the limits of the opposite end

04/21/23 34

Page 35: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Skew is always toward the direction of the longer tail(not the hump) Positive if skewed to the right Negative if to the left

04/21/23 35

The mean is shifted the most

Page 36: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Because the mean is shifted so much, it is not the best estimate of the average score for skewed distributions

The median is a better estimate of the center of skewed distributions It will be the central point of any distribution 50% of the values are above and 50% below

the median

04/21/23 36

Page 37: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Mean,Median and Mode

04/21/23 37

Page 38: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Midrange Smallest observation + Largest

observation2

Mode the value which occurs with the

greatest frequency i.e. the most common value

Summary statistics

Page 39: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Median the observation which lies in the

middle of the ordered observation.

Arithmetic mean (mean)Sum of all observationsNumber of observations

Summary statistics

Page 40: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

The mean represents the average of a group of scores, with some of the scores being above the mean and some below This range of scores is referred to as variability

or spread Range Variance Standard deviation Semi-interquartile range Coefficient of variation “Standard error”

Page 41: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

SD is the average amount of spread in a distribution of scores

The next slide is a group of 10 patients whose mean age is 40 years Some are older than 40 and some younger

04/21/23 41

Page 42: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 42

Ages are spread out along an X axis

Ages are spread out along an X axis

The amount ages are spread out is known as dispersion or spread

The amount ages are spread out is known as dispersion or spread

Page 43: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 43

Adding deviations always equals zero

Adding deviations always equals zero

Etc.

Page 44: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

To find the average, one would normally total the scores above and below the mean, add them together, and then divide by the number of values

However, the total always equals zero Values must first be squared, which cancels

the negative signs

04/21/23 44

Page 45: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 45

Symbol for SD of a sample for a population

Symbol for SD of a sample for a population

S2 is not in the same units (age), but SD is

S2 is not in the same units (age), but SD is

Page 46: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 46

Page 47: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

7 7

7 7 7

7

7 8

7 7 7

6

3 2

7 8 13

9

Mean = 7SD=0

Mean = 7SD=0.63

Mean = 7SD=4.04

Page 48: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

About 68.3% of the area under a normal curve is within one standard deviation (SD) of the mean

About 95.5% is within two SDs About 99.7% is within three SDs

04/21/23 48

Page 49: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 49

Page 51: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

The number of SDs that a specific score is above or below the mean in a distribution

Raw scores can be converted to z-scores by subtracting the mean from the raw score then dividing the difference by the SD

04/21/23 51

X

z

Page 52: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

If the element lies above the mean, it will have a positive z score;

if it lies below the mean, it will have a negative z score.

04/21/23 52

Page 53: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Standardization The process of converting raw to z-scores The resulting distribution of z-scores will

always have a mean of zero, a SD of one, and an area under the curve equal to one

The proportion of scores that are higher or lower than a specific z-score can be determined by referring to a z-table

04/21/23 53

Page 54: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 54

Refer to a z-tableto find proportionunder the curve

Refer to a z-tableto find proportionunder the curve

Page 55: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

04/21/23 55

Partial z-table (to z = 1.5) showing proportions of the area under a normal curve for different values of z.

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359

0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753

0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141

0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517

0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879

0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224

0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549

0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852

0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133

0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389

1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621

1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830

1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015

1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177

1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319

1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.94410.93320.9332

Corresponds to the area under the curve in black

Corresponds to the area under the curve in black

Page 56: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Tables of z scores state what proportion of any normal distribution lies above or below any given z scores, not just z scores of ±1, 2, or 3.

Z-score tables can be used to - Determine proportion of distribution with a

certain score(e.g finding the proportion of the class with a score of 65% on an exam)

-Find scores that divide the distribution into certain proportions(for example finding what scores separates the top 5% of the class from the remaining 95%)

04/21/23 56

Page 57: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

Allows us to specify the probability that a randomly picked element will lie above or below a particular score.

For example, if we know that 5% of the population has a heart rate above 90 beats/min, then the probability of one randomly selected person from this population having a heart rate above 86.5 beats/min will be 5%.

04/21/23 57

Page 58: © 2006 Dr Rotimi Adigun. Recommended text  High Yield Biostatistics, Epidemiology and Public Health,Michael Glaser, Fourth edition.  Pre-Test Preventive.

What is the probability that a randomly picked person would have a heart rate of less than 50 beats per minute in a population with S.D of 10 and mean heart beat of 70z?

04/21/23 58


Recommended