Download - Sample Size

Determining the Size of a Sample

Kajal Srivastava SPM Deptt.S.N.Medical College,

Kajal SrivastavaJR- 3SPM Deptt.S.N.Medical College,

Sample definition Characteristics of an ideal sample Terminologies C.I. method of calculating sample size

Formulas Other methods

Contents

A finite set of objects drawn from the population with an aim is called a sample.

Probability sampling

Aim

What is sample

Determining sample size is a very important issue because samples that are too large may waste time, resources and money, while samples that are too small may lead to inaccurate results.

Why to determine sample size

True representative

Precision

Unbiased character

Characters of Ideal Sample

Study design

Types of outcome measure

Guess at likely result

Required level of significance

Required precision / power

Before calculating sample size one has to decide on the following:

Types of outcome measures- Proportion, rates and means.

Sampling error-An estimate of an outcome measure calculated in an intervention study is subject to sampling error, because it is based on a sample of individuals and not on the whole population of interest.

The confidence interval is a range of plausible values for the true value of the outcome measure.

It is conventional to quote the 95 % confidence interval (also called 95%confidence limits).

There is a 95 chances out of 100 to find that the true value will be with in this range.

95% of samples drawn from a population will fall within + 1.96 x Sample error

Confidence Interval

It is appropriate to test a specific hypothesis about the outcome measure.

Null hypothesis

Significance tests

Determine the p – value (probability value) or ‘significance’ of the results.

Significance tests & P value

Type I error(α)- Rejecting a null hypothesis when its true.

The probability of this error is referred as P value. When P value is small, it is safe to conclude that

groups are different. This threshold, 0.05 is the level of significance.

Type II error- The second type of error is failure to reject null hypothesis when it is actually false. The complimentary probability of type II error is the statistical power (1-β). Thus the power of a statistical test is a probability of correctly rejecting a null hypothesis when it is false.

Types of error

Power of the study indicates the probability of finding a statistically significant difference between the two groups.

The power of a study depends on:1. The value of the true difference between the

study groups (effect size). The greater the effect, the higher the power to detect the effect as statistically significant for a study of a given size.

2. The study size; The larger the study size, higher is the power.

3. The probability level at which a difference will be regarded as ‘statistically significant’.

Power of study

Two tailed- when we make hypothesis that sample statistic is lesser or greater than population parameter than its called two tailed.(i.e. in both direction)

Single tailed- when we make hypothesis that sample statistic is either lesser or greater than population parameter than its called one tailed.(in single direction)

One sided and Two sided tests

Usually, it is more important to estimate the effect of the intervention and to specify a confidence interval around the estimate to indicate the likely range, than to test a specific hypothesis.

Therefore, in many situations it may be more appropriate to choose the sample size by setting the width of the confidence interval, rather than to rely on power calculations.

Confidence interval approach: applies the concepts of accuracy, variability, and confidence interval to create a “correct” sample size

• Variability: refers to how similar or dissimilar responses are to a given question

• P (%): share that “have” or “are” or “will do” etc.

• Q (%): 100%-P%, share of “have nots” or “are nots” or “won’t dos” etc.

With Nominal data (i.e. Yes, No), we can conceptualize answer variability with bar charts…the highest variability is 50/50

If we conducted our study over and over, e.g.1,000 times, we would expect our result to fall within a known range (+ 1.96 s.d.’s of the mean). Based upon this, there are 95 chances in 100 that the true value of the universe statistic (proportion, share, mean) falls within this range!

The Confidence Interval Method of Determining Sample Size

Normal Distribution

1.96 X s.d. defines the endpoints for 95% of the distribution

The level of confidence we desire that our results be repeated within some known range if we were to conduct the study again, and…

the variability (in responses) in the population and…

the amount of acceptable sample error (desired accuracy) we wish to have and…

the size of the sample.

There is a relationship among:

The reduced power or precision resulting from losses may be avoided by increasing the initial sample size in order to compensate for the expected number of losses.

A 5-20% allowance is generally considered appropriate.

Practical constraints

Allowances of losses

If the sample is small the confidence interval will be very wide and even though it will probably include the null value, it will extend to include large values of the effect measure.

In other words, the study will have failed to establish that the intervention has no appreciable effect.

In case the intervention does have an appreciable effect, a study that is too small will have low power i.e. it will have little chance of giving a statistically significant difference.

Consequences of studies those are too small

Estimating a population proportion: With specified absolute precision-

Required information-- Anticipated population proportion: P, a rough

estimate of P is sufficient.- Desired confidence level- Absolute Precision: (d ) - total percentage

points of the error that can be tolerated on each side of the figure obtained.

FORMULAE FOR SAMPLE SIZE ESTIMATION

The estimated sample size is applicable only in case of SRS.

If another sampling method is used, a larger sample size is likely to be needed because of design effect.

For cluster sampling strategy, the estimated sample size as above is multiplied by design effect, which is defined as the ratio of variance obtained in cluster survey to the variance for the same sample size adopting SRS.

In cluster sampling strategy, a design effect of 2 is taken.

This means twice as many individuals would have to be studied to obtain the same precision as with SRS.

Estimating a population proportion: With specified relative precision

Relative Precision:– The sample result should fall within є % of the true value.

Estimating the difference between two population proportions with specified absolute precision (Two – sample situations)

P1, P2 = anticipated value of the proportions in the two populations.

Required information :-- Anticipated values of the population

proportions: P1 & P2- Level of significance- Power of the test: 100 (1-β) %

Hypothesis testing for two population proportions

Estimating a population mean: With specified absolute precision

Estimating population mean: Withspecified relative precision

Estimating difference between means of two populations with specified precision

Hypothesis testing for two population means

There are computer programs available that perform sample size calculations.

In particular, this facility is available in the package ‘Epi Info’, though it does not cover the full range of possibilities.

Other Methods of Sample Size Determination

• Arbitrary “percentage rule of thumb” sample size:• Arbitrary sample size approaches rely on

erroneous rules of thumb (e.g. “n must be at least 5% of the population”).

• Arbitrary sample sizes are simple and easy to apply, but they are neither efficient nor economical. (e.g. Using the “5 percent rule,” if the universe is 12 million, n = 600,000 – a very large and costly result)

Other Methods of Sample Size Determination…cont.

• Conventional sample size specification• Conventional approach follows some

“convention” or number believed somehow to be the right sample size (e.g. 1,000 – 1,200 used for national opinion polls w/+ 3% error)

• Using conventional sample size can result in a sample that may be too large or too small.

• Conventional sample sizes ignore the special circumstances of the survey at hand.

Special Sample Size Determination Situations

Sample Size Using Nonprobability Sampling

• When using nonprobability sampling, sample size is unrelated to accuracy, so cost-benefit considerations must be used

Refrences Sample size determination in health studies- A

practical manual- S.K. Lwanga and S. Lameshow Sample size determination in health studies-NTI

Bulletin 2006,42/3&4,55-62, VK Chadha Sampling guide- Tulane University School of Public

Health, Robert Magnani On validity of assumptions while determing

sample size. IJCM, S.B. Sarmukaddam, S.G. Garad-. Essentials of biostatistics- Nishi Agarwal Methods in biostatistics- BK Mahajan

Thank You