Date post: | 16-Dec-2015 |
Category: |
Documents |
Upload: | jithesh-kumar-k |
View: | 10 times |
Download: | 1 times |
Experimental Design
Nilesh B. Patel, PhDDept Medical Physiology
University of NairobiKenya
What is science?
The systemic study of the structure and behavior of the physical world, involving experimentation and measurement and the development of theories to describe the results of these activities.
Cambridge International Dictionary of English
What is an experiment?
A test done in order to learn something or to discover whether something works or is true (plausible).
Experiments are done to prove the hypothesis is false.
Only falsification is possible with certainty; proving something can never be done with certainty
What can be studied with the methods of science?
1. The capital of Kenya is Nairobi.2. Holland is twice the size of Kenya.3. All humans are mortal.4. All rabbits are grey.5. All ziwats are blue.6. Mentally ill people are possessed by an evil
spirit that7. Mentally ill people are possessed by an evil
spirit that cannot be detected by any known means.
Adapted from: Circadian Physiology. Roberto Refinetti. CRC Press 2000
Galileo Galilei(1564-1642)
Father of modern physics Father of modern
observational astronomy Father of modern science Pioneered the use of
quantitative experiments whose results could be analyzed with mathematical precision
How many of you accept that the sun goes round the earth?How many of you accept that the earth goes round the sun?
Types of Experiments
Experiments are done to test a hypothesis; a prediction.
Observational Experimental Quasi-experimental
Experimental Design
Whether or not a studys findings are useful or not depends crucially on design.
No matter how ingenious or important an idea for an experiment might be, if the study is badly designed, its worthless.
An excellent procedure for determining cause and effect.
Does night cause day or does day cause night?
Three Aims of Research Validity
Results actually show what it is that you intent them to show
Reliable Potentially replicable by yourself or anyone else
Important Subjective
Research can possess all the above qualities but still be essentially trivial.
Research cannot be important if its findings are unreliable or invalid.
Score or Measurement Consists of a number made up of different
components1) A true measure of the thing we want to measure2) A measure of other things3) Systemic (non-random) basis: measuring other
things inadvertently.4) Random (non-systemic) error, which should cancel
out over large number of observations. We want the score or measurement to consist
of as much true score, as possible and little of the other factors (validity and reliability).
Types of Measurement Nominal Ordinal Interval Ratio
Is what you measured, a real measure of what you are interested in?
It is a measure what you are interested in. Random (non-systematic) errors. Non-random errors (systemic) errors.
Evidence for the intellectual inferiority of women
Paul Broca (19th Century) make careful measurement of brain weight and found Caucasian men have larger brains than Caucasian women, who in turn have a larger brain than negros or any other non-Caucasian for that matter.
Modern brains were supposedly heavier than medieval brains, and French brains were heavier than German brains.
The brain weights differences were considered to reflect differences in intelligence between these different groups.
MWM and disruption of biological clock where the measure is the time to reach the platform.
Anesthesia and learning.
Experimental Design
This is important so that the results can be interpreted and other causes can be excluded from the most likely conclusion.
The results have to be valid, reliable and generalizable (important)
Simplest Experiment
MeasurementMeasurementBaseline Treatment
There is a difference so my treatment has an effect
Time animal handler effectHabituation
The need of control group
Lazaro Sapallanzani 1793 Found that bats and owls differ in their ability to
avoid objects in the dark. With hoods the bats could not avoid the objects. So the bats have better night vision than owls. Covered with transparent hoods; the bats have a
problem (key control experiment). Blind the bats and they are fine Wax in the ears Pierce invented the ultrasound microphone and
Griffin walked in with a bat. Association is not causation but can be.
TreatmentExperimental
TreatmentControl
Measurement
Measurement
Experiment with control group
Great!!! There is a difference so my treatment has an effect.
fr
e
q
u
e
n
c
y
2cm height 100 cm
treatmentcontrol
Population distribution
Randomization
Random allocation
TreatmentExperimental
TreatmentControl
Measurement
Measurement
Post-test/control groupIndependent variable and dependent variable
Measurement
MeasurementMeasurement
MeasurementTreatmentExperimental
TreatmentControl
Between-group Design
Advantages Simplicity Less chance of practice or fatigue Useful when it is impossible for an individual
to participate in all experimental conditions Disadvantages
Expense in terms of time, effort and number
Treatment
Treatment MeasurementNo Treatment
Measurement MeasurementNo Treatment
Measurement
Repeated-Measure Design
Repeated-Measures Design
Advantages Economy Sensitivity
Disadvantages Carry-over effect The need for the condition to be reversible
Treatment
Treatment MeasurementNo Treatment
Measurement MeasurementNo Treatment
Measurement
Repeated-Measure Design
Ra
n
d
o
m
a
l
l
o
c
a
t
i
o
n
TreatmentLevel A
TreatmentControl
Measurement
Measurement
Measurement
Measurement Measurement
MeasurementTreatmentLevel C
TreatmentLevel B
Measurement
Measurement
Multiple Levels of Independent Variable4 levels
Latin Square Design
To deal with the problem of order effects in within-subject designs.
The order of the various conditions is counter-balanced so each possible order of condition occurs just once.
Example
Three conditions A, B, and C.
Avoid systemic order effect
6 possible combinations of order
Order of conditions or trials
Group 1 A B CGroup 2 B C AGroup 3 C A B
James Brown EDGAR programmewww.jic.bbsrc.ac.uk/services/statistics/edgar.htm
Multi-Factorial Design
Two or three independent variables in the same study
Not recommended to use more than 2 as the analysis gets quite complicated and conclusion have to be qualified.
Gender and treatment Day and performance
Correlations
How changes in one variable alters another variable?
Can have more than one variable and determine the contribution or weight or % that each variable has on the measured value.
Multi-factorial, multi-dimensions, e.g. physics
Correlations are not cause and effect
Statistical Analysis
STATISTICS
Descriptive Inferential
Calculation of probabilityTests of significance
Graphs/tablesNumerical summary
Descriptive statistics summarizes the data obtained
Mode, Median, Mean Range, Quartiles, Squared Sum, Variance,
Standard deviation
Inferential statistics is used to reach certain conclusions that can be applied to public health planning or patient care.
Modern knowledge methods and objective decision making process
Measurement
Types of measurements Nominal Ordinal Interval Ratio
Is what you measured, a real measure of what you are interested in?
It is a measure what you are interested in. Random (non-systematic) errors. Non-random errors (systemic) errors.
Descriptive Statistics
Summarize the data
Measure of central tendency Spread of data
Standard deviation (sd)Standard error of the mean (sem)
rannge
Graphs Tables
MeanMode
Median
Line graphsHistogramsPie charts
Measuring the accuracy of the mean
The mean is a statistic that predicts the likely score of a person or measurement.
Most of the time, it will not a number in the data or measurement a hypothetical value.
Mean number of children is 2.5 per family in the UK.
The mean will only be a prefect representation of the data if all of the scores collected are the same.
01
2
3
4
5
6
0 2 4 6 8
mean
mean
The closer the mean is to all the data points the more representative is the mean of the data
Distribution of data Means are the same
Distribution of the scores is different
Range
Distribution measures
Range Quartile Mean deviation Sum of squared deviation Mean of the sum of squared deviation
variance Standard deviation Standard error of the mean (sem)
Standard deviation and percentage of data
68%
96%
1 sd
2 sd1.96 sd 95%
Population Parameters
(mu, mean) (sigma, standard
deviation)
SampleParameters
X (mean)
Sd (standard deviation
estimate
Population and sample parameters
Standard error of the mean (sem)
How well does the sample mean represent the population mean?
This is after all the purpose of research. If we take different samples from the same
population, the mean for some samples will be similar and for other different.
But usually we do not take many samples, we take usually 1 sample.
So how good an estimate of the population mean is it?
Population and sample parameters
X = 10
X = 10
X = 12X = 9
X = 11
X = 11
X = 10
Why Inferential Statistics?
Descriptive statistics does not help to answer research questions.
Most of modern research is hypothesis driven. There are two common hypothesis
(predictions) inherent1. Experimental hypothesis (HA)
The experimental manipulation will have an effect.2. Null hypothesis (H0)
The experimental manipulation will have no effect
So what if there is a difference between the means?
Two samples from the same population will have different means.
Samples taken from the extremes of the distribution will have large difference in the means.
So large differences suggest that the means are from two different populations, but how can we be certain that it not due to sampling from the extreme of the distribution.
Application of inferential statistic.
The question
what is the probability that the difference between the means is due to chance and not due to my manipulation?
Carry out tests of significance The tests of significance give the probability of obtaining
the difference between the means by chance. Choice of test depends on the type of measurement and
the experimental design. Whichever test you use you will end up with a number
the test statistic,e.g. t, z, F. What is the probability of obtaining that value of the test
statistic for the means and the spread of data that I have?
On what basis do I or anyone else accept that my manipulation did indeed have an effect?
Test Statistic A test statistic (t, F, z) is a statistic with a known
distribution. e.g. Age and death
Test statistic = systemic variation / unsystemicvariation.
The exact form of this equation changes from test to test, but essentially the comparison is the amount of variance created by an experimental effect against the amount of variance due to random factors.
If the experiment has had an effect then it will create more variance than random factors alone.
Go back
Ronald Aylmer Fisher1890-1962
"To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of."
Fisher tea test
Is the milk added after or before the tea is added?
The level of significance p < 0.05. There is a 5% or less probability that the
difference between the means is due to chance. I am therefore willing to accept with a 95%
confidence that the difference I obtained was due to my manipulation and not due to chance.
My results are significant but are they important?
Which test to use?
Sample size Data distribution Homogeneity of variance Independent measurements or repeated
measurements Comparison between 2 groups Comparison between 3 or more groups Type of data
Parametric or Non-parametric
Parametric Tests 1. Arithmetic means
Interval or ratio data2. Variance in the one condition = variance in the
other conditionHomogeneity of variance (Levenes test)Sphericity assumption (Mauchlys test)
3. Normal distribution1. Tests
Plot histogramKolmogorov-Smirnov testShapiro-Wilk test
Parametric tests
Students t-testIndependent t-testDependent t-test
Analysis of Variance (ANOVA)One-Way ANOVATwo-Way ANOVARepeated measures ANOVAMixed ANOVA
Students t-test
Only two groups of data Sample size < 30 Ratio difference between the means /
estimated s.e. of the difference between those two sample means.
Independent t-test unpaired
Two groups Different participants, i.e. each participant
contributes only one data point Normal distribution Levenes test not significant (p = 0.492) Presenting the data:
m = 24.2, se = 1.49; m = 20.00, se = 1.30.t(18) = 2.12, p < 0.05, r = 0.45.
Analysis of Variance (ANOVA) For 3 or more experimental group Generates a F ratio Tells whether the means of the samples are significantly
different. But not which means. Additional tests need to be done to determine which
means are statistically different (planned comparison or post-hoc tests).
Types of ANOVA One-Way ANOVA Two-Way ANOVA Repeated measures ANOVA Mixed ANOVA
One-Way ANOVA
3 or more groups Different participants in each group One-way for 1 independent variable Groups are independent, i.e. data in one group
does not influence the data in another group Homogeneity of variance
If Levenes test is significant Use non-parametric tests or transform the data
Output from SPSSSum of squares
df Mean square
F Sig
Between groups
450.664 5 90.133 269.733 0.000
Within groups
38.094 114 0.334
Total 488.758
Presentation of resultsF (5, 114) = 269.73, p < 0.05
Post-hoc tests9 REGWQ or HSD
Equal sample size and homogeneity of variance9 Bonferroni
Conservative, effected by group variance9 Gabriels Procedure
Slight differences in sample size9 Hochbergs GT2
Big differences in sample size9 Games-Howell Procedure
If the variances are not equal
Assumption of sphericity Equality of variance of the differences between
treatment levels Mauchlys test
Violation of sphericity is loss of power, i.e. increased probability of Type II error
Correction df by Greenhouse-Geisser estimate of sphericity Huyn-Feldt estimate of sphericity Lowest possible estimate of sphericity (lower-bound)
If > 0.75 use Huyn-Feldt If < 0.75 use Greenhouse-Geiser
Some sort of result
Mauchlys test indicated that the assumption of sphericity had been violated, 2(5) = 13.12, p < 0.05,therefore degrees of freedom were corrected using Huyn-Feldt estimates of sphericity ( = 0.85). The result showed that the number of women eyed-up was significantly affected by the amount of alcohol drunk, F(2.55, 48.40) = 4.73, p < 0.05, r = 0.40)
Non-Parametric Test
Assumption-free tests Less strict about the distribution of the data
being analyzed. Thus raw scores are used Data is ranked
Lowest score rank 1, next lowest rank 2, etc Analysis is carried out on the ranks Less power than parameteric tests hence higher
chance of Type II error.
Basic tests
Mann-Whitney test= independent t-test
Wilcoxon Signed-Rank test= dependent t-test
Kruskal-Wallis test = one-way ANOVA
Friedmans ANOVA= one-way repeated measures ANOVA
Mann-Whitney Test = independent t-test
Two conditions and different participants in each condition
Men (M dn = 27) did not seem to differ from dogs (M dn = 24) in the amount of dog-like behaviour they displayed (U = 194.5, ns)
Graphs are usually box and whisker plots. Use of the median
Non-parametric tests are testing the difference in ranks; not the difference in means.
Means are biased by outliers; whereas ranks are not.
Kruskal-Wallis Test = one-way ANOVA
More than 2 conditions and different participants have been used in each condition.
The test statistic is H and has chi-squared distribution Childrens fear beliefs about clowns were significantly
affected by the format of information given to them (H(3) = 17.06, p < 0.01). Mann-Whitney test were used to follow-up this finding. A Bonferroni correction was applied and so all effects are reported at a 0.167 level of significance. It appears that fear beliefs were significantly higher after the adverts compared to the control, U = 37.50, r = -.60.
Claude Bernard(1813-1878)
Those whose minds are bound and cramped. They make poor observations, because they choose among the results of their experiments, observation, and reading only what suits their object, neglecting whatever is unrelated to it and carefully setting aside everything which might tend toward the idea they wish to combat.
Ignaz Semmelweis(1818-1865)
Savior of mothers 10-35% mortality Death of colleague Jacob
Kolletschka (1847) Introduced lime washing
of hands 12.24% to 2.38%. The Etiology, Concept
and Prophylaxis of Childhood Fever (1861).
Semmelweis Reflex
Dismissing or rejecting out of hand information, automatically, without thought, inspection, or experiment.
References Andy Field and Graham Hole. How to Design
and Report Experiments. Sage Publications. London. 2006.
Martin Bland. An Introduction to Medical Statistics. 2nd ed. Oxford Medical Publications. 1996.
Norman T. J. Bailey. Statistical Methods in Biology. 3rd Ed. Cambridge University Press. 1995.
Beth Dawson and Robert G. Trapp. Basic and Clinical Biostatistics. Lange Medical Books/McGraw-Hill. 3rd ed. 2001.
Experimental Design What is science?What is an experiment? What can be studied with the methods of science?Galileo Galilei(1564-1642)Types of ExperimentsExperimental Design Three Aims of Research Score or Measurement Types of Measurement Evidence for the intellectual inferiority of women Experimental DesignSimplest Experiment The need of control groupLazaro Sapallanzani 1793 Randomization Between-group Design Repeated-Measure DesignRepeated-Measures Design Repeated-Measure DesignMultiple Levels of Independent Variable4 levelsLatin Square Design ExampleMulti-Factorial Design Correlations Statistical AnalysisMeasurement Descriptive StatisticsMeasuring the accuracy of the meanDistribution of data Distribution measures Standard deviation and percentage of dataPopulation and sample parametersStandard error of the mean (sem)Population and sample parametersWhy Inferential Statistics?So what if there is a difference between the means?The questionCarry out tests of significanceTest StatisticRonald Aylmer Fisher1890-1962Fisher tea testWhich test to use?Parametric or Non-parametricParametric Tests Parametric testsStudents t-testIndependent t-test unpaired Analysis of Variance (ANOVA)One-Way ANOVAOutput from SPSSPost-hoc testsAssumption of sphericitySome sort of result Non-Parametric TestBasic tests Mann-Whitney TestKruskal-Wallis TestClaude Bernard(1813-1878)Ignaz Semmelweis(1818-1865)Semmelweis ReflexReferences