Sta$s$cs…. L � Its easy to be a really good methodologist without being a statistical
genius � But, you have to have some knowledge of statistics
� The statistical skills it takes to be a good methodologist are mainly conceptual rather than computational
� If you can understand the logic of inferential statistics and familiarize yourself with a few relatively simply statistical tests, you will have enough of the working knowledge you’ll need to be a great methodologist
� Statistics are a set of mathematical procedures for summarizing and interpreting observations � These observations are usually numerical or categorical facts about
specific people or things, and they are usually referred to as data
Descrip$ve Sta$s$cs � The most fundamental branch of statistics is descriptive statistics � Statistics used to summarize or describe a set of observations � The easy ones! � Ex: means, medians, modes, percentages � Very useful for the general population
� Think of sports � Bradshaw ran for 17 yards…then 2 yards…then 5 yards…then 8 yards…then 15 yards…then 1 yard…
� Wouldn’t it be easier to say that he ran for an average of 8 yards/carry?
� A simple mean gives us more information a lot faster
Inferen$al Sta$s$cs � The branch of statistics used to interpret or draw inferences about a set of observations is referred to as inferential statistics � The harder ones!
� We will talk more about these later
Central Tendency & Dispersion � Descriptive statistics used by laypeople are often incomplete in one important aspect
� Laypeople make frequent use of descriptive statistics that summarize the central tendency (basically, the average) of a set of observations
� Most people are unaware of an equally useful and important category or descriptive statistics � Those that summarize the dispersion, or variability of a set of
scores
� Measures of dispersion are particularly important in inferential statistics
Central Tendency & Dispersion � One common measure of dispersion is the range of a set of scores � The difference between the highest and the lowest value in the entire set of scores
� Ex: 6.6, 5.6, 7.8, 6.7, 7.9….my range is 7.9-‐5.6 = 2.3
� Another common measure of dispersion is the standard deviation � The average measure of how much each score in the sample differs from the sample mean
� Ex: the mean IQ score is 100, with a SD of 15
Central Tendency & Dispersion � Measures of central tendency, like the mean, tell you what the typical person is like
� Measures of dispersion, like the standard deviation, tell you how much you can expect specific people to differ from this typical person
The Shape of Distribu$ons � A third statistical property of a set of observations is the shape of a distribution of scores
� Arrange them in order from lowest to highest, and graph them so that taller parts of the graph represent more frequently occurring scores
� There are three main types of distributions � Rectangular � Bimodal � Normal
The Shape of Distribu$ons � A rectangular distribution contains scores that are about equally frequent or probable � Ex: the theoretical distribution representing the two possible outcomes that can be obtained by tossing a coin
The Shape of Distribu$ons � In a bimodal distribution, two distinct ranges of scores are more common
than any other � They are relatively rare � Usually reflects a sample that only contains two meaningful subsamples � Ex: the heights of athletes attending the annual sports banquet for a very large
high school that only has two sports teams � Women’s gymnastics � Men’s basketball
The Shape of Distribu$ons � The most important and common type of distribution is the normal distribution � A symmetrical, bell-‐shaped distribution in which most scores
cluster near the mean and in which scores become increasingly rare as they become increasingly divergent from this mean � Ex: height, weight, extroversion, self-‐esteem, age at which infants begin
to walk
The Shape of Distribu$ons � The nice thing about the normal distribution is that if you know that a set of observations is normally distributed, it further improves your ability to describe the entire set of scores � More specifically, you can make very good guesses about the exact proportion of scores that fall within any given number of standard deviations from the mean
The Shape of Distribu$ons � About 68% of scores will fall within one standard deviation of the mean
� About 95% will fall within two SD of the mean
� About 99% will fall within three SD
� Ex: Wechsler Adult Intelligence Scale � Mean of 100, SD of 15 � 68% of people have an IQ between 85 and 115 � 99% fall between 55 and 145
The Shape of Distribu$ons � One important purpose of this is that this kind of analysis can be used to put a particular score in perspective, which is the first step towards making inferences
� Ex: a set of 400 scores on an astronomy midterm: � Approximates a normal distribution � Has a mean of 70 � A standard deviation of 6
� Your friend got an 84 � How impressed should you be? � You can figure out that she scored over 2 standard deviations
above the mean, meaning she scored in the top 1% of the class
Descrip$ve Sta$s$cs � This only covers a little bit on descriptive statistics, and you will learn much more in your statistics class
� Descriptive statistics provide researchers with an enormously powerful tool for organizing and simplifying data
� At the same time, they are only half the picture though
� We need to do more than just simplify and organize our data � We need to be able to draw conclusions about populations from
our sample data � To do this, we need to rely on inferential statistics
Inferen$al Sta$s$cs � The basic idea behind these kinds of statistics is that decisions about what to conclude from a set of research findings need to be made in a logical, unbiased fashion
� The logic of statistical testing is largely a reflection of the skepticism and empiricism that are crucial to the scientific method
� When conducting statistical tests to aid in the interpretation of a set of findings, researchers begin by assuming that the null hypothesis is true � They begin by assuming that their own predictions are wrong
Inferen$al Sta$s$cs � In a simple, two-‐groups experiment, this would mean assuming that the experimental group and the control group are not really different from each other after the manipulation, and that any apparent difference is due to luck (failure of random assignment)
� The main thing that statistical testing does is it tells us exactly how possible it is that someone would get results as impressive as those actually observed in an experiment if chance alone were at work
Inferen$al Sta$s$cs � The logic of statistical testing is almost identical to the logic of what happens in an ideal courtroom…
� Researchers begin by assuming that the null hypothesis is correct � That the researcher’s findings reflect change variation and are not
real
� The opposite of the null hypothesis is the alternative hypothesis � Any observed differences between the experimental and the
control group are real
� We assume “wrong until proven right” like a court assumes “innocent until proven guilty”
Inferen$al Sta$s$cs � Jurors decide that a person is guilty only if the evidence suggests beyond a reasonable doubt that the defendant committed the crime in question � The statistical equivalent of “beyond a reasonable doubt” is the alpha level � In most cases, the alpha level is set at .05
� That is, researchers may reject the null hypothesis and conclude that their hypothesis is correct only when findings as extreme as those observed in the study would have occurred by chance alone less than 5% of the time
� Even if a researcher does find statistically significant results, they need to provide a reason for why they are true, just like we need to have a motive for a person to commit a crime
Probability Theory � All inferential statistics are grounded firmly in the logic of probability theory � Deals with the mathematical rules and procedures used to predict and understand chance events
� The probability of an event is: � The number of specific outcomes that qualify as the event in question divided by
� The total number of possible outcomes � Ex: the probability of rolling a 3 on a single roll with a standard die is 1/6, or .167 because: � There is only one roll that qualifies as a 3 � There are 6 equally likely outcomes
Factors That Influence the Results of Significance Tests � Alpha levels and Type I and II errors
� It is important to remember that when a researcher conducts a statistical test and obtains a significant result, it does not always mean that their hypothesis is correct
� Even if an experiment is perfectly executed with no design flaws, it is always possible that their results were due to chance � In fact, the p-‐value (the exact probability) we observe in an experiment tells us exactly how likely it is that we would have obtained results like ours even if nothing but dumb luck was operating in our study
Factors That Influence the Results of Significance Tests � Statisticians refer to this worrisome possibility of incorrectly rejecting the null hypothesis when it is true as a type I error � The likelihood of making a type I error is a direct function of where we set our alpha level � If we set it at .001, then this means that we are taking one chance in 1,000 of falsely rejecting the null hypothesis
� We can’t set our alpha at .001 all the time, because that would be foolish
� We would also increase the likelihood of committing a type II error
Factors That Influence the Results of Significance Tests � A type II error is when we fail to reject the null hypothesis when it is false
� This is why we usually set our alpha at .05
� Effect size also influences the results of our tests � The magnitude of the effect in which the researcher is interested in
� Ex: finding a correlation between height and foot size in a sample of NBA players
Meta-‐Analyses � Researchers have developed a special set of statistical techniques to summarize and evaluate entire sets of research findings
� Meta-‐analyses refer to the use of techniques to analyze the results of studies rather than the responses of individual participants
� Literally, it means that you are analyzing analyses!
� It can help us determine in what contexts a specific effect holds true � Ex: gender conformity experiment, pg. 291
T-‐Tests � One of the most important statistical tests
� Used to compare TWO GROUPS or TWO MEANS
Condi$ons of a t-‐test
� The groups must be independent � Their outcomes don’t affect each other
� The response variable that you are measuring is quantitative � Its values have numerical meaning and represent quantities
What do we use it for? � Typically used to study the mean of a population
� Not to study individuals within a population
� It is especially useful if your data set is small or if you don’t know the standard deviation of the population
� What your actual distribution looks like depends on your sample size � Smaller samples will yield large standard deviations and therefore a
flatter curve
� Also based on the degrees of freedom � How big is your sample?
� The number of items that are free to vary…so the formula is…
When are t-‐tests not enough?
� But, if you are extensively studying a population, at some point two means might not be enough…
� This is where ANOVAs come in
ANOVA � One of the most commonly used statistical tests at a more advanced level
� Stands for analysis of variance � All about examining the variance in a variable and trying to figure out where that variance comes from
� To compare several populations regarding some quantitative variable � The populations you are comparing constitute different groups, denoted by another variable (ex: age, ethnicity, etc.)
� Particularly useful when you are comparing groups who receive different treatments
ANOVA � Has several components
� Sum of squares � Pieces of variability
� F-‐test � Compares how much each group differs compared to how much
variability is within each group � ANOVA table
� Ex: applied to a one-‐factor/one-‐way ANOVA � Comparing the responses based on only one treatment variable
� Two-‐way ANOVA has two treatment variables, or two IVs
ANOVA Condi$ons � The groups are independent
� Their outcomes don’t affect each other
� The population you have sampled is normally distributed � You can guarantee this by using proper random sampling and assignment
� Variance among groups is equal (to start)
� After you have assured all of this, decide on your Ho and Ha
ANOVA � Verify your assumptions
� Come up with your hypotheses
� Determine your p-‐value
� Collect your data and analyze
� If you find significance, you can run multiple comparisons � Tells you where exactly the differences are (between which groups)
There’s more… � You can do two-‐way, three-‐way, four-‐way ANOVAs or Factorial ANOVAs, but those are for when you have more than one independent variable
� If you use these, you have to interpret main effects, as well as interactions
References Pelham, B. W. and Blanton, H. (2012). Conducting research in psychology: Measuring the weight of smoke, 4th ed. Belmont, CA: Wadsworth, Cangage Learning.