Chris Cunningham, Ph.D.
UC Foundation Asst. Prof., Dept. of Psychology, UTCAdjunct, Department of Internal Medicine, UTCOM‐C
Cover the basics of PASW
Help you feel confident enough to try using PASW
Show you how to be smart about data Show you how to be smart about data organization
Teach you when certain statistics are more useful than others
Most of this material comes from my own experience, but I have also found the following resources particularly helpful and worth mentioning to you:
Field, A. (2009). Discovering statistics using SPSS (3rd
ed.). Sage Publications
http://www.statisticshell.com/
Dr. Michael Biderman (UTC): ttp://www.utc.edu/faculty/michael‐biderman
PASW tutorials (built into the program’s Help function)
Design comes before data collection and before data analysis
No statistical analysis program can fix a bad research design
Garbage in, garbage out
When you are stuck, ask for help
No one knows everything
Research involves discovering relationships
between independent variables (IV) and dependent variables (DV)
between treatments and resultsbetween treatments and results
between procedures and outcomes
Make sure your IVs and DVs are measureable, meaningful, and clearly definable
For a relationship to be detectable, there must be variation in two or more variables that you are trying to study
Some patients must get better while others do not
Some subjects receive one treatment, while others get a placebo
Read
Search the literature
Talk with other professionals
Listen to our patientsListen to your patients
Take long walks
Write out your expectations, stating how the outcome varies as the treatment varies.
State why you have the above expectations, based on your examination of the literature and other factors
2/11/2010CJLCunningham 8
Type I (false alarm)
Highlighting an “effect” that is really due to chance/error
Set by our choice of alphaSet by our choice of alpha
Type II (miss)
Failing to identify an effect that is truly there
Linked to our choice of alpha level and other factors
Why would we design a study that is destined to not find any effects?
This is what we’re doing if we do not consider statistical power ahead of time
Power is defined as the probability that we will detect a significant effect or relationship if it truly exists
It is influenced by four main factors
Without careful design, a research study can be seen as a search for a needle in a haystack
Wouldn’t you prefer to search for a pitchfork?
When designing a study in which you control When designing a study in which you control the IV, make the manipulation and comparison groups as distinct as possible
Common in the social and behavioral sciences is an error risk (Type I) of 5% (alpha = .05)
Depending on your area of research, you may want to be more stringent (alpha = .01)
Generally, the more stringent your error risk criterion is, the lower your statistical power (and the higher your risk of committing a Type II error)
Variability mainly comes from two main sources: random error and systematic bias
Bias can be controlled with good design
Random error just is
If there is less error, there is less “noise” in our dataset, making it easier for us to detect what was really happening in our sample
Use homogeneous samples and reliable measurement techniques
Larger samples tend to better represent the broader, real population
Statistics calculated from large samples tend to be less biased than data from small samples
Your sample size needs are primarily dictated by the size of the effect you are expecting to observe
Large effects require smaller samples
Estimate for the weakest expected effect
This is better demonstrated within the program…
Refer to the guide and the screen here as we walk through some of the basic features and layout of this program
Variables are what we measure/observe
Three main types relevant to PASW:
Nominal – most basic, categorical “names”
Only good for frequencies and groupingOnly good for frequencies and grouping
Ordinal – numerical, rank‐ordered
Good for basic comparisons (< or >)
Scale – numerical, meaningful tie to construct
Good for most analyses within PASW
Alternatives:A word processor such as Word (if you have only a few measurements on each case.A spreadsheet such as ExcelA database program such as Accessp gA statistics software program such as PASW
If entered following basic rules, data can be moved from the above to any statistics package
Cleaning and organizing data is sometimes easier in Excel than PASW
Two main ways to import or transfer data from Excel PAW
Ugly way: Copy and paste
Not as simple as it sounds
Professional way: Use the Database Wizard
Follow me and your guide for a demo…
First place to turn is the PASW Help Tutorial
Reading data
Using the data editor
A general walk‐through may also be helpful, so g g y p ,please follow‐along in the guide as I demonstrate…
Is Vine St. or Oak St. safer? You decide to measure the speeds of randomly selected cars between 5 and 6pm:
Vine St. Speeds: 34 32 32 30 31 29 35 38 37 36 35 39 35 37 35 34 33 32 39 37 36 35 30 29 31 33 35 32
Oak St Speeds: 27 26 29 30 38 37 36 37 24 30 31 36 34 38 Oak St. Speeds: 27 26 29 30 38 37 36 37 24 30 31 36 34 38 31 20 37 29 30 35 37 31 26 38 31 30 27 26
On which streets are speeds typically faster ?
First, run Frequencies on Vine St. speeds.
Then run Frequencies on Oak St. speeds.
We are focusing on means (averages) here
There are several basic “rules” to remember when entering data
Most of these are learned by trail and error, but hopefully I can save you some time by reviewing a few of them now (not in your guide)
Create a rectangular table
Rows must represent patients, the objects of study
Columns represent the different measurements on those patients (i.e., variables)
Actual age rather than age group
Actual charge
Excess detail can always be collapsed away by creating categories, for example, age categories or charge categories.
Use the value labeling option after you set numeric placeholders (as described in the guide)
For dates, use four-digit years (unlike the example above…sorry)
TextStatistical Packages
What’s in Excel What ends up in statistics package
The > sign makes this entry text.
Treats problematic
values as missing.
Wrong way
Each cell contains two pieces of information
Right way
.Use two columns to record the two variables, Location and Side
Missing dataMissing data
Data in spreadsheet Data in statistics package
The entry “No”.The entry “No”.
In the following excerpt from a very wide database
Each patient “owns” a row
All data for that patient or case should be on same row
Hyperplasia was assessed at 3, 6, 12, 18, 24, and 30 months after biopsy
To facilitate inspection of key data values, put those that’ll be viewed at the same time together.
Statistical packages can now analyze millions of cases and thousands of measures per case
Don’t worry about entering too much data
Not likely to reach the limit of the PASW Not likely to reach the limit of the PASW program
For the sake of simplicity, though, only enter the data you realistically expect to analyze now or in the near future
Start with a decision tree (see the guide)
Take advantage of the PASW Statistics Coach
Consult with someone who might know how to help you figure this out and possibly conduct help you figure this out and possibly conduct the appropriate analyses
Follow me and the guide…
Generally two options
Generate basic graphics in PASW
Export or copy and paste out your PASW output into Excel and work with it thereoutput into Excel and work with it there
Choice depends on your comfort/familiarity with Excel and the level of detail you need in your graphics
Follow me and the guide for some basic tips…
Descriptive statistics describe a set of data (and conceptually the population that the data are supposed to represent)
Central tendency
Variability/dispersion
A couple of ways to generate these statistics within PASW
Follow me and the guide…
Goal of a correlational analysis is to demonstrate whether and to what degree two or more variables are related to each other
Pearson’s r is the most common statistic
Ranges from ‐1 to +1 (any other value = mistake)
Values closer to ‐/+ 1 indicate stronger relationship between two variables
Follow me and the guide for a demo…
This workshop is really a beginner’s affair
There are many, many other ways to analyze and explore your data
These other methods are designed to allow you to answer your research questions
Do not allow the methods to dictate the questions you ask
For these more advanced analyses, you may want to consult with someone experienced
Comparing groups
t‐tests and ANOVA
Predicting outcomes
Regression (linear and other ise)Regression (linear and otherwise)
Data behaving badly (or counted data)
Nonparametric statistics