Post on 26-Mar-2015
transcript
CBR221 Introduction to Survey Data Analysis
with Excel
2
Workshop Objectives
Use Excel to help you: – Organize data for analysis – Systematically work with data – Analyze data – Graphically display analysis
We’ll Examine
– Types of variables used in analysis – Types of measurement scales used in analysis – How to describe data with Frequency counts,
Descriptive Statistics, Histograms, and Pivot Tables – How to create charts
Survey
Questions:
1. Area of the city child lives in?
2. Number of colds child had last year?
3. Gender: Male_____ Female _____
4. Age
5. Describe cold symptoms __________
5
Analysis
Helps describe, conclude, recommend
Systematic exploration for interpreting data
Survey Data
Answers can be in text or numbered formats
7
Statistics
Systematic method of converting and analyzing data by using numbers
8
Excel
Support tool for statistical methods
© The Wellesley Institutewww.wellesleyinstitute.com
9
Start Excel
10
Moving from Model to Excel Data Analysis
• Assumption, hypothesis, or model • Collect data • Organize data for analysis
Example Model or Assumption
• “West area children get fewer colds than central area children.”
• Want proof• Analysis: Find mean (average) number of colds by area
When Data Fit Model
• Data cluster as expected • Findings support assumption or model
For 33 west area children,
the mean is 5 colds.
For 31 central area children,
the mean is 6 colds.
The results indicate west area children do have fewer colds than central area children.
Table Illustrates
Zone Participants n
Colds
West 33 5
Central 31 6
Total 64
Table 1. Colds by City Area
x
Graph Illustrates
Central
Mean Colds
5
Figure 1. West area children had a lower mean number of colds than central area children.
6
West
Pie Chart Illustrates
West, 5Central, 6
Figure 1. West area children had a lower mean of five (5) colds than central area children who had a mean value of six (6).
16
Data Organizing Tips
As required, ensure: • All questions answered • Questions needing to be skipped, were skipped• Split multiple choice into 1 answer per column • Open File 1
17
Coding Open Ended Questions
What cold symptoms did your child have? • Take first 100 answers • Group similar answers together • You define what is similar • Reduce to 10-20 codes or fewer if useful• Pilot test
18
Exercise: Create 3 codes
Question: Describe child’s symptoms?
• Response 1 Stuffy nose• Response 2 Sinus congestion, Runny nose• Response 3 Difficulty breathing through nose• Response 4 Phlegm, Body ache, Runny nose • Response 5 No energy, Cough• Response 6 No energy• Response 7 Sore throat• Response 8 Cough • Response 9 No energy
19
Exercise: Code Symptoms
Code
Block
Expel
Pain
Description
Stuffy nose, Sinus congestion, Difficulty breathing through nose, No energy
Cough, Phlegm, Runny nose
Body Ache, Headache, Sore throat
20
Assign Values to Codes
Code
Block
Expel
Pain
Description
Stuffy nose, Sinus congestion, Difficulty breathing through nose, No energy
Cough, Phlegm, Runny nose
Body Ache, Headache, Sore throat
Value
Yes=1
No=0
Yes=1
No=0
Yes=1
No=0
21
I entered Code Variables and Values on new Excel sheet
22
I entered Responses with values on new Excel Sheet
23
To Enter Data
A B C
row 1 ID Location Block
row 2 1 1 0
row 3 2 2 1
row 4 3 1 0
• Row 1 has label for each variable • Enter data 1 survey at a time• 1 row per ID, work left to right• 1 answer per column
24
To Enter Data
• Enter 1 survey at a time • For each question, work left to right across single Excel
row • Look at File 1b
25
Practice
Enter answers on File 1b, sheet 3
Tip: Split answers across 3 columns
Question 2 • ID#1 Answer a)Block X b)Expel c)Pain• ID#2 Answer a)Block X b)Expel X c)Pain• ID#3 Answer a)Block X b)Expel c)Pain• ID#4 Answer a)Block X b)Expel X c)Pain X• ID#5 Answer a)Block X b)Expel X c)Pain
Compare your results with Data sheet
26
Other Possible Open-EndedQuestion Codes
1 = positive comment
2 = negative comment
3 = neutral comment, positive and negative
Verbatim codes:
e.g., “ache”, “congested”, “cough”
Code only those related to research question
27
Text or Numbers
• Text codes often easier to remember, fewer entry errors
• e.g. M for male and F for female
• But numbers often faster entry
• Easier to work with numbers in Excel
28
Check for Response Accuracy
• See File 1b, Accuracy Sheet
• Take 1 question at a time - i.e. pick out single column and check answers
• ID numbers at left
• For unanswered question, create blank cell (pivot tables count blanks)
29
Use Excel
When have larger number of respondents
Makes manual calculations easier
30
Organize Data
• Make sure data are entered into Excel in such a way that mathematical transactions can be performed on them
• E.g., If studying gender, let male equal 1 and female equal 2. Can then count the 1s and 2s in your study.
To Explain Findings
Use common terms e.g., Variable
Use accepted methods of analyses
(Certain variables and measurements scales use certain tests)
Variable
An object or human characteristic that:
• Is observable • Can be subject to variation• Can be classified according to a type (discrete,
continuous as well as dependent, independent)
Variable and Measurement Scale
• Certain variables also use certain scales
• Can do frequency test for all variables
• But variable and scale type may also further inform with additional statistical test e.g., descriptive statistics
Discrete vs Continuous Variable
Continuous• Infinite values
• Valid values in-between
• E.g., distance, height, age
Discrete• Finite values• No valid values
in-between
• E.g., male/female
full/part-time/co-op
35
Discrete or Continuous Variable
Hypothesis: City areas have different water temperatures.
Discrete Variable: e.g., West, Central, East areaYes or No Value, no values in-between, no equi-distance
Continuous: e.g., water temperatureInfinite number or range of possible values in-between
Dependent vs Independent Variable
Dependent• Assumed to change
Independent • Assumed to influence or does
not change
Hypothesis:
City areas have different water temperatures
Dependent
Water temperature Independent
City area
Measurement Scales
How do we measure what we are working with?
Types:1. Nominal2. Ordinal3. Interval4. Ratio
Nominal Scale
• Finite values• No values in-between • No logical order• Yes/No answer• E.g., East, West, Central• E.g., colour, gender
Ordinal Scale
• Finite values• No values in-between • Yes logical order• Yes/No answer • E.g., letter grades
Grade B
Grade C Grade Grade AA
Interval Scale
• Infinite values • Logical order• Values in-between• Equal distance between data points• How much? Numerical value• No natural zero, keeps going • No meaningful ratio between
numbers• E.g., temperature, • 20 degrees NOT twice as hot as 10
Ratio Scale
• Infinite values • Logical order• Values in-between• Equal distance between data points• Comparing how much? Numerical values• “0” value means something• Meaningful ratio between numbers • E.g., AGE
Adult earns $50K; Teenager earns $25K Adult earns twice as much as teenager
:
Variable Type Review Variable Type Values
Infinite
Range
Values in-between
Value in order
Values Equidistant
Scale *
Discrete(Yes/no)
X X X X Nominal
Discrete Grade (Yes/no) A B C
X X √ X Ordinal
Continuous(How much?)
Temperature or $$, Age
√ √ √ √ Interval or Ratio
Dependent Assumed changes due to an independent variable
Independent Assumed does not change
* Certain variables lend themselves to using certain types of measurement scales
Areas
Measurement Scale ReviewScale Values
finite in a range
Values have order
Values
in-between
Values are equidistant
Test *
Nominal(yes/no), AREAS
√ X X X Frequency
Ordinal A,B,C,(yes/no)
√ √ X X Frequency
Interval/ Ratio(how much for
1 sample?) Age
X √ √ √ Descriptive Statistics
Interval/ Ratio (how much for comparing 2 or more samples?)
G1 age G2 age
X √ √ √ Inferential Statistics e.g. t-test
Note: * Certain scales lend themselves to using certain statistical tests.
44
Statistical Tests• Frequency count for discrete data (nominal, ordinal scale) *
- quantity
• Descriptive Statistics for continuous data (interval, ratio scale)– mean, median, mode– Characteristics of single sample
• Inferential Statistics for continuous data (ratio scale) – t-test – Comparison of two or more samples– Making inference from samples to populations
Note: Can do frequency count for all data types
45
Methodology
• “By using a discrete/continuous independent variable called area (for city areas West, East, and Central), this study examined the discrete/continuous, dependent variable (with its possible multiple range of values) namely, water temperature.”
• “To measure the independent variable, (to indicate if someone lived in the West, East, or Central area), a nominal/ordinal/interval/ratio scale was used.”
• “To measure the dependent variable, to indicate how much the water temperature is), a(n) nominal/ordinal/interval/ratio scale was used.”
46
Methodology
• “By using a discrete independent variable called area, this study examined the continuous, dependent variable namely, water temperature.”
• “To measure the independent variable, a nominal scale was used.”
• “To measure the continuous dependent variable, an interval scale was used.”
47
Calculations
The statistical calculations we would conduct with Excel would be:
• Frequency counts for each of the 3 city areas • Descriptive statistics of mean, median, mode for water
temperatures
48
Review of Descriptive Statistical Terms
• Mean – average
• Median – where 50% of scores lie above a certain score and where 50% lie below a certain score
• Mode – score that results most often
Format Cells
Open Booklet
Open File 1c)