Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 229 times |
Download: | 5 times |
Data Analysis
Statistics
OVERVIEW
Getting Ready for Data CollectionThe Data Collection ProcessGetting Ready for Data AnalysisDescriptive Statistics
GETTING READY FOR DATA COLLECTION
Four stepsConstructing a data collection formEstablishing a coding strategyCollecting the dataEntering data onto the collection form
THE DATA COLLECTION PROCESS
Begins with raw dataRaw data are unorganized data
CONSTRUCTING DATA COLLECTION FORMS
ID Gender
Grade Building Reading Score
Mathematics Score
12345
22122
8284
10
16666
5541465645
6044375932
One column for each variable
One row for each subject
CODING DATA
Use single digits when possibleUse codes that are simple and unambiguousUse codes that are explicit and discrete
Variable Range of Data Possible
Example
ID Number 001 through 200 138
Gender 1 or 2 2
Grade 1, 2, 4, 6, 8, or 10 4
Building 1 through 6 1
Reading Score 1 through 100 78
Mathematics Score 1 through 100 69
Interpretation
• The process of making pertinent inferences and drawing conclusions concerning the meaning and implications of a research investigation
The Basics
Descriptive statisticsInferential statistics
Sample statisticsPopulation parameters
Sample--------------population
Sample statistics
Variables in a sample or measures computed from sample data
Population parameters
The variables in a population or measured characteristics of the population
Making Data Usable
…Or what to do with all those numbers
Descriptive Statistics
Frequency Distributions
Organizing a set of data by summarizing the number of times a particular value of a variable occurs
Frequency distribution of ice cream consumptionAge Frequency
(number in range)
01-56-1011-15TOTAL
25158250
Percentage DistributionsOrganizing the frequency distribution into a chart or graph that summarizes percentage values associated with particular values of a variable
ProportionThe percentage of elements that meet some criterion (percentage, fraction or decimal)
Frequency distribution of ice cream
consumption by age
Age Percent (of people who consumed ice cream in range)
01-56-1011-15TOTAL
5030164
100%
Graphic Representations of Data
Pie Chart: Ice cream consumption
WinterSpringSummerFall
Bar Chart: Frequency of Seasonal Ice Cream consumption
0
10
20
30
40
50
60
70
80
90
Winter Spring Summer Fall
Amt
Cross tabulation
Cross tabulation: a technique for organizing data by groups, categories or classes, thus facilitating comparisons; a joint frequency distribution of observations on two or more sets of variables
Types of Cross tabsContingency table: the results of a cross tabulation of two variables, such as survey questionsCross tab of question: Do you have children under the age of six currently living with you? This is a 2X2 table, why
Yes No Total
Males 5 15 20
Females
10 20 30
Total 15 35 50
Types of Cross tabsPercentage cross-tab. Using percentages helps us make relative comparisons. The total number of respondents/observations may be used as a base for computing the percentage in each cellPercentage Cross tab : Do you have children under the age of six currently living with you? Yes No Total
Males 20% 80% 100% (20)
Females 33.33% 66.66% 100% (30)
Total 30% 70% 100% (50)
Bar Chart: Frequency of Seasonal Ice Cream consumption Shown By Gender
0
10
20
30
40
50
60
70
80
90
Winter Spring Summer Fall
MaleFemale
Graphical representation of results from cross tab
Elaboration Analysis of Cross tabs
Analysis of the basic cross-tab for each level of another variable, such as subgroups of the same samplePercentage Cross tab : Do you have children under the age of six currently living with you? Moderator Variable; Spurious relationship
Aged 17-25 Aged 25 and up
Male Female
Yes 0 2
No 10 20
Male Female
5 8
0 0
Calculating Rank Data
Please place in rank order the following varieties of cookies (1= most preferred to 4=least preferred) __ Chocolate chip__ Marshmallow__ Oatmeal__ Oreo
Choco chip
Marshm
Oatmeal Oreo
1 1 2 4 3
2 1 3 4 2
3 2 1 3 4
4 2 4 3 1
5 2 1 3 4
6 3 4 1 2
7 2 3 1 4
8 1 4 2 3
9 4 3 2 1
10 2 1 3 4Chocolate chip: (3X1) +(4X2) + (2X3) +(1X4) = 21 ********
Marshmallow: (3X1) +(1X2) + (3X3) +(3X4) = 26
Oatmeal: (2X1) +(2X2) + (4X3) +(3X4) = 26
Oreo: (2X1) +(2X2) + (2X3) +(4X4) = 28
Measures of central tendency
Mode: the value that occurs most often
Median: the midpoint; the value below which half the values in a distribution fall
Mean: the arithmetic average
Remember: what type of scale you use determines the type of statistic you may calculate
WHEN TO USE WHICH MEASURE
Measure of Central
Tendency
Level of Measurement
Use When Examples
Mode Nominal Data are categorical Eye color, party affiliation
Median Ordinal Data include extreme scores
Rank in class, birth order
Mean Interval and ratio
You can, and the data fit
Speed of response, age in years
Measures of dispersion
What is the tendency for measures to depart from the central tendency?Range: simplest measure of dispersionDeviation scores- quantitative index of dispersion Average deviation: never used Variance: the sum of squared deviation scores
divided by sample size minus 1- often used. (variance is in squared units, eg squared dollars)
Standard Deviation: square root of variance
MEASURES OF VARIABILITY
Variability is the degree of spread or dispersion in a set of scoresRange—difference between highest and lowest scoreStandard deviation—average difference of each score from mean
THE MEAN AND THE STANDARD DEVIATION
STANDARD DEVIATIONS AND % OF CASES
The normal curve is symmetricalOne standard deviation to either side of the mean contains 34% of area under curve68% of scores lie within ± 1 standard deviation of mean