of 22
8/3/2019 SPSS Summary Statistics
1/22
SPSS Summary Statistics
Our first topic of statistical analysis is the method we use to derive tables of summary
statistics from SPSS for Windows. For this purpose we choose the data obtained for
Questions 3 in the questionnaire we used for the previous section, which is thelikelihood of the respondents buying a packet of Woolworth dog biscuit at $15 per
packet. The answers are given in a scale from 0, which means that the respondent will
definitely not buy it, to 10, which means that the respondents will definitely buy it.
To obtain our summary statistics using Frequencies analysis, from the menus choose:
Analyze
Descriptive Statistics
Frequencies...
The following dialogue box appears. On the left hand side, you will see a list of all of
the variables that you have defined. We select 'likelihood of buying':
Now click on the right arrow near the middle of the dialogue box. You will see that
the variables have now been taken to the Variables list on the right hand side of the
dialogue box.
8/3/2019 SPSS Summary Statistics
2/22
Now click on the Statistics button. A dialogue box called "Frequency: Statistics"comes up in which you can select quite a large number of calculations of summary
statistics that you can request your computer to do on your data. Here we have
selected the mean, standard deviation, median, minimum, maximum and mode etc..
Click on the continue button to return to the original dialogue box.
Another button that you can click on is the Charts button, which draws up a dialogue
box named "Frequencies: Chart." Here you can ask the computer to draw from you
data a variety of graphs. Here we have selected Pie charts.
8/3/2019 SPSS Summary Statistics
3/22
Click on Continue to confirm and leave. Then, after you have returned to the original
dialogue box, click on the OK button, which runs all of the calculations and graphsthat you have asked for.
The output for the summary statistics calculations are in a file called Output1. You
will see that a frequency distribution table has been constructed for 'likelihood of
buying' and that all of the summary statistics selected are shown above the frequency
distribution table. The summary statistics are:
Statistics
likelihood of buying
Valid 25N
Missing 0
Mean 5.56
Median 6.00
Mode 2
Std. Deviation 3.042
Variance 9.257
Minimum 1
Maximum 10
25 2.50
50 6.00
Percentiles
75 8.50
8/3/2019 SPSS Summary Statistics
4/22
The center of the distribution can be approximated by the median (or second quartile)
6.00, and half of the data values fall between 2.5 and 8.5, the first and third quartiles.
The mean is quite close to the median, suggesting that the distribution is symmetric.
Lets look first at the frequency distribution table as shown below. The first column iscalled value label. This column is empty because our data for Question 3 can be left
as a score out of 10 without further explanation for our interpretations; therefore we
have not and need not define what each of the numbers in the input mean in the Value
column when we were defining the variables. On the other hand, if you are drawing a
frequency distribution table for the shopping duty of the respondents, the value label
column would be filled with "Yes" in the first row, "Shared duty" in the second
row," etc. A row is also given for missing values if there is any.
likelihood of buying
Frequency Percent Valid Percent
Cumulative
Percent
1 2 8.0 8.0 8.0
2 4 16.0 16.0 24.0
3 2 8.0 8.0 32.0
4 2 8.0 8.0 40.0
5 2 8.0 8.0 48.0
6 3 12.0 12.0 60.0
7 2 8.0 8.0 68.0
8 2 8.0 8.0 76.0
9 3 12.0 12.0 88.0
99%
chance3 12.0 12.0 100.0
Valid
Total 25 100.0 100.0
The second column is called value and it simply is the numbers used in the input of
the data. The third column shows the frequency of occurrence of each number or
response. The fourth column is the percentage of each response occurring, including
the missing values, so that the frequency of each response or missing value is divided
by the total number of responses and missing values. The fifth column is also the
percentage of each response occurring, but not taking into account the missing values,
so that the frequency of each response is divided by the total number of responses
8/3/2019 SPSS Summary Statistics
5/22
only. Both columns should therefore add up to 100%. The final column is the
cumulative percentage of the valid percentages in the fifth column.
The pie chart is as below:
You may also want a bar chart, ordered by descending frequencies, to help you find
the mode and also to visually compare the relative frequencies. To obtain an ordered
bar chart, recall the Frequencies dialog box. Click Charts. Select Bar charts. Click
Continue.
Then click Format in the Frequencies dialog box. Select Descending counts. The
frequency table can be arranged according to the actual values in the data or according
8/3/2019 SPSS Summary Statistics
6/22
to the count (frequency of occurrence) of those values, and in either ascending or
descending order.
Click Continue. Click OK in the Frequencies dialog box to get the following bar
chart:
With your output, it is possible to do two things - to print it or to copy it to another
application such as Microsoft Word. To print, go the File pull down menu and select
print, then press the OK button. If you only want to print a selection of the output,
then highlight your selection and when you are in the Print dialogue box select Print
Selection before clicking on the OK button.
Alternatively, highlight a part or the whole of your output and in the Edit pull down
menu select Copy. This puts the selection on the Clipboard so that when you enter
Microsoft Word, you can select Paste from the Edit pull down menu to transfer the
8/3/2019 SPSS Summary Statistics
7/22
selection to your Microsoft Word file. A small tip - change the font of the pasted table
to Courier New in word for proper alignment.
Going back to Output1 in SPSS, we are now done with our output for now and
minimize the window for Output1, after which you should see an icon for Output1 atthe bottom of your screen.
Descriptives
Descriptive statistics can also be obtained using the descriptives procedure in
SPSS. To run a Descriptives analysis, from the menus choose:
Analyze
Descriptive Statistics
Descriptives...
Select the following variables - bring shopping bags, willing to support, protect
environment, use scrap paper and government support. Click the save standardized
values as variables box to get measurements that are free from units of measurement:
Click Options, and make the following selections:
8/3/2019 SPSS Summary Statistics
8/22
Click OK to get the following output:
SPSS Split File Procedure
Suppose we want to see if there are differences between male and female respondents
in terms of their attitudes towards protecting the environment. We can from the
menus select Data, and then Split File. ClickCompare Groups, and move gender
into the textbox as shown below:
8/3/2019 SPSS Summary Statistics
9/22
ClickOK and then repeat the Descriptives procedure as described in the previous
section to get the following output:
Descriptive Statistics
11 1.00 5.00 37.00 3.3636 1.56670
11 4.00 5.00 49.00 4.4545 .52223
11 4.00 5.00 50.00 4.5455 .52223
11 1.00 5.00 33.00 3.0000 1.61245
11 4.00 5.00 51.00 4.6364 .50452
11
14 1.00 5.00 31.00 2.2143 1.57766
14 4.00 5.00 64.00 4.5714 .51355
14 4.00 5.00 62.00 4.4286 .51355
14 1.00 5.00 32.00 2.2857 1.54066
14 4.00 5.00 61.00 4.3571 .49725
14
bring shopping bags
willing to supportprotect environment
use scrap paper
government support
Valid N (listwise)
bring shopping bags
willing to support
protect environment
use scrap paper
government support
Valid N (listwise)
genderfemale
male
N Minimum Maximum Sum Mean Std. Deviation
SPSS Summary Procedure
Use Summarize to create a summary report. To begin the analysis, from the menus
choose:
Analyze
Reports
Case Summaries...
8/3/2019 SPSS Summary Statistics
10/22
Select likelihood of buying as the variable to summarize. Select education as the
grouping variable. Because this is a grouped summary report, individual case
listings are not required, so deselect Display cases.
Click Statistics. Select Mean, Median, Minimum, and Maximum as the cell statistics.
Note that Number of Cases appears by default in that list.
8/3/2019 SPSS Summary Statistics
11/22
Click Continue, and then click Options in the Summarize dialog box. Type
Likelihood of Buying as the title. Type Grouped by Education as the caption.
Click Continue. Click OK in the Summarize dialog box to get the output:
Summarize
Case Processing Summary
25 100.0% 0 .0% 25 100.0%likelihood of buying* education
N Percent N Percent N Percent
Included Excluded Total
Cases
Likelihood of Buying
likelihood of buying
6 5.67 6.00 299%
chance
6 6.33 6.50 299%
chance
6 4.83 4.50 1 9
7 5.43 5.00 1 9
25 5.56 6.00 199%
chance
educationprimary
secondary
some university
university or above
Total
N Mean Median Minimum Maximum
Group by Education
Exploratory Data Analysis
Exploring data can help to determine whether the statistical techniques that you are
using for data analysis are appropriate. The Explore procedure requires that the
dependent variable be a scale variable, while the grouping variables be ordinal or
nominal. To use the analysis, from the menus choose:
8/3/2019 SPSS Summary Statistics
12/22
Analyze
Descriptive Statistics
Explore...
Select likelihood of buying as the dependent variable. Select gender as the factorvariable, and label cases by identification number. Click OK.
Click Statistics and select the following:
Click Continue. Click Plots in the Explore dialog box. Request tests of normality
for the data. These tests will be calculated individually for each gender.
8/3/2019 SPSS Summary Statistics
13/22
Click Continue. Click OK in the Explore dialog box to get the output:
8/3/2019 SPSS Summary Statistics
14/22
8/3/2019 SPSS Summary Statistics
15/22
Stem-and-Leaf Plots
likelihood of buying
Stem-and-Leaf Plot for
GENDER= female
Frequency Stem & Leaf
5.00 0 . 22344
4.00 0 . 5669
2.00 1 . 00
Stem width: 10
8/3/2019 SPSS Summary Statistics
16/22
Each leaf: 1 case(s)
likelihood of buying Stem-and-Leaf Plot for
GENDER= male
Frequency Stem & Leaf
5.00 0 . 11223
8.00 0 . 56778899
1.00 1 . 0
Stem width: 10
Each leaf: 1 case(s)
The Explore procedure also outputs boxplots. Boxplots allow you to compare each
group using a five-number summary: the median, the 25th and 75th percentiles, andthe minimum and maximum observed values that are not statistically outlying.
Outliers and extreme values are also highlighted in the drawing.
The heavy black line inside each box marks the 50th percentile, or median, of thatdistribution. The lower and upper hinges, or box boundaries, mark the 25th and 75th
percentiles of each distribution, respectively. Whiskers appear above and below the
hinges. Whiskers are vertical lines ending in horizontal lines at the largest and
smallest observed values that are not statistical outliers. Outliers are identified with an
O. Extreme values are marked with an asterisk (*).
8/3/2019 SPSS Summary Statistics
17/22
Normal Q-Q Plots
Compute
Let's compute the exact age of each respondent. In the questionnaire, we only ask for
the year in which the respondent was born.
To compute the age, select Transform, and then Compute to display the following
dialog box. Enter "Age" in the Target Variable text box, then "2008 - year" in the
Numeric Expression text box, and click on OK to calculate the age of each
respondent.
8/3/2019 SPSS Summary Statistics
18/22
You will notice that a new column with the column heading "Age" appears in the data
file.
8/3/2019 SPSS Summary Statistics
19/22
Factor Analysis
The objective of factor analysis is to identify the underlying dimensions, or tocombine correlated variables into a smaller number of variables. To do factor
analysis, select Analyze, then Data Reduction, then Factor to open the following
dialog box. Move the five variables related to environmental protection to the text
box under Variables:
8/3/2019 SPSS Summary Statistics
20/22
Click on the Rotation box to display the following dialog box, then select Varimax:
Click on Continue to go back to the previous dialog box. Then click the Options
button to display the following dialog box. Change the value in the text box next to
"Suppress absolute values less than:" from 0.1 to 0.4.
8/3/2019 SPSS Summary Statistics
21/22
Click on Continue, then OK to obtain the output. Look at the following table in the
output:
Rotated Component Matrix(a)
Component
1 2
likelihood of buying .782
bring shopping bags .938
willing to support .854
protect environment .472 .549
use scrap paper .927
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a Rotation converged in 3 iterations.
From the above, we know that we can combine "bring shopping bags" and "use scrap
paper" into component or factor 1, and we can name this factor as actual
environmental protection behaviour. Meanwhile, the remaining three variables can
be combined into component or factor 2, which we can label as envrionemtalprotection with mere lip service.
Reliability Analysis
Very often we use more than one item to measure an important construct/concept in
our study. To check if the answers to the items measuring the same construct are
consistent, select Analyze, then Scale and then Reliability Analysis to pop up the
following dialog box. Move environ2, environ3 and environ5 into the Items list as
shown below:
8/3/2019 SPSS Summary Statistics
22/22
Click on Statistics to select Item from the following dialog box. Select Item and Scale
if item deleted.
Click OK to get the results. If the value of Alpha (Cronbach's alpha) is greater than
0.7, it means that answers given by the respondents are consistent. You may then use
the summation of the items for subsequent analysis.