SPSS Summary Statistics

8/3/2019 SPSS Summary Statistics

1/22

SPSS Summary Statistics

Our first topic of statistical analysis is the method we use to derive tables of summary

statistics from SPSS for Windows. For this purpose we choose the data obtained for

Questions 3 in the questionnaire we used for the previous section, which is thelikelihood of the respondents buying a packet of Woolworth dog biscuit at $15 per

packet. The answers are given in a scale from 0, which means that the respondent will

definitely not buy it, to 10, which means that the respondents will definitely buy it.

To obtain our summary statistics using Frequencies analysis, from the menus choose:

Analyze

Descriptive Statistics

Frequencies...

The following dialogue box appears. On the left hand side, you will see a list of all of

the variables that you have defined. We select 'likelihood of buying':

Now click on the right arrow near the middle of the dialogue box. You will see that

the variables have now been taken to the Variables list on the right hand side of the

dialogue box.


2/22

Now click on the Statistics button. A dialogue box called "Frequency: Statistics"comes up in which you can select quite a large number of calculations of summary

statistics that you can request your computer to do on your data. Here we have

selected the mean, standard deviation, median, minimum, maximum and mode etc..

Click on the continue button to return to the original dialogue box.

Another button that you can click on is the Charts button, which draws up a dialogue

box named "Frequencies: Chart." Here you can ask the computer to draw from you

data a variety of graphs. Here we have selected Pie charts.


3/22

Click on Continue to confirm and leave. Then, after you have returned to the original

dialogue box, click on the OK button, which runs all of the calculations and graphsthat you have asked for.

The output for the summary statistics calculations are in a file called Output1. You

will see that a frequency distribution table has been constructed for 'likelihood of

buying' and that all of the summary statistics selected are shown above the frequency

distribution table. The summary statistics are:

Statistics

likelihood of buying

Valid 25N

Missing 0

Mean 5.56

Median 6.00

Mode 2

Std. Deviation 3.042

Variance 9.257

Minimum 1

Maximum 10

25 2.50

50 6.00

Percentiles

75 8.50


4/22

The center of the distribution can be approximated by the median (or second quartile)

6.00, and half of the data values fall between 2.5 and 8.5, the first and third quartiles.

The mean is quite close to the median, suggesting that the distribution is symmetric.

Lets look first at the frequency distribution table as shown below. The first column iscalled value label. This column is empty because our data for Question 3 can be left

as a score out of 10 without further explanation for our interpretations; therefore we

have not and need not define what each of the numbers in the input mean in the Value

column when we were defining the variables. On the other hand, if you are drawing a

frequency distribution table for the shopping duty of the respondents, the value label

column would be filled with "Yes" in the first row, "Shared duty" in the second

row," etc. A row is also given for missing values if there is any.


Frequency Percent Valid Percent

Cumulative

Percent

1 2 8.0 8.0 8.0

2 4 16.0 16.0 24.0

3 2 8.0 8.0 32.0

4 2 8.0 8.0 40.0

5 2 8.0 8.0 48.0

6 3 12.0 12.0 60.0

7 2 8.0 8.0 68.0

8 2 8.0 8.0 76.0

9 3 12.0 12.0 88.0

99%

chance3 12.0 12.0 100.0

Valid

Total 25 100.0 100.0

The second column is called value and it simply is the numbers used in the input of

the data. The third column shows the frequency of occurrence of each number or

response. The fourth column is the percentage of each response occurring, including

the missing values, so that the frequency of each response or missing value is divided

by the total number of responses and missing values. The fifth column is also the

percentage of each response occurring, but not taking into account the missing values,

so that the frequency of each response is divided by the total number of responses


5/22

only. Both columns should therefore add up to 100%. The final column is the

cumulative percentage of the valid percentages in the fifth column.

The pie chart is as below:

You may also want a bar chart, ordered by descending frequencies, to help you find

the mode and also to visually compare the relative frequencies. To obtain an ordered

bar chart, recall the Frequencies dialog box. Click Charts. Select Bar charts. Click

Continue.

Then click Format in the Frequencies dialog box. Select Descending counts. The

frequency table can be arranged according to the actual values in the data or according


6/22

to the count (frequency of occurrence) of those values, and in either ascending or

descending order.

Click Continue. Click OK in the Frequencies dialog box to get the following bar

chart:

With your output, it is possible to do two things - to print it or to copy it to another

application such as Microsoft Word. To print, go the File pull down menu and select

print, then press the OK button. If you only want to print a selection of the output,

then highlight your selection and when you are in the Print dialogue box select Print

Selection before clicking on the OK button.

Alternatively, highlight a part or the whole of your output and in the Edit pull down

menu select Copy. This puts the selection on the Clipboard so that when you enter

Microsoft Word, you can select Paste from the Edit pull down menu to transfer the


7/22

selection to your Microsoft Word file. A small tip - change the font of the pasted table

to Courier New in word for proper alignment.

Going back to Output1 in SPSS, we are now done with our output for now and

minimize the window for Output1, after which you should see an icon for Output1 atthe bottom of your screen.

Descriptives

Descriptive statistics can also be obtained using the descriptives procedure in

SPSS. To run a Descriptives analysis, from the menus choose:

Analyze


Descriptives...

Select the following variables - bring shopping bags, willing to support, protect

environment, use scrap paper and government support. Click the save standardized

values as variables box to get measurements that are free from units of measurement:

Click Options, and make the following selections:


8/22

Click OK to get the following output:

SPSS Split File Procedure

Suppose we want to see if there are differences between male and female respondents

in terms of their attitudes towards protecting the environment. We can from the

menus select Data, and then Split File. ClickCompare Groups, and move gender

into the textbox as shown below:


9/22

ClickOK and then repeat the Descriptives procedure as described in the previous

section to get the following output:


11 1.00 5.00 37.00 3.3636 1.56670

11 4.00 5.00 49.00 4.4545 .52223

11 4.00 5.00 50.00 4.5455 .52223

11 1.00 5.00 33.00 3.0000 1.61245

11 4.00 5.00 51.00 4.6364 .50452

11

14 1.00 5.00 31.00 2.2143 1.57766

14 4.00 5.00 64.00 4.5714 .51355

14 4.00 5.00 62.00 4.4286 .51355

14 1.00 5.00 32.00 2.2857 1.54066

14 4.00 5.00 61.00 4.3571 .49725

14

bring shopping bags

willing to supportprotect environment

use scrap paper

government support

Valid N (listwise)

bring shopping bags

willing to support

protect environment

use scrap paper

government support

Valid N (listwise)

genderfemale

male

N Minimum Maximum Sum Mean Std. Deviation

SPSS Summary Procedure

Use Summarize to create a summary report. To begin the analysis, from the menus

choose:

Analyze

Reports

Case Summaries...


10/22

Select likelihood of buying as the variable to summarize. Select education as the

grouping variable. Because this is a grouped summary report, individual case

listings are not required, so deselect Display cases.

Click Statistics. Select Mean, Median, Minimum, and Maximum as the cell statistics.

Note that Number of Cases appears by default in that list.


11/22

Click Continue, and then click Options in the Summarize dialog box. Type

Likelihood of Buying as the title. Type Grouped by Education as the caption.

Click Continue. Click OK in the Summarize dialog box to get the output:

Summarize

Case Processing Summary

25 100.0% 0 .0% 25 100.0%likelihood of buying* education

N Percent N Percent N Percent

Included Excluded Total

Cases

Likelihood of Buying


6 5.67 6.00 299%

chance

6 6.33 6.50 299%

chance

6 4.83 4.50 1 9

7 5.43 5.00 1 9

25 5.56 6.00 199%

chance

educationprimary

secondary

some university

university or above

Total

N Mean Median Minimum Maximum

Group by Education

Exploratory Data Analysis

Exploring data can help to determine whether the statistical techniques that you are

using for data analysis are appropriate. The Explore procedure requires that the

dependent variable be a scale variable, while the grouping variables be ordinal or

nominal. To use the analysis, from the menus choose:


12/22

Analyze


Explore...

Select likelihood of buying as the dependent variable. Select gender as the factorvariable, and label cases by identification number. Click OK.

Click Statistics and select the following:

Click Continue. Click Plots in the Explore dialog box. Request tests of normality

for the data. These tests will be calculated individually for each gender.


13/22

Click Continue. Click OK in the Explore dialog box to get the output:


14/22


15/22

Stem-and-Leaf Plots


Stem-and-Leaf Plot for

GENDER= female

Frequency Stem & Leaf

5.00 0 . 22344

4.00 0 . 5669

2.00 1 . 00

Stem width: 10


16/22

Each leaf: 1 case(s)

likelihood of buying Stem-and-Leaf Plot for

GENDER= male

Frequency Stem & Leaf

5.00 0 . 11223

8.00 0 . 56778899

1.00 1 . 0

Stem width: 10

Each leaf: 1 case(s)

The Explore procedure also outputs boxplots. Boxplots allow you to compare each

group using a five-number summary: the median, the 25th and 75th percentiles, andthe minimum and maximum observed values that are not statistically outlying.

Outliers and extreme values are also highlighted in the drawing.

The heavy black line inside each box marks the 50th percentile, or median, of thatdistribution. The lower and upper hinges, or box boundaries, mark the 25th and 75th

percentiles of each distribution, respectively. Whiskers appear above and below the

hinges. Whiskers are vertical lines ending in horizontal lines at the largest and

smallest observed values that are not statistical outliers. Outliers are identified with an

O. Extreme values are marked with an asterisk (*).


17/22

Normal Q-Q Plots

Compute

Let's compute the exact age of each respondent. In the questionnaire, we only ask for

the year in which the respondent was born.

To compute the age, select Transform, and then Compute to display the following

dialog box. Enter "Age" in the Target Variable text box, then "2008 - year" in the

Numeric Expression text box, and click on OK to calculate the age of each

respondent.


18/22

You will notice that a new column with the column heading "Age" appears in the data

file.


19/22

Factor Analysis

The objective of factor analysis is to identify the underlying dimensions, or tocombine correlated variables into a smaller number of variables. To do factor

analysis, select Analyze, then Data Reduction, then Factor to open the following

dialog box. Move the five variables related to environmental protection to the text

box under Variables:


20/22

Click on the Rotation box to display the following dialog box, then select Varimax:

Click on Continue to go back to the previous dialog box. Then click the Options

button to display the following dialog box. Change the value in the text box next to

"Suppress absolute values less than:" from 0.1 to 0.4.


21/22

Click on Continue, then OK to obtain the output. Look at the following table in the

output:

Rotated Component Matrix(a)

Component

1 2

likelihood of buying .782

bring shopping bags .938

willing to support .854

protect environment .472 .549

use scrap paper .927

Extraction Method: Principal Component Analysis.

Rotation Method: Varimax with Kaiser Normalization.

a Rotation converged in 3 iterations.

From the above, we know that we can combine "bring shopping bags" and "use scrap

paper" into component or factor 1, and we can name this factor as actual

environmental protection behaviour. Meanwhile, the remaining three variables can

be combined into component or factor 2, which we can label as envrionemtalprotection with mere lip service.

Reliability Analysis

Very often we use more than one item to measure an important construct/concept in

our study. To check if the answers to the items measuring the same construct are

consistent, select Analyze, then Scale and then Reliability Analysis to pop up the

following dialog box. Move environ2, environ3 and environ5 into the Items list as

shown below:


22/22

Click on Statistics to select Item from the following dialog box. Select Item and Scale

if item deleted.

Click OK to get the results. If the value of Alpha (Cronbach's alpha) is greater than

0.7, it means that answers given by the respondents are consistent. You may then use

the summation of the items for subsequent analysis.

Date post:	06-Apr-2018
Category:	Documents
Upload:	wan-yip
View:	230 times
Download:	0 times

SPSS Summary Statistics

Documents