SPSS TUTORIAL ZA SESTRINSTVO

SPSS PASW Statistics 18 Lesson 1

Opening and Importing Files

The goal of this lesson is to get you to be able to open SPSS and have some data to work with.First, we’ll start with the basics of opening SPSS and then I’ll show you three different ways of loadingdata. The first way is by typing in data manually, which is very laborious but also very effective when noother methods are available. We’ll also look at how to import data from a Microsoft Excel spreadsheetas well as importing data from a regular old text file. In order to practice the last two ways of enteringdata, please feel free to download the demo data sets here:

[ insert two links ]

Alright, let’s get down to business. First, we’ll look at how to open SPSS. Please read thistutorial while sitting in front of a computer that runs PASW 18 and follow along with the demonstration.

Open SPSS PASW 18 in Windows XP

First, point your mouse to the start menu, then “All Programs” and then “SPSS Inc.”, as shownhere:

After a few moments of loading, the following screen should appear:

This is the first screen in SPSS. From here, you can do a lot of things. Most of the stuff you cando is beyond the scope of this lesson. In order to get into the meat of the program, we have to get bythis screen by telling the system what data we want to work on. For the sake of this tutorial, the firstthing we’ll do is type in our own data. In actual practice, we rarely will type in our own data, but it ismuch simpler to explain than going right into open other file types. So for now, please select “Type indata” over on the right hand side of the dialog box where the red oval is highlighting the above picture.

This should bring up a screen with a large grid on it that looks like the picture above. This firstscreen is called the “Data View” of SPSS. There are two main views in this program, and you can toggleback and forth between them using the tabs on the bottom left, where the red oval is on the picture.

You can start typing information into the data view and it will record your input, however,sometimes it can get a little confusing if you don’t label variables properly first. So, for now, let’s switchinto “Variable View” by using the tabs on the bottom left.

Variable View

Variable View is used to define the parameters of the variables that are included in a data set.Switching from Data view to Variable view brings up different column headers on the grid. They are:

Name

Type

Width

Decimals

Label

Values

Missing

Columns

Align

Measure

Role

Our first goal is to create a single variable. In order to do so, we will concentrate on the first threecolumns in the Variable view. So, for now, enter the title “ID” into the name column. Next, click in the“Type” box in the first row. This will bring up the default variable type, which is Numeric. Click on the“…” button in this box, and it will bring up the Variable Type diaglo, pictured here:

You will notice that there are 8 different possible variable types in SPSS. These variable typestell the program what format your data is in and how to interpret it. The program has to know whattype of data to expect so that it knows how much memory to allocate to each data point. For instance,if you enter “5/6/1942”, should the computer recognize and store that information as a date, or do youwant to store the actual numbers and symbols as a “word”? That’s the purpose that this column in thevariable view serves.

Data formats are one of the more common errors people encounter when trying to import data.A lot of times you will have a file that’s pretty long, let’s say 2000 lines, and there will be missing values,or incorrectly formatted values. For instance, if one column is date information, each observation maybe formatted slightly differently, like mm/dd/yyyy, then mm-dd-yyyy and so forth. These types of errorscause major problems for computers, especially when you want to base calculations off the originaldata. So for instance, if you wanted to calculate age, based off of date of birth and current date, youneed to make sure that each of the dates was stored correctly, so the computer correctly interprets daysas days, months as months and years as years.

Try toggling through the different possible data types now and you will see sub-menus thatallow you to further drill down to what specific data type you want to define. Here is the date variabletype dialog:

And here is the Dollar dialog:

Followed by the String dialog:

A string is just another word for text. If you plan to string together a number of letters intowords or you would like to store strings of characters as literally as you type them in (5/2/1942), use this

data type. Please realize that if you use the String data type, you will not be able to base calculations offthese data points later on. If you’re having trouble importing data, this is the “brute force” approach ofgetting passed those problems.

The next column in the variable definition is the Width column. This defines the maximumwidth allowable for each entry in this column. This is also put in place to allow for correct memoryallocation. Short words and numbers that aren’t too big and have only a few decimals can be storedusing less space than long words and irrational numbers. This column will put a cap on how much spaceis assigned each datapoint.

Finally, look at the Label column in the variable view. This column is used to identify what eachvariable represents to the researcher. You have 40 characters to write about your data here so thatwhen you re-open a data file four months after using it, you’ll still remember what’s stored where. Thiscolumn comes in very handy when you have large sets of data.

Your First Variable Definition

So, let’s go ahead and define a variable called “ID”. In this variable, we will place the ID numbersof 3 nurses at an area hospital. Use the following information to define this variable:

Name: ID

Type: String

Width: 7

Decimals: 0

Label: “This is the subject ID”

Once you’re done entering this information, the variable view should look something like this:

Now, switch to Data View and enter the following ID’s into the ID variable of the sheet:

Congratulations! You have just created your first variable in SPSS.

Do you understand what’s going on here? You now have an ID Variable that holds informationregarding Person ID’s. You no longer have just a column of data from a spreadsheet or a wordprocessor, you now have a variable definition as well as a number of observations of that variable.

Now, let’s define a few more variables. Switch back to Variable view and enter the followingvariable:

Name Type Width Decimals LabelName String 15 0 Employee Name

Now, let’s define a date variable. Name this variable “DOB”. In the Type box, select “Date”.This should bring up the following dialog:

In the scrollbox to the right, select “mm/dd/yyyy” to define the date format the program shouldexpect in this column. All the other information will be filled in automatically, but you can add a labelstill.

Now, let’s define a numeric variable. Name it “Salary” and in the Type box, select “Numeric”.This should bring up the following dialog:

Leave it as 8 characters wide and 2 decimal places. Give the variable a label of your choosing.

Your Variable View should now look something like this:

Switch back to the Data View and enter the following information:

ID Name DOB SalaryNUR501a Sarah Louis 4/23/1965 74000NUR502a Berthenia Biehl 5/12/1975 42000NUR503a Rufus Murdock 3/12/1981 56000

Your Data View should look like this:

Good job! You’ve just completed your first data file. Now let’s save your work before it’s toolate!

Saving a File

To save your data, point the mouse to “FileSave As”, as pictured here:

Next, give the file a filename and select the place you would like to save it. Then click “Save” onthe dialog pictured here:

The program will produce a file that looks something like this:

You can ignore and close this file.

Alright, now that you know how to enter and save your own information into SPSS, let’s look athow to import other people’s data into the program. There are a TON of ways data is stored, but thetwo most popular and the ones you’ll be using the most are in Microsoft Excel and and Tab-DelimitedText format. The following two sections will show you how to import both.

Importing Data from a Microsoft Excel Document

Close whatever you’re working on and reopen the program to import a file. This will bring up that firstscreen again, as pictured here:

On this screen, make sure the “Open an Existing Data Source” radio button is selected and thenhighlight “More Files” and click Ok in the bottom right of the screen. That will bring up the dialogpictured here:

First, navigate to the folder where your file is stored in the “Look in” drop-down. If you don’tsee your file, make sure to select “Excel” in the “Files of type” drop down at the bottom.

That should bring up the dialog pictured here:

Select your file and click “Open”. That will bring up this dialog:

Make sure the settings on this dialog are correct before proceeding. Specifically, if your excelspreadsheet has multiple worksheets, make sure the one you want to import is selected correctly. Also,if you would like to import the first row of data as variable names, make sure the checkbox is selected todo so. After clicking okay, you will see the results in data view. The following is what the sample exceldata looks like after importation:

Importing Data from a Text File

Again, close everything you’ve been working on down and restart the program, which will bringyou to the following welcome screen:

On this screen, select “Open an existing data source”, highlight “More Files” and click “Ok”. Thiswill bring up the following dialog:

Navigate to the correct folder at the top of the screen and make sure to tell it to look for a“Text” file at the bottom (where the red oval is highlighting the picture). Then click “Open” and it willbring up the following dialog:

This is the first dialog in a 6 step process. SPSS needs to go through this process because filesstored as text data are much less structured than files stored in Excel format. In the first step, justindicate that your data does not match any predefined format (predefined formats are way beyond thescope of this lesson). Then click “Next” which will bring up the following screen:

In this screen, indicate that the text is “Delimited” and that variable names are included at thetop of our file. The file we’re importing has been saved in such a way that columns are distinguished

using tab characters hidden inside our file. This is an agreed upon format that’s used lots of times whenmigrating data between two different programs that haven’t been designed to specifically talk to eachother. We’re indicating that variable names are at the top of the file because we can then import thefirst column on our Variable View instead of typing it in manually (such a drag). Click Next after you’vecompleted these steps and it will bring up the following dialog:

Since we included variable names at the top of our data, on this screen, the default line to startwith will indicate a “2” because the actual data starts on line 2 of our file. Since our file has included acarriage-return character hidden at the end of each line, we can indicate that each line represents acase. Sometimes, when data is saved in a less structured format, multiple cases can be stored in oneline, but we don’t want to confuse things by doing that here. Finally, since our file is extremely short, wewill import all cases. We might want to import fewer if we have over 50,000 cases and we just want toexperiment speedily with less data. After you’ve indicated these responses, press “Next” and step 4 willpop-up (pictured next).

Step 4 is the most difficult in this whole shebang. I hope you’ve noticed that the onlyintervention needed thus far is hitting “Next”. Well, in this step, it’s important to un-check “Space”. Asit stands, the default is to move to the next variable everytime the file has EITHER a tab OR a space.Since this file holds names with spaces in them, it screws up the data conversion. If you look at thetextbox in the picture above, there are more variables than we want and the Name variable holds noinformation. If you uncheck the “Space” checkbox it will fix this problem. Check out the results here:

Now, the data looks as we hoped. Press Next to get to step 5.

This step allows you to work more closely with the variable definitions. Since we know how todo this directly in the Variable View of SPSS, we don’t need to do so here. Just notice that you can selectdifferent columns in the textbox and the information at the top of the dialog changes depending onwhich column is selected. I selected column 2 in the next picture:

After experimenting a little in that view, click next to get to the final step

In this step, we can save this format for future use or do some other mumbo jumbo, but thatstuff starts to get beyond this lesson, so just click finish and Voila – your data has been imported!

Good job! You’ve completed this lesson.

Laboratory 1:

The goal of this laboratory is to get you familiar with the concepts of:

- Population

- Sample

- Parameter

- Statistic

As well as help you to distinguish between levels of measurement:

- Nominal

- Ordinal

- Interval

- Ratio

And finally to describe the difference between classes of variables:

- Qualitative/Quantitative

- Continuous/Categorical

Download and import the file “lesson1.txt” that can be found on the website along with this lesson. Ifyou are having trouble importing the file, go back to the previous lesson to learn how to import a textfile. This is a fictional dataset, but let’s say it represents data from all of the hospital admissions over a10 yr. period of statewide cardiac ICU hospital admissions. The variables included represent:

1). Gender of subject

2). New York Heart Association Heart Failure Class of each subject

3). Date of Admission

4). Weight of subject at time of admission

How does this survey differ from one that would usually be conducted by a more typical research team?Answer the following questions about this dataset to help find out:

1). How many observations do you see in this file?

2). Circle all that apply:

a. The GENDER variable is:

Nominal Ordinal Interval Ratio

Qualitative Quantitative

Continuous Categorical

b. the NYHA Heart Failure Class variable is:


c. the Date of Admission variable is:


d. the Weight of Subject at time of admission is:


Qualitative Quantitative


This data represents an entire statewide cross-section of information. It is a complete datasetfrom the entire state population and would be very costly to assemble. Researchers don’t have accessto this amount of data except in rare circumstances. The US census attempts to assemble informationfrom the entire population every ten years and needs an entire government department to do it. Let’ssay we’re interested in knowing the average weight of subjects that have been admitted to cardiac ICU’sover this ten year period in the entire state. Since we have this dataset, it’s not very difficult to answerthat question. Click on the “Analyze” Menu, followed by “Descriptive Statistics”, and then“Descriptives”, as shown here:

This will result in a table like to following popping up:

Descriptive Statistics

N Minimum Maximum Mean Std. Deviation

wt 10000 172.33563157 247.43160594 210.009681736

2

10.0528617218

8

Valid N (listwise) 10000

From this table, you can clearly see that the population mean is 210 lbs.

Now, what would happen if we were a poor, struggling nursing professor interested inwriting a grant that dealt with this measurement and didn’t have the resources available to finddata from all admissions in the entire state? The answer is we’d have to take a sample of data,either by phone, internet or some other manner and estimate the population mean with oursample mean.

Let’s try doing that in SPSS. We’re going to select a random sample of the data from thisdataset right inside of the software. To do so, go to the “Data Menu”, followed by “SelectCases”, as shown here:

This will bring up the following window, in which case we will select the “wt” variableand “Random sample of cases”.

To define our sample, click “Sample…”, which will bring up the next window:

In this window, type “1000” into the box after “Exactly” and 10,000 into the next box. Then hit“Continue” and “Okay”. The computer randomly selects 1000 cases for us, which makes our screen looklike this:

There are now a bunch of cases that have been crossed out on the left and a new “filter”variable as shown above. Now, go back and request a “Descriptives” table as we did before. This willshow a table that looks something like this:

Descriptive Statistics

N Minimum Maximum Mean Std. Deviation

wt 1000 180.29881653 243.81648201 209.949296404

5

10.2202347910

8

Valid N (listwise) 1000

Notice, I said it will look something like this. This is because your random sample will bedifferent than mine, so your numbers will be slightly different than mine. Look at the new descriptivetable. Can you see how many observations it based it’s calculations on? There are 1000. The mean ofthis dataset is 209.949. Now, you should be able to answer the following questions:

Review Questions:

1). How large is the population in this study?

2). How large of a sample did we take?

3). What was the population mean?

4). What was the sample mean?

5). Is the population mean a parameter or a statistic?

6). Is the sample mean a parameter or a statistic?

Lesson 2

Goal of this lesson are:

- Demonstrate how to make a frequency table- Demonstrate how to adjust the categories of a frequency table- Demonstrate how to calculate a new variable in SPSS- Demonstrate how to calculate Quartiles of a dataset in SPSS- Demonstrate how to produce a histogram in SPSS

Part I: Frequencies

A frequency table is generally considered part of a large category of statistics called Descriptive Statistics. SPSS has awhole menu of these that you can use to describe a dataset inside the Analyze menu. In this lesson, we will use thedataset from lesson 1 to build a frequency table for the NYHA class variable. A frequency table will allow us to see howmany respondents were entered into each category of this ordinal variable. You should be able to visualize the generalform of the frequency table before you request it. In fact, it’s a very good idea to think about what it should look likeahead of time so that you don’t just rely on the software blindly. The software will output whatever you request, butsometimes the most difficult part of the whole analysis is figuring out exactly what to request.

Question 1: How many categories do you expect to appear in the frequency table and why?

So load up the dataset from lesson 1, and point your mouse to AnalyzeDescriptive Statistics and click on Frequencies,as pictured here:

This should bring up the dialog box on the left in the picture below:

In the frequencies box, you have to tell the software which variable (or variables) you would like frequency tables for. Inthis case, click on NYHA and then click the arrow in the middle of the dialog box that should move the NYHA variableover to the right side of the dialog, as pictured on the right side of the above table. Then click ‘ok’ twice and the tableshould be output, as pictured here:

NYHA

Frequency Percent Valid Percent

Cumulative

Percent

1 2471 24.7 24.7 24.7

2 2475 24.8 24.8 49.5

3 2559 25.6 25.6 75.1

4 2495 25.0 25.0 100.0

Valid

Total 10000 100.0 100.0

Let’s interpret this table a little bit. Does it correspond well to the picture of the frequency table that you

visualized earlier? There are 4 categories, as you would probably expect, since the description of the NYHA

class variable included four total categories. You can see that there are about 25% of respondents in each

category (24.7%, 24.8%, 25.6%, 25.0%), and all the data appears to be valid (as evidenced by the 100 in the

bottom row of the Valid Percent column). Some datasets you work on might have missing data, and this

column is helpful in telling you that. Some datasets might have incorrect responses too, for instance someone

might be listed as having a category ‘5’ NYHA class, which is impossible and should be corrected. If that is the

case, it will be very apparent in this table.

Now, let’s look at how to group frequencies into different categories. Let’s say we’re interested in

comparing two groups of subjects, one with ‘low’ NYHA class, and one with ‘high’. In this case, it would be

helpful to recode NYHA classes of 1 or 2 into a single category, and those with a 3 or 4 into a second category.

To do so, point your mouse to Transform and click on Recode into Different Variables, as pictured here:

This will bring up the dialog pictured here:

Since we want to recode the NYHA variable, click on it in the left text area and click the arrow to move

it over to the right window. Give the new variable a name by typing “NYHA_grouped” into the textbox under

Name: in the area to the right. Then highlight NYHA and click the Old and New Values button, which should

bring up the following dialog:

This is where we will define the new mapping. In the old value area on the left, type the number 1 into thetextbox under Value: in the New Value area on the right, type in the number 1 in the textbox after the Value: line. Thisshould enable the “add” button. Click the “add” button to define the first new mapping. Now, do the same thing, but thistime, old value should be 2 and new value should be 1. Then, old value 3 should be mapped to new value 2 and old value4 should be mapped to new value 2. When you’re done, click continue and display a frequency table for the new variable.It should look something like this:

NYHA_grouped

Frequency Percent Valid Percent

Cumulative

Percent

Valid 1.00 4946 49.5 49.5 49.5

2.00 5054 50.5 50.5 100.0

Total 10000 100.0 100.0

Question 2: Where do the numbers 4946 and 5054 in the above frequency table come from?

Part II: Percentages

Now we’re going to look at how to calculate percentages. Let’s say we’re interested in knowing howeach respondent’s weight compares to the person with the highest weight. To do so, we will first need to findthe person with the highest weight, and then we will need to code a new variable that divides each person’sweight by the overall maximum.

First, to find the maximum weight, we will sort the whole dataset by weight. To do so, point yourmouse to Data and then click on Sort Cases, as shown here:

This should bring up the following dialog:

Now, since we want to sort by the variable Wt, click on it in the left text area and then click the arrow tomove it over to the right. Notice, we can sort in either ascending or descending order by toggling the SortOrder radio buttons. Click ‘Ok’ and take a look at the dataset. It should now be sorted by the weight variable.

Question 3: What is the weight of the person with the maximum weight in this dataset?

Now, we are going to compute a new variable that represents the percentage of each person’s weightcompared against the maximum weight in the dataset. To do so, point your mouse to Transform and click onCompute Variable, as shown here:


First, give your new variable a name, by typing ‘Percent_wt’ into the textbox in the top left. Then,move the wt variable over into the numeric expression text area by highlighting it in the left-most text-area andclicking the arrow to move it to the right. Now, use your mouse to type in the division sign off the calculator inthe middle of the screen, and finally type out maximum weight you computed in the last step. Finally, click‘Ok’ and checkout your new variable.

Question 4: What percent of the weight of the heaviest respondent is the weight of the lightest respondent?

Part III: Quartiles

We will now compute the quartiles of the weight variable. This will show us the cutoff point for thefirst, second, third and fourth quarter of weights. To do so, click on AnalyzeDescriptiveStatisticsFrequencies, as we did when we made a frequency table for NYHA. This should bring up thefollowing dialog:

Now, bring the wt variable over to the right hand text area and click on the Statistics button. This shouldbring up the following dialog:

Simply click on the ‘quartiles’ checkbox here and click the Continue button. This should output thefollowing table:

Statistics

Wt

Valid 10000N

Missing 0

25 203.047543142

50 210.089653430

Percentiles

75 216.720281375

Question 5: What are the quartile cutoffs for the wt variable in this dataset?Question 6: What percentiles do the quartiles correspond to?

Part IV: Histogram Plot

We will now produce our first graph in SPSS. Let’s say we’re interested in plotting out the weightvariable across its whole range. We call this a histogram, and SPSS is particularly well suited at producinggraphs in this manner. To construct a histogram of the wt variable, point your mouse to Graphs and click onChart Builder, as pictured here:


In the bottom left text area, click histogram and in the area to the right of that, DOUBLE-click the firstplot. This should give you the graph at the top right. Now, drag the wt variable from the top-left text area tothe x-axis on the top-right of the dialog. Then click ok. This should output the following graph:

This is a very basic histogram. Notice that there are many, many ways to customize different aspects ofthe plot. One particular customization that can be very controversial is the choice of bin size. Since weight is acontinuous variable, you have to choose where to make the cutoffs for each bar. For instance, here is anotherhistogram on the same data with much different bins:

Question 7: How many options are there for defining bin size in a histogram?A). 1B). 4C). less than 10D). Infinite

Lesson 3: Descriptive Statistics

The goal of this lesson is to demonstrate two different ways of calculating descriptive statistics in SPSS.

Alright, let’s look at how to apply some of the concepts from Chapter 3 in SPSS. First, import thedataset from Lesson 1 again. Next, click AnalyzeDescriptive StatisticsDescriptives, as shown in thefollowing dialog:

This should pop-up the following window:

Bring the wt variable over to the right and then click the ‘Options’ button. That should bring up thefollowing dialog:

In this dialog, you can request any of the descriptive statistics that are discussed in the book.

Now, do you see any problems? Notice that there are only two variables available for thismenu. What if you want to know the mode of the gender variable? This is done differently. There is asecond place that SPSS allows you to output most of the same statistics. Go back to the ‘Analyze’ menuand this time, point to the “Frequencies” choice inside of Descriptive Statistics. This should give you awindow that looks like this:

Now move gender over to the right window and click on the “Statistics” button. This should giveyou the following dialog:

In this dialog, you can request all of the same statistics as in the last one and more. Click ‘Mode’here and then hit the Continue button.

Question 1: What is the mode of the Gender variable in this dataset

Question 2: What is the median of the NYHA variable in this dataset?

Question 3: What is the mean and standard deviation of the Wt variable in this dataset?

Lesson 4

The goal of this lesson is to show you how to calculate a Chi-Square test in SPSS. Start by importing thedataset from Chapter 1 into SPSS. Then click on the Analyze menu, select Descriptive Statistics andfinally click Crosstabs, as shown below. Remember, a chi-square test describes the amount of linearassociation between two categorical variables. Crosstabs is short for cross-tabulation, which is theprocess we’re interested in.

This will bring up the following dialog:

Start by highlighting the ‘gender’ variable and hitting the arrow to put it in the ‘Rows’ text area. Next,move the NYHA variable over to the ‘Columns’ text area. Finally, click the Statistics button, which willbring this dialog up:

In here, check the Chi-Square checkbox and click continue, followed by ‘Ok’ in the dialogunderneath. This should output the following tables:

gender * NYHA Crosstabulation

Count

NYHA

1 2 3 4 Total

F 1224 1237 1273 1262 4996gender

M 1247 1238 1286 1233 5004

Total 2471 2475 2559 2495 10000

Chi-Square Tests

Value df

Asymp. Sig. (2-

sided)

Pearson Chi-Square .611a 3 .894

Likelihood Ratio .611 3 .894

N of Valid Cases 10000

a. 0 cells (.0%) have expected count less than 5. The minimum

expected count is 1234.51.

Notice the top table shows you the cross-tabulation with a box for each of the 8 possiblecategories. Also notice the resulting value of the Pearson Chi-Square test statistic is .611. Thiscorresponds to a p-value of .894.

Question 1: What is the null hypothesis in the test we just performed?

Question 2: How many female subjects had a NYHA class greater than 2?

Question 3: Would we accept the null hypothesis in this test? Why or why not?

Lesson 5

The goal of this lesson is to demonstrate how to perform an independent samples T-test in SPSS.

To start, import the dataset from Lesson 1 into SPSS. Next, point to AnalyzeCompare Meansand click on Independent-Samples T Test, as shown here:


The variables from your dataset are listed in the left hand box. Let’s say we’re interested in knowingwhether there is a difference in weight between men and women. In this instance, we’d move the weightvariable from the left hand box into the Test Variable box, and the gender variable from the left hand boxto the Grouping Variable box as shown in the above diagram. However, SPSS isn’t quite smart enoughto know which groups to use in the gender variable, which is why when the dialog originally pops upthere are two question marks inside the gender parentheses. To fix this, hit the Define Groups button,which will bring up the following dialog:

Now, type ‘M’ into the Group 1 textbox and ‘W’ into the Group 2 textbox. Then press theContinue button. Can you see what happened to the question marks that used to be in theGender parentheses? They have been replaced by the values you just defined. Now press the Okbutton and the following tables will be output:

Group Statistics

gender N Mean Std. Deviation Std. Error Mean

M 5004 209.991946450

7

10.0055943830

0

.14144390651wtdi

m

e

n

si

o

n

1

F 4996 210.027445420

9

10.1009533537

6

.14290622619

Independent Samples Test

Levene's Test for

Equality of Variances t-test for Equality of Means

95% Confidence

Interval of the

Difference

F Sig. t df Sig. (2-tailed)

Mean

Diff Std. Error Diff Lower Upper

Equal var

assumed

.575 .448 -.177 9998 .860 -.035 .201 -.429 .358042wt

Equal var

not

assumed

-.177 9996.77 .860 -.035 .201 -.429 .358732

Now, let’s see if you know how to interpret these tables:

1). How many Men were in this dataset?

2). How many Women were in this dataset?

3.) What was the mean weight of the men?

4). What was the mean weight of the women?

5). All things considered, are these two means very far apart?

6). Based on your response to question 5, would you expect a t-test to have sufficient evidence to saythese two means are different?

7). The two-tailed p-value associated with this test is 0.860. Should we reject the null hypothesis?

8). Is there sufficient evidence to suggest that there is a statistically significant difference between the twomeans in question?

9). If we now were interested in knowing whether there’s a difference in weight between those with lowNYHA class (1 and 2) versus those with high NYHA class (3 and 4), what p-value would you report?

Lesson 6

The goal of this lesson is to demonstrate how to run a One-Way ANOVA in SPSS

To begin, import the dataset from lesson 1. Next, point to AnalyzeCompare Means and then click onOne-Way ANOVA, as shown below:

This will bring up the following window:

Let’s say we’re interested in knowing whether there is any difference in the weight variable acrossdifferent NYHA groups. Our null hypothesis is:

H0: There is no difference in mean weight between NYHA groups

And our alternative hypothesis is:

HA: There is a difference in mean weight between different NYHA groups

In order to perform this test, move the weight variable over to the Dependent List box and move theNYHA variable over to the Factor box, as shown in the above figure. Then hit the Ok button. This willbring up the following table:

ANOVA

Wt

Sum of Squares df Mean Square F Sig.

Between Groups 707.725 3 235.908 2.335 .072

Within Groups 1009791.503 9996 101.020

Total 1010499.228 9999

You should be able to interpret this table.

Lab Questions

1). The p-value for this test is .072. Do we reject the null hypothesis?

2). Is there sufficient evidence to conclude that there is a statistically significant difference in meanweight across different NYHA groups?

3). Pretend for a moment that the resulting p-value was .02. Would we reject the null hypothesis in thiscase? Why or Why Not? If we did reject the null hypothesis we would conclude that there is sufficientevidence to suggest a statistically significant difference in mean weight between NYHA groups. TheNYHA variable has 4 levels in it; would we know which of the 4 NYHA groups’ means resulted in thestatistically significant result?

4). Divide each of the first two sums of squares by their respective degrees of freedom. What column ofthe table holds the same answers as these?

5). What percent of the within groups mean square is the between groups mean square? What column ofthe table holds this answer?

6). Subtract the number of NYHA groups from the total number of observations. What value in the tableholds this same answer?

7). Subtract 1 from the number of NYHA groups. What value in the table holds this same answer?

8). Add your answers from questions 6 and 7. What value in the table holds this same answer?

Lesson 7

The goal of this lesson is to show you how to calculate Pearson’s Correlation Coefficient in SPSS

To start, import the dataset from lesson 1. Then point your mouse to AnalyzeCorrelate and click onBivariate, as shown here:

This will bring up the following window:

Let’s say we’re interested in testing the linear association between the variables ht and wt in this dataset.To do this, move those variables from the left hand box to the Variables box using the arrow in themiddle of the dialog, as shown in the picture above. Then make sure the Pearson box is checked, so thatSPSS knows which statistic you would like displayed. Then hit the Ok button and the following table willbe output:

Correlations

wt ht

Pearson Correlation 1 -.022*

Sig. (2-tailed) .028

wt

N 10000 10000

Pearson Correlation -.022* 1

Sig. (2-tailed) .028

ht

N 10000 10000

*. Correlation is significant at the 0.05 level (2-tailed).

This table is called the “Correlation Matrix”. Pearson’s Correlation coefficient between ht and wt isdisplayed in the cells of the table that have been greyed out. The correlation itself is -.022 whichcorresponds to a p-value of .028. You should be able to interpret the rest of this table.

Review Questions:

1). What is the null hypothesis to be tested by this analysis?

2). What does a negative correlation coefficient indicate?

3). Do we reject the null hypothesis?

4). What are the maximum and minimum values the correlation can take on?

5). Why is the Correlation in the other two cells of the table 1?

6.) What does a correlation coefficient of 0 indicate?

7). How many observations were tested?

Lesson 8

The goal of this lesson is to demonstrate how to calculate a Relative Risk or an Odds Ratio in SPSS

To begin, import the dataset from lesson 1 and create a new variable called NYHA_grouped that is equalto 1 when NYHA class is 3 or 4 and is equal to 2 when NYHA class is 1 or 2 (go back to lesson 2 for areminder if you are having trouble with this). The group with the high NYHA class has the most severeheart disease. By coding the group with the high NYHA scores a one and the healthier group a two youare telling SPSS to put the group with severe heart disease in the first column of your two by two tablewhich is where the disease outcome should be. NYHA_grouped will represent whether each respondentis categorized as having a high (1) or low NYHA class (2) which is the disease outcome of interest. Weare interested in examining whether gender (the exposure of interest) is associated with the risk of havinga high NYHA class, or if the number of cases of a high NYHA is significantly different for womencompared to men in a sample of individuals with heart disease.

To start, point your mouse to AnalyzeDescriptive Statistics and click on Crosstabs, just like we did fordoing a Chi-Square Test, as pictured here:

That should bring up the next dialog as pictured here:

Since we’re interested in examining how gender relates to the NYHA_group, move the gender variable tothe Rows box (exposures are usually listed in the rows) and the NYHA_grouped variable to the Columnbox (outcomes or disease are usually listed in the columns), as shown above. Then click on the Statisticsbutton. This will bring up the following dialog box:

You should recognize this dialog box because we used it when we requested a chi-square test in Lesson 4.This time, make sure the Risk checkbox is checked and then hit the Continue button. This should makeSPSS output the following tables:

gender * NYHA_grouped Crosstabulation

Count

NYHA_grouped

1.00 2.00 Total

F 2535 2461 4996gender

M 2519 2485 5004

Total 5054 4946 10000

Review questions:1.) Looking at the above table. Which gender is the “unexposed” group or referent group?

2.) Using the above 2x2 table calculate the appropriate RR of developing severe heart disease(NYHA group one) for women compared to men?

3.) What kind of data must you have to calculate the RR?

4.) Give an example of a study design that will produce the type of data you must have tocalculate a RR.

5.) If the information you have in the above 2x2 table is from a case control study what wouldyou calculate to estimate the RR? Please do so.

Now look at the second table SPSS gives you as output from this data.

Risk Estimate

95% Confidence Interval

Value Lower Upper

Odds Ratio for gender (F / M) 1.016 .940 1.099

For cohort NYHA_grouped = 1.00 1.008 .970 1.048

For cohort NYHA_grouped = 2.00 .992 .953 1.032

N of Valid Cases 10000

6.) Is the OR the same as what you calculated?

7.) Is the OR significant?

8.) What would you conclude about your null hypothesis?

9.) If this 2x2 table were from a cohort study, you would want to report a RR. Can you find the relativerisk you calculated on this table?

1). Keeping in mind that a high NYHA class indicates a worse heart condition, solely in terms ofNYHA_grouped variable, is it better to be a man or woman?

Date post:	03-Mar-2023
Category:	Documents
Upload:	independent
View:	0 times
Download:	0 times

SPSS TUTORIAL ZA SESTRINSTVO

Documents