+ All Categories
Home > Documents > SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables...

SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables...

Date post: 01-Apr-2015
Category:
Upload: katerina-kenward
View: 234 times
Download: 5 times
Share this document with a friend
Popular Tags:
31
SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity
Transcript
Page 1: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 1

Computing Transformations

Transforming variables

Transformations for normality

Transformations for linearity

Page 2: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 2

Transforming variables to satisfy assumptions

When a metric variable fails to satisfy the assumption of normality, homogeneity of variance, or linearity, we may be able to correct the deficiency by using a transformation.

We will consider three transformations for normality, homogeneity of variance, and linearity: the logarithmic transformation the square root transformation, and the inverse transformation

plus a fourth that is useful for problems of linearity: the square transformation

Page 3: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 3

Computing transformations in SPSS

In SPSS, transformations are obtained by computing a new variable. SPSS functions are available for the logarithmic (LG10) and square root (SQRT) transformations. The inverse transformation uses a formula which divides one by the original value for each case.

For each of these calculations, there may be data values which are not mathematically permissible. For example, the log of zero is not defined mathematically, division by zero is not permitted, and the square root of a negative number results in an “imaginary” value. We will usually adjust the values passed to the function to make certain that these illegal operations do not occur.

Page 4: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 4

Two forms for computing transformations

There are two forms for each of the transformations to induce normality, depending on whether the distribution is skewed negatively to the left or skewed positively to the right.

Both forms use the same SPSS functions and formula to calculate the transformations.

The two forms differ in the value or argument passed to the functions and formula. The argument to the functions is an adjustment to the original value of the variable to make certain that all of the calculations are mathematically correct.

Page 5: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 5

Functions and formulas for transformations

Symbolically, if we let x stand for the argument passes to the function or formula, the calculations for the transformations are:

Logarithmic transformation: compute log = LG10(x)

Square root transformation: compute sqrt = SQRT(x)

Inverse transformation: compute inv = 1 / (x)

Square transformation: compute s2 = x * x

For all transformations, the argument must be greater than zero to guarantee that the calculations are mathematically legitimate.

Page 6: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 6

Transformation of positively skewed variables

For positively skewed variables, the argument is an adjustment to the original value based on the minimum value for the variable.

If the minimum value for a variable is zero, the adjustment requires that we add one to each value, e.g. x + 1.

If the minimum value for a variable is a negative number (e.g., –6), the adjustment requires that we add the absolute value of the minimum value (e.g. 6) plus one (e.g. x + 6 + 1, which equals x +7).

Page 7: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 7

Example of positively skewed variable

Suppose our dataset contains the number of books read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the distribution is positively skewed.

The minimum value for the variable books is 0. The adjustment for each case is books + 1.

The transformations would be calculated as follows: Compute logBooks = LG10(books + 1) Compute sqrBooks = SQRT(books + 1) Compute invBooks = 1 / (books + 1)

Page 8: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 8

Transformation of negatively skewed variables

If the distribution of a variable is negatively skewed, the adjustment of the values reverses, or reflects, the distribution so that it becomes positively skewed. The transformations are then computed on the values in the positively skewed distribution.

Reflection is computed by subtracting all of the values for a variable from one plus the absolute value of maximum value for the variable. This results in a positively skewed distribution with all values larger than zero.

Page 9: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 9

Example of negatively skewed variable

Suppose our dataset contains the number of books read (books) for 5 subjects: 1, 3, 0, 5, and 2, and the distribution is negatively skewed.

The maximum value for the variable books is 5. The adjustment for each case is 6 - books.

The transformations would be calculated as follows: Compute logBooks = LG10(6 - books) Compute sqrBooks = SQRT(6 - books) Compute invBooks = 1 / (6 - books)

Page 10: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 10

The Square Transformation for Linearity

The square transformation is computed by multiplying the value for the variable by itself.

It does not matter whether the distribution is positively or negatively skewed.

It does matter if the variable has negative values, since we would not be able to distinguish their squares from the square of a comparable positive value (e.g. the square of -4 is equal to the square of +4). If the variable has negative values, we add the absolute value of the minimum value to each score before squaring it.

Page 11: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 11

Example of the square transformation

Suppose our dataset contains change scores (chg) for 5 subjects that indicate the difference between test scores at the end of a semester and test scores at mid-term: -10, 0, 10, 20, and 30.

The minimum score is -10. The absolute value of the minimum score is 10.

The transformation would be calculated as follows: Compute squarChg = (chg + 10) * (chg +

10)

Page 12: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 12

Normal Q-Q Plot of TOTAL TIME SPENT ON THE INTERNET

Observed Value

120100806040200-20-40

Exp

ecte

d N

orm

al

3

2

1

0

-1

-2

-3

Transformations for normality

TOTAL TIME SPENT ON THE INTERNET

100.0

90.0

80.0

70.0

60.0

50.0

40.0

30.0

20.0

10.0

0.0

Histogram

Fre

quen

cy

50

40

30

20

10

0

Std. Dev = 15.35

Mean = 10.7

N = 93.00

Both the histogram and the normality plot for Total Time Spent on the Internet (netime) indicate that the variable is not normally distributed.

Page 13: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 13

Descriptives

10.73 1.59

7.57

13.89

8.29

5.50

235.655

15.35

0

102

102

10.20

3.532 .250

15.614 .495

Mean

Lower Bound

Upper Bound

95% ConfidenceInterval for Mean

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

TOTAL TIME SPENTON THE INTERNET

Statistic Std. Error

Determine whether reflection is required

Skewness, in the table of Descriptive Statistics, indicates whether or not reflection (reversing the values) is required in the transformation.

If Skewness is positive, as it is in this problem, reflection is not required. If Skewness is negative, reflection is required.

Page 14: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 14

Descriptives

10.73 1.59

7.57

13.89

8.29

5.50

235.655

15.35

0

102

102

10.20

3.532 .250

15.614 .495

Mean

Lower Bound

Upper Bound

95% ConfidenceInterval for Mean

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

TOTAL TIME SPENTON THE INTERNET

Statistic Std. Error

Compute the adjustment to the argument

In this problem, the minimum value is 0, so 1 will be added to each value in the formula, i.e. the argument to the SPSS functions and formula for the inverse will be:

netime + 1.

Page 15: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 15

Computing the logarithmic transformation

To compute the transformation, select the Compute… command from the Transform menu.

Page 16: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 16

Specifying the transform variable name and function

First, in the Target Variable text box, type a name for the log transformation variable, e.g. “lgnetime“.

Second, scroll down the list of functions to find LG10, which calculates logarithmic values use a base of 10. (The logarithmic values are the power to which 10 is raised to produce the original number.)

Third, click on the up arrow button to move the highlighted function to the Numeric Expression text box.

Page 17: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 17

Adding the variable name to the function

First, scroll down the list of variables to locate the variable we want to transform. Click on its name so that it is highlighted.

Second, click on the right arrow button. SPSS will replace the highlighted text in the function (?) with the name of the variable.

Page 18: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 18

Adding the constant to the function

Following the rules stated for determining the constant that needs to be included in the function either to prevent mathematical errors, or to do reflection, we include the constant in the function argument. In this case, we add 1 to the netime variable.

Click on the OK button to complete the compute request.

Page 19: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 19

The transformed variable

The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.

Page 20: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 20

Computing the square root transformation

To compute the transformation, select the Compute… command from the Transform menu.

Page 21: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 21

Specifying the transform variable name and function

First, in the Target Variable text box, type a name for the square root transformation variable, e.g. “sqnetime“.

Second, scroll down the list of functions to find SQRT, which calculates the square root of a variable.

Third, click on the up arrow button to move the highlighted function to the Numeric Expression text box.

Page 22: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 22

Adding the variable name to the function

Second, click on the right arrow button. SPSS will replace the highlighted text in the function (?) with the name of the variable.

First, scroll down the list of variables to locate the variable we want to transform. Click on its name so that it is highlighted.

Page 23: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 23

Adding the constant to the function

Following the rules stated for determining the constant that needs to be included in the function either to prevent mathematical errors, or to do reflection, we include the constant in the function argument. In this case, we add 1 to the netime variable.

Click on the OK button to complete the compute request.

Page 24: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 24

The transformed variable

The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.

Page 25: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 25

Computing the inverse transformation

To compute the transformation, select the Compute… command from the Transform menu.

Page 26: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 26

Specifying the transform variable name and formula

First, in the Target Variable text box, type a name for the inverse transformation variable, e.g. “innetime“.

Second, there is not a function for computing the inverse, so we type the formula directly into the Numeric Expression text box.

Third, click on the OK button to complete the compute request.

Page 27: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 27

The transformed variable

The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.

Page 28: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 28

Descriptives

10.73 1.59

7.57

13.89

8.29

5.50

235.655

15.35

0

102

102

10.20

3.532 .250

15.614 .495

Mean

Lower Bound

Upper Bound

95% ConfidenceInterval for Mean

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

TOTAL TIME SPENTON THE INTERNET

Statistic Std. Error

Adjustment to the argument for the square transformation

In this problem, the minimum value is 0, no adjustment is needed for computing the square. If the minimum was a number less than zero, we would add the absolute value of the minimum (dropping the sign) as an adjustment to the variable.

It is mathematically correct to square a value of zero, so the adjustment to the argument for the square transformation is different. What we need to avoid are negative numbers, since the square of a negative number produces the same value as the square of a positive number.

Page 29: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 29

Computing the square transformation

To compute the transformation, select the Compute… command from the Transform menu.

Page 30: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 30

Specifying the transform variable name and formula

First, in the Target Variable text box, type a name for the inverse transformation variable, e.g. “s2netime“.

Second, there is not a function for computing the square, so we type the formula directly into the Numeric Expression text box.

Third, click on the OK button to complete the compute request.

Page 31: SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.

SW388R7Data Analysis

& Computers II

Slide 31

The transformed variable

The transformed variable which we requested SPSS compute is shown in the data editor in a column to the right of the other variables in the dataset.


Recommended