+ All Categories
Home > Documents > 17b.Accessing Data: Manipulating Variables in SAS ®

17b.Accessing Data: Manipulating Variables in SAS ®

Date post: 24-Feb-2016
Category:
Upload: gavril
View: 34 times
Download: 0 times
Share this document with a friend
Description:
17b.Accessing Data: Manipulating Variables in SAS ®. Prerequisites. Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either 4. Parent and Youth Surveys or - PowerPoint PPT Presentation
28
17b. Accessing Data: Manipulating Variables in SAS ®
Transcript
Page 1: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: ManipulatingVariables in SAS®

Page 2: 17b.Accessing Data: Manipulating Variables in SAS ®

2

17b. Accessing Data: Manipulating Variables in SAS®

Prerequisites• Recommended modules to complete before viewing

this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either

• 4. Parent and Youth Surveys or• 5. School Surveys, Student Assessments, and Transcripts

NLTS2 Documentation• 10. Overview• 11. Data Dictionaries• 12. Quick References

Page 3: 17b.Accessing Data: Manipulating Variables in SAS ®

3

17b. Accessing Data: Manipulating Variables in SAS®

Prerequisites• Recommended modules to complete before viewing

this module (cont’d) 13. Analysis Example: Descriptive/Comparative Using

Longitudinal Data Accessing Data

• 14b. Files in SAS• 15b. Frequencies in SAS

Page 4: 17b.Accessing Data: Manipulating Variables in SAS ®

4

17b. Accessing Data: Manipulating Variables in SAS®

Overview Purpose Modifying existing variables Creating new variables Summary Closing Important information

Page 5: 17b.Accessing Data: Manipulating Variables in SAS ®

5

17b. Accessing Data: Manipulating Variables in SAS®

NLTS2 restricted-use data• NLTS2 data are restricted.• Data used in these presentations are from a

randomly selected subset of the restricted-use NLTS2 data.

• Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.

Page 6: 17b.Accessing Data: Manipulating Variables in SAS ®

6

17b. Accessing Data: Manipulating Variables in SAS®

Purpose• Learn to

Modify an existing variable Create a new variable Join/combine data from different sources

Page 7: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

7

Modifying existing variables• How to modify a variable.• To collapse categories, break a continuous variable into categories, or

recode a variable, it is not always necessary to create a new variable in SAS. User-assigned formats control how output prints but does not change the

variable.• Syntax for categorizing an existing variable with a format

PROC FORMAT ; VALUE b2catfmt low-1 = "(<=1) 1 or younger" 2-5 = "(2-5) 2 to 5 years of age" 6-10 = "(6-10) 6 to 10 years of age" 11-high = "(>=11) 11 or older" ;PROC FREQ data = collapse ; TABLES np1B2a ; FORMAT np1B2a b2catfmt. ;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 8: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

8

Modifying existing variables• Syntax to modify an existing variable

Create a new variable rather than permanently changing the exiting variable

Create a new format so values are meaningfulPROC FORMAT ;

VALUE b2catfmt 1 = "(1) 1 or younger" 2 = "(2) 2 to 5 years of age" 3 = "(3) 6 to 10 years of age" 4 = "(4) 11 or older" ; Recode the variable in a data step

• This would result in a temporary change. Why? What would make it a permanent change?

DATA collapse ;SET sasdb.n2w1parent ;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 9: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

9

Modifying existing variables• Syntax to recode an existing variable into a new

variable with value and variable labels./* create age of youth when diagnosed – with age

range categories*/if missing(np1B2a) then np1B2a_Cat = np1B2a ; else if np1B2a <= 1 then np1B2a_Cat = 1 ; else if 2<=np1B2a<=5 then np1B2a_Cat = 2 ; else if 6<=np1B2a<=10 then np1B2a_Cat = 3 ; else if np1B2a > 10 then np1B2a_Cat = 4 ;FORMAT np1B2a_Cat b2catfmt. ;LABEL np1B2a_Cat = '(np1B2a_cat) Age of youth

when diagnosed - categorized into ranges' ;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 10: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

10

Modifying existing variables• Look at results

Run a frequency of the new variable Useful to look at a crosstab of the original variable by the new variable

to check how values were coded

• Look at frequency distributions and crosstab of new vs. old variables The “LIST” option on TABLES statement will print the crosstab table

more compactly. A FORMAT statement without a format specified will strip existing

formats.TABLES np1B2a_Cat * np1B2a/MISSPRINT LIST ;

FORMAT np1B2a ;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 11: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

11

Modifying existing variables

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 12: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

12

Modifying existing variables

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 13: 17b.Accessing Data: Manipulating Variables in SAS ®

13

17b. Accessing Data: Manipulating Variables in SAS®

Modifying existing variables: Example• Modifying a variable

Use Wave 3 parent/youth interview file Collapse np3NbrProbs into a new variable

• 0-1• 2• 3• 4-6

Remember to• Label the variable.• Add value formats.• Account for missing values.

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 14: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

14

Modifying existing variables: Example

• PROC FREQ with a user-defined format (no change made to np3NbrProbs)

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 15: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

15

Modifying existing variables: Example

• PROC FREQ with new variable np3NbrProbs_Cat created from np3NbrProbs

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 16: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

16

Modifying existing variables: Example

• Created np3NbrProbs_Cat compared with original np3NbrProbs• Stripped existing formats from np3NbrProbs with format statement

FORMAT np3NbrProbs;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 17: 17b.Accessing Data: Manipulating Variables in SAS ®

17

17b. Accessing Data: Manipulating Variables in SAS®

Creating new variables• How to create a new variable.• The values in the new variable can be the results of

calculations, assignments, or logic.• A new variable can be created from an existing

variable or from multiple variables, including variables from other sources and/or waves. Variables from other sources/waves must be added to the

active data file before creating the new variable.

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 18: 17b.Accessing Data: Manipulating Variables in SAS ®

18

17b. Accessing Data: Manipulating Variables in SAS®

Creating new variables• Be aware of any coding differences between the

variables when combining values.• Decide what to do with missing values.• Example: Create a variable using parent interview

data from Waves 1, 2, and 3. Has student been suspended and/or expelled in any wave?

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 19: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

19

Creating new variablesCreate a format for the new variable and join data needed

PROC FORMAT ; VALUE fmta 0 = "(0) Never suspended/expelled" 1 = "(1) Suspended or expelled in any wave" 2 = "(2) Suspended or expelled every wave" ;

DATAcollapse ;MERGE sasdb.n2w1parent (keep=ID np1d7h) sasdb.n2w2paryouth (keep=ID np2d5d) sasdb.n2w3paryouth (keep=ID np3d5d) sasdb.n2w4paryouth(keep=ID np4d5d) ;BY ID ;

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 20: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

20

Creating new variables• Syntax

If np1D7h>=0 and np2D5d>=0 and np3D5d>=0 and np4D5d>=0then do ;

if np1D7h=1 and np2D5d=1 and np3D5d=1 and np4D5d=1 then np4D5d_ever = 2 ; else if np1D7h=1 or np2D5d=1 or np3D5d=1 or

np4D5d=1 then np4D5d_ever = 1 ; else np4D5d_ever = 0 ; end ;

• Code will result in a variable that Requires a value for every wave Is 0 if never suspended/expelled Is 1 if suspended/expelled in any wave Is 2 if suspend/expelled in all three waves.

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 21: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

21

Creating new variables

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 22: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

22

Creating new variables: Example• Creating a new variable

Use the Wave 4 parent/youth interview file. Bring in np1F7 from Wave 1, np2P8_J4 from Wave 2, and

np3P8_J4 from Wave 3 interview files. Create a new variable np4P8_J4_ever (ever done volunteer

or community service). Initialize value to “0” if any value in np1F7, np2P8_J4,

np3P8_J4, or np4P8_J4 is “0.” Reassign to “1” if any value in np1F7, np2P8_J4, np3P8_J4,

or np4P8_J4 is “1.”

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 23: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

23

Creating new variables: Example• Creating a new variable (cont’d)

Assign a variable label and value labels. Run a frequency of np4P8_J4_ever. Run a crosstabulation of np4P8_J4_ever by

np4P8_J4.

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 24: 17b.Accessing Data: Manipulating Variables in SAS ®

17b. Accessing Data: Manipulating Variables in SAS®

24

Creating new variables: Example

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 25: 17b.Accessing Data: Manipulating Variables in SAS ®

25

17b. Accessing Data: Manipulating Variables in SAS®

Summary• Be aware of differences in coding between similar

variables when building composite variables.• Missing values must be considered.

Know how missing values are being coded, particularly when using more than one variable to create another.

Joined data are more likely to have missing values.• Weights

Generally, the analysis weight would be the weight from the smallest sample when combining data.

When filling in values for a variable in an active file with values from another, it is OK to use the weight in the active file.

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 26: 17b.Accessing Data: Manipulating Variables in SAS ®

26

17b. Accessing Data: Manipulating Variables in SAS®

Summary

Know the values, mind the missing, and watch your weights!

These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.

Page 27: 17b.Accessing Data: Manipulating Variables in SAS ®

27

17b. Accessing Data: Manipulating Variables in SAS®

Closing• Topics discussed in this module

Modifying existing variables Creating new variables Summary

• Next module: 18b. PROC SURVEY Procedures in SAS

Page 28: 17b.Accessing Data: Manipulating Variables in SAS ®

28

17b. Accessing Data: Manipulating Variables in SAS®

Important information NLTS2 website contains reports, data tables, and other

project-related information http://nlts2.org/

Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/

General information about restricted data licenses can be found on the NCES website http://nces.ed.gov/statprog/instruct.asp

E-mail address: [email protected]


Recommended