1
1
Reading Raw Data, Formats and Data Types
2.1 SAS Data Libraries2.2 SAS List Reports from SAS Data Sets2.3 Formats in SAS2.4 Reading Raw Data into SAS2.5 Minitab List Reports from Minitab Worksheets2.6 Formats in Minitab2.7 Reading Raw Data into Minitab SAS Data Libraries
3
ObjectivesExplain the concept of a SAS data library.State the difference between a permanent library and a temporary library.Use the CONTENTS procedure to investigate a SAS data library.
4
In this class, you will be working with business data from International Airlines (IA). The various kinds of data IA maintains are
flight datapassenger datacargo dataemployee datarevenue data.
Airline Data Library
5
A SAS data library is a collection of SAS files that are recognized as a unit by SAS.
Windows: c:\mysasfiles
SAS Data Library
SAS File
SAS File
SAS File
Directory-based A SAS data library isSystems a directory.
A SAS data set is a type of SAS file.
SAS Data Libraries
6
FILES
LIBRARIES
You can think of a SAS data library as a drawer in a filing cabinet and a SAS data set as one of the file folders in the drawer.
SAS Data Libraries
2
7
work
sasuser
ia
work - temporary library
sasuser - permanent library
When you invoke SAS, you automatically have access to a temporary and a permanent SAS data library.
You can create and access your own permanent libraries.
ia - permanent library
SAS Data Libraries
8
Regardless of which host operating system you use, you identify SAS data libraries by assigning each a library reference name (libref).
libref
Assigning a Libref
9
LIBNAME libref 'SAS-data-library' <options>;LIBNAME libref 'SAS-data-library' <options>;
Rules for naming a libref: must be 8 characters or lessmust begin with a letter or underscoreremaining characters are letters, numbers, or underscores.
Assigning a LibrefYou can use the LIBNAME statement to assign a libref to a SAS data library.
General form of the LIBNAME statement:
10
libname ia "\\Cqaspdc\lab share\LaLonde\742\data";
Assigning a LibrefExamples:Windows
When you submit the LIBNAME statement, a connection is made between a libref in SAS and the physical location of files on your operating system.
11
The first name (libref) refers to the library.
Every SAS file has a two-level name:
The second name (filename) refers to the file in the library.
The data set ia.sales is a SAS file in the ia library.
libref.filename
sasuser
work
ia
sales
Two-level SAS Filenames
12
work.employee employee
Temporary SAS FilenameThe libref work can be omitted when you refer to a file in the work library. The default libref is work if the libref is omitted.
3
13
During an interactive SAS session, the LIBNAME window enables you to investigate the contents of a SAS data library.
In the LIBNAME window, you can view a list of all the libraries available during your current SAS sessiondrill down to see all members of a specific librarydisplay the descriptor portion of a SAS data set.
Browsing a SAS Data Library
14
LIBNAME Window: Windows
15
Use the _ALL_ keyword to list all the SAS files in the library and the NODS option to suppress the descriptor portions of the data sets.
General form of the NODS option:
NODS must be used in conjunction with the keyword _ALL_.
PROC CONTENTS DATA=libref._ALL_ NODS;RUN;PROC CONTENTS DATA=libref._ALL_ NODS;RUN;
proc contents data=ia._all_ nods;run;
Browsing a SAS Data Library
16
The SAS System
The CONTENTS Procedure
-----Directory-----
Libref: IAEngine: V8Physical Name: C:\workshop\winsas\prog1File Name: C:\workshop\winsas\prog1
File# Name Memtype Size Last Modifiedƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ1 ALLGOALS DATA 5120 07MAY2001:09:24:532 ALLGOALS2 DATA 5120 07MAY2001:09:24:473 ALLSALES DATA 5120 18JUL2001:15:55:014 ALLSALES2 DATA 5120 07MAY2001:09:23:435 APRTARGET DATA 17408 09AUG2001:19:02:446 CHICAGO DATA 17408 05MAY2001:21:20:107 CREW DATA 13312 29JUN2001:21:55:598 DELAY DATA 66560 18JUL2001:16:03:02 9 DFWLAX DATA 5120 25JUN2001:17:27:28
. . .32 TARGET121999 DATA 115712 09AUG2001:18:38:2033 WEEKREV DATA 5120 18JUL2001:15:52:42
Partial OutputPROC CONTENTS Output
17
To explore the descriptor portion of a SAS data set, specify the data set name in the DATA= option.
PROC CONTENTS DATA=libref.SAS-data-set-name;RUN;PROC CONTENTS DATA=libref.SAS-data-set-name;RUN;
proc contents data=ia.crew;run;
Browsing a SAS Data Library
18
The SAS System
The CONTENTS Procedure
Data Set Name: IA.CREW Observations: 69Member Type: DATA Variables: 8Engine: V8 Indexes: 0Created: 15:15 Friday, Observation Length: 120
June 29, 2001Last Modified: 15:41 Friday, Deleted Observations: 0
June 29, 2001Protection: Compressed: NOData Set Type: Sorted: NOLabel:
-----Engine/Host Dependent Information-----
Data Set Page Size: 12288Number of Data Set Pages: 1First Data Page: 1Max Obs per Page: 102Obs in First Data Page: 69Number of Data Set Repairs: 0File Name: C:\workshop\winsas\
prog1\crew.sas7bdatRelease Created: 8.0202M0Host Created: WIN_PRO
PROC CONTENTS Output – Part 1
4
19
-----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos Format Informatƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ6 EmpID Char 6 1043 FirstName Char 32 481 HireDate Num 8 0 DATE9. DATE9.7 JobCode Char 6 1102 LastName Char 32 164 Location Char 16 805 Phone Char 8 968 Salary Num 8 8
PROC CONTENTS Output – Part 2
20
SummarySAS data sets are stored in SAS libraries (synonymous with folders in Windows…)A SAS library reference is used to point to a SAS library (or folder)SAS data sets can be either temporary (erased at the end of execution), or permanent (not automatically erased at the end of execution).PROC CONTENTS is used to display the contents of a SAS data set.
SAS List Reports from SAS Data Sets
22
ObjectivesGenerate simple list reports using the PRINT procedure.Sequence (sort) observations in a SAS data set.Group observations in a list report.Print column subtotals in a list report. Use the ID statement to identify observations.Define titles and footnotes to enhance reports.Define descriptive column headings.Create HTML reports using the Output Delivery System.
23
Overview of PROC PRINTList reports are typically generated with the PRINT procedure.
The SAS System
Emp JobObs ID LastName FirstName Code Salary
1 0031 GOLDENBERG DESIREE PILOT 50221.622 0040 WILLIAMS ARLENE M. FLTAT 23666.123 0071 PERRY ROBERT A. FLTAT 21957.714 0082 MCGWIER-WATTS CHRISTINA PILOT 96387.395 0091 SCOTT HARVEY F. FLTAT 32278.406 0106 THACKER DAVID S. FLTAT 24161.147 0355 BELL THOMAS B. PILOT 59803.168 0366 GLENN MARTHA S. PILOT 120202.38
24
You can displaytitles and footnotesdescriptive column headingsformatted data values.
Overview of PROC PRINT
Salary Report
Emp Job AnnualObs ID LastName FirstName Code Salary
1 0031 GOLDENBERG DESIREE PILOT $50,221.622 0040 WILLIAMS ARLENE M. FLTAT $23,666.123 0071 PERRY ROBERT A. FLTAT $21,957.714 0082 MCGWIER-WATTS CHRISTINA PILOT $96,387.395 0091 SCOTT HARVEY F. FLTAT $32,278.406 0106 THACKER DAVID S. FLTAT $24,161.147 0355 BELL THOMAS B. PILOT $59,803.168 0366 GLENN MARTHA S. PILOT $120,202.38
5
25
Overview of PROC PRINTYou can display
column totalscolumn subtotalspage breaks for each subgroup.
The SAS System
------------------------ JobCode=FLTAT -------------------------
EmpObs ID LastName FirstName Salary
1 0040 WILLIAMS ARLENE M. 23666.122 0071 PERRY ROBERT A. 21957.713 0091 SCOTT HARVEY F. 32278.404 0106 THACKER DAVID S. 24161.14
------- ---------JobCode 102063.37
26
Overview of PROC PRINTThe SAS System
------------------------ JobCode=PILOT -------------------------
EmpObs ID LastName FirstName Salary
5 0031 GOLDENBERG DESIREE 50221.626 0082 MCGWIER-WATTS CHRISTINA 96387.397 0355 BELL THOMAS B. 59803.168 0366 GLENN MARTHA S. 120202.38
------- ---------JobCode 326614.55
=========428677.92
27
Example:
General form of the PRINT procedure:
Creating a Default List Report
PROC PRINT DATA=SAS-data-set;RUN;PROC PRINT DATA=SAS-data-set;RUN;
libname ia 'SAS-data-library';proc print data=ia.empdata;run;
28
The VAR statement enables you toselect variables to include in the reportdefine the order of the variables in the report.
General form of the VAR statement:
Printing Selected Variables
VAR variable(s);VAR variable(s);
29
The NOOBS option suppresses the row numbers on the left side of the report.
General form of the NOOBS option:
Suppressing the Obs Column
PROC PRINT DATA=SAS-data-set NOOBS;RUN;PROC PRINT DATA=SAS-data-set NOOBS;RUN;
30
Use to produce a listing report that displays information for pilots only.
Subsetting Data: WHERE Statement
The WHERE statement enables you to select observations that meet a certain conditioncan be used with most SAS procedures.
6
31
General form of the WHERE statement:
where-expression is a sequence of operands and operators.
Operands includevariablesconstants.
Subsetting Data: WHERE Statement
WHERE where-expression;WHERE where-expression;
32
Operators include comparison operatorslogical operatorsspecial operatorsfunctions.
Subsetting Data: WHERE Statement
33
Comparison Operators
EQ = equal to
NE ^= ¬= ~=
not equal to
GT > greater than
LT < less than
GE >= greater than or equal to
LE <= less than or equal to
IN equal to one of a list
Mnemonic Symbol Definition
34
Character comparisons are case-sensitive.The IN operator allows commas or blanks to separate values.
Examples:
Comparison Operators
where Salary>25000;
where EmpID='0082';
where Salary=.;
where LastName=' ';
where JobCode in('PILOT','FLTAT');
where JobCode in('PILOT' 'FLTAT');
35
Logical operators include
AND if both expressions are true, then the compound expression is true
where JobCode='FLTAT' and Salary>50000;&
OR if either expression is true, then the compound expression is true
where JobCode='PILOT' or JobCode='FLTAT';|
NOT can be combined with other operators to reverse the logic of a comparison.
where JobCode not in('PILOT','FLTAT');^
Logical Operators
36
Special operators include
BETWEEN-AND selects observations in which the value of the variable falls within a range of values, inclusively.
where Salary between 50000 and 70000;
CONTAINS selects observations that include the specified substring.
where LastName ? 'LAM';?
(LAMBERT, BELLAMY, and ELAM are selected.)
Special Operators
7
37
Selects observations where the value of Code begins with an E, followed by a single character, followed by a U, followed by any number of characters.
Special OperatorsThe following are special operators :
LIKE selects observations by comparing character values to specified patterns.A percent sign (%) replaces any number of characters.An underscore (_) replaces one character.
where Code like 'E_U%';
38
The sounds like (=*) operator selects observations that contain spelling variations of the word or words specified.
Selects names like SMYTHE and SMITT.
IS NULL or IS MISSING selects observations in which the value of the variable is missing.
Special Operators
where Name=*'SMITH';
where Flight is missing;
where Flight is null;
39
The SUM statement produces column totals.
General form of the SUM statement:
The SUM statement also produces subtotals if you print the data in groups.
Requesting Column Totals
SUM variable(s);SUM variable(s);
40
The SORT procedurerearranges the observations in a SAS data setcan create a new SAS data set containing the rearranged observationscan sort on multiple variablescan sort in ascending (default) or descending orderdoes not generate printed outputtreats missing values as the smallest possible value.
To request subgroup totals in PROC PRINT, the observations in the data set must be grouped.
Sorting a SAS Data Set
41
General form of the PROC SORT step:
Examples:
Sorting a SAS Data Set
PROC SORT DATA=input-SAS-data-set<OUT=output-SAS-data-set>;
BY <DESCENDING> by-variable(s);RUN;
PROC SORT DATA=input-SAS-data-set<OUT=output-SAS-data-set>;
BY <DESCENDING> by-variable(s);RUN;
proc sort data=ia.empdata;by Salary;
run;
proc sort data=ia.empdata out=work.jobsal;by JobCode descending Salary;
run;
42
EmpID LastName FirstName JobCode Salary
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
0031 GOLDENBERG DESIREE PILOT 50221.62
work.empdata
Sorting a SAS Data Setia.empdata
PROC Step
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
proc sort data=ia.empdata out=work.empdata;by JobCode;
run;
8
43
Using a BY statement and a SUM statement together in a PROC PRINT step produces subtotals and grand totals.
Printing Subtotals and Grand TotalsPrint the data set grouped by JobCode with a subtotal for the Salary column for each JobCode.
proc sort data=ia.empdata out=work.empdata;by JobCode;
run;proc print data=work.empdata;
by JobCode;sum Salary;
run;
44
Printing Subtotals and Grand TotalsThe SAS System
------------------------ JobCode=FLTAT -------------------------
EmpObs ID LastName FirstName Salary
1 0040 WILLIAMS ARLENE M. 23666.122 0071 PERRY ROBERT A. 21957.713 0091 SCOTT HARVEY F. 32278.404 0106 THACKER DAVID S. 24161.14
------- ---------JobCode 102063.37
------------------------ JobCode=PILOT -------------------------
EmpObs ID LastName FirstName Salary
5 0031 GOLDENBERG DESIREE 50221.626 0082 MCGWIER-WATTS CHRISTINA 96387.397 0355 BELL THOMAS B. 59803.168 0366 GLENN MARTHA S. 120202.38
------- ---------JobCode 326614.55
=========428677.92
45
General form of the PAGEBY statement:
The PAGEBY statement must be used with a BY statement.
Page BreaksUse the PAGEBY statement to put each subgroup on a separate page.
PAGEBY by-variable;PAGEBY by-variable;
proc print data=work.empdata;by JobCode;pageby JobCode;sum Salary;
run;
c04s2d2 46
Identifying ObservationsThe ID statement enables you to
suppress the Obs column in the reportspecify which variable(s) should replace the Obs column.
General form of the ID statement:
ID variable(s);ID variable(s);
47
ia.empdataCreating a Default List Report
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
PROC Step
proc print data=ia.empdata;id JobCode;var EmpID Salary;
run;
The SAS System
Job EmpCode ID Salary
PILOT 0031 50221.62FLTAT 0040 23666.12FLTAT 0071 21957.71
Suppress the Obs column
c04s3d1 48
General form of the FOOTNOTE statement:
Examples:
Defining Titles and FootnotesYou use titles and footnotes to enhance reports.General form of the TITLE statement:
TITLEn 'text ';TITLEn 'text ';
FOOTNOTEn 'text ';FOOTNOTEn 'text ';
title1 'Flight Crew Employee Listing';footnote2 'Employee Review';
9
49
Defining Titles and FootnotesFeatures of titles:
Titles appear at the top of the page.The default title is The SAS System.The value of n can be from 1 to 10.An unnumbered TITLE is equivalent to TITLE1.Titles remain in effect until they are changed, cancelled, or you end your SAS session.The null TITLE statement, title;, cancels all titles.
50
Defining Titles and FootnotesFeatures of footnotes:
Footnotes appear at the bottom of the page.No footnote is printed unless one is specified.The value of n can be from 1 to 10.An unnumbered FOOTNOTE is equivalent to FOOTNOTE1.Footnotes remain in effect until they are changed, cancelled, or you end your SAS session.The null FOOTNOTE statement, footnote;, cancels all footnotes.
51
Changing Titles and FootnotesTITLEn or FOOTNOTEn
replaces a previous title or footnote with the same numbercancels all titles or footnotes with higher numbers.
52
General form of the LABEL statement:
'label' specifies a label up to 256 characters.
Labels are usedto replace variable names in SAS outputautomatically by many proceduresby the PRINT procedure when the LABEL or SPLIT= option is specified in the PROC PRINT statement.
Assigning Column Labels
LABEL variable='label'variable='label';
LABEL variable='label'variable='label';
53
Salary Report
Emp First Job AnnualObs ID Last Name Name Code Salary
1 0031 GOLDENBERG DESIREE PILOT 50221.622 0040 WILLIAMS ARLENE M. FLTAT 23666.123 0071 PERRY ROBERT A. FLTAT 21957.71
PROC Step
Assigning Column Labels
proc print data=ia.empdata label;label LastName='Last Name'
FirstName='First Name'Salary='Annual Salary';
title1 'Salary Report';run;
ia.empdataEmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
c05s1d1 54
Using SAS System OptionsYou can use SAS system options to change the appearance of a report.
General form of the OPTIONS statement:
The OPTIONS statement is not usually included in a PROC or DATA step.
OPTIONS option . . . ;OPTIONS option . . . ;
10
55
DATE (default)
specifies to print the date and time the SAS session began at the top of each page of the SAS output.
NODATE specifies not to print the date and time the SAS session began.
LINESIZE=width LS=width
specifies the line size for the SAS log and SAS output.
PAGESIZE=n PS=n
specifies the number of lines (n) that can be printed per page of SAS output.
Selected SAS system options:Using SAS System Options
56
NUMBER (default)
specifies that page numbers be printed on the first line of each page of output.
NONUMBER specifies that page numbers not be printed.
PAGENO=n specifies a beginning page number (n) for the next page of SAS output.
Selected SAS system options:
Example:
Using SAS System Options
options nodate nonumber ls=72;
57
Generating HTML FilesThe ODS HTML statement opens, closes, and managesthe HTML destination.General form of the ODS HTML statement:
ODS HTML FILE='HTML-file-specification' <options>;SAS code that generates output
ODS HTML CLOSE;
ODS HTML FILE='HTML-file-specification' <options>;SAS code that generates output
ODS HTML CLOSE;
58
ods html file='…';proc print…proc means…proc freq…ods html close;
report
report
report
HTML File
Generating HTML FilesOutput is directed to the specified HTML file until you
close the HTML destinationspecify another destination file.
59
Creating an HTML Report1. Open an HTML destination for the listing report.2. Generate the report.3. Close the HTML destination.
ods html file='c05s3d1.html';proc print data=ia.empdata label noobs;
label Salary='Annual Salary';format Salary money. Jobcode $codefmt.;title1 'Salary Report';
run;ods html close;
c05s3d1 60
SummaryPROC PRINT is used to display a listing of SAS data.PROC PRINT can be customized to display only particular variables from a SAS data set, create subtotals and total, labels, page breaks, etc.OPTIONS statements and TITLE statements can be used to further customize the appearance of output from PROC PRINT and other procedures.
11
Formats in SAS
62
ObjectivesDisplay formatted values using SAS formats in a list report.Create user-defined formats using the FORMAT procedure.Apply user-defined formats to variables in a list report.
63
Salary Report
Emp Last First Job AnnualObs ID Name Name Code Salary
1 0031 GOLDENBERG DESIREE PILOT $50,221.622 0040 WILLIAMS ARLENE M. FLTAT $23,666.123 0071 PERRY ROBERT A. FLTAT $21,957.714 0082 MCGWIER-WATTS CHRISTINA PILOT $96,387.395 0091 SCOTT HARVEY F. FLTAT $32,278.406 0106 THACKER DAVID S. FLTAT $24,161.147 0355 BELL THOMAS B. PILOT $59,803.168 0366 GLENN MARTHA S. PILOT $120,202.38
Using SAS FormatsEnhance the readability of reports by formatting the data values.
64
Salary Report in Categories
Emp Last First AnnualID Name Name JobCode Salary
0031 GOLDENBERG DESIREE Pilot More than 50,0000040 WILLIAMS ARLENE M. Flight Attendant Less than 25,0000071 PERRY ROBERT A. Flight Attendant Less than 25,0000082 MCGWIER-WATTS CHRISTINA Pilot More than 50,0000091 SCOTT HARVEY F. Flight Attendant 25,000 to 50,0000106 THACKER DAVID S. Flight Attendant Less than 25,0000355 BELL THOMAS B. Pilot More than 50,0000366 GLENN MARTHA S. Pilot More than 50,000
Using User-defined FormatsCreate custom formats to recode data values in a report.
65
SASData Set
Format Report
Values in the SAS data set are not changed.
Formatting Data ValuesYou can enhance reports by using SAS formats to format data values.
66
To apply a format to a specific SAS variable, use the FORMAT statement.
General form of the FORMAT statement:
Example:
Formatting Data Values
FORMAT variable(s) format;FORMAT variable(s) format;
proc print data=ia.empdata;format Salary dollar11.2;
run;
12
67
<$>format<w>.<d>
Format name
Total width (including decimal places and special characters)
Number of decimal places
Required delimiter
Indicates a character format
What Is a SAS Format?A format is an instruction that SAS uses to write data values.SAS formats have the following form:
68
w.d standard numeric format8.2 Width=8, 2 decimal places: 12234.21
$w. standard character format$5. Width=5: KATHY
COMMAw.d commas in a numberCOMMA9.2 Width=9, 2 decimal places: 12,234.21
DOLLARw.d dollar signs and commas in a number DOLLAR10.2 Width=10, 2 decimal places: $12,234.21
Selected SAS formats:SAS Formats
69
Stored Value
Format Displayed Value
27134.2864 COMMA12.2 27,134.2927134.2864 12.2 27134.2927134.2864 DOLLAR12.2 $27,134.2927134.2864 DOLLAR9.2 $27134.2927134.2864 DOLLAR8.2 27134.2927134.2864 DOLLAR5.2 2713427134.2864 DOLLAR4.2 27E3
If you do not specify a format width large enough to accommodate a numeric value, the displayed value is automatically adjusted to fit into the width.
SAS Formats
70
Salary Report
Emp Last First Job AnnualObs ID Name Name Code Salary
1 0031 GOLDENBERG DESIREE PILOT $50,221.622 0040 WILLIAMS ARLENE M. FLTAT $23,666.123 0071 PERRY ROBERT A. FLTAT $21,957.71
proc print data=ia.empdata split=' ';label LastName='Last Name'
FirstName='First Name'Salary='Annual Salary';
format Salary dollar11.2;title1 'Salary Report';
run;
PROC Step
EmpID LastName FirstName JobCode Salary
0031 GOLDENBERG DESIREE PILOT 50221.62
0040 WILLIAMS ARLENE M. FLTAT 23666.12
0071 PERRY ROBERT A. FLTAT 21957.71
ia.empdata
Formatting Data Values
c05s2d1
71
Selected SAS date formats:
Format
Displayed Value
MMDDYY6. 101601 MMDDYY8. 10/16/01 MMDDYY10. 10/16/2001
Format
Displayed Value
DATE7. 16OCT01DATE9. 16OCT2001
MMDDYYw. DATEw.
Recall that a SAS date is stored as the number of days between 01JAN1960 and the specified date.
SAS date formats display SAS date values in standard date forms.
SAS Formats
72
Stored Value
Format
Displayed Value
0 MMDDYY8. 01/01/60
0 MMDDYY10. 01/01/1960
1 DATE9. 02JAN1960
-1 WORDDATE. December 31, 1959
365 DDMMYY10. 31/12/1960
366 WEEKDATE. Sunday, January 1, 1961
SAS FormatsExamples:
13
73
Creating User-defined FormatsSAS also provides the FORMAT procedure, which enables you to define custom formats.
To create and use your own formats,1. use the FORMAT procedure to create the format2. apply the format to specific variable(s)
by using a FORMAT statement.
74
Creating User-defined FormatsGeneral form of a PROC FORMAT step:
PROC FORMAT;VALUE format-name range1='label '
range2='label '. . . ;
RUN;
PROC FORMAT;VALUE format-name range1='label '
range2='label '. . . ;
RUN;
75
Creating User-defined FormatsFormat-name
names the format you are creatingcannot be more than 8 charactersfor character values, must have a dollar sign ($) as the first character, a letter or underscore as the second character, and no more than 6 additional characters, numbers, and underscoresfor numeric values, must have a letter or underscore as the first character and no more than 7 additional characters, numbers, and underscorescannot end in a numbercannot be the name of a SAS formatdoes not end with a period in the VALUE statement.
76
Creating User-defined FormatsLabels
can be up to 32,767 characters in lengthare typically enclosed in quotes, although it is not required.
Range(s) can be single valuesranges of values.
77
proc format;value gender 1='Female'
2='Male'other='Miscoded';
run;
Numeric format name
Formatted value
KeywordNumeric data value
Creating User-defined FormatsAssign labels to single numbers.
78
proc format;value boardfmt low-49='Below'
50-99='Average'100-high='Above Average';
run;
Numeric data ranges
Creating User-defined FormatsAssign labels to ranges of numbers.
Keyword
14
79
proc format;value $grade 'A'='Good'
'B'-'D'='Fair''F'='Poor''I','U'='See Instructor'other='Miscoded';
run;
Character format name
Discrete character values
Character value range Keyword
Creating User-defined FormatsAssign labels to character values and ranges of character values.
80
$codefmt
Step 1: Create the format.
Step 2: Apply the format.
Creating User-defined Formats
proc format;value $codefmt 'FLTAT'='Flight Attendant'
'PILOT'='Pilot';run;
proc print data=ia.empdata;format JobCode $codefmt.;
run;
81
money
Step 1: Create the format.
Step 2: Apply the format.
Creating User-defined Formats
proc format;value money low-<25000 ='Less than 25,000'
25000-50000='25,000 to 50,000'50000<-high='More than 50,000';
run;
proc print data=ia.empdata;format Salary money.;
run;
82
You can use multiple VALUE statements in a single PROC FORMAT step.
Creating User-defined Formats
proc format;value $codefmt 'FLTAT'='Flight Attendant'
'PILOT'='Pilot';value money low-<25000 ='Less than 25,000'
25000-50000='25,000 to 50,000'50000<-high='More than 50,000';
run;
c05s2d2
83
Applying User-defined Formatsproc print data=ia.empdata split=' ' noobs;
label LastName='Last Name' FirstName='First Name'Salary='Annual Salary';
format Jobcode $codefmt. Salary money.;title1 'Salary Report in Categories';
run;
Salary Report in Categories
Emp Last First AnnualID Name Name JobCode Salary
0031 GOLDENBERG DESIREE Pilot More than 50,0000040 WILLIAMS ARLENE M. Flight Attendant Less than 25,0000071 PERRY ROBERT A. Flight Attendant Less than 25,0000082 MCGWIER-WATTS CHRISTINA Pilot More than 50,0000091 SCOTT HARVEY F. Flight Attendant 25,000 to 50,0000106 THACKER DAVID S. Flight Attendant Less than 25,0000355 BELL THOMAS B. Pilot More than 50,0000366 GLENN MARTHA S. Pilot More than 50,000
84
SummaryThe FORMAT statement can be used to assign a particular format to the display of a variable.There are many SAS numeric and character formats available to customize the appearance of variables.PROC FORMAT can be used to create custom numeric and character formats.
15
Reading Raw Data into SAS
Section 2-4a
Reading Internal Data: List Input
87
ObjectivesCreate a temporary SAS data set from data contained within the SAS data step using the datalinesstatement.Read space delimited data using list input.
88
Reading Internal Data in LIST FormatThe data must be separate by one or more spaces, and the character data must contain no spaces. For example, the height (in inches), weight (in pounds) and gender (M/F), of ten different individuals…
66 154 M72 188 M69 167 M74 225 M64 138 F65 180 M78 280 M54 132 F66 172 F70 176 M
89
data SAS-data-set-name;input input-specifications;datalines;*** put data here ***run;
Creating a SAS Data Set
Raw Data File
DATA Step
SAS Data Set
In order to create a SAS data set from internal data:
1. start a DATA step and name the SAS data set being created (DATA statement)
2. Indicate that the data is to follow:(DATALINES statement)
3. List the variables in order to be read (indicate character data with a $)(INPUT statement).
90
Reading Data FieldsList form of the INPUT statement:
input-specificationsnames the SAS variables identifies the variables as character or numericlists the variables in order
INPUT input-specifications;INPUT input-specifications;
16
91
Reading Data Using List InputList input is appropriate for reading
data separate by single or multiple spacesstandard character and numeric data.character data has no spaces
List form of a column INPUT statement:
INPUT variable1 <$> variable2 <$> … ;INPUT variable1 <$> variable2 <$> … ;
92
Create a temporary SAS Data Set
data work.one; /* or just “data one;” */input height weight gender $;datalines;
66 154 M72 188 M69 167 M74 225 M64 138 F65 180 M78 280 M54 132 F66 172 F70 176 Mrun;
Store the one data set in the work library.
Section 2-4b
Reading Raw Data Files: Column Input
94
ObjectivesCreate a temporary SAS data set from a raw data file.Create a permanent SAS data set from a raw data file.Explain how the DATA step processes data.Read standard data using column input.
95
Reading Raw Data Files
1 1 21---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 1517098212/12/00dfw 5 8543912/13/00LAX 1419698212/13/00DFW 1511643112/14/00LaX 1716698212/14/00DFW 7 8811412/15/00LAX 18798212/15/00DFW 14 31
Description ColumnFlight Number 1- 3 Date 4-11 Destination 12-14 First Class Passengers
15-17
Economy Passengers
18-20
Data for flights from New York to Dallas (DFW) and Los Angeles (LAX) are stored in a raw data file. Create a SAS data set from the raw data.
96
data SAS-data-set-name;infile 'raw-data-filename';input input-specifications;
run;
Creating a SAS Data Set1 1 2
1---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170
Raw Data File
DATA Step
Flight Date Dest First Class
Economy
439 12/11/00 LAX 20 137921 12/11/00 DFW 20 131114 12/12/00 LAX 15 170
SAS Data Set
In order to create a SAS data set from a raw data file, you must
1. start a DATA step and name the SAS data set being created (DATA statement)
2. identify the location of the raw data file to read(INFILE statement)
3. describe how to read the data fields from the raw data file (INPUT statement).
17
97
Creating a SAS Data SetGeneral form of the DATA statement:
Example: This DATA statement creates a temporarySAS data set named dfwlax:
Example: This DATA statement creates a permanentSAS data set named dfwlax:
DATA libref.SAS-data-set(s);DATA libref.SAS-data-set(s);
data work.dfwlax;
libname ia 'SAS-data-library';data ia.dfwlax;
98
Pointing to a Raw Data File General form of the INFILE statement:
Examples:Windowsinfile 'c:\workshop\winsas\prog1\dfwlax.dat';
INFILE ‘filename’ <options>;INFILE ‘filename’ <options>;
The PAD option in the INFILE statement is useful forreading variable-length records typically found inWindows environments.
The FIRSTOBS option in the INFILE statement is useful for skipping the first few records in a raw data file.
99
Reading Data FieldsGeneral form of the INPUT statement:
input-specificationsnames the SAS variables identifies the variables as character or numericspecifies the locations of the fields in the raw datacan be specified as column, formatted, list or named input.
INPUT input-specifications;INPUT input-specifications;
100
Reading Data Using Column InputColumn input is appropriate for reading
data in fixed columnsstandard character and numeric data.
General form of a column INPUT statement:
INPUT variable <$> startcol-endcol . . . ;INPUT variable <$> startcol-endcol . . . ;
101
The Raw Data1 1 2
1---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 1517098212/12/00dfw 5 8543912/13/00LAX 1419698212/13/00DFW 1511643112/14/00LaX 1716698212/14/00DFW 7 8811412/15/00LAX 18798212/15/00DFW 14 31
Description ColumnFlight Number 1- 3 Date 4-11 Destination 12-14 First Class Passengers
15-17
Economy Passengers
18-20
102
Create Temporary SAS Data Sets
NOTE: The data set WORK.DFWLAX has 10 observations and 5variables.
data work.dfwlax;infile 'raw-data-file';input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17Economy 18-20;
run;
Store the dfwlax data set in the work library.
18
103
Create Permanent SAS Data SetsAlter the previous DATA step to permanently store the dfwlax data set.
NOTE: The data set IA.DFWLAX has 10 observations and 5variables.
libname ia 'SAS-data-library'; data ia.dfwlax;
infile 'raw-data-file';input Flight $ 1-3 Date $ 4-11
Dest $ 12-14 FirstClass 15-17Economy 18-20;
run;
104
Compile Program
Initialize Variablesto Missing
Execute INPUTStatement
Execute OtherStatements
Output to SAS Data Set
End ofFile?
No
Yes
NextStep
DATA Step Execution: Summary
Section 2-4c
Reading Raw Data Files: Formatted Input
106
Read standard and nonstandard character and numeric data using formatted input.Read date values and convert them to SAS date values.
Objectives
107
Formatted input is appropriate for readingdata in fixed columnsstandard and nonstandard character and numeric datacalendar values to be converted to SAS date values.
Reading Data Using Formatted Input
108
General form of the INPUT statement with formatted input:Reading Data Using Formatted Input
INPUT pointer-control variable informat . . . ;INPUT pointer-control variable informat . . . ;
Formatted input is used to read data values by moving the input pointer to the starting position of the fieldspecifying a variable namespecifying an informat.
19
109
Pointer controls:@n moves the pointer to column n.+n moves the pointer n positions.
An informat specifies the width of the input fieldhow to read the data values that are stored in the field.
Reading Data Using Formatted Input
110
<$>informat-namew.<d>
An informat is an instruction that SAS uses to read data values.
SAS informats have the following form:
Informat name
Total width of the field to read
Number of decimal places
Required delimiter
Indicates a character informat
What Is a SAS Informat?
111
8. or 8.0 reads 8 columns of numeric data.
Raw Data Value Informat SAS Data Value8.0
1 2 3 4 5 6 7
1 2 3 4 5 6 7
8.2
1 2 3 4 5 6 7
1 2 3 4 5 . 6 7
8.2 reads 8 columns of numeric data and may insert a decimal point in the value.
Raw Data Value Informat SAS Data Value
Selected Informats
8.2
1 2 3 4 . 5 6 7
1 2 3 4 . 5 6 7
8.0
1 2 3 4 . 5 6 7
1 2 3 4 . 5 6 7
112
$8.
J A M E S
J A M E S
$CHAR8.
J A M E S
J A M E S
Selected Informats$8. reads 8 columns of character data and removes
leading blanks.
Raw Data Value Informat SAS Data Value
$CHAR8. reads 8 columns of character data and preserves leading blanks.
Raw Data Value Informat SAS Data Value
113
COMMA7. reads 7 columns of numeric data and removes selected nonnumeric characters such as dollar signs and commas.
Raw Data Value Informat SAS Data Value
MMDDYY8. reads dates of the form 10/29/01.
Raw Data Value Informat SAS Data Value
COMMA7.0
$ 1 2 , 5 6 7
1 2 5 6 7
MMDDYY8.
1 0 / 2 9 / 0 1
1 5 2 7 7
Selected Informats
114
Date values that are stored as SAS dates are special numeric values.A SAS date value is interpreted as the number of days between January 1, 1960, and a specific date.
01JAN1959 01JAN1960 01JAN1961
-365 0 366
01/01/1959 01/01/1960 01/01/1961
informat
format
Working with Date Values
...
20
115
SAS uses date informats to read and convert dates to SAS date values.
10/29/2001 MMDDYY10. 1527710/29/01 MMDDYY8. 1527729OCT2001 DATE9. 1527729/10/2001 DDMMYY10. 15277
InformatRaw Data
ValueConverted
Value
Examples:
Number of days between01JAN1960 and 29OCT2001
Convert Dates to SAS Date Values
116
Raw Data File 43912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 15170
1 1 2 1---5----0----5----0
Reading Data: Formatted Input
data work.dfwlax;infile 'raw-data-file';input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.@18 Economy 3.;
run;
117
The SAS System
FirstObs Flight Date Dest Class Economy
1 439 14955 LAX 20 1372 921 14955 DFW 20 1313 114 14956 LAX 15 1704 982 14956 dfw 5 855 439 14957 LAX 14 1966 982 14957 DFW 15 1167 431 14958 LaX 17 1668 982 14958 DFW 7 889 114 14959 LAX . 18710 982 14959 DFW 14 31
Reading Data: Formatted Inputproc print data=work.dfwlax;run;
SAS date values
118
proc print data=work.dfwlax;format Date date9.;
run;
The SAS System
FirstObs Flight Date Dest Class Economy
1 439 11DEC2000 LAX 20 1372 921 11DEC2000 DFW 20 1313 114 12DEC2000 LAX 15 1704 982 12DEC2000 dfw 5 855 439 13DEC2000 LAX 14 1966 982 13DEC2000 DFW 15 1167 431 14DEC2000 LaX 17 1668 982 14DEC2000 DFW 7 889 114 15DEC2000 LAX . 18710 982 15DEC2000 DFW 14 31
Reading Data: Formatted Input
Formatted SAS date values
119
SAS detects data errors whenthe INPUT statement encounters invalid data in a fieldillegal arguments are used in functionsimpossible mathematical operations are requested.
What Are Data Errors?
120
When SAS encounters a data error, 1. a note that describes the error is printed in the
SAS log2. the input record being read is displayed in the
SAS log (contents of the input buffer)3. the values in the SAS observation being created
are displayed in the SAS log (contents of the PDV)4. a missing value is assigned to the appropriate
SAS variable5. execution continues.
Examining Data Errors
21
Section 2-4d
Assigning Variable Attributes
122
Assign permanent attributes to SAS variables.Override permanent variable attributes.
Objectives
123
When a variable is created in a DATA step, thename, type, and length of the variable are automatically assignedremaining attributes such as label and format are not automatically assigned.
Default Variable Attributes
When the variable is used in a later step,the name is displayed for identification purposesits value is displayed using a system-determined format.
124
Create the ia.dfwlax data set.
Default Variable Attributes
libname ia 'SAS-data-library'; data ia.dfwlax;
infile 'raw-data-file';input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.@18 Economy 3.;
run;
125
Default Variable Attributes
proc contents data=ia.dfwlax;run;
Partial Output-----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos-------------------------------------2 Date Num 8 03 Dest Char 3 275 Economy Num 8 164 FirstClass Num 8 81 Flight Char 3 24
Examine the descriptor portion of the ia.dfwlaxdata set.
126
Use LABEL and FORMAT statements in the PROC step to temporarily assign the attributes (for the duration of the step only)DATA step to permanently assign the attributes (stored in the data set descriptor portion).
Specifying Variable Attributes
22
127
Use LABEL and FORMAT statements in a PROC step to temporarily assign attributes.
Temporary Variable Attributes
proc print data=ia.dfwlax label;format Date mmddyy10.;label Dest='Destination'
FirstClass='First Class Passengers'Economy='Economy Passengers';
run;
128
Temporary Variable AttributesThe SAS System
FirstClass Economy
Obs Flight Date Destination Passengers Passengers
1 439 12/11/2000 LAX 20 1372 921 12/11/2000 DFW 20 1313 114 12/12/2000 LAX 15 1704 982 12/12/2000 dfw 5 855 439 12/13/2000 LAX 14 1966 982 12/13/2000 DFW 15 1167 431 12/14/2000 LaX 17 1668 982 12/14/2000 DFW 7 889 114 12/15/2000 LAX . 18710 982 12/15/2000 DFW 14 31
129
Assign labels and formats in the DATA step.
Permanent Variable Attributes
libname ia 'SAS-data-library'; data ia.dfwlax;
infile 'raw-data-file';input @1 Flight $3. @4 Date mmddyy8.
@12 Dest $3. @15 FirstClass 3.@18 Economy 3.;
format Date mmddyy10.;label Dest='Destination'
FirstClass='First Class Passengers'Economy='Economy Passengers';
run;
130
proc contents data=ia.dfwlax;run;
-----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos Format Label----------------------------------------------------------------2 Date Num 8 0 MMDDYY10.3 Dest Char 3 27 Destination5 Economy Num 8 16 Economy Passengers4 FirstClass Num 8 8 First Class Passengers1 Flight Char 3 24
Permanent Variable Attributes
Partial Output
Examine the descriptor portion of the ia.dfwlaxdata set.
131
Permanent Variable Attributesproc print data=ia.dfwlax label;run;
The SAS System
FirstClass Economy
Obs Flight Date Destination Passengers Passengers
1 439 12/11/2000 LAX 20 1372 921 12/11/2000 DFW 20 1313 114 12/12/2000 LAX 15 1704 982 12/12/2000 dfw 5 855 439 12/13/2000 LAX 14 1966 982 12/13/2000 DFW 15 1167 431 12/14/2000 LaX 17 1668 982 12/14/2000 DFW 7 889 114 12/15/2000 LAX . 18710 982 12/15/2000 DFW 14 31
132
Use a FORMAT statement in a PROC step to temporarily override the format stored in the data set descriptor.
Override Permanent Attributes
proc print data=ia.dfwlax label;format Date date9.;
run;
23
133
Override Permanent AttributesThe SAS System
FirstClass Economy
Obs Flight Date Destination Passengers Passengers
1 439 11DEC2000 LAX 20 1372 921 11DEC2000 DFW 20 1313 114 12DEC2000 LAX 15 1704 982 12DEC2000 dfw 5 855 439 13DEC2000 LAX 14 1966 982 13DEC2000 DFW 15 1167 431 14DEC2000 LaX 17 1668 982 14DEC2000 DFW 7 889 114 15DEC2000 LAX . 18710 982 15DEC2000 DFW 14 31
Section 2-4e
Changing Variable Attributes
135
ObjectivesUse features in the windowing environment to change variable attributes.Use programming statements to change variable attributes.
136
You can use the DATASETS procedure to modify a variable’s
namelabelformatinformat.
The DATASETS Procedure
137
PROC DATASETS LIBRARY=libref ; MODIFY SAS-data-set ;RENAME old-name-1=new-name-1
<. . . old-name-n=new-name-n>;LABEL variable-1='label-1'
<. . . variable-n='label-n'>; FORMAT variable-list-1 format-1
<. . . variable-list-n format-n>; INFORMAT variable-list-1 informat-1
<. . . variable-list-n informat-n>; RUN;
PROC DATASETS LIBRARY=libref ; MODIFY SAS-data-set ;RENAME old-name-1=new-name-1
<. . . old-name-n=new-name-n>;LABEL variable-1='label-1'
<. . . variable-n='label-n'>; FORMAT variable-list-1 format-1
<. . . variable-list-n format-n>; INFORMAT variable-list-1 informat-1
<. . . variable-list-n informat-n>; RUN;
The DATASETS ProcedureGeneral form of PROC DATASETS for changing variable attributes:
138
Use the DATASETS procedure to change the name of the variable Dest to Destination.
Look at the attributes of the variables in the ia.dfwlax data set.
Data Set Contents
proc contents data=ia.dfwlax;run;
-----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos-------------------------------------2 Date Char 8 193 Dest Char 3 275 Economy Num 8 84 FirstClass Num 8 01 Flight Char 3 16
24
139
Rename the variable Dest to Destination.
The DATASETS Procedure
proc datasets library=ia;modify dfwlax;rename Dest=Destination;
run;
140
Look at the attributes of the variables in the ia.dfwlaxdata set after running PROC DATASETS.
Data Set Contents
proc contents data=ia.dfwlax;run;
-----Alphabetic List of Variables and Attributes-----
# Variable Type Len Pos--------------------------------------2 Date Char 8 193 Destination Char 3 275 Economy Num 8 84 FirstClass Num 8 01 Flight Char 3 16
Section 2-4f
The Import Wizard and Proc Import
142
ObjectivesCreate a SAS data set from an Excel spreadsheet using the Import Wizard.Create a SAS data set from an Excel spreadsheet using PROC IMPORT.
143
The flight data for Dallas and Los Angeles are in an Excel spreadsheet. Read the data into a SAS data set.
SAS Data SetFlight Date Dest FirstClass Economy
439 12/11/00 LAX 20 137921 12/11/00 DFW 20 131114 12/12/00 LAX 15 170
Excel Spreadsheet
SAS Data Set
Business Task
144
The Import Wizard is a point-and-click graphical interface that enables you to create a SAS data set from several types of external files including
dBASE files (*.DBF)Excel spreadsheets (*.XLS)Microsoft Access tables (*.MDB)delimited files (*.*)comma-separated values (*.CSV).
The file formats that you are able to import may vary depending the what was installed with your particular installation of SAS.
The Import Wizard
25
145
General form of the IMPORT procedure:
The IMPORT Procedure
PROC IMPORT OUT=SAS-data-setDATAFILE='external-file-name‘ DBMS=file-type;
GETNAMES=YES;RUN;
PROC IMPORT OUT=SAS-data-setDATAFILE='external-file-name‘ DBMS=file-type;
GETNAMES=YES;RUN;
146
Look at the file created by the Import Wizard.
What if the data in the previous example were stored in a tab-delimited file?
The IMPORT Procedure
PROC IMPORT OUT= WORK.DFWLAX DATAFILE= "DallasLA.xls" DBMS=EXCEL2000 REPLACE;
GETNAMES=YES;RUN;
147
Change the PROC IMPORT code to read thetab-delimited file.
The IMPORT Procedure
PROC IMPORT OUT= WORK.DFWLAX DATAFILE= "DallasLA.txt" DBMS=TAB REPLACE;
GETNAMES=YES;RUN;
148
SummaryINFILE statements are used to identify where the raw data file exists.INPUT statements are used to tell SAS how the raw data should be read into SAS numeric and character variables.The format of the raw data can range from a simple space or tab delimited list of values, to a complicated multi-line formatted series of values.Variable attributes such as names, storage lengths, labels and formats can be assigned in the DATA step and changed in PROC DATASETS.Various forms of data can also be converted to a SAS dataset using PROC IMPORT, or the Import Wizard.
Minitab List Reports from Minitab Worksheets
150
ObjectivesIn Minitab we will:
Generate simple list reports using the PRINT command.Use the NAME command to create descriptive column headings.Sequence (SORT) observations in a Minitab Worksheet.
Things we could do in SAS, but there are no options for in Minitab:Print column subtotals in a list report . Use the ID statement to identify observations.Define titles and footnotes to enhance reports.Group observations in a list report.Create HTML reports using the Output Delivery System.
26
151
Overview of the PRINT commandList reports are generated with the PRINT command.
Data Display
Row C1 C2 C3 C4 C5
1 31 GOLDENBERG DESIREE PILOT 502222 40 WILLIAMS ARLENE M. FLTAT 236663 71 PERRY ROBERT A. FLTAT 219584 82 MCGWIER-WATTS CHRISTINA PILOT 963875 91 SCOTT HARVEY F. FLTAT 322786 106 THACKER DAVID S. FLTAT 241617 355 BELL THOMAS B. PILOT 598038 366 GLENN MARTHA S. PILOT 120202
152
You can displaydescriptive column headingsformatted data values (although more restricted than in SAS)but notboth…
Overview of PRINT Command
Data Display31 GOLDENBERG DESIREE PILOT $ 50221.6240 WILLIAMS ARLENE M. FLTAT $ 23666.1271 PERRY ROBERT A. FLTAT $ 21957.7182 MCGWIER-WATTS CHRISTINA PILOT $ 96387.3991 SCOTT HARVEY F. FLTAT $ 32278.40106 THACKER DAVID S. FLTAT $ 24161.14355 BELL THOMAS B. PILOT $ 59803.16366 GLENN MARTHA S. PILOT $120202.38
Data DisplayRow Emp ID LastName FirstName Job Code Salary
1 31 GOLDENBERG DESIREE PILOT 502222 40 WILLIAMS ARLENE M. FLTAT 236663 71 PERRY ROBERT A. FLTAT 219584 82 MCGWIER-WATTS CHRISTINA PILOT 963875 91 SCOTT HARVEY F. FLTAT 322786 106 THACKER DAVID S. FLTAT 241617 355 BELL THOMAS B. PILOT 598038 366 GLENN MARTHA S. PILOT 120202
153
Example:
General form of the PRINT procedure:
Creating a Default List Report
PRINT list of columns, matrices or contants;RUN;PRINT list of columns, matrices or contants;RUN;
PRINT c1-c5.
c04s1d1
You must list the columns to be printed. If names exist for the variables, they will be displayed.
154
The SORT procedurerearranges the rows in a Worksheetcan sort on multiple variablescan sort in ascending (default) or descending orderdoes not generate printed outputtreats missing values as the smallest possible value.can sort the data in place, or put sorted data in new columns on the worksheet.
Sorting a Minitab Worksheet
155
General form of the SORT command:
Examples:
Sorting a Minitab Worksheet
SORT C [carry along C...C] put into C [and C...C]BY C...CDESCENDING C...C
SORT C [carry along C...C] put into C [and C...C]BY C...CDESCENDING C...C
SORT c1-c5 c1-c5;BY c5.
SORT c1-c5 c11-c15;BY c5.
156
Sorting a Minitab SpreadsheetBefore sorting…
SORT c1-c5 c1-c5;BY c5.
After sorting…
C1 C2 C3 C4 C531 GOLDENBERG DESIREE PILOT 5022240 WILLIAMS ARLENE M. FLTAT 2366671 PERRY ROBERT A. FLTAT 21958 . . .
C1 C2 C3 C4 C571 PERRY ROBERT A. FLTAT 2195840 WILLIAMS ARLENE M. FLTAT 23666106 THACKER DAVID S. FLTAT 24161 . . .
27
157
General form of the NAME statement:
‘name'specifies a name up to 31 characters.
Names are used:to replace column numbers in Minitab outputautomatically by many commandsspecifically, by the PRINT command
Assigning Column Labels
Name column=‘name‘ column=‘name’;Name column=‘name‘ column=‘name’;
158
Row Emp ID LastName FirstName Job Code Salary
1 71 PERRY ROBERT A. FLTAT 219582 40 WILLIAMS ARLENE M. FLTAT 236663 106 THACKER DAVID S. FLTAT 24161
Assigning Column LabelsMTB > name c1='Emp ID' c2='LastName' & MTB > c3='FirstName' c4='Job Code' & MTB > c5='Salary‘.MTB > print c1-c5;
c05s1d1
159
SummaryThe PRINT command in Minitab is used to display data from columns, constants and matrices in the session window.The NAME command in Minitab is used to assign a more descriptive label to a column, constant or matrix than the required C, K, or M.The SORT command in Minitab can be used to reorder, or sort, the data in a column or group of columns.
Formats in Minitab
161
Objectives
Define the allowable format values for MinitabDemonstrate the use of formats in displaying data
162
Data Display31 GOLDENBERG DESIREE PILOT $ 50221.6240 WILLIAMS ARLENE M. FLTAT $ 23666.1271 PERRY ROBERT A. FLTAT $ 21957.7182 MCGWIER-WATTS CHRISTINA PILOT $ 96387.3991 SCOTT HARVEY F. FLTAT $ 32278.40106 THACKER DAVID S. FLTAT $ 24161.14355 BELL THOMAS B. PILOT $ 59803.16366 GLENN MARTHA S. PILOT $120202.38
Using Minitab FormatsEnhance the readability of reports by formatting the data values.
Unlike SAS, the format command must be supplied for all the data being printed, not individual columns.
(exception: right-click on the column header to specify format)
28
163
Minitab Worksheet PRINT
FORMATReport
Values in the worksheet are not changed.
Formatting Data ValuesYou can enhance reports by using Minitab formats to format data values.
164
To apply a format to a specific SAS variable, use the FORMAT statement.
General form of the FORMAT statement:
Example:
Formatting Data Values
FORMAT specification;FORMAT specification;
MTB > print c1-c5;SUBC> format (f4,1x,a14,a11,a6,1x,'$',f9.2).
165
Allowable FormatsFw F format - for reading numbersFw.d F format - with decimal place specified Aw A format - for text (alpha) data DT DT format - for date/time dataX X format - says to skip a spaceTn T format - says to move to position n n Repeat factor( Open parenthesis ) Close parenthesis , Comma - used to separate format items / Slash - says to go to a new data line
166
format<w>.<d>
Format name
Total width (including decimal places and special characters)
Number of decimal places
Required delimiter
Minitab (Fortran) Format?A format is an instruction that Minitab uses to read or write data values. Most formats have the following form:
167
Fw.d standard numeric formatF8.2 Width=8, 2 decimal places: 12234.21
Aw standard character formatA5. Width=5: KATHY
Standard formats:Minitab Formats
168
Stored Value
Format Displayed Value
27134.2864 (F4) ****27134.2864 (F6) 2713427134.2864 (F8) 2713427134.2864 (F8.2) 27134.2927134.2864 (F10.2) 27134.2927134.2864 (F12.4) 27134.286427134.2864 ('$',f8.2) $27134.29
If you do not specify a format width large enough to accommodate a numeric value, the displayed value will be adjusted, if possible, or replaced by asterisks.
Minitab Formats
29
169
Raw Data0031 GOLDENBERG DESIREE PILOT 50221.620040 WILLIAMS ARLENE M. FLTAT 23666.120071 PERRY ROBERT A. FLTAT 21957.710082 MCGWIER-WATTS CHRISTINA PILOT 96387.390091 SCOTT HARVEY F. FLTAT 32278.40106 THACKER DAVID S. FLTAT 24161.140355 BELL THOMAS B. PILOT 59803.160366 GLENN MARTHA S. PILOT 120202.38
170
Default PRINT CommandRow C1 C2 C3 C4 C5
1 31 GOLDENBERG DESIREE PILOT 502222 40 WILLIAMS ARLENE M. FLTAT 236663 71 PERRY ROBERT A. FLTAT 219584 82 MCGWIER-WATTS CHRISTINA PILOT 963875 91 SCOTT HARVEY F. FLTAT 322786 106 THACKER DAVID S. FLTAT 241617 355 BELL THOMAS B. PILOT 598038 366 GLENN MARTHA S. PILOT 120202
MTB > print c1-c5
171
PRINT Command with FORMAT subcommand31 GOLDENBERG DESIREE PILOT $ 50221.6240 WILLIAMS ARLENE M. FLTAT $ 23666.1271 PERRY ROBERT A. FLTAT $ 21957.7182 MCGWIER-WATTS CHRISTINA PILOT $ 96387.3991 SCOTT HARVEY F. FLTAT $ 32278.40106 THACKER DAVID S. FLTAT $ 24161.14355 BELL THOMAS B. PILOT $ 59803.16366 GLENN MARTHA S. PILOT $120202.38
MTB > print c1-c5;SUBC> format (f4,1x,2a15,a6,1x,'$',f9.2).
172
SummaryFORTRAN style formats are used in Minitab to modify the way in which data is displayed.The FORMAT subcommand to the PRINT command is used to apply the formats.The FORMAT subcommand does not change the values of the actual data, it only changes the way it is displayed.
Reading Raw Data into Minitab
174
ObjectivesRead a tab-delimited file using the READ command with a TAB subcommand.Read a data file using the READ command with a FORMAT subcommand.
30
175
READ CommandCommand Syntax
READ data into C...CFILE "filename"FORMAT (format statement)NOBS = KSKIP K lines
With READ; FILE only:TABNONAMES (use with TAB)ALPHA K...K (use with TAB)DECIMAL "." or ",“
176
Raw DataEmpID<TAB>LastName <TAB> FirstName <TAB> JobCode <TAB> Salary0031 <TAB> GOLDENBERG <TAB> DESIREE <TAB> PILOT <TAB> 50221.620040 <TAB> WILLIAMS <TAB> ARLENE M. <TAB> FLTAT <TAB> 23666.120071 <TAB> PERRY <TAB> ROBERT A. <TAB> FLTAT <TAB> 21957.710082 <TAB> MCGWIER-WATTS <TAB> CHRISTINA <TAB> PILOT <TAB> 96387.390091 <TAB> SCOTT <TAB> HARVEY F. <TAB> FLTAT <TAB> 32278.40106 <TAB> THACKER <TAB> DAVID S. <TAB> FLTAT <TAB> 24161.140355 <TAB> <TAB> BELL <TAB> THOMAS B. <TAB> PILOT <TAB> 59803.160366 <TAB> GLENN <TAB> MARTHA S. <TAB> PILOT <TAB> 120202.38
Notes: (1) the first column can be treated as numeric, or alpha.(2) the second, third and forth columns are alpha data.(3) the fifth column is numeric data.(4) the first row contains the column names.(5) the file is stored as “empdata_tab.txt”.
177
Minitab CommandsMTB > READ c1-c5;SUBC> FILE ‘filename’;SUBC> TAB;SUBC> ALPHA 2 3 4.
Row EmpID LastName FirstName JobCode Salary
1 31 GOLDENBERG DESIREE PILOT 502222 40 WILLIAMS ARLENE M. FLTAT 236663 71 PERRY ROBERT A. FLTAT 219584 82 MCGWIER-WATTS CHRISTINA PILOT 963875 91 SCOTT HARVEY F. FLTAT 322786 106 THACKER DAVID S. FLTAT 241617 355 BELL THOMAS B. PILOT 598038 366 GLENN MARTHA S. PILOT 120202
178
Data for flights from New York to Dallas (DFW) and Los Angeles (LAX) are stored in a raw data file. This is the same data as before…
Reading Formatted Raw Data Files
1 1 21---5----0----5----043912/11/00LAX 2013792112/11/00DFW 2013111412/12/00LAX 1517098212/12/00dfw 5 8543912/13/00LAX 1419698212/13/00DFW 1511643112/14/00LaX 1716698212/14/00DFW 7 8811412/15/00LAX 18798212/15/00DFW 14 31
Description ColumnFlight Number 1- 3 Date 4-11 Destination 12-14 First Class Passengers
15-17
Economy Passengers
18-20
Stored as: dfwlax.dat
179
READ CommandCommand Syntax
READ data into C...CFILE "filename"FORMAT (format statement)NOBS = KSKIP K lines
With READ; FILE only:TABNONAMES (use with TAB)ALPHA K...K (use with TAB)DECIMAL "." or ",“
180
Minitab CommandsUsing formats:MTB > READ C1-C5;SUBC> file ‘path\dfwlax.dat’;SUBC> format (f3,dt8mm/dd/yy,a3,f3,f3).
439 12/11/00 LAX 20 137921 12/11/00 DFW 20 131114 12/12/00 LAX 15 170982 12/12/00 dfw 5 85439 12/13/00 LAX 14 196982 12/13/00 DFW 15 116431 12/14/00 LaX 17 166982 12/14/00 DFW 7 88114 12/15/00 LAX * 187982 12/15/00 DFW 14 31
31
181
Minitab CommandsUsing formats with pointer control:MTB > READ C1-C5;SUBC> file ‘path\dfwlax.dat';SUBC> format & SUBC> (T1,f3,T4,dt8mm/dd/yy,T12,a3,T15,f3,T18,f3).
439 12/11/00 LAX 20 137921 12/11/00 DFW 20 131114 12/12/00 LAX 15 170982 12/12/00 dfw 5 85439 12/13/00 LAX 14 196982 12/13/00 DFW 15 116431 12/14/00 LaX 17 166982 12/14/00 DFW 7 88114 12/15/00 LAX * 187982 12/15/00 DFW 14 31
182
Minitab CommandsUsing formats with pointer control (columns read in a different order):MTB > READ C1-C5;SUBC> file ‘path\dfwlax.dat';SUBC> format & SUBC> (T1,f3,T4,dt8mm/dd/yy,T12,a3,T15,f3,T18,f3).12/11/00 439 20 137 LAX12/11/00 921 20 131 DFW12/12/00 114 15 170 LAX12/12/00 982 5 85 dfw12/13/00 439 14 196 LAX12/13/00 982 15 116 DFW12/14/00 431 17 166 LaX12/14/00 982 7 88 DFW12/15/00 114 * 187 LAX12/15/00 982 14 31 DFW
183
Minitab – FILE/OPEN WORKSHEETMay read a variety of files:e.g. Minitab Worksheets (*.mtw)
Excel Spreadsheets (*.xls)dBase databases (*.dbf)Text files (*.txt)Dat files (*.dat)All files (*.*)
Options specific to each file types are available…Preview option allows you to ‘test out’ your options.Does not generate commands, therefore cannot be used in subsequent programs…
184
Minitab – FILE/OPEN WORKSHEET options
185
SummaryThe READ statement in Minitab is used identify where the raw data file exists and tell Minitab how the raw data should be read into numeric and character columns or matrices.The format of the raw data can range from a simple space or tab delimited list of values, to a complicated multi-line formatted series of values.FORTRAN style formatting can be used to specify more complicated raw data formats.The FILE/OPEN WORKSHEET dialog contain much of the functionality of the READ command.Unlike other Minitab pull-down menu options, the FILE/OPEN WORKSHEET option does not print it’s commands in the Session window.