+ All Categories
Home > Documents > ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation...

ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation...

Date post: 18-Jan-2018
Category:
Upload: matilda-simon
View: 221 times
Download: 0 times
Share this document with a friend
Description:
ISQS 6347, Data & Text Mining 3 Structure and Components of Business Intelligence
62
ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University
Transcript
Page 1: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 1

ISQS 6339, Data Management & Business Intelligence

Data Preparation for Analytics Using SAS

Zhangxi LinTexas Tech University

Page 2: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 2

Outline

An overview of data preparation for analytics SAS Programming Essentials

Running SAS programs Mastering fundamental concepts SAS program debugging

Make use of SAS Enterprise Guide for programming

Page 3: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 3

Structure and Components of Business Intelligence

Page 4: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 4

Overview: From Data Warehousing to Data Analysis Previous major topics in data warehousing (using SQL Server

2008) Dimensional model design ETL Cubes design and OLAP

Data analysis topics (using SAS) Data preparation

Analytic business questions Data format and data conversion

Data cleansing Data exploratory Data analysis Data visualization

Page 5: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 5

Components of the SAS System

ReportingAnd

Graphics

Data AccessAnd

ManagementUser

Interface

Analytical Base SAS ApplicationDevelopment

VisualizationAnd Discovery

BusinessSolutions

WebEnablement

Page 6: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 6

SAS Programming Essentials

Find more information from http://support.sas.com

Page 7: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 7

Data-driven Tasks

The functionality of the SAS System is built around four data-driven tasks common to virtually any applications Data access Data management Data analysis Data presentation

Page 8: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 8

Turning Data into Information Process of delivery meaningful information

80% data-related Access Scrub Transform Mange Store and retrieve

20% analysis

Page 9: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 9

DATAStep

SAS Data Sets

Data

PROCSteps

Information

Turning Data into Information

Page 10: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 10

PCPC WorkstationWorkstation Servers//Midrange MainframeMainframe Super

Computer

90%independent

10%dependent

MultiVendor Architecture

Design of the SAS System

...

Page 11: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 11

MultiEngine ArchitectureDesign of the SAS System

DATA

Teradata

SYBASE

Microsoft ExcelORACLE

dBase

SAP

DB2

Page 12: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 12

SAS Programming – Level I Fundamentals (ch1-3) Producing list reports (ch4) Enhancing output (ch5) Creating data sets (ch6) Data step programming (ch7)

Reading data Creating variables Conditional processing Keeping and dropping variables Reading Excel files

Combining SAS data sets (ch8) Producing summary reports (ch9) SAS graphing (ch10)

Page 13: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 13

In this course, you work with business data from International Airlines (IA). The various kinds of data that IA maintains are listed below: flight data passenger data cargo data employee data revenue data

Course Scenario

Page 14: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 14

The following are some tasks that you will perform: importing data creating a list of employees producing a frequency table of job codes summarizing data creating a report of salary information

Course Scenario

Page 15: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 15

DATA steps are typically used to create SAS data sets.

PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, and sort data).

A SAS program is a sequence of steps that the user submits for execution.

RawData

DATAStep

Report

SASDataSet

SASDataSet

PROCStep

SAS Programs

Page 16: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 16

data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;

proc print data=work.staff;run;

proc means data=work.staff; class JobTitle; var Salary;run;

DATAStep

PROCSteps

SAS Programs

Page 17: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 17

SAS steps begin with either of the following: DATA statement PROC statement

SAS detects the end of a step when it encounters one of the following: a RUN statement (for most steps) a QUIT statement (for some procedures) the beginning of another step (DATA statement

or PROC statement)

Step Boundaries

Page 18: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 18

data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;

proc print data=work.staff;

proc means data=work.staff; class JobTitle; var Salary;run;

Step Boundaries

Page 19: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 19

You can invoke SAS in the following ways: interactive windowing mode (SAS windowing

environment) interactive menu-driven mode (SAS Enterprise Guide,

SAS/ASSIST, SAS/AF, or SAS/EIS software) batch mode noninteractive mode

Running a SAS Program

Page 20: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 20

Preparation of SAS Programming Data sets: \SAS-Programming Create a user defined library reference

StatementLIBNAME libref ‘SAS-data-library’ <options>;

Example LIBNAME ia ‘c:\workshop\winsas\prog1’;

Two-levels of SAS files namesLibref.fielname

Page 21: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 21

SAS Programming Essentials

Demon: c02s2d1 Exercise: c02ex1

Page 22: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 22

General form of the CONTENTS procedure:

Example:

PROC CONTENTS DATA=SAS-data-set;RUN;

proc contents data=work.staff;run;

Browsing the Descriptor Portion

c02s3d1

Page 23: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 23

Numeric values

Variable

names

Variable

values

LastName FirstName JobTitle Salary

TORRES JAN Pilot 50000LANGKAMM SARAH Mechanic 80000SMITH MICHAEL Mechanic 40000WAGSCHAL NADJA Pilot 77500TOERMOEN JOCHEN Pilot 65000

The data portion of a SAS data set is a rectangular table of character and/or numeric data values.

Variable names are part of the descriptor portion, not the data portion.

Character values

SAS Data Sets: Data Portion

Page 24: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 24

SAS Variable Values

There are two types of variables:

character contain any value: letters, numbers, special characters, and blanks. Character values are stored with a length of 1 to 32,767 bytes. One byte equals one character.

numeric stored as floating point numbers in 8 bytes of storage by default. Eight bytes of floating point storage provide space for 16 or 17 significant digits. You are not restricted to 8 digits.

Page 25: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 25

SAS names have these characteristics: can be 32 characters long. can be uppercase, lowercase, or mixed-case. are not case sensitive. must start with a letter or underscore.

Subsequent characters can be letters, underscores, or numerals.

SAS Data Set and Variable Names

Page 26: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 26

data5mon

Select the valid default SAS names.

Valid SAS Names

...

Page 27: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 27

Select the valid default SAS names.

Valid SAS Names

...

data5mon

Page 28: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 28

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

Page 29: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 29

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

Page 30: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 30

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

data#5

Page 31: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 31

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

data#5

Page 32: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 32

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

data#5 five months data

Page 33: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 33

data5mon

Select the valid default SAS names. data5mon 5monthsdata

Valid SAS Names

...

data#5 five months data

Page 34: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 34

data5mon

Select the valid default SAS names. data5mon 5monthsdata

five months data data#5

Valid SAS Names

...

fivemonthsdata

Page 35: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 35

data5mon

Select the valid default SAS names. data5mon 5monthsdata

five months data data#5

Valid SAS Names

...

fivemonthsdata

Page 36: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 36

data5mon

Select the valid default SAS names. data5mon 5monthsdata

five months data data#5

Valid SAS Names

...

fivemonthsdata FiveMonthsData

Page 37: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 37

data5mon

Select the valid default SAS names. data5mon 5monthsdata

five months data data#5

Valid SAS Names

...

fivemonthsdata FiveMonthsData

Page 38: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 39

LastName FirstName JobTitle Salary

TORRES JAN Pilot 50000LANGKAMM SARAH Mechanic 80000SMITH MICHAEL Mechanic . WAGSCHAL NADJA Pilot 77500TOERMOEN JOCHEN 65000

A value must exist for every variable for each observation.Missing values are valid values.

A numeric missing value is displayed as a period.

A character missing value is displayed as a blank.

Missing Data Values

Page 39: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 40

The PRINT procedure displays the data portion of a SAS data set.

By default, PROC PRINT displays the following: all observations all variables an Obs column on the left side

Browsing the Data Portion

Page 40: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 41

General form of the PRINT procedure:

Example:

PROC PRINT DATA=SAS-data-set;RUN;

proc print data=work.staff;run;

Browsing the Data Portion

c02s3d1

Page 41: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 42

SAS documentation and text in the SAS windowing environment use the following terms interchangeably:

SAS Data Set SAS Table

Variable Column

Observation Row

SAS Data Set Terminology

Page 42: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 43

SAS statements have these characteristics: usually begin with an identifying keyword always end with a semicolondata work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff;run;proc means data=work.staff; class JobTitle; var Salary;run;

SAS Syntax Rules

Page 43: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 44

SAS statements are free-format. One or more blanks or special characters can

be used to separate words. They can begin and end in any column. A single statement can span multiple lines. Several statements can be on the same line.

Unconventional Spacingdata work.staff; infile 'raw-data-file';input LastName $ 1-20 FirstName $ 21-30JobTitle $ 36-43 Salary 54-59;run; proc means data=work.staff; class JobTitle; var Salary;run;

SAS Syntax Rules

...

Page 44: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 46

data work.staff; infile 'raw-data-file';input LastName $ 1-20 FirstName $ 21-30JobTitle $ 36-43 Salary 54-59;run; proc means data=work.staff; class JobTitle; var Salary;run;

SAS statements are free-format. One or more blanks or special characters can

be used to separate words. They can begin and end in any column. A single statement can span multiple lines. Several statements can be on the same line.

Unconventional Spacing

SAS Syntax Rules

...

Page 45: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 47

SAS statements are free-format. One or more blanks or special characters can

be used to separate words. They can begin and end in any column. A single statement can span multiple lines. Several statements can be on the same line.

Unconventional Spacingdata work.staff; infile 'raw-data-file';input LastName $ 1-20 FirstName $ 21-30JobTitle $ 36-43 Salary 54-59;run; proc means data=work.staff; class JobTitle; var Salary;run;

SAS Syntax Rules

...

Page 46: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 48

data work.staff; infile 'raw-data-file';input LastName $ 1-20 FirstName $ 21-30JobTitle $ 36-43 Salary 54-59;run; proc means data=work.staff; class JobTitle; var Salary;run;

...

SAS statements are free-format. One or more blanks or special characters can

be used to separate words. They can begin and end in any column. A single statement can span multiple lines. Several statements can be on the same line.

Unconventional Spacing

SAS Syntax Rules

...

Page 47: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 49

data work.staff; infile 'raw-data-file';input LastName $ 1-20 FirstName $ 21-30JobTitle $ 36-43 Salary 54-59;run; proc means data=work.staff; class JobTitle; var Salary;run;

...

SAS statements are free-format. One or more blanks or special characters can

be used to separate words. They can begin and end in any column. A single statement can span multiple lines. Several statements can be on the same line.

Unconventional Spacing

SAS Syntax Rules

Page 48: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 50

Good spacing makes the program easier to read.Conventional Spacing

data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;

proc print data=work.staff;run;

proc means data=work.staff; class JobTitle; var Salary;run;

SAS Syntax Rules

Page 49: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 51

Type /* to begin a comment. Type your comment text. Type */ to end the comment.

/* Create work.staff data set */data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;

/* Produce listing report of work.staff */proc print data=work.staff;run;

SAS Comments

c02s3d2

Page 50: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 52

daat work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff run;proc means data=work.staff average max; class JobTitle; var Salary;run;

Syntax errors include the following: misspelled keywords missing or invalid punctuation invalid options

Syntax Errors

Page 51: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 53

This demonstration illustrates how to submit a SAS program that contains errors, diagnose the errors, correct the errors, and save the corrected program.

Debugging a SAS Program c02s4d1.sas userid.prog1.sascode(c02s4d1) c02s4d2.sas userid.prog1.sascode(c02s4d2)

Page 52: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 54

daat work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff run;proc means data=work.staff average max; class JobTitle; var Salary;run;data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff; run;proc means data=work.staff mean max; class Jobtitle; var Salary;run;

Program statements accumulate in a recall buffer each time you issue a SUBMIT command.

SubmitNumber 1

SubmitNumber 2

Recall a Submitted Program

Page 53: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 55

SubmitNumber 1

SubmitNumber 2

Issue RECALLonce.

Submit Number 2 statementsare recalled.

Issue the RECALL command once to recall the most recently submitted program.

data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff; run;proc means data=work.staff mean max; class JobTitle; var Salary;run;

Recall a Submitted Program

Page 54: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 56

daat work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff run;proc means data=work.staff average max; class JobTitle; var Salary;run;data work.staff; infile 'raw-data-file'; input LastName $ 1-20 FirstName $ 21-30 JobTitle $ 36-43 Salary 54-59;run;proc print data=work.staff; run;proc means data=work.staff mean max; class JobTitle; var Salary;run;

Issue the RECALL command again to recall Submit Number 1 statements.

Recall a Submitted Program

SubmitNumber 1

SubmitNumber 2

Issue RECALLagain.

Page 55: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 57

Exercise 8: Basic SAS Programming Define library IA and Out Go through all SAS programs in Chapter 2-5. Write a SAS program to read a dataset created by

yourself or simply use Person0.txt in \\TechShare\coba\d\ISQS3358\OtherDatasets\ .

The dataset is output to your library Out. Try to apply whatever SAS features in Chapter 5 of Prog-

I to general a nice looking report.

Go through all exercises for Ch 2, 3, 4, 5, 6 (answer keys are available, so no need to submit the results)

Page 56: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

Hands-on exercise

Write a SAS program to calculate the number of dates passed in 2012 to 3/3/2012. The input is in the format: date9.

01JAN2012 03MAR2012 Answer: 62 days

ISQS 6347, Data & Text Mining 58

Page 57: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 59

Making Use of SAS Enterprise Guide Code Import a text file

Example: Orders.txt Import an Excel file

Example: SupplyInfo.xls

Page 58: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 60

Learn from Examples

SAS Help Contents -> Learning to use SAS -> Sample SAS

Programs -> Base SAS “Base Usage Guide Examples”

Chapter 3, 4

Page 59: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 61

Page 60: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 62

Import an Excel Sheetproc import out=work.commrex

datafile ="C:\Lin\Shared\ISQS6339\Commrex_3358.xls" dbms=excel replace;

sheet="Company";getnames=yes;mixed=no;scantext=yes;usedate=yes;scantime=yes;run;proc print data=work.commrex;run;

Page 61: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 63

Excel SAS/ACCESS LIBNAME Enginelibname xlsdata 'C:\Lin\Shared\ISQS6339\Commrex_3358.xls';

proc print data=xlsdata.New1;run;

Page 62: ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.

ISQS 6347, Data & Text Mining 64

EG EX5: SAS Data Step Programming http://zlin.ba.ttu.edu/6339/ExerciseSASProgramming.ht

m


Recommended