+ All Categories
Home > Documents > sas_intro

sas_intro

Date post: 12-Nov-2015
Category:
Upload: abhijeet-jha
View: 215 times
Download: 2 times
Share this document with a friend
Description:
sas
34
Statistics in Science Statistics in Science Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy
Transcript
  • Need to knowSAS environmentSAS files (datasets, catalogs etc) & librariesSAS programsHow to:Get data inManipulate dataGet results out

  • SAS software environment

  • SAS Windows (SAS 9)

  • Some (!) SAS windowsEditorWhere code is written or imported, and submittedLogWhat happened, including what went wrongOutputResults of program procedures that produce outputExplorerShows libraries (SAS & Windows), their files, and where you can see data, graphs ResultsShows how the output is made up of tables, graphs, datasets etcNotepadA useful place to keep bits of code

  • SAS software programs

  • SAS Programsdata one;input x y;datalines;-3.2 0.0024-3.1 0.0033. . . ;run;

    proc print data = one (obs = 5);run;

    proc means data = one;run;

    DATA stepcreates SAS data setPROC stepsprocess data in data set

  • Step BoundariesSAS steps begin with aDATA statementPROC statement.

    SAS detects the end of a step when it encountersa RUN statement (for most steps)a QUIT statement (for some procedures)the beginning of another step (DATA statement or PROC statement).Recommendation: use RUN; at end of each step

  • data seedwt;input oz $ rad wt;datalines;Low 118.4 0.7High 109.1 1.3Low 215.2 2.9run;

    proc print data = two;

    proc means data = seedwt; class oz; var rad wt;run;Step Boundaries

  • Submitting a SAS ProgramWhen you execute a SAS program, the output generated by SAS is divided into two major parts: SAS log contains information about the processing of the SAS program, including any warning and error messages.SAS output contains reports generated by SAS procedures and DATA steps.

  • Recommended steps!Submit all (or selected) code byF4Click on the runner in the toolbarRead logLook in output window if you expect code to produce outputProblemsBad syntaxMissing ; at end of lineMissing quote at end of title (nasty!)

  • Improved output - HTMLTools Options Preferences Results Do this & resubmit codeCheck HTML output in Results Window

  • SAS data sets

  • SAS data setsSAS procedures (PROC ) process data from SAS data setsNeed to know (briefly!)What a SAS data set looks likeHow to get out data into a SAS data set

  • SAS data setslive in librarieshave a descriptor part (with useful info)have a data part which is a rectangular table of character and/or numeric data values (rows called observations)have names with syntax datasetname libname defaults to work if omitted

  • work librarySAS data sets with a single part name like oz, wp or mybestdata99are stored in the work librarycan be referenced e.g. as mybestdata99 or work.mybestdata99are deleted at end of SAS session!

  • Dont loose your data!Keep the SAS program that read the data from its original source

    . . . More later!

  • Viewing descriptor & data/* view descriptor part */proc contents data = wp;run;/* view data part */proc print data = work.wp;run;Alternatively:Use SAS Explorer: Open (for data) Properties (for descriptor)Properties is not as clear as CONTENTS

  • SAS variablesThere are two types of variables:charactercontain any value: letters, numbers, special characters, and blanks. Character values are stored with a length of 1 to 32,767 bytes (default is 8). One byte equals one character.

    numericstored as floating point numbers in 8 bytes of storage by default. Eight bytes of floating point storage provide space for 16 or 17 significant digits. You are not restricted to 8 digits. Dont change the 8 byte length!

  • SAS variables The CONTENTS Procedure

    Alphabetic List of Variables and Attributes

    # Variable Type Len 1 oz Char 8 2 rad Num 8 3 wt Num 8OUTPUT

  • SAS names for data sets & variablescan be 32 characters long.can be uppercase, lowercase, or mixed-case but are not case sensitive!must start with a letter or underscore. Subsequent characters can be letters, underscores, or numeric digits - no %$!*@ or spaces.

  • Missing Data ValuesLastName FirstName JobTitle Salary

    TORRES JAN Pilot 50000LANGKAMM SARAH Mechanic 80000SMITH MICHAEL Mechanic . WAGSCHAL NADJA Pilot 77500TOERMOEN JOCHEN 65000A value must exist for every variable for each observation.Missing values are valid values.A numeric missing value is displayed as a period.A character missing value is displayed as a blank.

  • SAS syntaxNot case sensitiveEach line usually begins with keyword and ends with ;Common Errors:Forget ;Miss-spelt or wrong keywordMissing final quote in titletitle Woodpecker Habitat; /* quote mark missing */title Woodpecker Habitat;

  • CommentsType /* to begin a comment.Type your comment text.Type */ to end the comment.To comment selected typed text remember: Ctrl+/Alternative:* comment ;

    Statistics in Science

    SASCreating a SAS data set

  • Getting data in!Consider 2 methodsData in program (briefly!)Data in Excel workbook

  • Getting data in!Data in program file:data oz;input oz $ rad wt;datalines;Low 118.4 0.7High 109.1 1.3Low 215.2 2.9. . .;run;Note:oz is text variable so requires $No missing valuesValues of oz dont contain spacesare at most 8 character long

  • Getting data in!from Excel

    Use IMPORT wizard saving program to reduce future clicking!

  • Creating new variablesAdding a new variable to an existing SAS data set (say work.old)Use setGive definition of new variable data new; /* read data from work.old */ set old; y2 = y**2; ly = log(y); ly_base10 = log10(y); t1 = (treat = 1); run;

  • Data set: work.new

  • Data Screening

  • Data Screeningchecking input data for gross errorsUse PRINT procedure to scan for obvious anomaliesUse MEANS procedure & examine summary tableMAXIMUM, MINIMUM reasonable?MEAN - near middle of range?MISSING VALUES - input or calculation error e.g. log(0)?CV (= 100*std.dev/mean) - < 10% for plant growth, between 12 & 30% for animal production variables, > 50% implies skewness for any positive variable

  • SAS syntaxMEANS syntax

    What else should go here?

  • Dealing with data errorsCheck original recordsChange mistakes in recording where the correct value is beyond questionRegenerate observations where possible e.g. reweigh sample, redo chemical analysisWith a large body of data in an unbalanced design err on the side of omitting questionable dataDo not proceed until data has been properly cleaned if necessary perform a number of screening runs

    Open wp.sas for illustrationDo a few example using wp.sasDo demo with wp.sas using contens and ExplorerSee program rd_oz.sas:Check long value for ozvalue with spaceDo demo using oz.xls