SAS Tips & TricksBrought to you by your fellow MASUG members
Jun Tang, Data Analyst, MS UT Health Science center Missing Data in Multivariate Analysis
Missing Data in Multivariate AnalysisIn multivariate analysis, we encountered this: 76 of 306 observations in data set LIB.FILENAME omitted due to missing values. Its 25% of the sample. Sample size is drastically reduced.
Missing Data in Multivariate AnalysisTo solve this problem, we consider two situations:Do not use Multivariate Analysis, in stead, use Univariate Test. This will not affect the sample size of those non-missing value variables. Example: MANOVA change to TTEST.
proc anova;class AAMC; model ratevar1 ratevar2 ratevar3 =AAMC;manova h=AAMC/printe printh;run;
Change to:proc ttest data=lib.filename;class AAMC;VAR ratevar1 ratevar2 ratevar3 ;run;
Missing Data in Multivariate AnalysisTo replace the missing data :Mean Estimation;Hot-Deck imputation;Predict missing values from Regression;Multiple Imputation (which is still in experimental form in Release 8.1 of the SAS system).SAS procedures for doing MI:a. PROC MI creates Multiple Imputed data sets;b. PROC MIANALYZE combines results after analysis.Example:proc MI data=lib.filename out=outmi;VAR ratevar1 ratevar2 ratevar3;run;
Refs: 1. Kim Chantala and C.Suchindran, Multiple Imputation for missing data; 2. www.ats.ucla.edu/stat/sas/v8/miv802.pdf, chapter 9, the MI procedure;
Shelly Lensing, Sr. Biostatistician IIDeqing Pei, Statistical Analyst II St. Jude Children's Research Hospital Missing Readings Without a Missing Record
Missing Readings Without a Missing Record This is a problem that just came up: We needed to group each set of readings (3h, 7h, 23h, 44h) for each ID to indicate group 1, 2, etc. over time. Some readings were missing, but there was no missing record
Missing Readings Without a Missing Recorddata one; input id sample $ 10-18 mtx_con @37 hdmtx_date mmddyy8.; sampnum=compress(sample,'MTXHR')+0; format hdmtx_date mmddyy8.; datalines; *** example data;11111 MTX 3HR 81.29 11/20/9511111 MTX 7HR 22.07 11/20/9511111 MTX 23HR 2.24 11/21/9511111 MTX 44HR 1.49 11/22/9511111 MTX 3HR 92.16 01/19/9611111 MTX 44HR 0.18 01/21/9611111 MTX 7HR 24.27 02/20/9611111 MTX 23HR 3.34 02/21/9622222 MTX 3HR 127.22 12/17/9422222 MTX 7HR 29.73 12/17/9422222 MTX 23HR 0.65 12/18/9422222 MTX 7HR 45.00 12/28/9422222 MTX 23HR 1.88 12/29/9422222 MTX 3HR 132.48 02/06/9522222 MTX 7HR 37.96 02/06/9522222 MTX 23HR 1.53 02/07/9522222 MTX 44HR 0.13 02/08/9522222 MTX 7HR 85.71 02/09/9522222 MTX 44HR 30.01 02/11/9522222 MTX 44HR 35.01 02/13/9533333 MTX 3HR 65.02 02/13/9533333 MTX 23HR 34.13 02/14/9533333 MTX 23HR 24.13 02/15/9533333 MTX 3HR 74.23 02/16/9533333 MTX 44HR 24.06 02/17/95;run;
Missing Readings Without a Missing Recordproc sort data=one; *** make sure data are in order; by id hdmtx_date sampnum;run;
data two; set one; by id;
lagsamp=lag(sampnum); * Uses logic of when hour-based measurements should occur relative to each other; * to find first record of each set; if sampnum=3 then first=1; if sampnum=7 and lagsamp ne 3 then first=1; if sampnum=23 and lagsamp not in(7,3) then first=1; if sampnum=44 and lagsamp not in(23,7,3) then first=1;
if first.id then do; * restarts group number with each new id; group=0; end; retain group; if first=1 then group+1; * increments group number for each grouping;run;
Vaibhav Gard , Senior Marketing Analyst FedEx Services - Marketing, Planning and Analysis Group Interactive ModeBeats the Batch Mode in most cases
I like to use the SAS Interactive Mode on UNIX. It is almost as convenient to use as Windows, and my code gets neatly color coded! This makes it easy to catch minor error in syntax like forgetting a quote, semi-colon, etc.
In the interactive mode, it is quite easy to see the exact location, size, modification date, etc. of all your tables.
You can even right click to open (view) Oracle tables, SAS datasets to see the records contained therein. Much easier than the usual UNIX SQL options! (Can also delete, copy, save, etc.)
You can see a continuously visible log file (again color coded) to monitor the progress/success of your program. In case there is an error, it shows up in RED.There are other benefits of using the Interactive mode, but I have highlighted just these few. Hope you find it worthwhile to check it out!
Alan Teal, Chief, Technical Support Branch Defense Contract Audit Agency Macro to Document Date/Time of Processing
Macro to Document Date/Time of ProcessingPurpose: It is a good practice to document the date and time when a SAS job was run, especially if the same job is often run but with different data sets. The following macro writes the date and time to the SAS log. It can also be used to determine how long each Data step or PROC processed, in order to evaluate which steps might be made more efficient.
Macro to Document Date/Time of ProcessingSAS Log Shown Below:1 * TheTimeIs.SAS Writes the date and time to the SAS Log to document processing.;2 * Created 03-25-2003, Alan Teal;34 %MACRO Tim;5 DATA _Null_;6 TheTimeIs=Datetime();7 PUT TheTimeIs= DateTime20.;8 %MEND Tim;910 %Tim1112 RUN;
TheTimeIs=25MAR2003:12:36:47NOTE: DATA statement used: real time 0.32 seconds
When the %Tim macro is called TheTimeIs (date/time) is written to the SAS Log, as shown above.
The Unknown SAS Instructor I just have one comment to make
I just have one comment to make/* Heres an easy way to comment out a chunk of code: */
%macro skip;data new; set old; . . .run;%mend skip;/* And the best part is, you can use it as many times as you want within the same program! */
Stephanie Thompson, Sr. Merchandising Analyst AutoZone Extra rsubmits
Extra rsubmits Ever forget to remove an extra rsubmit from your code and it crashes? Keep them there and use them at will with...
Extra rsubmitsrsubmit;proc sql;create table payprelim_vp as (SELECT VP, sum(purchase_cost) as p_cost, sum(inventory_cost) as i_cost FROM payprelim_t group by vp);*rsubmit;data q.payables_vp;set payprelim_vp;payables = p_cost / i_cost;run;proc print data = q.payables_vp;format purchase_cost dollar12. inventory_cost dollar12. payables percent6.2;run;Higlight from after the * to the end and submit.
Walter Smith, Marketing Project Manager WorldWide Forecasting, FedEx Dates and Fixing Capitalization
Creating SAS Dates From Character DataOften date data is stored in files or tables as character data and is more useful when converted to SAS date values (number of days since Jan 1, 1960)
*-- read in some dates as character ---;data chardates; infile cards; input yyyymm $6. ;cards;200206200207200208200303run;
*---------- create SAS dates ----------;data sasdates; set chardates; yy = input( substr( yyyymm , 1 , 4 ) , 4. ); mm = input( substr( yyyymm , 5 ) , 2. ); sasdate = mdy( mm , 1 , yy ); date9 = sasdate; mmddyy = sasdate; attrib sasdate label='Numeric Date' format=5.; attrib date9 label='Displayed As Date9' format=date9.; attrib mmddyy label='Displayed As mmddyy' format=mmddyy10.; keep yyyymm sasdate date9 mmddyy; tt = 1; put 'tt=' tt 'tt=' tt date9.;run;title 'Character & SAS Dates';proc print; run;
The ResultCharacter & SAS Dates
Obs yyyymm sasdate date9 mmddyy
1 200206 15492 01JUN2002 06/01/2002 2 200207 15522 01JUL2002 07/01/2002 3 200208 15553 01AUG2002 08/01/2002 4 200303 15765 01MAR2003 03/01/2003
SAS Dates in Macro VariablesSometimes it is convenient to have date values in the macro environmentAdditional manipulations can be done the macro function %sysfunc() gives access to nearly all the datastep functions!
Sample Code & Results*--- put min & max into macro vars ----;proc sql noprint; select put(min(sasdate),5.), max(sasdate) into :mindate, :maxdate from sasdates;quit; %put >> mindate="&mindate" maxdate="&maxdate";%let nextmonth = %sysfunc( intnx( month , &maxdate , 1 ) , date9. );%let sas_date_const = %unquote(%bquote(')&nextmonth%bquote('d));%put >> nextmonth=&nextmonth sas_date_const=&sas_date_const;
>> mindate="15492" maxdate=" 15765>> nextmonth=01APR2003 sas_date_const='01APR2003'd
Capitalizing Proper NamesOften names of people, places and things in data is in mixed or case, or worse, IN ALL CAPITAL LETTERS.Names are easier to read when each word of the name has only the initial letter capitalized.
Sample Code*--------- read in some data ----------;data names; infile cards; input fullname $30. ;cards;Chris GeithFaRoUK HiJaZi unusual caseSOUTH KOREATARIQ AZIZWalter j SMITHlos angeles CA state abbrevrun;
*---- change names to initial caps ----;data newnames; set names; oldname = fullname; length newname $30; fullname = lowcase( fullname ); n_wrds = len
Click here to load reader