I have written a macro that will create a waterfall plot for the user with just a few input variables (and of course, some additional options if so desired).
Before I get started, I must say that I am not a huge fan of SAS waterfall plots. If you do not care why I use SAS over R, then please skip to The Macro Description section. I have heard that with the new SAS 9.3 that there are some SG procedures that can produce these plots quite easily and attractively. I do not have access to SAS 9.3 (thanks to the kind programmers at SAS Institute who require paychecks). I still use SAS 9.1.3. I have always been more fond of my waterfall plots in R (which you can compare here).
The only reason I have settled with SAS plots is because it became too much of a management issue for me. First, let me explain my SAS to R process (There is a SAS to R to SAS method that I have yet to figure out). Since SAS is great for data management and not so wonderful when it comes to graphics while R is fabulous with graphics while not so user friendly when it comes to data manipulation (this may just be a personal opinion), I would manipulate my data in SAS, export it to a .csv Excel file, read it into R, perform minor manipulations if needed, and output a PDF of the graph.
Now my issues...
First, my plots are usually not wanted in PDF because it makes it difficult to copy into PowerPoint presentations for management. Second, my overseers have this loving habit of asking me to produce a plot. They tell me how wonderful it looks, but they would like to see what happens if I change just one criteria. Well, they are not programmers, so they do not realize that sometimes that one change requires quite a bit more programming--not necessarily getting the result, but getting the output to adjust.
In short, to skip all that hassle, I have just dropped my R waterfall plots (sorry my wonderful R graphics, maybe we can get together some other time) and moved on to SAS. R is kind of like that girl that you really wanted to date in high school, but she was way too classy for you. So, you went out with the next best thing because she like you back.
The Macro Description
Okay, back to the macro! This macro is called %waterfall (yes, a very creative name--this is why programmers get paid the big bucks). %waterfall requires 4 input parameters with 5 more to enhance presentation. There is some internal stuff that can be changed around if you want to even more personalize the output for your liking. Take time to adjust the code in this macro to make it fit your company profile or your personal programming preferences!
Parameter
|
Required?
|
Default Value
|
Description
|
dsin
|
Yes
|
The input dataset used containing the data with at least yvar, byvar, and any variables required for
the whr filtering
|
|
whr
|
No
|
1=1 (captures all
observations in the dsin dataset)
|
Filters out any unwanted observations in the dsin dataset
|
yvar
|
Yes
|
The name of the variable containing the values that will
be plotted on the y-axis
|
|
byvar
|
No
|
A grouping variable if you want multiple waterfalls side
by side (e.g., suppose you want to compare males to females)
|
|
title
|
No
|
Plot title
|
|
ylab
|
No
|
Label for the y-axis
|
|
outpath
|
Yes
|
The path for the output file
|
|
filename
|
Yes
|
The name of the output file
|
|
barwidth
|
No
|
3
|
The width of the waterfall bars
|
The Macro Code
This is a very straightforward macro for those familiar with SAS programming. I have tried to leave sufficient comments throughout the code, but feel free to contact me with questions (via e-mail or by leaving a comment).
%macro
waterfall(dsin=, whr=%str(), yvar=, byvar=,
title=%str(), ylab=%str(),
outfile=%str(), filename=%str(),
barwidth=3);
/* Display error
message if WATERFALL dataset is attempting to be overwritten */
%if
%sysfunc(upcase(&dsin.))=WATERFALL
%then %do;
%put
ERROR: The dataset WATERFALL is used in the waterfall macro. Please rename your input dataset and rerun
the macro;
%return;
%end;
/* Get variable type
and format of BYVAR */
proc sql noprint;
select type, format into :vartyp,
:varfmt from sashelp.vcolumn
where upcase(memname)=upcase("&dsin.")
& upcase(name)=upcase("&byvar.");
quit;
/* Subset DSIN into
WATERFALL & sort */
data waterfall;
set &dsin.;
where &whr.;
run;
proc sort data=waterfall;
by &byvar.
descending &yvar.;
run;
data waterfall;
set waterfall;
n=_n_;
run;
/* Set up BYVAR
distinct values and number of distinct values */
proc sql ;
select count(distinct &byvar.)
into :nby from waterfall;
%let
nby=%sysfunc(compress(&nby.));
%if
&nby.=1
%then %do;
select distinct &byvar.
into :by1 from waterfall;
%end;
%if
&nby.>1
%then %do;
select distinct &byvar.
into :by1-:by&nby. from waterfall;
%end;
quit;
/* Create waterfall
plot */
%let
col1=blue; %let
col2=red; %let
col3=yellow; %let
col4=green; %let
col5=purple;
%let
col6=orange; %let
col7=pink; %let
col8=brown; %let
col9=gray; %let
col10=black;
options formchar="|____|+|___+=|_/\<>*"
pageno=1 nonumber nodate
orientation=landscape center;
ods listing close;
ods rtf file="&outfile.\&filename..rtf"
style=journal;
goptions reset=all ftext=swissl
ftitle=swissl colors=(black) hsize=9
in vsize=6.5 in;
axis1 label=(angle=90 "&ylab.")
order=(-100 to 150
by 50) minor=(n=4);
axis2 label=('
') value=none major=none
minor=none;
legend1 across=1
mode=protect position=(top left inside) cborder=black label=none value=(j=c %do
i=1 %to &nby.;
"&&by&i." %end;);
title1 h=2
"&title.";
proc gplot data=waterfall;
plot &yvar.*n=&byvar.
/ vaxis=axis1 haxis=axis2 legend=legend1;
where &yvar.^=.;
%do
i=1 %to &nby.;
symbol&i.
interpol=Needle value=none width=&barwidth.
color=&&col&i.;
%end;
run;
quit;
ods rtf close;
ods listing;
title; footnote;
proc datasets lib=work;
delete waterfall;
run;
quit;
%mend
waterfall;
Example
First, please note from the first few lines of the macro that you cannot name your input dataset WATERFALL. I guess you could, but you will get an error message, and your macro will not run. This is set up so that you do not accidentally overwrite any desired data.
With that said, let's create a dataset with a more creative name like WATERFALLDATA.
data
waterfalldata;
input
patient $ change gender $ @@;
cards;
001 64 Male 002 0 Female
003 -30 Female 004 -42 Female
005 7 Female 006 19 Male
007 4 Female 008 0 Male
009 0 Male 010 -100 Female
011 -19 Female 012 -3 Female
013 14 Female 014 28 Male
015 -13 Male 016 -67 Female
017 -50 Female 018 59 Female
019 27 Female 020 -24 Male
021 16 Female 022 -54 Female
023 35 Male 024 -69 Male
025 -9 Female 026 61 Female
027 -19 Female 028 95 Male
029 3 Female 030 -5 Male
031 107 Male 032 -2 Female
033 65 Male 034 78 Female
035 65 Female 036 -41 Female
037 12 Female 038 15 Male
039 -13 Male 040 -35 Female
041 -21 Male 042 15 Female
043 35 Female 044 -54 Male
045 21 Female 046 10 Male
047 -100 Male 048 10 Female
;
run;
If you have followed any of my other posts, you will know that I love to write reusable macros, save them in a separate file, and call them using the %inc statement. It makes the current program less crowded
%inc
'D:\Documents and Settings\dbateman\Desktop\Reusable
Macros\waterfall.sas';
%waterfall(dsin=waterfalldata,
yvar=change, byvar=gender,
title=%str(Best
Change in Sum of Longest Diameter by Gender), ylab=%str(Best
Change in Sum LD),
outpath=%str(D:\Documents
and Settings\dbateman\Desktop\Waterfall), filename=%str(waterfall_example),
barwidth=8);
And with that short amount of code, you can create this beautiful thing:
2 comments:
Great!!!
Good Work and Helpfull...
Clinnovo Research Labs provides Internships, Trainings, Job Assistance in the Domains of Clinical Research, Clinical Data Management, SAS, Pharmacovigilance. For more details contact us @ +91-9985492233. www.clinnovo.com
SAS training
Online Pharmacovigilance course
class room pharmacovilance training
SDTM Training
ADAM Training
clinical Research Training
clinical data management Training
Post a Comment