Friday, April 27, 2012

Survival Analysis

The number one reason clinical programmers are employed is to produce presentable output on efficacy.  That is, companies want to know if their drug is working better than the comparative drug.

I have taken time to create a macro for survival analysis output.  It is mostly for progression-free survival (PFS) and overall survival (OS), but you may adapt it to your own needs.

Examples

First, I am creating some random survival times from the exponential distribution.  Sorting the data is not necessary, but it can make the data cleaner to view if you want to open the dataset.  Please note that this is arbitrary data, so the results for the different groups will not be significantly different from each other.

Suppose that we have a disease that is being treated by Drug A (the treatment) and Drug B (the control or current standard of care).  randexp assigns 400 patients to one of the two treatment arms, gives them one of four racial status' (White[1], Black[2], Asian[3], or Pacific Islander[4]), assigns a survival time on the drug, and finally tells us if the survival time is an observed event [1] or a censored event [0].


data randexp;
      call streaminit(1234);
      do i=1 to 400;
            trtgrp = rand('BERNOULLI',.5);
            race = rand('BINOMIAL',.25,4)+1;
            time = rand('EXPONENTIAL');
            event = rand('BERNOULLI',.8);
            output;
      end;
run;
proc sort data=randexp;
      by trtgrp time event;
run;

I have stored the macro code in a separate file, so all I need to do is %include the file and call the macro.  (The full macro can be found at the end of this post.)

%inc 'D:\Documents and Settings\Desktop\efficacy.sas';

Now, I just call the efficacyreport() macro, and SAS will take care of the rest.  First, let me display a table explaining what each of the parameters represent.

Parameter
Description
pop
Name of population being described.
cols
Number of columns that will be displayed.
protocol
Character string for the protocol name.
stratlab1
Character string for ARM A name.
stratlab2
Character string for ARM B name.
cond1
Filtering condition for the first group of output.
cond2
Filtering condition for the second group of output.
cond3
Filtering condition for the third group of output.
Cond4
Filtering condition for the forth group of output.
grouplab1
Character string for the label corresponding to cond1.
Grouplab2
Character string for the label corresponding to cond2.
Grouplab3
Character string for the label corresponding to cond3.
Grouplab4
Character string for the label corresponding to cond4.
event
Character string for the event.  Possible values: PFS/OS
efficacydata
Name of the dataset storing strata, survtime, censorflag, and any variables required for the cond variables above.
strata
Name of the stratification variable stored in efficacydata.  This is a binary variable.  The lesser value always represent Arm B, and the greater value always represents Arm B.
survtime
Name of the survival time variable stored in efficacydata.
censorflag
Name of the censoring variable stored in efficacydata.
censorval
The value from censorflag that represent a censored value.
timemeas
Character string for the time unit measurement.  Possible values: weeks/months
directory
Character string for the output directory.
filename
Character string for the name of the output file.
pgm
Character string for the name of the program being used.





Example 1

This example is only going to display the output for one group: the ITT population.  It will be accompanied with one KM curve for the ITT population.  It will represent the data presented in the output table.

%efficacyreport(pop=%str(ITT), cols=1, protocol=%str(AZ1234), stratlab1=%str(Drug A), stratlab2=%str(Drug B), cond1=%str(1=1), grouplab1=%str(ITT), event=PFS, efficacydata=randexp, strata=trtgrp, survtime=time, censorflag=event, censorval=0, timemeas=months, directory=%str(D:\Documents and Settings\Desktop), filename=%str(Example1), pgm=%str(survival_example.sas));

Output 1
I was not able to get the header or the footer in the screenshot of the output table.






Example 2

This example will now display the output for four group.  The data will come from the same ITT population, but now it will be grouped by race.  The summary table will be accompanied by four KM curves, one for each race group.

Notice that in Example 1, there was only one column represented, so I only had to specify  cond1 and  grouplab1 .  Here, I have multiple columns, so I need multiple  cond and  grouplab entries.

%efficacyreport(pop=%str(ITT), cols=4, protocol=%str(AZ1234), stratlab1=%str(Drug A), stratlab2=%str(Drug B), cond1=%str(race=1), cond2=%str(race=2), cond3=%str(race=3), cond4=%str(race=4), grouplab1=%str(White), grouplab2=%str(Black), grouplab3=%str(Asian), grouplab4=%str(Pacific Islander), event=PFS, efficacydata=randexp, strata=trtgrp, survtime=time, censorflag=event, censorval=0, timemeas=months, directory=%str(D:\Documents and Settings\Desktop), filename=%str(Example2), pgm=%str(survival_example.sas));


Output 2
I was able to get the header in the screenshot of this one, but the footer is still missing.













Macro Code

/***  This macro is called within the efficacyreport() macro.  ***/
/*****************************************************************/

%macro efficacy(set=,event=,cond=,survtime=,censorflag=,censorval=,timemeas=);
      ods listing close;  /* Suppress SAS listing */
      ods output CensoredSummary=cs HomTests=ht Quartiles=qu;  /* Keep datasets with desired output */
            proc lifetest data=efficacy outsurv=survdata method=km;
                  time survtime*censorflag(&censorval.);
                  strata strata;
                  where &cond.;
            run;
      ods listing;  /* Undo suppression of listing */

      /***  HR estimates  ***/
      ods listing close;
      ods output parameterestimates=pe;
            proc phreg data=efficacy;
                  model survtime*censorflag(&censorval.)=strata / ties=exact rl;
                  where &cond.;
            run;
      ods listing;

      data sdfdata&set.;
            set survdata;
            if survival^=. then do;
                  sdf_value=survival;
                  retain sdf_value;  /*  Extend survival values so tail ends are not missing  */
            end;
            if sdf_lcl^=. then do;
                  sdf_lower=sdf_lcl;
                  retain sdf_lower;  /*  Extend lower CI values so tail ends are not missing  */
            end;
            if sdf_ucl^=. then do;
                  sdf_upper=sdf_ucl;
                  retain sdf_upper;  /*  Extend upper CI values so tail ends are not missing  */
            end;
            if _censor_=1 then censvalue=sdf_value;
            if stratum=1 then survrate1=sdf_value;
                  else if stratum=2 then survrate2=sdf_value;
            %if &timemeas.=months %then %do;
                  if survtime=0 then weeks=0;
                        do i=1 to 10;
                              if (i-1)*3 < survtime <= (i*3) then weeks=i*3;
                        end;
            %end;
            %if &timemeas.=weeks %then %do;
                  if survtime=0 then weeks=0;
                        do i=1 to 5;
                              if (i-1)*6 < survtime <= (i*6) then weeks=i*6;
                        end;
                        if 24 < survtime <= 32 then weeks=32;
                        do i=5 to 10;
                              if (i-1)*8 < survtime <= (i*8) then weeks=i*8;
                        end;
            %end;
      run;


      /***  Number of assessments, events, and censored observations  ***/
      proc transpose data=cs out=transcs (where=(_NAME_ in ('Total','Failed','Censored')));
            id stratum;
      run;
      data group1;
            length est2 $15 header $50;
            set transcs;
            groupnum=1;
            if _NAME_='Total' then do;
                  _LABEL_='Number Assessed';
                  linenum=1;
            end;
            if _NAME_='Failed' then do;
                  _LABEL_='Number of Events';
                  linenum=2;
            end;
            if _NAME_='Censored' then linenum=3;
            est1=compress(put(_2,best.));
            est2=compress(put(_1,best.));
            header=_LABEL_;
            drop _NAME_ _LABEL_ _1 _2;
      run;

     
      /***  Quartile estimates and 95% CIs   ***/
      data qu0;
            set qu;
            ci='('||compress(put(lowerlimit,5.1)||','||put(upperlimit,5.1)||')');
            drop strata lowerlimit upperlimit;
      run;
      proc sort data=qu0;
            by percent;
      run;
      proc transpose data=qu0 out=transqu;
            id stratum;
            by percent;
            var estimate ci;
      run;

      data est;
            length est2 $15;
            set transqu;
            where _NAME_='Estimate';
            est1=compress(put(roundz(_2,.01),9.2));
            est2=compress(put(roundz(_1,.01),9.2));
            drop _1 _2;
      run;
      data ci;
            set transqu;
            where _NAME_='ci';
            rename _1=ci2 _2=ci1;
      run;

      data group2;
            length header $50;
            merge est ci;
            by percent;
            groupnum=2;
            if percent=25 then header='    25th Percentile';
            if percent=50 then header='    Median';
            if percent=75 then header='    75th Percentile';
            linenum=percent;
            drop _NAME_ _LABEL_ percent;
      run;
      proc sql;
            insert into group2(header,groupnum,linenum)
                  values("&event.~{super 1} (&timemeas.)",2,1);   * Add superscript for footnote reference;
      quit;


      /***  PFS/OS rate in 3-week/month intervals  ***/
      proc sort data=sdfdata&set.;
            by weeks survtime;
      run;
      data group3a;
            set sdfdata&set.;
            where STRATUM=2;
            by weeks survtime;
            if last.weeks;
            if sdf_value>0 then do;
                  est1=compress(put(sdf_value,6.3));
                  ci1='(' || compress(put(sdf_lower,6.3) || ',' || compress(put(sdf_upper,6.3))||')');
            end;
                  else do;
                        est1=compress(put(sdf_value,6.3));
                        ci1='(.,.)';
                  end;
            keep weeks est1 ci1;
      run;

      data group3b;
            length est2 $15;
            set sdfdata&set.;
            where STRATUM=1;
            by weeks survtime;
            if last.weeks;
            if sdf_value>0 then do;
                  est2=compress(put(sdf_value,6.3));
                  ci2='(' || compress(put(sdf_lower,6.3) || ',' || compress(put(sdf_upper,6.3))||')');
            end;
                  else do;
                        est2=compress(put(sdf_value,6.3));
                        ci2='(.,.)';
                  end;
            keep weeks est2 ci2;
      run;
     
      data group3;
            length est2 $15 header $50;
            merge group3a group3b;
            by weeks;
            header=put("&event. Rate through " || compress(put(weeks,2.)) || " &timemeas.",90.);
            if est1^=' ' then tempest1=est1;
            if est2^=' ' then tempest2=est2;
            if ci1^=' ' then tempci1=ci1;
            if ci2^=' ' then tempci2=ci2;
            retain tempest1 tempest2 tempci1 tempci2;
            if est1=' ' and tempest1^=' ' then est1=tempest1;
            if est2=' ' and tempest2^=' ' then est2=tempest2;
            if ci1=' ' and tempci1^=' ' then ci1=tempci1;
            if ci2=' ' and tempci2^=' ' then ci2=tempci2;
            groupnum=3;
            linenum=weeks;
            %if &timemeas.=months %then %do;
                  if weeks in (3,6,9,12);  /*  Only time intervals requested for output  */
            %end;
            %if &timemeas.=weeks %then %do;
                  if weeks in (6,12,18,24);  /*  Only time intervals requested for output  */
            %end;
      run;


      /***  Hazard ratio & CI, log-rank p-value  ***/
      data group4;
            length est2 $15 header $50;
            set pe (in=a) pe (in=b) ht (in=c) ht (in=d);
            groupnum=4;
            if a then do;
                  header='Hazard Ratio';
                  est2=compress(put(hazardratio,5.3));
                  linenum=1;
            end;
                  else if b then do;
                        header='    95% CI';
                        est2='(' || compress(put(hrlowercl,5.3) || ', ' || put(hruppercl,5.3) || ')');
                        linenum=2;
                  end;
                  else if c & test='Log-Rank' then do;
                        header='    Log-Rank p-value (one-sided)';
                        est2=compress(put(probchisq/2,PVALUE6.4));
                        linenum=3;
                  end;
                  else if d & test='Log-Rank' then do;
                        header='    Log-Rank p-value (two-sided)';
                        est2=compress(put(probchisq,PVALUE6.4));
                        linenum=4;
                  end;
                  else delete;
            keep header est2 groupnum linenum;
      run;

      /* Combine all data and insert line breaks */
      data final&set.;
            set group1 group2 group3 group4;
            rename est1=est1&set. est2=est2&set.;
      run;
      proc sql;
            insert into final&set. (groupnum)
                  values(1.5) values(2.5) values(3.5);
      quit;
      proc sort data=final&set.;
            by groupnum linenum;
      run;
%mend efficacy;

%macro efficacyreport(pop=,cols=,protocol=,stratlab1=Arm A,stratlab2=Arm B,cond1=,cond2=,cond3=,cond4=,grouplab1=,grouplab2=,grouplab3=,grouplab4=,event=,efficacydata=,strata=,survtime=,censorflag=,censorval=,timemeas=,directory=,filename=,pgm=);
      %if &event.=PFS %then %let main=Progression-Free Survival;
      %if &event.=OS %then %let main=Overall Survival;

      /* normalize efficacy dataset */
      data efficacy;
            set &efficacydata.;
            strata=&strata.;
            survtime=&survtime.;
            censorflag=&censorflag.;
      run;
      proc sql noprint;
            select (ceil(max(survtime)/5))*5 into :xaxis
            from efficacy;
      quit;

      /* Call the EFFICACY macro for every condition */
      %do i=1 %to &cols.;
            %efficacy(set=&i.,event=&event.,cond=&&cond&i.,survtime=survtime,censorflag=censorflag,censorval=&censorval.,timemeas=&timemeas.);
      %end;

      /***  Prepare final output  ***/
      data report;
            merge
                  %do i=1 %to &cols.;
                        final&i.
                  %end;
            ;
            by groupnum linenum;
            drop ci1 ci2 weeks tempest1 tempest2 tempci1 tempci2;
      run;

      /* Prepare standardized output and output summary table */
      options formchar="|____|+|___+=|_/\<>*" pageno=1 nonumber nodate orientation=landscape center;
      ods listing close;
      ods rtf file="&directory.\&filename..rtf" style=journal;
      ods escapechar='~';
      option noquotelenmax;

      %let user=%sysfunc(upcase(&sysuserid.));
      %let proddt=%sysfunc(date(),worddate.);

      title1 j=left font='Arial' 'ABC Company, Inc.' j=right "Page ~{pageof}";
      title2 j=left font='Arial' "Protocol: &protocol.";
      footnote1 j=left font='Arial' "Program: &pgm..sas by &user." j=right "Date Produced: &proddt.";
      proc report data=report nowindows headline split='^' style(column)={asis=on}
                  style(report)={font=('Arial',9.5pt,bold italic) pretext="&main. \line\ (&pop. Population)"};
            column (header
                  %do j=1 %to &cols.;
                        ("&&grouplab&j. Analysis Set^" est1&j. est2&j.)
                  %end;
                        );
            define header / display left ' ' style=[CellWidth=20%];
            %do i=1 %to &cols.;
                  define est1&i. / display center "&stratlab1." style=[CellWidth=9%];
                  define est2&i. / display center "&stratlab2." style=[CellWidth=9%];
            %end;
      run;

      %let margin=%sysfunc(cats(%eval((100-((&cols.*18)+20))/2),%));
      %let foottime=%sysfunc(propcase(%sysfunc(compress(&timemeas.,'s'))));
      %if &timemeas.=months %then %let footcalc=30.4375;
      %if &timemeas.=weeks %then %let footcalc=7;

      %if &event.=PFS %then %do;
            %let foottext1=Progression-Free Survival is the number of &timemeas. from the randomization date to the day the subject experienced an event of radiographically or clinically defined disease progression or death;
            %let foottext2= or the censoring day of the last RECIST evaluation.  &foottime. is calculated as ((date of event[censoring] - randomization date) + 1)/&footcalc..;
      %end;
      %if &event.=OS %then %do;
            %let foottext1=Overall Survival is the number of &timemeas. from the randomization date to the day the subject died or the censoring day that the subject was last known to be alive.;
            %let foottext2=&foottime. is calculated as ((date of death/censoring - randomization date) + 1)/&footcalc..;
      %end;
      ods rtf text="~S={leftmargin=&margin. rightmargin=&margin. just=l font=('Arial', 9pt)}~{super 1} &foottext1. &foottext2.";
      ods rtf text="~S={leftmargin=&margin. rightmargin=&margin. just=l font=('Arial', 9pt)}NOTES: &main. and &event. Rate based on Kaplan-Meier estimates.";
      ods rtf text="~S={leftmargin=&margin. rightmargin=&margin. just=l font=('Arial', 9pt)}~n KEY:~n &event. = &main.~n CI = Confidence Interval";

      /* Output Kaplan-Meier Curves */
      %let timeaxis=%sysfunc(propcase(&timemeas.));
      goptions reset=all rotate=landscape ftext=swissl ftitle=swissl colors=(black) htext=3 hsize=9 in vsize=6.5 in;
      axis1 order=(0 to &xaxis. by 5) minor=(number=4) label=(height=1.25 "&timeaxis. from Randomzation") value=(height=1.25);
      axis2 order=(0 to 1 by 0.1) minor=none label=(a=90 height=1.25 justify=center "Probability of &main.") value=(height=1.25);
      legend1 position=(top right inside) across=1 label=none value=(height=.75 j=center "&stratlab2." j=center "&stratlab1." j=center "Censored Observation");

      title; footnote;
      %do i=1 %to &cols.;
            proc sort data=sdfdata&i.;
                  by stratum descending survival;
            run;

            title1 h=1.5 justify=center "Kaplan-Meier &main. Curves - &&grouplab&i.";
            title2 h=1.25 justify=center "(&pop. Population)";
            proc gplot data=sdfdata&i.;
                  plot survrate1*survtime survrate2*survtime censvalue*survtime / overlay haxis=axis1 vaxis=axis2 legend=legend1;
                  symbol1 i=stepj l=20 color=black;
                  symbol2 i=stepj l=1 color=black;
                  symbol3 i=none v=plus color=black;
            run;
            quit;
      %end;

      ods rtf close;
      ods listing;
      title; footnote;

      /* Clean up your work */
      proc datasets lib=work;
            delete ci cs efficacy est final1 final2 final3 final4 group1 group2 group3 group3a group3b group4
                        ht pe qu qu0 report survdata transcs transqu sdfdata1 sdfdata2 sdfdata3 sdfdata4;
      run;
      quit;
%mend efficacyreport;









No comments: