This is by no means ground breaking code. It is not even an ingenious idea. But maybe it will help someone out there save a little time with trying to find a way to simply append the date to your output files.
When I use dates in file names (or even folder names), I like to use pure numerals rather than alphanumeric for sorting purposes only. For example, I could name a file "Updated_Output_25APR2013.csv", but this does not like to be sorted properly; therefore, making it difficult to find files quickly. In other words, usingly strictly numbers for the date, the same file would read "Updated_Output_2013-04-25.csv".
Suppose you have 3 updated files named
- Updated_Output_2Jan2013.csv
- Updated_Output_11Jan2013.csv
- Updated_Output_1Feb2013.csv
The would show up in your output folder sorted as:
- Updated_Output_1Feb2013.csv
- Updated_Output_2Jan2013.csv
- Updated_Output_11Jan2013.csv
This is not the correct ordering because Feb. 1, 2013 does not come before Jan. 2, 2013. In fact, it should be on the bottom. This is because the date is a character string and is sorted alphabetically.
Now suppose you have the same files, but you used a different date convention:
- Updated_Output_2013-01-02.csv
- Updated_Output_2013-01-11.csv
- Updated_Output_2013-02-01.csv
The ordered file names will come out in the same order. Note that you must use YYYY-MM-DD or things will not sort chronologically. If you did MM-DD-YYYY like Americans like to speak, things can go amiss. They will sort fine for the examples above, but if you throw in different years, you can see that "Updated_Output_12-31-2013.csv" will come after "Updated_Output_02-01-2013.csv" because alphabetically, a "0" comes before a "1". Thus, "02" (or February) will ALWAYS come before "12" (or December) regardless of the year.
Some might ask, "Well, you can sort files in a folder by the 'Date modified' column, so why do we need the date to sort things?" There are two issues with that. First this is the date modified, not the date created. If you were to go into the Jan. 1 file and make any edit and then save it, the date modified would change to today. Now when you sort, it shows up as the most recent when it is actually the oldest file. Second, if you are storing a whole bunch of files with different files names, things can become very jumbled very quickly. You may be able to assume that no changes are ever made to the files, so the date modified will remain the same as the date created; however, with different files being added in between updates with toally unrelated names, you can see how they may be sorted in date order, but they will not be grouped by file name.
It is best to use the same base name for the file and then append the date to it. That way, when you sort by Name, everything is grouped together, and all similar file names are sorted chronologically.
Code
First, let me show you my little snippet of code, then we can walk through what it means. This code would obviously be put at the end of a larger program. You would do your data manipulation and then add:data _null_;
year=year(today());
month=month(today());
day=day(today());
date=put(year,best.)
|| '-' || put(month,z2.)
|| '-' || put(day,z2.);
call
symputx('date',date);
run;
%put
~~~&date.~~~;
proc
export data=safety
outfile="C:\Temp\Safety Update - &date..csv"
dbms=csv replace;
run;
Explanation
data _null_ is a way to do data-step processing without taking the time to actually create a dataset. It is kind of like a function in Excel where you give the formula some parameters, and it gives back a value. None of the processing is saved anywhere. In this case, we are going to create a macro variable, but we need to do a little processing in order to get it.SAS has some nice date functions. The first is today(). You can get the same result by using date(). The function takes in no parameters, but it returns today's date. So, if we submitted that code today (April 25, 2013), then you would get the SAS date for today. Remember that SAS dates are the number of days from January 1, 1960, so today would be stored as 19473.
The other important date function used here are year(), month(), and day(). These return just what you would expect: the year of a given date, the numeric month of a given date, or the day of a given date, respectively. Thus, you can see that the first 3 lines inside the data _null_ statement are saving the numeric parts of today's date.
The 4th line is concatenating ("||" is the concatenation operator in SAS) the strings to appear as we want. I have to use the put() function to change a numeric value to a character value. The z2. format will change the first 9 months of the year or the first 9 days of the month to 01, 02, 03, etc. in stead of 1, 2, 3, etc. It just makes things look nicer. It essetially says, "I want my value to be 2 digits long. If the value has fewer digits than that, then add zeros to the front to make it 2 digits long."
Finally, using the call symputx() function, we can take the value stored in the variable date and store it in a macro variable (which I also named date).
Just for fun, I used the %put ~~~&date.~~~; line to print to the log the value of the macro variable &date. This way, I can be sure that it will come out correctly without any spaces. For example, try running the same code, but use call symput('date',date); instead of call symputx('date',date);. Notice that I just removed the x from the end of the function name. call symputx() removes leading and trailing blanks. My %put statement just makes sure that there are not leading or trailing blanks.
The last thing to do is to output your results and simply adding the &date macro variable in the file name. Tada! Now you can run your same code over and over with updated data, and not tweaks are required to manually change the file name.
Well, this certainly turned into a much longer post that I imagined, but I hope it is a little bit of help in better understanding the how and why of appending dates to file names.