Iterating over a 1-dimensional (column vector) SAS table and storing the entry as a macro-variable to perform some computation

I am currently learning how to use SAS and I have a question on whether it is possible to store an entry from a SAS table as a macro variable, to then do some computation before moving onto the next iteration. That is, obtain values from a table using something like DUMMY_TABLE[i,j], such as what would be possible in R or Python.

Suppose we were to have a SAS table with name, "dummy_tab", with one variable named, "NAME" and entries [a,b,c,d,e,f]. Is it then possible using a DO Loop, or some other method, to be able to iterate over the values in "dummy_tab" and store this as a macrovariable to perform some operation on it before iterating to the next entry?

To give some idea as to why I am even considering such an approach, this is my problem:

I am trying to create a macro which automatically renames variables which have special characters with an underscore. My current approach is to extract the variable names and store it in a 1-dimensional SAS table (column vector).

Then, I will iterate over the entries of this column vector and store each entry into a macro variable (mc1). Upon which, I will use PXRCHANGE to replace all the special characters with an underscore and then store this as a new macro variable (mc2.)

Finally, I will then rename the variable in the original database using the two macro variables I've defined, namely: RENAME &mc1. = &mc2. After which I will go onto the next iteration, and begin the process to rename the second variable, etc.

/* Generate dummy data */
option validvarname = any;
data dummy_data;
    input "dummy?"n "te!st"n;
    datalines;
    1 1
    2 2
    3 3
    ;
run;

/* Obtain variable names as a column vector */
proc contents
    data = dummy_data
    noprint
    out = col_name
        (keep = name);
run;

/* Stores number of variables in dummy_data in macrovariable n_obs*/
proc sql;
    select count(NAME) into : n_obs
    from col_name;
quit;

/* Iterates over column names and stores i_th column name as a macrovariable */
data output_tab;
set col_name;
    
    do i = 1 to &n_obs.;
        
/* Here is where the problem lies: */
        %put i_col_name = col_name[i];
                
        /* PRXCHANGE step and rename of variable to go here */
run;

All of what you're describing is possible, but there's some difference between "possible" and "SAS idiomatic".

The main thing to understand in SAS, that doesn't tend to easily transfer from some other languages (R/Python), is that the "data step" is the place that basically all programming lies, and "columns" are the main variable you act on, in SAS. Macro variables do exist, but their purpose is to store SAS code (or, rather, things that will become SAS code) and not data.

In your case, it's more complicated because you are writing SAS code, so it does make sense to put it in a macro variable - but not at the stage you're talking about. Keep everything in datasets until you're ready to actually do the renaming!

The straightforward way to do that is...

data vars_to_rename;    *this is identical, basically, to the proc contents output - use whichever you prefer;
  set sashelp.vcolumn;
  where libname='WORK' and memname='DUMMY_DATA';
  new_name = translate(name,'__','!?');  *replace this with your PRXCHANGE;
run;

%macro renamer(from=,to=);   *simple one line macro to do the renaming;
  rename "&from."n = &to.;
%mend renamer;

proc sql;  *create one renamer macro call for each variable to be renamed;
  select cats('%renamer(from=',name,',to=',new_name,')')
    into :renamelist separated by ' '
    from vars_to_rename
  ;   *could easily add a WHERE name ne new_name; to only rename vars that actually need it;
quit; 

data new_dataset;  *could do this in PROC DATASETS also to save disk time;
  set dummy_data;
  &renamelist;
run;

The only thing that ends up in the macro language is the actual code you want to write. It's possible to do all of this solely in the macro language, but that's neither idiomatic nor efficient.


Assuming VALIDVARNAME isn't a good fix for you for whatever reason, here's a different approach.

  • Use VCOLUMNS/dictionary.columns data set to source the column names
  • Create a new variable, new_name with the new variable name based on the name variable
  • Use CALL EXECUTE to generate code, instead of macros

Full example below that takes the SASHELP.CLASS data set and renames the variables to VAR001- VAR###. You can add in whatever you need to the new_name calculation to simplify the process.

This renames a data set in place though, if you want to create a new data set something similar can be done.

And if you want functionality more similar to R/Python you can look into IML if you'd like.

*fake data to play with;
data class;
set sashelp.class;
run;

*data set name to fix;
*making code slightly easier to follow;
%let dsn_in = CLASS;

data _null_;
set sashelp.vcolumn end=eof;
where libname = 'WORK' and memname = "&dsn_in";

 if _n_=1 then
    call execute ("proc datasets lib=WORK nodetails nolist; modify &dsn_in; rename ");
 
 *create new variable name here;
 *in this example just renaming it to VAR001 - VAR###;
 *variable name is 'name';
 new_name=catt('VAR', put(_n_, z3.));
 
 *pass new and old name to proc datasets;
 call execute (name);
 call execute ('=');
 call execute (new_name);
 
 *if last record then quit;
 If eof then
    call execute (';run;quit;');
    
    
run;

Partial Log:

NOTE: CALL EXECUTE generated line.
 1         + proc datasets lib=WORK nodetails nolist; modify CLASS;
 1         +                                                        rename
 2         + Name
 3         + =
 4         + VAR001
 5         + Sex
 6         + =
 7         + VAR002
 8         + Age
 9         + =
 10        + VAR003
 11        + Height
 12        + =
 13        + VAR004
 14        + Weight
 15        + =
 16        + VAR005
 17        + ;
 NOTE: Renaming variable Name to VAR001.
 NOTE: Renaming variable Sex to VAR002.
 NOTE: Renaming variable Age to VAR003.
 NOTE: Renaming variable Height to VAR004.
 NOTE: Renaming variable Weight to VAR005.
 17        +  run;
 
 NOTE: MODIFY was successful for WORK.CLASS.DATA.
 17        +      quit;
 
 NOTE: PROCEDURE DATASETS used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.01 seconds
       memory              596.56k
       OS Memory           26020.00k
       Timestamp           01/18/2022 05:14:09 PM
       Step Count                        26  Switch Count  0
       Page Faults                       0
       Page Reclaims                     245
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           272

EDIT: to understand logic look at this dataset and then the log:

data temp;
*data _null_;
set sashelp.vcolumn (keep = libname memname name type) end=eof;
where libname = 'WORK' and memname = "&dsn_in";

 if _n_=1 then
    call execute ("proc datasets lib=WORK nodetails nolist; modify &dsn_in; rename ");
 
 *create new variable name here;
 *in this example just renaming it to VAR001 - VAR###;
 *variable name is 'name';
 new_name=catt('VAR', put(_n_, z3.));
 
 *pass new and old name to proc datasets;
 call execute (name);
 call execute ('=');
 call execute (new_name);
 
 *if last record then quit;
 If eof then
    call execute (';run;quit;');
    
    
run;