Use of PRXCHANGE to rename variables causes excessive replacement to happen at the end of the variable name
Your issue is how SAS pads string variable length. While most languages have variable length strings, SAS is more akin to SQL char
type, without the accompanying varchar
type. This gives SAS very good performance in some ways, due to predictable row sizes, but has some consequences. Note that you can actually get effectively variable length strings on datasets using options compress
, but during a data step the dataset is uncompressed.
In SAS, a string of length 10 that is assigned "A"
will actually have value "A "
. A, plus 9 spaces. Not null characters, actual space characters. That usually doesn't matter, as SAS is written in many ways to ignore those trailing spaces (so "A" = "A " = "A "
), but in this particular case it does matter (since you're transforming the space character).
You can use the trim
function to remove the spaces during execution, though it will still be stored with the spaces afterwards of course.
new_name = prxchange("s/[^a-zA-Z0-9]/_/", -1, trim(name));
Note that trim
cannot return a null value, it will always return a single space, so if that's a possibility, you should wrap this in a check for missing
(a string variable with only spaces = missing).
if not missing(name) then do;
new_name = prxchange("s/[^a-zA-Z0-9]/_/", -1, trim(name));
end;
else new_name = ' ';
There is a trimn
function that can return a length 0 string, but there's no reason to do the prxchange if it's missing - this will save time.