What specifically are the dangers of eval(parse(...))?
Solution 1:
Most of the arguments against eval(parse(...))
arise not because of security concerns, after all, no claims are made about R being a safe interface to expose to the Internet, but rather because such code is generally doing things that can be accomplished using less obscure methods, i.e. methods that are both quicker and more human parse-able. The R language is supposed to be high-level, so the preference of the cognoscenti (and I do not consider myself in that group) is to see code that is both compact and expressive.
So the danger is that eval(parse(..))
is a backdoor method of getting around lack of knowledge and the hope in raising that barrier is that people will improve their use of the R language. The door remains open but the hope is for more expressive use of other features. Carl Witthoft's question earlier today illustrated not knowing that the get
function was available, and the question he linked to exposed a lack of understanding of how the [[
function behaved (and how $
was more limited than [[
). In both cases an eval(parse(..))
solution could be constructed, but it was clunkier and less clear than the alternative.
Solution 2:
The security concerns only really arise if you start calling eval on strings that another user has passed to you. This is a big deal if you are creating an application that runs R in the background, but for data analysis where you are writing code to be run by yourself, then you shouldn't need to worry about the effect of eval
on security.
Some other problems with eval(parse(
though.
Firstly, code using eval-parse is usually much harder to debug than non-parsed code, which is problematic because debugging software is twice as difficult as writing it in the first place.
Here's a function with a mistake in it.
std <- function()
{
mean(1to10)
}
Silly me, I've forgotten about the colon operator and created my vector wrongly. If I try and source this function, then R notices the problem and throws an error, pointing me at my mistake.
Here's the eval-parse version.
ep <- function()
{
eval(parse(text = "mean(1to10)"))
}
This will source, because the error is inside a valid string. It is only later, when we come to run the code that the error is thrown. So by using eval-parse, we've lost the source-time error checking capability.
I also think that this second version of the function is much more difficult to read.
The other problem with eval-parse is that it is much slower than directly executed code. Compare
system.time(for(i in seq_len(1e4)) mean(1:10))
user system elapsed
0.08 0.00 0.07
and
system.time(for(i in seq_len(1e4)) eval(parse(text = "mean(1:10)")))
user system elapsed
1.54 0.14 1.69
Solution 3:
Usually there's a better way of 'computing on the language' than working with code-strings; evalparse heavy-code needs a lot of safe-guarding to guarantee a sensible output, in my experience.
The same task can usually be solved by working on R code as a language object directly; Hadley Wickham has a useful guide on meta-programming in R here:
The defmacro() function in the gtools library is my favourite substitute (no half-assed R pun intended) for the evalparse construct
require(gtools)
# both action_to_take & predicate will be subbed with code
F <- defmacro(predicate, action_to_take, expr =
if(predicate) action_to_take)
F(1 != 1, action_to_take = print('arithmetic doesnt work!'))
F(pi > 3, action_to_take = return('good!'))
[1] 'good!'
# the raw code for F
print(F)
function (predicate = stop("predicate not supplied"), action_to_take = stop("action_to_take not supplied"))
{
tmp <- substitute(if (predicate) action_to_take)
eval(tmp, parent.frame())
}
<environment: 0x05ad5d3c>
The benefit of this method is that you are guaranteed to get back syntactically-legal R code. More on this useful function can be found here:
Hope that helps!
Solution 4:
In some programming languages,
eval()
is a function which evaluates a string as though it were an expression and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval. The input to eval is not necessarily a string; in languages that support syntactic abstractions (like Lisp), eval's input will consist of abstract syntactic forms. http://en.wikipedia.org/wiki/Eval
There are all kinds of exploits that one can take advantage of if eval is used improperly.
An attacker could supply a program with the string "session.update(authenticated=True)" as data, which would update the session dictionary to set an authenticated key to be True. To remedy this, all data which will be used with eval must be escaped, or it must be run without access to potentially harmful functions. http://en.wikipedia.org/wiki/Eval
In other words, the biggest danger of eval()
is the potential for code injection into your application. The use of eval()
can also cause performance issues in some languages depending on what is being used for.
Specifically in R, it's probably because you can use get()
in place of eval(parse())
and your results will be the same without having to resort to eval()