Python interface for R Programming Language [duplicate]
As pointed out by @lgautier, there is already another answer on this subject. I leave my answer here as it adds the experience of approaching R as a novice, knowing Python first.
I use both Python and R and sympathise with your need as a newcomer to R.
Since any answer you get will be subjective, I summarise a few points from my experience:
- I use rpy2 as my interface and find it is 'Pythonic', stable, predictable, and effective enough for my needs. I have not used the other packages so this is not a comment on them, rather on the merits of rpy2 itself.
- BUT do not expect that there will be an easy way of using R in Python without learning both. I find that adding an interface between the two languages allows ease of coding when you know both, but a nightmare of debugging for someone who is deficient in one of the languages.
My advice:
- For most applications, Python has packages that allow you to do most of the things that you want to do in R, from data wrangling to plotting. Check out SciPy, NumPy, pandas, BioPython, matplotlib and other scientific packages, or even the full Anaconda or Enthought python distributions. This allows you to stay within the Python environment and provides you most of the power that you need.
- At the same time, you will want R's vast range of specialised packages, so spend some time learning it in an interactive environment. I found it almost impossible to master even basic R on the command line, but RStudio and the tutorials at Quick-R and Learn-R got me going very fast.
Once you know both, then you will do magic with rpy2 without the horrors of cross-language debugging.
New Resources
Update on 29 Jan 2015
This answer has proved popular and so I thought it would be useful to point out two more recent resources:
- Ralph Heinkel gave a great talk on this subject at EuroPython 2014. The video on Combining the powerful worlds of Python and R is available on the EuroPython YouTube channel. Quoting him:
The triplet R, Rserve, and pyRserve allows the building up of a network bridge from Python to R: Now R-functions can be called from Python as if they were implemented in Python, and even complete R scripts can be executed through this connection.
- It is now possible to combine R and Python using
rmagic
inIPython/Jupyter
greatly easing the work of producing reproducible research and notebooks that combine both languages.
A question about comparing rpy2, pyrserve, and pyper with each other was answered on the site earlier.
Regarding the number of contributors, I'd say that all 3 have a relatively small number. A site like Ohloh can give a more detailled answer.
How actively a package is used is tricky to determine. One indication might be the number of downloads, an other might be the number of posts on mailing lists or the number questions on a site like stackoverflow, the number of other packages using it or citing it, the number of CVs or job openings mentioning the package. As much as I believe that I could give a fair evaluation, I might also be seen as having a conflict of interest. ;-)
All three have their pros and cons. I'd say that you base you choice on that.
My personal experience has been with Rpy
, not Rpy2
. I used it for a while, but dropped it in favor of using system
commands. A typical case for me was running a FORTRAN model using Python scripts, and post-processing with R. In my experience the easiest solution was to create a command line tool using R, which is quite straightforward (at least under Linux). The command line tool could be executed in the root of the model run, and the script would produce a set of R objects and plots in an Routput
directory. The advantage of disconnecting R and Python in this way was that I could easily debug the R code separate from the Python code.
I think Rpy
really shines when a lot of back and forth communication between R and Python is needed. But if the functionality is nicely separable, and the overhead of disk i/o is not too bad, I would stick to system
calls. See ?system
for more information regarding system calls, and Rscript
for running R scripts as a command line tool.
Regarding your wish to write R code in a Python way, this is not possible as all the solutions require you to write R code in R syntax. For Rpy
this means R syntax, but a little different (no .
for example). I agree with @gauden that there is no shortcut in using R through Rpy
.