long/bigint/decimal equivalent datatype in R
What datatype choices do we have to handle large numbers in R? By default, the size of an integer seems to be 32bit, so bigint numbers from sql server as well as any large numbers passed from python via rpy2 get mangled.
> 123456789123
[1] 123456789123
> 1234567891234
[1] 1.234568e+12
When reading a bigint value of 123456789123456789 using RODBC, it comes back as 123456789123456784 (see the last digit), and the same number when deserialized via RJSONIO, comes back as -1395630315L (which seems like an additional bug/limitation of RJSONIO).
> fromJSON('[1234567891]')
[1] 1234567891
> fromJSON('[12345678912]')
[1] -539222976
Actually, I do need to be able to handle large numbers coming from JSON, so with RJSONIO's limitation, I may not have a workaround except for finding a better JSON library (which seems like a non-option right now). I would like to hear what experts have to say on this as well as in general.
Solution 1:
See help(integer)
:
Note that on almost all implementations of R the range of
representable integers is restricted to about +/-2*10^9: ‘double’s
can hold much larger integers exactly.
so I would recommend using numeric
(i.e. 'double') -- a double-precision number.
Solution 2:
I understood your question a little differently vs the two who posted before i did.
If R's largest default value is not big enough for you, you have a few choices (disclaimer: I have used each of the libraries i mention below, but not through the R bindings, instead through other language bindings or the native library)
The Brobdingnag package: uses natural logs to store the values; (like Rmpfr, implemented using R's new class structure). I'm always impressed by anyone whose work requires numbers of this scale.
library(Brobdingnag)
googol <- as.brob(1e100)
The gmp package: R bindings to the venerable GMP (GNU Multi-precision library). This must go back 20 years because i used it in University. This Library's motto is "Arithmetic Without Limits," which is a credible claim--integers, rationals, floats, whatever, right up to the limits of the RAM on your box.
library(gmp)
x = as.bigq(8000, 21)
The Rmpfr package: R bindings which interface to both gmp (above) and MPFR, (MPFR is in turn a contemporary implementation of gmp. I have used the Python bindings ('bigfloat') and can recommend it highly. This might be your best option of the three, given its scope, given that it appears to be the most actively maintained, and and finally given what appears to be the most thorough documentation.
Note: to use either of the last two, you'll need to install the native libraries, GMP and MPFR.