Max Length for a Vector in R

Solution 1:

If you're willing to work with the development version of R, you can have experimental support for this feature. From http://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.html :

LONG VECTORS

There are the beginnings of support for vectors longer than 2^31 - 1 elements on 64-bit platforms. This applies to raw, logical, integer, double, complex and character vectors, as well as lists. (Elements of character vectors remain limited to 2^31 - 1 bytes.)

All aspects are currently experimental.

What can be done with such vectors is currently somewhat limited, and most operations will return the error ‘long vectors not supported yet’. They can be serialized and unserialized, coercion, identical() and object.size() work and means can be computed. Their lengths can be get and set by xlength(): calling length() on a long vector will throw an error.

Most aspects of indexing are available. Generally double-valued indices can be used to access elements beyond 2^31 - 1.

See the link for more details. I haven't experimented with this at all myself, so I can't comment on whether it is practically useful yet or not.

If you go to http://developer.r-project.org/R_svnlog_2011 (and http://developer.r-project.org/R_svnlog_2012) and search for "long vectors" you can get a sense of the work that is going on.

Solution 2:

Here are some more details that will complement Ben's answer. The limitations seem to be inherited from the lower level programming languages used to build R, especially (apparently) the FORTRAN code. So, obviously, transitioning R so that it can take full advantage of 64-bit addressing systems is going to be a major project.

From the R-admin manual:

Even on 64-bit builds of R there are limits on the size of R objects (see help("Memory-limits"), some of which stem from the use of 32-bit integers (especially in FORTRAN code). On all builds of R, the maximum length (number of elements) of a vector is 2^31-1, about 2 billion, and on 64-bit builds the size of a block of memory allocated is limited to 2^34-1 bytes (8GB). It is anticipated these will be raised eventually* but the need for 8GB objects is (when this was written in 2011) exceptional.

(There's also a wry footnote in the manual, where I've put a *, noting that "this comment has been in the manual since 2005". :)