Long Numbers As A Character String

As part of my dataset, one of the columns is a series of 24-digit numbers.

Example:

bigonumber <- 429382748394831049284934

When I import it using either data.table::fread or read.csv, it shows up as numeric in exponential format (EG: 4.293827e+23).

options(digits=...) won't work since the number is longer than 22 digits.

When I do

as.character(bigonumber) 

what I get is "4.29382748394831e+23"

Is there a way to get bigonumber converted to a character string and show all of the digits as characters? I don't need to do any math on it, but I do need to search against it and do dplyr joins on it.

I need to this after import, since the column number varies from month to month.

(Yes, in the perfect world, my upstream data provider would use a hash instead of a long number and a static number of columns that stay the same every month, but I don't get to dictate that to them.)


You can specify colClasses on your fread or read.csv statement.

bignums
429382748394831049284934
429382748394831049284935
429382748394831049284936
429382748394831049284937
429382748394831049284938
429382748394831049284939

bignums <- read.csv("~/Desktop/bignums.txt", sep="", colClasses = 'character')

You can suppress the scientific notation with

options(scipen=999)

If you define the number then

bigonumber <- 429382748394831049284934

you can convert it into a string:

big.o.string <- as.character(bigonumber)

Unfortunately, this does not work because R converts the number to a double, thereby losing precision:

#[1] "429382748394831019507712"

The last digits are not preserved, as pointed out by @SabDeM. Even setting

options(digits=22)

doesn't help, and in any case 22 is the largest number that is allowed; and in your case there are 24 digits. So it seems that you will have to read the data directly as character or factor. Great answers have been posted showing how this can be achieved.

As a side note, there is a package called gmp that allows using arbitrarily large integer numbers. However, there is a catch: they have to be read as characters (again, in order to prevent R's internal conversion into double).

library(gmp)
bigonumber <- as.bigz("429382748394831049284934")
> bigonumber
Big Integer ('bigz') :
[1] 429382748394831049284934
> class(bigonumber)
[1] "bigz"

The advantage is that you can indeed treat these entries as numbers and perform calculations while preserving all the digits.

> bigonumber * 2
#Big Integer ('bigz') :
#[1] 858765496789662098569868

This package and my answer here may not solve your problem, because reading the numbers directly as characters is an easier way to achieve your goal, but I thought I might post this anyway as an information for users who may need to use large integers with more than 22 digits.


Use digest::digest on bigonumber to generate an md5 hash of the number yourself?

bigonumber <- 429382748394831049284934
hash_big <- digest::digest(bigonumber)
hash_big
# "e47e7d8a9e1b7d74af6a492bf4f27193"

I saw this before I posted my answer, but dont see it here anymore.

set options(scipen) to a big value so that there is no truncation:

options(scipen = 999)
bigonumber <- 429382748394831049284934
bigonumber
# [1] 429382748394831019507712
as.character(bigonumber)
# [1] "429382748394831019507712"