cut() function puts all data in a single interval
I'm trying to cut a data to do a frequency distribution but after cut, all the data is assingned to one interval
points <- 224 * 0:5
cut_data <- cut(rs$amount, points ,dig.lab = 10)
My rs$amount data:
integer64
[1] 517 200 391 186 262 1020 791 124 437 238 896 212 144 529 523 190
And I get something like this
> cut_data
[1] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224] (0,224]
[15] (0,224] (0,224]
Levels: (0,224] (224,448] (448,672] (672,896] (896,1120]
What do I do wrong
EDIT:
result of dput() on rs$amount
structure(c(2.55431938899924e-321, 9.88131291682493e-322, 1.93179667523927e-321,
9.18962101264719e-322, 1.29445199210407e-321, 5.03946958758071e-321,
3.90805925860426e-321, 6.12641400843146e-322, 2.15906687232625e-321,
1.17587623710217e-321, 4.42682818673757e-321, 1.04741916918344e-321,
7.11454530011395e-322, 2.61360726650019e-321, 2.58396332774972e-321,
9.38724727098368e-322), class = "integer64")
EDIT2:
Casting rs$amount as numeric helped with the issue
cut_data <- cut(as.numeric(rs$amount),points,dig.lab = 10)
I think you have two alternatives: use cut(as.numeric(vec),...)
or findInterval
.
as.numeric
If you are not concerned about hitting the theoretical precision loss when converting to integer64
to numeric
(it might be hard to find this happening), then you can convert to numeric
:
cut(as.numeric(vec), points ,dig.lab = 10)
# [1] (448,672] (0,224] (224,448] (0,224] (224,448] (896,1120] (672,896] (0,224] (224,448] (224,448] (672,896] (0,224] (0,224] (448,672] (448,672] (0,224]
# Levels: (0,224] (224,448] (448,672] (672,896] (896,1120]
findInterval
table(cut(vec, points ,dig.lab = 10))
# (0,224] (224,448] (448,672] (672,896] (896,1120]
# 16 0 0 0 0
table(findInterval(vec, points))
# 1 2 3 4 5
# 6 4 3 1 2
You can mock this to produce similarly-formatted factors manually:
labels <- sprintf("(%i,%i]", points[-length(points)], points[-1])
labels
# [1] "(0,224]" "(224,448]" "(448,672]" "(672,896]" "(896,1120]"
factor(labels[findInterval(vec, points)], labels = labels)
# [1] (448,672] (0,224] (224,448] (0,224] (224,448] (896,1120] (672,896] (0,224] (224,448] (224,448] (896,1120] (0,224] (0,224] (448,672] (448,672] (0,224]
# Levels: (0,224] (224,448] (448,672] (672,896] (896,1120]
Data
vec <- structure(c(2.55431938899924e-321, 9.88131291682493e-322, 1.93179667523927e-321, 9.18962101264719e-322, 1.29445199210407e-321, 5.03946958758071e-321, 3.90805925860426e-321, 6.12641400843146e-322, 2.15906687232625e-321, 1.17587623710217e-321, 4.42682818673757e-321, 1.04741916918344e-321, 7.11454530011395e-322, 2.61360726650019e-321, 2.58396332774972e-321, 9.38724727098368e-322), class = "integer64")