Order a "mixed" vector (numbers with letters)
How can I order a vector like
c("7","10a","10b","10c","8","9","11c","11b","11a","12") -> alph
in
alph
[1] "7","8","9","10a","10b","10c","11a","11b","11c","12"
and use it to sort a data.frame, like
V1 <- c("A","A","B","B","C","C","D","D","E","E")
V2 <- 2:1
V3 <- alph
df <- data.frame(V1,V2,V3)
and order the row to obtain (order V2 and then V3)
V1 V2 V3
C 1 9
A 1 10a
B 1 10c
D 1 11b
E 1 12
A 2 7
C 2 8
B 2 10b
E 2 11a
D 2 11c
> library(gtools)
> mixedsort(alph)
[1] "7" "8" "9" "10a" "10b" "10c" "11a" "11b" "11c" "12"
To sort a data.frame you use mixedorder
instead
> mydf <- data.frame(alph, USArrests[seq_along(alph),])
> mydf[mixedorder(mydf$alph),]
alph Murder Assault UrbanPop Rape
Alabama 7 13.2 236 58 21.2
California 8 9.0 276 91 40.6
Colorado 9 7.9 204 78 38.7
Alaska 10a 10.0 263 48 44.5
Arizona 10b 8.1 294 80 31.0
Arkansas 10c 8.8 190 50 19.5
Florida 11a 15.4 335 80 31.9
Delaware 11b 5.9 238 72 15.8
Connecticut 11c 3.3 110 77 11.1
Georgia 12 17.4 211 60 25.8
mixedorder
on multiple vectors (columns)
Apparently mixedorder
cannot handle multiple vectors. I have made a function that circumvents this by converting all character vectors to factors with mixedsorted sorted levels, and pass all vectors on to the standard order
function.
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
l
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
However, in your particular case multi.mixedorder
gets you the same result as the standard order
, since V2
is numeric.
df <- data.frame(
V1 = c("A","A","B","B","C","C","D","D","E","E"),
V2 = 19:10,
V3 = alph,
stringsAsFactors = FALSE)
df[multi.mixedorder(df$V2, df$V3),]
V1 V2 V3
10 E 10 12
9 E 11 11a
8 D 12 11b
7 D 13 11c
6 C 14 9
5 C 15 8
4 B 16 10c
3 B 17 10b
2 A 18 10a
1 A 19 7
Notice that
-
19:10
is equivalent toc(19:10)
.c
means concat, that is to make one long vector out of many short, but in you case you only have one vector (19:10
) so there's no need to concat anything. However, in the case ofV1
you have 10 vectors of length 1, so there you need to concat, as you already do. - You need
stringsAsFactors=FALSE
to not convertV1
andV3
to (incorrectly sorted) factors (which is default).