How to order my dataframe lexicographicaly

I have a following data frame

a = data.frame(a=c(1,2,3,4,5,6,7),b=c(1,2,3,10,12,21,4),c=c(1,2,10,11,"X","Y",3))
> a
  a  b  c
1 1  1  1
2 2  2  2
3 3  3 10
4 4 10 11
5 5 12  X
6 6 21  Y
7 7  4  3

I want to sort whole data frame in lexicographical order, so that the output (for example, column "c") should be like

> a[,"c"]
[1] 1  2  3 10 11  X  Y

I tried and I am geting different answer

indata <- a[do.call(order,a[,c("c","a","b")]),]
> indata[,"c"]
[1] 1  10 11 2  3  X  Y
Levels: 1 10 11 2 3 X Y

I tried gtools, mixedorder package and worked fine on one column:

> a[mixedorder(a$c),]
  a  b  c
1 1  1  1
2 2  2  2
3 3  3 10
4 4 10 11
5 5 12  X
6 6 21  Y
7 7  4  3

but it doesn't work if I include multiple columns:

> a[with(a,order(mixedorder(c),mixedorder(b),mixedorder(a))),]
  a  b  c
1 1  1  1
2 2  2  2
4 4 10 11
5 5 12  X
6 6 21  Y
7 7  4  3
3 3  3 10

though I am expecting :

  a  b  c
1 1  1  1
2 2  2  2
4 7  4  3
5 3  3 10
6 4 10 11
7 5 12  X
3 6 21  Y

Solution 1:

One option is to use mixedorder() from the gtools package.

library(gtools)
a[mixedorder(a$c),]
#   a  b  c
# 1 1  1  1
# 2 2  2  2
# 7 7  4  3
# 3 3  3 10
# 4 4 10 11
# 5 5 12  X
# 6 6 21  Y

Solution 2:

Sticking in base you could make a function yourself:

a = data.frame(a=c(1,2,3,4,5,6,7),b=c(1,2,3,10,12,21,4),c=c(1,2,10,11,"X","Y",3))

SORTER_DEVICE <- function(x) {
    c(sort(as.numeric(na.omit(gsub("[a-zA-Z]", NA, x)))),
        sort(na.omit(gsub("[0-9]", NA, x))))
}
data.frame(apply(a, 2, SORTER_DEVICE))

Solution 3:

Unfortunately mixedsort does not (yet) support multiple column sorting. So, you need to implement it yourself, for example like this:

a[order(sub("[0-9]+", "", a$c),
        as.numeric(sub("[[:alpha:]]*([[:digit:]]*)", '\\1', a$c)),
        as.numeric(a$b),
        as.numeric(a$a)), ]

This first, alphanumerically sorts data.frame using a$c, and for tie situations(which actually does not exist in your data.frame 'a'), it uses a$b and a$a.

Output is:

  a  b  c
1 1  1  1
2 2  2  2
7 7  4  3
3 3  3 10
4 4 10 11
5 5 12  X
6 6 21  Y

PS: This was written by David Winsemius in this post as a reply to a similar question.