Add ID column by group [duplicate]

Solution 1:

Here's one way using interaction.

d <- read.table(text='LAT LONG
13.5330 -15.4180 
13.5330 -15.4180 
13.5330 -15.4180 
13.5330 -15.4180 
13.5330 -15.4170 
13.5330 -15.4170 
13.5330 -15.4170 
13.5340 -14.9350 
13.5340 -14.9350 
13.5340 -15.9170 
13.3670 -14.6190', header=TRUE)

d <- transform(d, Cluster_ID = as.numeric(interaction(LAT, LONG, drop=TRUE)))

#       LAT    LONG Cluster_ID
# 1  13.533 -15.418          2
# 2  13.533 -15.418          2
# 3  13.533 -15.418          2
# 4  13.533 -15.418          2
# 5  13.533 -15.417          3
# 6  13.533 -15.417          3
# 7  13.533 -15.417          3
# 8  13.534 -14.935          4
# 9  13.534 -14.935          4
# 10 13.534 -15.917          1
# 11 13.367 -14.619          5

EDIT: Incorporated @Spacedman's suggestion to supply drop=TRUE to interaction.

Solution 2:

The data:

dat <- read.table(text="
LAT        LONG
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4170
13.5330 -15.4170
13.5330 -15.4170
13.5340 -14.9350
13.5340 -14.9350
13.5340 -15.9170
13.3670 -14.6190", header = TRUE)

These commands create an id variable starting with 1:

comb <- with(dat, paste(LAT, LONG))
within(dat, Cluster_ID <- match(comb, unique(comb)))

The output:

      LAT    LONG Cluster_ID
1  13.533 -15.418          1
2  13.533 -15.418          1
3  13.533 -15.418          1
4  13.533 -15.418          1
5  13.533 -15.417          2
6  13.533 -15.417          2
7  13.533 -15.417          2
8  13.534 -14.935          3
9  13.534 -14.935          3
10 13.534 -15.917          4
11 13.367 -14.619          5