Add ID column by group [duplicate]
Solution 1:
Here's one way using interaction
.
d <- read.table(text='LAT LONG
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4170
13.5330 -15.4170
13.5330 -15.4170
13.5340 -14.9350
13.5340 -14.9350
13.5340 -15.9170
13.3670 -14.6190', header=TRUE)
d <- transform(d, Cluster_ID = as.numeric(interaction(LAT, LONG, drop=TRUE)))
# LAT LONG Cluster_ID
# 1 13.533 -15.418 2
# 2 13.533 -15.418 2
# 3 13.533 -15.418 2
# 4 13.533 -15.418 2
# 5 13.533 -15.417 3
# 6 13.533 -15.417 3
# 7 13.533 -15.417 3
# 8 13.534 -14.935 4
# 9 13.534 -14.935 4
# 10 13.534 -15.917 1
# 11 13.367 -14.619 5
EDIT: Incorporated @Spacedman's suggestion to supply drop=TRUE
to interaction
.
Solution 2:
The data:
dat <- read.table(text="
LAT LONG
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4180
13.5330 -15.4170
13.5330 -15.4170
13.5330 -15.4170
13.5340 -14.9350
13.5340 -14.9350
13.5340 -15.9170
13.3670 -14.6190", header = TRUE)
These commands create an id variable starting with 1
:
comb <- with(dat, paste(LAT, LONG))
within(dat, Cluster_ID <- match(comb, unique(comb)))
The output:
LAT LONG Cluster_ID
1 13.533 -15.418 1
2 13.533 -15.418 1
3 13.533 -15.418 1
4 13.533 -15.418 1
5 13.533 -15.417 2
6 13.533 -15.417 2
7 13.533 -15.417 2
8 13.534 -14.935 3
9 13.534 -14.935 3
10 13.534 -15.917 4
11 13.367 -14.619 5