Remove duplicates keeping entry with largest absolute value
First. Sort in the order putting the less desired items last within id
groups
aa <- a[order(a$id, -abs(a$value) ), ] #sort by id and reverse of abs(value)
Then: Remove items after the first within id
groups
aa[ !duplicated(aa$id), ] # take the first row within each id
id value
2 1 2
4 2 -4
5 3 -5
6 4 6
A data.table
approach might be in order if your data set is very large:
library(data.table)
aDT <- as.data.table(a)
setkey(aDT,"id")
aDT[J(unique(id)), list(value = value[which.max(abs(value))])]
Or a not as fast, but still fast, alternative :
library(data.table)
as.data.table(a)[, .SD[which.max(abs(value))], by=id]
This version returns all the columns of a
, in case there are more in the real dataset.
Here is a dplyr
approach
library(dplyr)
a %>%
group_by(id) %>%
top_n(1, abs(value))
# A tibble: 4 x 2
# Groups: id [4]
# id value
# <dbl> <dbl>
#1 1 2
#2 2 -4
#3 3 -5
#4 4 6