Count how many values are above limit in dataframe
Solution 1:
Try this:
below <- rowSums(data[,-c(1,2)] < data[,'ll']) # below
above <- rowSums(data[,-c(1,2)] > data[,'ul']) # above
DF <- cbind(data, below, above) # altogether
head(DF)
ll ul unit1 unit2 unit3 unit4 unit5 below above
1 10 82 75 10 24 80 60 0 0
2 13 87 60 20 4 94 31 1 1
3 8 85 58 9 98 94 65 0 2
4 11 95 68 45 5 38 76 1 0
5 11 87 15 79 5 43 67 1 0
6 11 89 33 6 18 1 22 2 0
Solution 2:
I would use within()
for this, following the same logic as @Jilber's answer:
# First, use set.seed to make your example reproducible
set.seed(1)
data <- data.frame(
ll=round(runif(5, 5, 15)),
ul=round(runif(5, 80, 95)),
unit1=sample(1:100, 5, TRUE),
unit2=sample(1:100, 5, TRUE),
unit3=sample(1:100, 5, TRUE),
unit4=sample(1:100, 5, TRUE),
unit5=sample(1:100, 5, TRUE)
)
data
# ll ul unit1 unit2 unit3 unit4 unit5
# 1 8 93 21 50 94 39 49
# 2 9 94 18 72 22 2 60
# 3 11 90 69 100 66 39 50
# 4 14 89 39 39 13 87 19
# 5 7 81 77 78 27 35 83
The within
function lets you add new columns in a convenient way.
within(data, {
below = rowSums(data[-c(1:2)] < ll)
above = rowSums(data[-c(1:2)] > ul)
})
# ll ul unit1 unit2 unit3 unit4 unit5 above below
# 1 8 93 21 50 94 39 49 1 0
# 2 9 94 18 72 22 2 60 0 1
# 3 11 90 69 100 66 39 50 1 0
# 4 14 89 39 39 13 87 19 0 1
# 5 7 81 77 78 27 35 83 1 0
Alternatively, you can also use transform()
to achieve the same output:
transform(data,
below = rowSums(data[-c(1:2)] < ll),
above = rowSums(data[-c(1:2)] > ul))
Benchmarking the Jilber's solution and these two on a 2,000,000 row dataset, here are the results:
test replications elapsed relative user.self sys.self
3 jilber 3 33.586 1.000 31.490 1.916
1 within 3 34.493 1.027 32.542 1.584
2 transform 3 33.813 1.007 31.870 1.828
I think these two functions hold up pretty well for the convenience they offer!