How to determine if TRUE occurs before FALSE in two columns
I have a list of TRUE/FALSE statements in two columns. I want to determine if a test passes before it fails, a test fails before it passes, or neither. I am trying to create an output for each situation as well.
So for example if a test passes before it fails
Pass Fail
1 TRUE FALSE
2 FALSE FALSE
3 FALSE FALSE
I want the output to say which event occurred and then the row number at which it occurred so for the example above the output would look like this
Pass 1
Another example of a test that failed
Pass Fail
1 FALSE FALSE
2 FALSE TRUE
3 TRUE FALSE
The expected output would look like
Fail 2
And then a situation where there was no pass or fail
Pass Fail
1 FALSE FALSE
2 FALSE FALSE
3 FALSE FALSE
and the expected output would be
None 0
As I mentioned before, I want to find out which event occurs first.
Pass dataset
structure(list(Pass = c(TRUE, FALSE, FALSE), Fail = c(FALSE,
FALSE, FALSE)), row.names = c(NA, 3L), class = "data.frame")
Fail dataset
structure(list(Pass = c(FALSE, FALSE, TRUE), Fail = c(FALSE,
TRUE, FALSE)), row.names = c(NA, 3L), class = "data.frame")
None dataset
structure(list(Pass = c(FALSE, FALSE, FALSE), Fail = c(FALSE,
FALSE, FALSE)), row.names = c(NA, 3L), class = "data.frame")
One option relevant for scenarios when only one value is of interest could be:
fun <- function(data) {
ind <- sapply(data, function(x) match(TRUE, x))
if(all(is.na(ind))) {
setNames(0, "None")
} else {
ind[which.min(ind)]
}
}
Results for the first dataset:
Pass
1
second:
Fail
2
third:
None
0
If the output is supposed to be a dataframe, then it could be adjusted to:
fun <- function(data) {
ind <- sapply(data, function(x) match(TRUE, x))
if(all(is.na(ind))) {
stack(setNames(0, "None"))
} else {
stack(ind[which.min(ind)])
}
}
values ind
1 2 Fail
A simple base
solution:
Solution
find_result <- function(data) {
# Make a vector naming the first occurrence of each result.
firsts <- c("Pass" = which(data$Pass)[1], "Fail" = which(data$Fail)[1])
# If there are no occurrences, default to "None".
if(all(is.na(firsts)))
data.frame(Result = "None", Row = 0)
# Otherwise locate and name the earliest of the two occurrences.
else {
which <- which.min(firsts)
data.frame(Result = names(firsts)[which], Row = unname(firsts)[which])
}
}
Results
Given your sample data reproduced here
pass_df <- structure(
list(
Pass = c(TRUE, FALSE, FALSE),
Fail = c(FALSE, FALSE, FALSE)
),
row.names = c(NA, 3L),
class = "data.frame"
)
fail_df <- structure(
list(
Pass = c(FALSE, FALSE, TRUE),
Fail = c(FALSE, TRUE, FALSE)
),
row.names = c(NA, 3L),
class = "data.frame"
)
none_df <- structure(
list(
Pass = c(FALSE, FALSE, FALSE),
Fail = c(FALSE, FALSE, FALSE)
),
row.names = c(NA, 3L),
class = "data.frame"
)
the find_result()
function should yield the following results:
#> find_result(pass_df)
Result Row
1 Pass 1
#> find_result(fail_df)
Result Row
1 Fail 2
#> find_result(none_df)
Result Row
1 None 0
Here's an approach using dplyr::cumany
:
PF <- function(data){
mymin <- function(x) ifelse(!all(is.na(x)), min(x, na.rm=T), NA)
npass <- sum(cumany(data$Pass))
firstPass <- mymin(which(data$Pass == T))
nfail <- sum(cumany(data$Fail))
firstFail <- mymin(which(data$Fail == T))
if(npass > nfail) c(result = "pass", which = firstPass) else
if (npass < nfail) c(result = "fail", which = firstFail) else
c(result = "none", which = 0)
}
PF(pass)
result which
"pass" "1"
PF(fail)
result which
"fail" "2"
PF(none)
result which
"none" "0"
I’d approach this by first finding the first row where any of the columns in
the input is TRUE
. Then find the column that is TRUE
on that row.
A cleverer version uses weighted sums to determine exactly which event(s)
occurred on a row:
find_event <- function(x) {
# Encode event combination on each row as a binary integer
codes <- rowSums(2^(col(x) - 1) * x)
# Find first row with any events
row <- head(which(codes > 0), 1)
# Decode the event combination on that row
cols <- which(intToBits(codes[row]) > 0)
events <- colnames(x)[cols]
if (length(events) == 0) {
data.frame(event = "None", row = 0L)
} else {
data.frame(event = events, row = row)
}
}
Tests on the original example data:
pass <- data.frame(
Pass = c(TRUE, FALSE, FALSE),
Fail = c(FALSE, FALSE, FALSE)
)
find_event(pass)
#> event row
#> 1 Pass 1
fail <- data.frame(
Pass = c(FALSE, FALSE, TRUE),
Fail = c(FALSE, TRUE, FALSE)
)
find_event(fail)
#> event row
#> 1 Fail 2
none <- data.frame(
Pass = c(FALSE, FALSE, FALSE),
Fail = c(FALSE, FALSE, FALSE)
)
find_event(none)
#> event row
#> 1 None 0
And on some more exotic data:
pass_fail <- data.frame(
Pass = c(FALSE, TRUE, TRUE),
Fail = c(FALSE, TRUE, FALSE)
)
find_event(pass_fail)
#> event row
#> 1 Pass 2
#> 2 Fail 2
indeterminate <- data.frame(
Pass = c(FALSE, FALSE, TRUE),
Fail = c(FALSE, FALSE, TRUE),
Indeterminate = c(TRUE, FALSE, FALSE)
)
find_event(indeterminate)
#> event row
#> 1 Indeterminate 1