How to determine if TRUE occurs before FALSE in two columns

I have a list of TRUE/FALSE statements in two columns. I want to determine if a test passes before it fails, a test fails before it passes, or neither. I am trying to create an output for each situation as well.

So for example if a test passes before it fails

   Pass  Fail
1  TRUE FALSE
2 FALSE FALSE
3 FALSE FALSE

I want the output to say which event occurred and then the row number at which it occurred so for the example above the output would look like this

Pass 1

Another example of a test that failed

   Pass  Fail
1 FALSE FALSE
2 FALSE  TRUE
3  TRUE FALSE

The expected output would look like

Fail 2

And then a situation where there was no pass or fail

   Pass  Fail
1 FALSE FALSE
2 FALSE FALSE
3 FALSE FALSE

and the expected output would be

None 0

As I mentioned before, I want to find out which event occurs first.

Pass dataset

structure(list(Pass = c(TRUE, FALSE, FALSE), Fail = c(FALSE, 
FALSE, FALSE)), row.names = c(NA, 3L), class = "data.frame")

Fail dataset

structure(list(Pass = c(FALSE, FALSE, TRUE), Fail = c(FALSE, 
TRUE, FALSE)), row.names = c(NA, 3L), class = "data.frame")

None dataset

structure(list(Pass = c(FALSE, FALSE, FALSE), Fail = c(FALSE, 
FALSE, FALSE)), row.names = c(NA, 3L), class = "data.frame")

One option relevant for scenarios when only one value is of interest could be:

fun <- function(data) {
    ind <- sapply(data, function(x) match(TRUE, x))
    
    if(all(is.na(ind))) {
        setNames(0, "None")
    } else {
        ind[which.min(ind)]
    }
}

Results for the first dataset:

Pass 
   1

second:

Fail 
   2

third:

None 
   0

If the output is supposed to be a dataframe, then it could be adjusted to:

fun <- function(data) {
    ind <- sapply(data, function(x) match(TRUE, x))
    
    if(all(is.na(ind))) {
        stack(setNames(0, "None"))
    } else {
        stack(ind[which.min(ind)])
    }
}

  values  ind
1      2 Fail

A simple base solution:

Solution

find_result <- function(data) {
  # Make a vector naming the first occurrence of each result.
  firsts <- c("Pass" = which(data$Pass)[1], "Fail" = which(data$Fail)[1])
  
  # If there are no occurrences, default to "None".
  if(all(is.na(firsts)))
    data.frame(Result = "None", Row = 0)

  # Otherwise locate and name the earliest of the two occurrences.
  else {
    which <- which.min(firsts)
    
    data.frame(Result = names(firsts)[which], Row = unname(firsts)[which])
  }
}

Results

Given your sample data reproduced here

pass_df <- structure(
  list(
    Pass = c(TRUE, FALSE, FALSE),
    Fail = c(FALSE, FALSE, FALSE)
  ),
  row.names = c(NA, 3L),
  class = "data.frame"
)

fail_df <- structure(
  list(
    Pass = c(FALSE, FALSE, TRUE),
    Fail = c(FALSE, TRUE, FALSE)
  ),
  row.names = c(NA, 3L),
  class = "data.frame"
)

none_df <- structure(
  list(
    Pass = c(FALSE, FALSE, FALSE),
    Fail = c(FALSE, FALSE, FALSE)
  ),
  row.names = c(NA, 3L),
  class = "data.frame"
)

the find_result() function should yield the following results:

#> find_result(pass_df)
  Result Row
1   Pass   1

#> find_result(fail_df)
  Result Row
1   Fail   2

#> find_result(none_df)
  Result Row
1   None   0

Here's an approach using dplyr::cumany:

PF <- function(data){

  mymin <- function(x) ifelse(!all(is.na(x)), min(x, na.rm=T), NA)
  
  npass <- sum(cumany(data$Pass))
  firstPass <- mymin(which(data$Pass == T))
  nfail <- sum(cumany(data$Fail))
  firstFail <- mymin(which(data$Fail == T))
  
  if(npass > nfail) c(result = "pass", which = firstPass) else 
    if (npass < nfail) c(result = "fail", which = firstFail) else 
      c(result = "none", which = 0)
}

PF(pass)
result  which 
"pass"    "1"

PF(fail)
result  which 
"fail"    "2" 

PF(none)
result  which 
"none"    "0"

I’d approach this by first finding the first row where any of the columns in the input is TRUE. Then find the column that is TRUE on that row. A cleverer version uses weighted sums to determine exactly which event(s) occurred on a row:

find_event <- function(x) {
  # Encode event combination on each row as a binary integer
  codes <- rowSums(2^(col(x) - 1) * x)
  
  # Find first row with any events
  row <- head(which(codes > 0), 1)
  
  # Decode the event combination on that row
  cols <- which(intToBits(codes[row]) > 0)
  events <- colnames(x)[cols]
  
  if (length(events) == 0) {
    data.frame(event = "None", row = 0L)
  } else {
    data.frame(event = events, row = row)
  }
}

Tests on the original example data:

pass <- data.frame(
  Pass = c(TRUE, FALSE, FALSE),
  Fail = c(FALSE, FALSE, FALSE)
)

find_event(pass)
#>   event row
#> 1  Pass   1

fail <- data.frame(
  Pass = c(FALSE, FALSE, TRUE),
  Fail = c(FALSE, TRUE, FALSE)
)

find_event(fail)
#>   event row
#> 1  Fail   2

none <- data.frame(
  Pass = c(FALSE, FALSE, FALSE),
  Fail = c(FALSE, FALSE, FALSE)
)

find_event(none)
#>   event row
#> 1  None   0

And on some more exotic data:

pass_fail <- data.frame(
  Pass = c(FALSE, TRUE, TRUE),
  Fail = c(FALSE, TRUE, FALSE)
)

find_event(pass_fail)
#>   event row
#> 1  Pass   2
#> 2  Fail   2

indeterminate <- data.frame(
  Pass = c(FALSE, FALSE, TRUE),
  Fail = c(FALSE, FALSE, TRUE),
  Indeterminate = c(TRUE, FALSE, FALSE)
)

find_event(indeterminate)
#>           event row
#> 1 Indeterminate   1

How to determine if TRUE occurs before FALSE in two columns

Solution

Results

Related

Recent Posts