Create a co-occurrence matrix from dummy-coded observations
Is there a simple approach to converting a data frame with dummies (binary coded) on whether an aspect is present, to a co-occurrence matrix containing the counts of two aspects co-occuring?
E.g. going from this
X <- data.frame(rbind(c(1,0,1,0), c(0,1,1,0), c(0,1,1,1), c(0,0,1,0)))
X
X1 X2 X3 X4
1 1 0 1 0
2 0 1 1 0
3 0 1 1 1
4 0 0 1 0
to this
X1 X2 X3 X4
X1 0 0 1 0
X2 0 0 2 1
X3 1 2 0 1
X4 0 1 1 0
Solution 1:
This will do the trick:
X <- as.matrix(X)
out <- crossprod(X) # Same as: t(X) %*% X
diag(out) <- 0 # (b/c you don't count co-occurrences of an aspect with itself)
out
# [,1] [,2] [,3] [,4]
# [1,] 0 0 1 0
# [2,] 0 0 2 1
# [3,] 1 2 0 1
# [4,] 0 1 1 0
To get the results into a data.frame exactly like the one you showed, you can then do something like:
nms <- paste("X", 1:4, sep="")
dimnames(out) <- list(nms, nms)
out <- as.data.frame(out)