Product of adjacency matrices
Solution 1:
Suppose $A$ is the adjacency matrix of graph $G_1$ and $B$ the adjacency matrix of graph $G_2$, where we consider both $G_1$ and $G_2$ to have the same vertices $1,\ldots,n$. Then $(AB)_{ij}$ is the number of ways to get from $i$ to $j$ by going first along an edge of $G_1$ and then along an edge of $G_2$.
Solution 2:
The story extends to non-square matrices. Consider a collection $A$ of $a$ vertices, another collection $B$ of $b$ vertices, and a collection of directed edges from the first collection to the second (in other words, a bipartite graph, but where we choose explicitly a $2$-coloring of the vertices). We can organize this data into a $b \times a$ matrix whose entries describe the number of edges from a vertex in the first collection to a vertex in the second.
Now consider a third collection $C$ of $c$ vertices and some directed edges from vertices in $B$ to vertices in $C$. We can organize this data into a $c \times b$ matrix as above.
Then the product of the two matrices above is a $c \times a$ matrix describing how to get from vertices in $A$ to vertices in $C$ by following edges that go through $B$.
This idea is the basis of a combinatorial approach to linear algebra which is described, for example, in Brualdi and Cvetkovic's A combinatorial approach to matrix theory and its applications.
For the initiated, the above admits the following nice interpretation: we have described composition of morphisms of a particularly nice category. In fact, if I'm not mistaken, it's the free category with finite biproducts on one object.
Solution 3:
The dot product of the adjacency matrix with itself is a measure of similarity between nodes. For instance take the non-symmetric directed adjacency matrix A =
1, 0, 1, 0
0, 1, 0, 1
1, 0, 0, 0
1, 0, 1, 0
then the dot of $A^T$A (gram matrix) gives the un-normalized similarity between column i and column j which is the symmetric matrix:
3, 0, 2, 0
0, 1, 0, 1
2, 0, 2, 0
0, 1, 0, 1
This is much like the gram matrix of a linear kernel in an SVM. An alternate version of the kernel is the RBF kernel. The RBF kernel is simply a measure of similarity between two datapoints that can be looked up in the nxn matrix. Likewise, so is the linear kernel.
A Gram matrix is simply the dot of its transpose and itself.
Now say you have matrix B which is also a non-symmetric directed adjacency matrix. B =
1, 0, 1, 0
1, 0, 0, 0
1, 0, 0, 0
1, 0, 0, 1
So $A^T$B is a non-symmetric matrix:
3, 0, 1, 1
1, 0, 0, 0
2, 0, 1, 1
1, 0, 0, 0
Matrix A col i and matrix B col j are proportionately similar according to the above matrix. Thus the dot product of transpose of the first matrix to the second matrix is a measure of similarity of nodes characterized by their edges.