Theoretical link between the graph diffusion/heat kernel and spectral clustering
The graph diffusion kernel of a graph is the exponential of its Laplacian $\exp(-\beta L)$ (or a similar expression depending on how you define the kernel). If you have labels on some vertices, you can get labels on the rest of the vertices by a majority vote of adding the kernel contribution. In spectral clustering, the sign of the eigenvector components of the largest eigenvalue of the Laplacian determines the class assignment of the vertices. Since these techniques seem to be doing similar things based on the graph Laplacian, is there a link between spectral clustering and the diffusion or heat kernel? Is one the generalization of the other under some assumptions?
What I have found so far:
The paper "The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering" by Saerens et al. seems to say something about this. They say that one flavor of the diffusion kernel (the Euclidean Commute Time Distance) is the same as one type of spectral clustering.
The paper "Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators" by Nadler et al. also has a result that I am yet to parse.
Following "Graph spectral image smoothing using the heat kernel" by Zhan and Hancock, we can observe that if the Laplacian $L = USU^T$, (with $U$ as the eigenvectors and $S$ the diagonal matrix of eigenvalues) and the heat kernel is $H=\exp(-\beta L)$, then $H = Uexp(-\beta S)U^T$. So for some $\beta$ we just have to take the smallest eigenvalue to get the approximate $H$.
Disclaimer: This is cross-posted from CV.
Solution 1:
Yes there is indeed, and this link is the basis of the class of dimensionality reduction techniques known as Diffusion Maps.
Effectively it is a generalization of Laplacian Eigenmaps/Spectral Clustering using the random-walk normalized Laplacian (as opposed to the standard graph Laplacian or symmetric Laplacian).
The generalization comes from first normalizing the Laplacian according to $\alpha$ (i.e. generating a reversible Markov Chain on the data):
$$L^{(\alpha)} = D^{-\alpha}LD^{-\alpha}$$
before constructing the random walk normalized Laplacian:
$$M = (D^{(\alpha)})^{-1}L^{(\alpha)}$$
Where $D^{(\alpha)}$ is the degree matrix of $L^{(\alpha)}$.
The normalization step with $\alpha$ is introduced in order to "tune the influence of the data point density on the infinitesimal transition of the diffusion".
Setting $\alpha = 1$ approximates the Neumann heat kernel (see [1] for full derivation).
Setting $\alpha = 0$ recovers the Laplacian eigenmaps method.
1. Diffusion Maps (2006) - the original paper to propose the method.
2. Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators