Spectral clustering k, vs k-means k?
In spectral clustering, as I've seen the algorithm presented, we often choose the first $k$ eigenvectors of the "graph Laplacian" to cluster on. It appears to me that this $k$ is the number of clusters expected, just like in $k$-means.
However, it seems like an arbitrary choice to cluster on the first $k$ eigenvectors based on the fact that we believe there are $k$ clusters. What is the mathematical intuition behind reducing the number of dimensions to the number of expected clusters?
A second thing I have have learned is that the multiplicity of the 0 eigenvalue tells you how many connected components there are in the graph represented by the Laplacian. This makes rough sense to me.
But, what seems to be constantly glossed over is how to interpret the eigenvalues of a fully connected similarity graph, which is often used in practice for spectral clustering. How can the eigenvalues be interpreted in this fully connected case?
Solution 1:
For the first question I do recommend Luxburg (2007) paper: A Tutorial on Spectral Clustering, page 9-14. Basically you take those k eigenvectores because you are transforming your clustering problem into a linear algebra problem, solving it by a relaxation and using Rayleigh-Ritz.
For the second question, the number of zeros of Laplacian matrix is the same as connected component (you can think it as clusters) on the graph. The thing about fully connected graphs is that you would like to see if it is close to be with k connected componentes, so if the first k eigenvalues are close to zero, this could lead that the fully connected graph is close to be a k connected component graph.