Finding all disconnected subgraphs in a graph

I have a graph which contains an unknown number of disconnected subgraphs. What's a good algorithm (or Java library) to find them all?


I think what you are looking for is generally called a Flood Fill. It is up to you whether you traverse the graph through a BFS or a DFS.

Basically you take an unlabeled (AKA uncoloured) node and assign a new label to it. You assign the same label to all nodes adjacent to that one, and so on to all nodes that are reachable from that node.

When no more reachable nodes can be labeled, you start over by picking another unlabeled node. Notice that the fact that this new node is unlabeled implies that it is not reachable from our earlier node and is thus in a different disconnected component.

When there are no more unlabeled nodes, the number of distinct labels you had to use is the number of components of the graph. The label for each node tells you which node is part of which component.


Not a Java implementation but perhaps it will be useful for someone, here is how to do it in Python:

import networkx as nx
g = nx.Graph()
# add nodes/edges to graph

d = list(nx.connected_components(g)) 
# d contains disconnected subgraphs
# d[0] contains the biggest subgraph

More information here.