Merge lists that share common elements
Solution 1:
You can see your list as a notation for a Graph, ie ['a','b','c']
is a graph with 3 nodes connected to each other. The problem you are trying to solve is finding connected components in this graph.
You can use NetworkX for this, which has the advantage that it's pretty much guaranteed to be correct:
l = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']]
import networkx
from networkx.algorithms.components.connected import connected_components
def to_graph(l):
G = networkx.Graph()
for part in l:
# each sublist is a bunch of nodes
G.add_nodes_from(part)
# it also imlies a number of edges:
G.add_edges_from(to_edges(part))
return G
def to_edges(l):
"""
treat `l` as a Graph and returns it's edges
to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]
"""
it = iter(l)
last = next(it)
for current in it:
yield last, current
last = current
G = to_graph(l)
print connected_components(G)
# prints [['a', 'c', 'b', 'e', 'd', 'g', 'f', 'o', 'p'], ['k']]
To solve this efficiently yourself you have to convert the list into something graph-ish anyways, so you might as well use networkX from the start.
Solution 2:
Algorithm:
- take first set A from list
- for each other set B in the list do if B has common element(s) with A join B into A; remove B from list
- repeat 2. until no more overlap with A
- put A into outpup
- repeat 1. with rest of list
So you might want to use sets instead of list. The following program should do it.
l = [['a', 'b', 'c'], ['b', 'd', 'e'], ['k'], ['o', 'p'], ['e', 'f'], ['p', 'a'], ['d', 'g']]
out = []
while len(l)>0:
first, *rest = l
first = set(first)
lf = -1
while len(first)>lf:
lf = len(first)
rest2 = []
for r in rest:
if len(first.intersection(set(r)))>0:
first |= set(r)
else:
rest2.append(r)
rest = rest2
out.append(first)
l = rest
print(out)