Efficient algorithm to generate random multigraph (undirected) given nodes and degree
Since you have also posted parts of the code that are unrelated to the core algorithm, it makes going through the code and finding bottlenecks relatively difficult.
Here's an algorithm that is faster from what I've seen in your code. It runs in O(n * m)
for creating each graph, where n
is the number of nodes, and m
is the max degree that any of the nodes can have. In other words it's O(V + E)
where V
is the number of vertices and E
the number of edges.
- Create a list for the nodes, called
nodes
, like[1, 2, ..., n]
. - Create a corresponding list for degrees, called
degrees
, wheredegrees[i]
is the degree ofnodes[i]
. - Create a store for your graph however you like it. Adjacency list, matrix. Just make sure that adding edges to the graph is of
O(1)
complexity. Let's call thisgraph
. Adefaultdict(list)
fromcollections
in python would make a good adjacency list. For this algorithm I assumegraph
is adefaultdict(list)
. - Run a while loop on nodes.
while len(nodes) > 0:
and do as follows:
# Get a random index from current nodes
node_idx = random.randint(0, len(nodes)-1)
# Store the node and its corresponding degree
node = nodes[node_idx]
degree = degrees[node_idx]
# Swap that node and its degree with the last node/degree and pop
# This helps us to remove them in O(1) time
# We don't need them anymore since we are going to exhaust the required edges
# for this particular node.
# This also prevents self-edges.
nodes[node_idx], nodes[-1] = nodes[-1], nodes[node_idx]
nodes.pop()
degrees[node_idx], degrees[-1] = degrees[-1], degrees[node_idx]
degrees.pop()
for _ in degree: # this is the amount of edges this node needs
# To make sure we don't get out of bounds.
# This could potentially happen unless
# there is a guarantee that the degrees and number of nodes
# are made such that they fit exactly
if len(nodes) == 0:
break
neighbor_idx = random.randint(0, len(nodes)-1)
graph[node].append(nodes[neighbor_idx])
graph[nodes[neighbor_idx]].append(node)
degrees[neighbor_idx] -= 1
if degrees[neighbor_idx] == 0:
# we need to remove the neighbor node if it has its maximum edges already
nodes[neighbor_idx], nodes[-1] = nodes[-1], nodes[neighbor_idx]
nodes.pop()
degrees[neighbor_idx], degrees[-1] = degrees[-1], degrees[neighbor_idx]
degrees.pop()
This algorithm potentially leaves one node at the end that has not all its required edges, but this isn't a shortcoming of the algorithm but can happen if the number of edges for the nodes dosn't work out. I'm not sure how to express is mathematically though.
Also note that this algorithm could produce multiple edges between two nodes. It isn't clear to me if this is allowed or not for the particular graph you are looking for. If so, the code can be ammended such that it avoids such edges without sacrificing the time complexity. But it has the potential to leave multiple nodes with less edges than required. This wouldn't be a shortcoming of the algorithm but a result of how the degrees for particular nodes are defined.