Generate a dependency graph for the whole system

Here is a hacky way to do it in Python.

First write all installed packages to a file.

apt list --installed > installed_packages.txt

Iterate over that list and get first order dependencies of every package with debtree. This can take quite some time, if you have a lot packages (Mostly lib*).

This script is best run in an interactive environment like Jupyter Lab. It caches all work in edges_dict and skips the retrieval process if it already has an entry.

import json
import subprocess
from io import StringIO
import pydot
import networkx as nx



names = [package.split("/")[0] for package in open("installed_packages.txt").read().split("\n")[1:-1]]

edges_dict = dict()

command = "debtree --with-suggests --max-depth 1 {}"
for name in names:
    if name not in edges_dict.keys():
        try:
            process = subprocess.Popen(command.format(name).split(), stdout=subprocess.PIPE)
            output, error = process.communicate()
            if error is None:
                g = nx.drawing.nx_pydot.read_dot(StringIO(output.decode()))
                edges_dict[name] = list(g.edges())
            else:
                print(name, error)
        except:
            print(name, "Failed")

json.dump(edges_dict, open("edges.json", "w"))
edges_dict = json.load(open("edges.json"))

With this data you can either create a graph with networkx...

import networkx as nx
g = nx.DiGraph()
for name, edges in edges_dict.items():
    g.add_edges_from(edges)

or a DataFrame with pandas.

data = list()
for name, edges in edges_dict.items():
    for node1, node2 in edges:
        data.append([name, node1, node2])

df = pd.DataFrame(data, columns=["name", "package", "dependency"])

# many packages have themself as a dependency
df = df[df["package"] != df["dependency"]]

I hope that helps :)

Generate a dependency graph for the whole system

Related

Recent Posts