Generate a dependency graph for the whole system
Here is a hacky way to do it in Python.
First write all installed packages to a file.
apt list --installed > installed_packages.txt
Iterate over that list and get first order dependencies of every package with debtree. This can take quite some time, if you have a lot packages (Mostly lib*).
This script is best run in an interactive environment like Jupyter Lab. It caches all work in edges_dict
and skips the retrieval process if it already has an entry.
import json
import subprocess
from io import StringIO
import pydot
import networkx as nx
names = [package.split("/")[0] for package in open("installed_packages.txt").read().split("\n")[1:-1]]
edges_dict = dict()
command = "debtree --with-suggests --max-depth 1 {}"
for name in names:
if name not in edges_dict.keys():
try:
process = subprocess.Popen(command.format(name).split(), stdout=subprocess.PIPE)
output, error = process.communicate()
if error is None:
g = nx.drawing.nx_pydot.read_dot(StringIO(output.decode()))
edges_dict[name] = list(g.edges())
else:
print(name, error)
except:
print(name, "Failed")
json.dump(edges_dict, open("edges.json", "w"))
edges_dict = json.load(open("edges.json"))
With this data you can either create a graph with networkx...
import networkx as nx
g = nx.DiGraph()
for name, edges in edges_dict.items():
g.add_edges_from(edges)
or a DataFrame with pandas.
data = list()
for name, edges in edges_dict.items():
for node1, node2 in edges:
data.append([name, node1, node2])
df = pd.DataFrame(data, columns=["name", "package", "dependency"])
# many packages have themself as a dependency
df = df[df["package"] != df["dependency"]]
I hope that helps :)