What does apt-get install do under the hood?

What does the apt-get install ... command do?

When I enter apt-get install ... command, there are some texts appearing on the screen, but that does not have enough information for me. I want to know if any file is created / edited, any service is started and other activities...

Is there any .sh file executed when the apt-get install ... run? If so, how can I see the content of that sh file?

The reason for this question is recently I tried to install tomcat7 with apt-get install tomcat7. Everything works fine until I install tomcat7-admin (manager web application), the server became unresponsive to any request. I tried this many times, and this always happen.


Solution 1:

Mostly, apt-get does the following things:

  • checks for dependencies (and asks to install them),
  • downloads the package, verifies it and then tells dpkg to install it.

dpkg will:

  • extract the package and copy the content to the right location, and check for pre-existing files and modifications on them,
  • run package maintainer scripts: preinst, postinst, (and prerm, postrm before these, if a package is being upgraded)
  • execute some actions based on triggers

You might be interested in the maintainer scripts, which are usually located at /var/lib/dpkg/info/<package-name>.{pre,post}{rm,inst}. These are usually shell scripts, but there's no hard rule. For example:

$ ls /var/lib/dpkg/info/xml-core.{pre,post}{rm,inst}
/var/lib/dpkg/info/xml-core.postinst
/var/lib/dpkg/info/xml-core.postrm
/var/lib/dpkg/info/xml-core.preinst
/var/lib/dpkg/info/xml-core.prerm

Solution 2:

In short: apt-get install does everything that is needed that your system can successfully execute the new installed software application.

Longer:

Preliminaries:

From the manpage:

All packages required by the package(s) specified for installation will also be retrieved and installed.

Those packages are stored on a repository in the network. So, apt-get downloads all the needed ones into a temporary directory (/var/cache/apt/archives/). They will be downloaded from a web- or a ftp-server. They are specified in the so called sources.list; a list of repositories. From then on they get installed one by one procedurally.

The first ones are the ones, that have no further dependencies; so no other package has to be installed for them. Through that, other packages (that had dependencies previously) have now no dependencies anymore. The system keeps doing that process over and over until the specified packages are installed.

Each package undergoes an installation procedure.

Package installation:

In Debian-based Linux distributions, as Ubuntu, those packages are in a specified standardized format called: deb - The Debian binary package format.

Such a package contains the files to be installed on the system. Also they contain a control file. That file contains scripts that the packaging system should execute in a specific situation; the so called maintainer scripts. Those scripts are split in:

  • preinst: before the installation of the files into the systems filehierarchy
  • postinst: after the installation
  • prerm: before the uninstallation
  • postrm: after the uninstallation

There is an interesting picture, showing the procedure of an installation of a new package:

installation

There are also more control-files, the most important are as follows:

  • control: A list of the dependencies, and other useful information to identify the package
  • conffiles: A list of config files (usually those in /etc)
  • debian-binary: contains the deb-package version, currently 2.0
  • md5sums: A list of md5sums of each file in the package for verifying
  • templates: A file with error descriptions and dialogs during installation

Solution 3:

For the actual under-the-hood stuff, you'll need to grab the Apt source. Fairly simple if you have source repositories enabled:

apt-get source apt

The apt-get command itself lives in cmdline/apt-get.cc. It's a pain to read through but most of apt-get's actions are spelled out quite extensively in there. Installation however, is mapped through a DoInstall function which lives in apt-private/private-install.{cc,h}.

You have to remember that apt-get is merely one side of the coin.
dpkg is handling the actual installation but DoInstall doesn't know about dpkg directly. apt-get is actually surprisingly package-manager agnostic. All the functionality is abstracted through apt-pkg/package-manager.cc

I'm only looking briefly but even there I can't see where this actually attaches to the dpkg systems. Some of this seems to be autoconfigured through apt-pkg/aptconfiguration.cc but this is a deep well. You could spend days unravelling this.

The source documentation is good though. You could do worse things than to go through each file and read the header to work out what's actually happening.

Solution 4:

There are some fantastic answers here which are better than this short one, but something you might consider to help you get a better understanding of the changes made by a package manager is Docker. You can diff the changes made in a container using docker diff <container> and it will show you all of the changes. This is especially useful for taking a look under the hood to see what apt-get install does to a system. A quick search will get you several resources to help implement this.