Why apt-get install and remove do not clear the same space

Why do apt-get install package and then apt-get remove package do not consume and clear almost exactly the same space? For example with the package "latex2html" one gets:

ubuntu:~$ sudo apt install latex2html
Need to get 758 MB of archives.
After this operation, 1,211 MB of additional disk space will be used.
Do you want to continue? [Y/n]
....
ubuntu:~$ sudo apt remove latex2html
The following packages will be REMOVED:
  latex2html
0 upgraded, 0 newly installed, 1 to remove and 92 not upgraded.
After this operation, 5,578 kB disk space will be freed.
Do you want to continue? [Y/n]

If we could inspect the full output of those two commands we would likely find that apt-get install latex2html installed many more packages (dependencies of latex2html) than apt-get remove latex2html removed (only a single one, latex2html itself).

It’s easy to see that the sum of a set of positive numbers A is larger than that of a set B which is a true subset of A. More concretely: the package latex2html (set B in the analogy) occupies less space than the same package plus all of its dependencies (set A, if there was at least one unsatisfied dependency during package installation).

If you want to remove all unused1 dependency packages you can use:

sudo apt remove <PACKAGE>
sudo apt autoremove

or simply

sudo apt autoremove <PACKAGE>

All these commands will ask for confirmation if Apt intends to do something beyond what you instructed it to do directly, e. g. install or remove a different package on top of the ones specified on the command line. You can also ask apt to only show what it would do and not actually do it via the command-line options -s, --simulate, --just-print, --dry-run, --recon, or --no-act (all equivalent).


1 In this context "unused" means that no manually installed package depends on it (directly or transitively). "Manually" means that someone or something instructed Apt to install this particular package directly, i. e. via apt-get install <PACKAGE>, Software Center, or some other package manager interface, and that it was not merely selected for installation as a dependency of a different package by Apt.


In addition to what @DavidFoerster said, apt remove will not only not remove dependencies. Especially config files are quite often left lying around. From man apt-get:

remove

remove is identical to install except that packages are removed instead of installed. Note that removing a package leaves its configuration files on the system. If a plus sign is appended to the package name (with no intervening space), the identified package will be installed instead of removed.

purge

purge is identical to remove except that packages are removed and purged (any configuration files are deleted too).

To get rid of everything, including the config files, run sudo apt purge instead, as noted in the quote.

Also note that apt puts the downloaded package files themselves in /var/cache/apt/archives/ in case it needs them again. These will not be deleted by any of sudo apt remove/autoremove/purge, which takes up a bit more disk space. You can delete all cached package files which cannot be downloaded anymore by running sudo apt autoclean and all of them using sudo apt clean.

From man apt-get:

clean

clean clears out the local repository of retrieved package files. It removes everything but the lock file from /var/cache/apt/archives/ and /var/cache/apt/archives/partial/.

autoclean (and the auto-clean alias since 1.1)

Like clean, autoclean clears out the local repository of retrieved package files. The difference is that it only removes package files that can no longer be downloaded, and are largely useless. This allows a cache to be maintained over a long period without it growing out of control. The configuration option APT::Clean-Installed will prevent installed packages from being erased if it is set to off.