Is it possible to mirror the apt repositories?

I am a student at Case Western Reserve University, and the bandwidth from the Ubuntu servers to my location is often horrendously bad (on the order of a few hundred bytes per second). Myself and a few friends would like to be able to download the packages once, and have them cached for the rest of our Ubuntu installations on campus. To do that, we would either need to setup our own APT repositories, or setup some form of caching (squid?) server at which we could point our systems.

Is setting up such a mirror a difficult process? How would one accomplish it?


Solution 1:

You might want to use apt-proxy instead of a full mirror, since it will then take considerably less space and time to get set up:

https://help.ubuntu.com/community/AptProxy

You would then need to update the repository lists for anyone wanting to use your proxy.

Solution 2:

There are several ways to mirror a repository or cache package downloads. What is the best solution depends on how many people are going to use it and what infrastructure is already available.

For example, many universities already have local software mirrors, and in that case the easiest solution is probably to add Ubuntu to that mirror. ;)

And if your university already has a proxy server, it might be possible to use that (maybe with some custom settings for the repositories?).

When you want to mirror the whole or some part of the official repositories (and/or other repositories), you can use something like apt-mirror, debmirror, debpartial-mirror, mirrorkit or ubumirror. Mirroring the whole repositories might pull in a lot of packages that nobody ever uses, so if bandwidth is really an issue (even at night) it might be useful to mirror only the popular packages...

When you want to cache only the used packages, there are apt-cacher, apt-cacher-ng or apt-p2p, or a proxy like Squid.

One advantage of having a local mirror (when compared to a cache) is that installation/upgrades will always be fast (for the packages that are available on the mirror), while when using a cache the first person who needs a package will have to wait until it's downloaded. You can also configure the mirror to update at night, so that downloading packages happens when (almost) nobody else is using the internet uplink.

OTOH the advantage of using a cache is that you will only download exactly what packages are needed, and never more than that.