Downloading Entirety of Lubuntu/Ubuntu Man-pages?

I know about this page which is almost exactly what I want. Unfortunately, it is not current.

What I would like to do is to have the entirety of the Ubuntu man-pages in a nice, easy to read, PDF format. I'll accept other formats but I'd prefer an indexed PDF file for simplicity and portability.

I am also aware of HTTrack which can pull down the pages in HTML format. There are a few reasons that I wish to avoid this - the primary reason being that it's not really a nice thing to do to their bandwidth and servers.

I've searched the Ubuntu site, used an external search engine, and have searched this site. I did find one answer that led me back to HTTrack which is a potential solution but not the ideal solution and, as mentioned, isn't very nice to their servers or bandwidth.

Even more special would be being able to get this specifically for Lubuntu because there are a few differences in software and I'm an avid Lubuntu user but, if need be, I can make due with just the Ubuntu man-pages.

The reason that I want this is because, well, I'd like to read it - in its entirety. More like a book than like a file that is called when needed. I want to be able to read it while I only have access to my phone, tablet, or other compute device and in an easier to read format than the man-pages typically use.


EDIT:

Specifically for Ubuntu (or Lubuntu) version 15.10, as noted in the tags and title. Also, yes - all the man-pages (even redundant and short ones). I'm aware that this is a lot of information which is one of the reasons that I'm trying to avoid using HTTrack.


Solution 1:

Even more special would be being able to get this specifically for Lubuntu because there are a few differences in software and I'm an avid Lubuntu user but, if need be, I can make due with just the Ubuntu man-pages.

There are no differences in manpages between Lubuntu and Ubuntu. One of the points of becoming a recognized flavour is using the same repositories as Ubuntu, so the software is identical, it's only the starting points that differ.

Also, http://manpages.ubuntu.com suffers from a bug where identically named manpages from different packages aren't distinguished - the manpages of the last package read show up.

Instead of hammering the manpages site, hammer the repositories.

Get a list of manpages, for, say, the binary-amd64 architecture (should be identical to the others):

mkdir temp
cd temp
curl http://archive.ubuntu.com/ubuntu/dists/wily/Contents-amd64.gz | 
  gunzip | 
  grep 'share/man' |
  sed 's/.* //;s/,/\n/g' | 
  awk -F/ '{print $NF}' | 
  sort -u > packages.txt
while IFS= read -r package
do
    apt-get download "$package"
    dpkg-deb --fsys-tarfile "$package"*.deb | tar x ./usr/share/man
    mkdir "$package"-manpages
    find ./usr/share/man/man* -type f -exec mv -t "$package"-manpages {} +
    rm "$package"*.deb
    for page in "$package"-manpages/*
    do
        man -t "$page" | ps2pdf - > "$page".pdf
    done
done < packages.txt

If course, this is going to consume an insane amount of bandwidth - the repository servers are used to it, the question is: is your network upto the task?

Solution 2:

For this approach, you will need html2ps,ps2pdf and a working LaTeX installation. You should be able to install all requirements with

sudo apt-get install html2ps ghostscript texlive-latex-base

Once you've installed the required packages, run this to get the man pages as pdf files:

curl http://manpages.ubuntu.com/manpages/wily/en/man1/ | 
    grep -oP 'href="\K.*?\.1\.html' | 
        while read man; do 
            wget http://manpages.ubuntu.com/manpages/wily/en/man1/"$man" && 
                html2ps "$man"  | ps2pdf - "${man/.html/.pdf}"
        done

You should now have a (huge) collection of pdf files in the directory you ran the command in. By the way, make sure to run the command in a new, empty directory.

Now, to combine them into a single, indexed PDF file, you'll need LaTeX and you'll need to rename them because LaTeX doesn't like . in file names:

rename 's/\./-/g;s/-pdf/\.pdf/' *pdf
cat <<EoF > man1.tex   
\documentclass{article}
\usepackage[colorlinks=true,linkcolor=blue]{hyperref}
\usepackage{pdfpages}
\begin{document}
\tableofcontents
\newpage
EoF
for f in *.pdf; do
    file="${f/.pdf/}"
    printf '\section{%s}\n\includepdf[pages=-]{%s}\n\n' "$file" "$f" >> man1.tex
done
echo "\end{document}" >> man1.tex
pdflatex man1.tex && pdflatex man1.tex

The result is an indexed PDF file of all man pages (I only used 10 for testing):

enter image description here