Why is there a man entry in /etc/passwd

man (the command, not the user) is a help application. Applications provide man pages in their packages but man needs to know where they are and also what help they provide. To speed things up — so man isn't search the whole filesystem when you type man <command> — these man pages are indexed into a database by a command called mandb.

In Ubuntu mandb stores indexes in a GNU gdbm database at /var/cache/man/index.db (and a few language specific versions in the same directory). This is a key-value hashing database not dissimilar to memcache, or a hundred other implementations on similar ideas. It's binary, light and fast. I'll throw in an example of how to play with it at the end.

This indexing is scheduled to run daily in Ubuntu by /etc/cron.daily/man-db. The whole script runs as root and does some cleaning up first but right at the end we see mandb being run as the man user:

# --pidfile /dev/null so it always starts; mandb isn't really a daemon,
# but we want to start it like one.
start-stop-daemon --start --pidfile /dev/null \
                  --startas /usr/bin/mandb --oknodo --chuid man \
                  $iosched_idle \
                  -- --no-purge --quiet

It's not changing group, which is why all the group ownerships in /var/cache/man are still root.

But why does mandb run as a different user at all? It could (probably) run just as well as root but it's processing input from a variety of sources (look at manpath). Running as its own user insulates the system from the process blowing up —or worse— being exploited by malformed, corrupted or malicious man pages.

The worst that could happen would only affect the man pages index. Boo hoo. You can confirm that with something like:

sudo -u man find / -writable 2>/dev/null

And you can use that approach to see how much damage any user could wreak on a system. It's a good idea to audit your file permissions (I just found out that any user could delete my entire music collection, for example).


You can peek at the database with accessdb. Here are a few random records:

$ accessdb | shuf -n3
fpurge -> "- 3 3 1380819168 A - - gz purge a stream"
fcgetlangs -> "FcGetLangs 3 3 1402007131 A - - gz Get list of languages"
ipython -> "- 1 1 1393443907 A - - gz Tools for Interactive Computing in Python."

Though not entirely clear from the above, there are actually tab-separated fields in there:

<name> -> <ext> <sec> <mtime> <ID> <ref> <comp> <whatis> 

You can read more about the actual field contents in the technical manual.