How can I view updatedb database content, and then exclude certain files/paths?
The updatedb database on my debian (squeeze) server is quite slow.
- where is the database located
- how can I view its content and find out if there are some paths with useless stuff, that I could add to the prunepaths?
- how can I prune all paths that contain
*/.git/*
,*/.svn/*
and similar? - why don't the files get excluded, I defined in
PRUNEPATHS
?
my /etc/updatedb.conf
looks like this:
...
# filesystems which are pruned from updatedb database
PRUNEFS="NFS nfs nfs4 afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre_lite tmpfs usbfs udf"
export PRUNEFS
# paths which are pruned from updatedb database
PRUNEPATHS="/tmp /usr/tmp /var/tmp /afs /amd /alex /var/spool /sfs /media /var/backups/rsnapshot /var/mod_pagespeed/"
...
EDIT:
- The locate database is in
/var/cache/locate/locatedb
-
locate /
will list all files and directories in the database (I looked through the results by exporting it in a file:locate />/tmp/locatedb.txt
, download this txt-file and find large amount of useless stuff)
Solution 1:
You are probably using the GNU findutils version of locate, which doesn't support the PRUNENAMES option. Installing mlocate will provide these configuration options:
apt-get remove locate
mv /etc/updatedb.conf /etc/updatedb.conf-GNU.old
apt-get install mlocate
Now with the mlocate packge you can edit or create /etc/updatedb.conf and add these lines:
PRUNENAMES=".git .bzr .hg .svn"
PRUNEPATHS="/tmp /var/spool /var/cache /media /usr/tmp /var/tmp /sfs /afs /amd /alex /var/backups/rsnapshot /var/mod_pagespeed"
# the paths in `PRUNEPATHS` must be without trailing slashes
Then actualize the database with:
updatedb
You probably can remove the huge old locate database:
rm /var/cache/locate/locatedb
(The mlocate database is stored at /var/lib/mlocate/mlocate.db
)
Check out https://apps.ubuntu.com/cat/applications/mlocate/ for more information about the package.
(I spent a ridiculous amount of time trying to solve a similar issue!)
Solution 2:
Use PRUNENAMES
as stated in man updatedb.conf
A whitespace-separated list of directory names (without paths) which should not be scanned by updatedb(8). By default, no directory names are skipped.
The use of
PRUNENAMES=".git .hg .svn"
should do the trick (above line is the standard value on Fedora 18).
Solution 3:
locate /
will list all files and directories in the database.
Solution 4:
why don't the files get excluded, I defined in PRUNEPATHS
Although the OP's problem ended up being version/PRUNENAMES, as an alternative/addition to trolling through locate db output, running updatedb manually with the --debug-pruning flag prints the individual pruning decisions to stderr, and is really useful for tracking down pruning problems
For eg stick it into a file (as root in this case):
updatedb --debug-pruning > ~/updatedb_debug.log 2>&1 &
Sample output:
Matching bind_mount_paths:
...done
Checking whether filesystem `/boot' is excluded:
`/', type `rootfs'
`/proc', type `proc'
=> type matches, dir `/proc'
`/run', type `tmpfs'
...
Checking whether filesystem `/mnt/windows' is excluded:
Checking whether filesystem `/proc' is excluded:
Checking whether filesystem `/run' is excluded:
...
Skipping `/dev/mqueue': in prunefs
Skipping `/dev/pts': in prunefs
etc
(Am using mlocate)