find vs. locate
There are the commands find
and locate
to search for files on the disk.
I know that find
recursively processes all needed subdirectories to search files and therefore is slow but up-to-date, whereas locate
uses a database that gets updated every now and then (when exactly?) to quickly show results which might be outdated though.
Are there any other differences? In which situations would one prefer the one or the other? And when does the locate
database get updated usually?
locate
is really only good for finding files and displaying them to humans. You can do a few things with it, but I wouldn't trust it enough to parse and —as you say— it's impossible to guarantee the state of the internal database, more so because it's only scheduled to run from /etc/cron.daily/mlocate
, once a day!
find
is live. It filters, excludes, executes. It's suitable for parsing. It can output relative paths. It can output full paths. It can do things based on attributes, not just names.
locate
certainly has a place in my toolbox but it's usually right at the bottom as a last-ditch effort to find something. It's easier than find
too.
As much as I like Oli (which is a lot!) I disagree with him on the find
command. I don't like it.
find
command takes over three minutes
Take for example this simple command:
$ time find / -type f -name "mail-transport-agent.target"
find: ‘/lost+found’: Permission denied
find: ‘/etc/ssmtp’: Permission denied
find: ‘/etc/ssl/private’: Permission denied
(... SNIP ...)
find: ‘/run/user/997’: Permission denied
find: ‘/run/sudo’: Permission denied
find: ‘/run/systemd/inaccessible’: Permission denied
real 3m40.589s
user 0m4.156s
sys 0m8.874s
It takes over three minutes for find
to search everything starting from /
. By default reams of error messages appear and you must search through them to find what you are looking for. Still it is better than grep
to search the whole drive for a string which takes 53 hours: `grep`ing all files for a string takes a long time
I know I can fiddle with the find command's parameters to make it work better but the point here is the amount of time it takes to run.
locate
command takes less than a second
Now let's use locate
:
$ time locate mail-transport-agent.target
/lib/systemd/system/mail-transport-agent.target
real 0m0.816s
user 0m0.792s
sys 0m0.024s
The locate command takes less than a second!
updatedb
only run once a day by default
It is true the updatedb
command which updates the locate database is only run once a day by default. You can run it manually before searching for files just added by using:
$ time sudo updatedb
real 0m3.460s
user 0m0.503s
sys 0m1.167s
Although this will take 3 seconds, it's small in comparison to find
command's 3+ minutes.
I've updated my sudo crontab -e
to include the line at the bottom:
# m h dom mon dow command
0 0 1 * * /bin/journalctl --vacuum-size=200M
*/5 * * * * /usr/bin/updatedb
Now every five minutes updatedb
is run and locate
commands database is almost always up-to-date.
But there are no attributes?
You can pipe locate
output to other commands. If for example you want the file attributes you can use:
$ locate mail-transport-agent.target | xargs stat
File: '/lib/systemd/system/mail-transport-agent.target'
Size: 473 Blocks: 8 IO Block: 4096 regular file
Device: 10305h/66309d Inode: 667460 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-03-31 18:11:55.091173104 -0600
Modify: 2017-10-27 04:11:45.000000000 -0600
Change: 2017-10-28 07:18:24.860065653 -0600
Birth: -
Summary
I posted this answer to show the speed and ease of use of locate
. I tried to address some of the command short-comings pointed out by others.
The find
command needs to traverse the entire directory structure to find files. The locate
command has it's own database which gives it lightning speed in comparison.