What makes 'locate' so fast compared with 'find'?

In my mind, both locate and find finds a file, but why does locate run so fast?

According to its documentation, locate:

DESCRIPTION
locate reads one or more databases prepared by updatedb(8) and writes file names matching at >least one of the PATTERNs to standard output, one per line.

What files are in that database, and is every created file in that database?


Solution 1:

In my mind, both locate and find finds a file, but why does locate run so fast?

find searches the file system itself. It is optimised for telling you everything (including the content of files that can be many gigabytes in size) a given file for a specific path and for being written to frequently.

locate searches a database generated from previously indexing the file system. The database is optimised for the types of searches locate performs.

What files are in that database, and is every created file in that database?

The database is populated by updatedb. The files in it are determined by what options are passed to updatedb. Files will be in it unless they are outside the areas being searched or if they have been created since updatedb last ran.

For example, my default install of Ubuntu has:

PRUNE_BIND_MOUNTS="yes"
PRUNEPATHS="/tmp /var/spool /media /var/lib/os-prober /var/lib/ceph /home/.ecryptfs /var/lib/schroot"
PRUNEFS="NFS afs autofs binfmt_misc ceph cgroup cgroup2 cifs coda configfs curlftpfs debugfs devfs devpts devtmpfs ecryptfs ftpfs fuse.ceph fuse.cryfs fuse.encfs fuse.glusterfs fuse.gvfsd-fuse fuse.mfs fuse.rozofs fuse.sshfs fusectl fusesmb hugetlbfs iso9660 lustre lustre_lite mfs mqueue ncpfs nfs nfs4 ocfs ocfs2 proc pstore rpc_pipefs securityfs shfs smbfs sysfs tmpfs tracefs udev udf usbfs"

In the /etc/updatedb.conf file.

So it indexes everything except for certain directories which shouldn't be indexes for various (but hopefully fairly obvious) reasons and a bunch of different file system types (typically ones containing secret data, data on remote file systems, and system APIs).