What's a command line way to find large files/directories to remove and free up space?
Looking for a series of commands that will show me the largest files on a drive.
If you just need to find large files, you can use find
with the -size
option. The next command will list all files larger than 10MiB (not to be confused with 10MB):
find / -size +10M -ls
If you want to find files between a certain size, you can combine it with a "size lower than" search. The next command find files between 10MiB and 12MiB:
find / -size +10M -size -12M -ls
apt-cache search 'disk usage'
lists some programs available for disk usage analysis. One application that looks very promising is gt5
.
From the package description:
Years have passed and disks have become larger and larger, but even on this incredibly huge harddisk era, the space seems to disappear over time. This small and effective programs provides more convenient listing than the default du(1). It displays what has happened since last run and displays dir size and the total percentage. It is possible to navigate and ascend to directories by using cursor-keys with text based browser (links, elinks, lynx etc.)
On the "related packages" section of gt5, I found ncdu
. From its package description:
Ncdu is a ncurses-based du viewer. It provides a fast and easy-to-use interface through famous du utility. It allows to browse through the directories and show percentages of disk usage with ncurses library.
My favorite solution uses a mix from several of these good answers.
du -aBM 2>/dev/null | sort -nr | head -n 50 | more
du
arguments:
-
-a
for "all" files and directories. Leave it off for just directories -
-BM
to output the sizes in megabyte (M) block sizes (B) -
2>/dev/null
- exclude "permission denied" error messages (thanks @Oli)
sort
arguments:
-
-n
for "numeric" -
-r
for "reverse" (biggest to smallest)
head
arguments:
-
-n 50
for the just top 50 results. - Leave off
more
if using a smaller number
Note: Prefix with sudo
to include directories that your account does not have permission to access.
Example showing top 10 biggest files and directories in /var (including grand total).
cd /var
sudo du -aBM 2>/dev/null | sort -nr | head -n 10
7555M .
6794M ./lib
5902M ./lib/mysql
3987M ./lib/mysql/my_database_dir
1825M ./lib/mysql/my_database_dir/a_big_table.ibd
997M ./lib/mysql/my_database_dir/another_big_table.ibd
657M ./log
629M ./log/apache2
587M ./log/apache2/ssl_access.log
273M ./cache
I just use a combination of du
and sort
.
sudo du -sx /* 2>/dev/null | sort -n
0 /cdrom
0 /initrd.img
0 /lib64
0 /proc
0 /sys
0 /vmlinuz
4 /lost+found
4 /mnt
4 /nonexistent
4 /selinux
8 /export
36 /media
56 /scratchbox
200 /srv
804 /dev
4884 /root
8052 /bin
8600 /tmp
9136 /sbin
11888 /lib32
23100 /etc
66480 /boot
501072 /web
514516 /lib
984492 /opt
3503984 /var
7956192 /usr
74235656 /home
Then it's a case of rinse and repeat. Target the subdirectories you think are too big, run the command for them and you find out what's causing the problem.
Note: I use du
's -x
flag to keep things limited to one filesystem (I have quite a complicated arrangement of cross-mounted things between SSD and RAID5).
Note 2: 2>/dev/null
redirects any error messages into oblivion. If they don't bother you, it's not obligatory.
To display the biggest top-20 directories (recursively) in the current folder, use the following one-liner:
du -ah . | sort -rh | head -20
or (more Unix oriented):
du -a . | sort -rn | head -20
For the top-20 biggest files in the current directory (recursively):
ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20
or with human readable sizes:
ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20
Please note that
-h
is available for GNUsort
only, so to make it work on OSX/BSD properly, you've to install it fromcoreutils
. Then add its folder into yourPATH
.
So these aliases are useful to have in your rc files (every time when you need it):
alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'
qbi's answer is correct but it will be very slow when there are a lot of files since it will start a new ls process for each item.
a much faster version using find without spawning child processes would be to use printf to print the size in bytes (%s) and the path (%p)
find "$directory" -type f -printf "%s - %p\n" | sort -n | tail -n $num_entries