How to find the n largest files in a folder?
How to find the n largest files in a folder except the ones from the first folder?
In this example, for n=2:
dir
--file 50KB
--dir1
--dir2
----file2_1.txt (size 25KB)
--dir3
----dir3_1
------file3_1.txt (size 35KB)
------file3_2 (size 25KB)
Result:
dir/dir3/dir3_1/file3_1.txt 35KB
dir/dir2/file2_1.txt 25KB
Solution 1:
find . -mindepth 2 -printf "%s\t%p\n" | sort -n | cut -f 2- | tail -n $n
Here, the largest file is last. If you want to change the order so the largest file is first:
find . -mindepth 2 -printf "%s\t%p\n" | sort -nr | cut -f 2- | head -n $n
# .............................................^...............^^^^
With the GNU toolset, you can handle filenames that contain newlines (annoying but valid):
find . -mindepth 2 -printf "%s\t%p\0" | sort -znr | cut -zf 2- | head -zn $n
And to get your desired output, you can do:
find . -mindepth 2 -printf "%s\t%p\n" |
sort -nr |
head -n 5 |
perl -MNumber::Bytes::Human=format_bytes -F'\t' -lane '
push @F, format_bytes(shift @F);
print join "\t", @F;
'
Using perl module Number::Bytes::Human from CPAN.
Solution 2:
Although you tagged your question bash
, here is a zsh
solution in case others find it useful.
Given
% tree -h dir
dir
├── [ 512] dir1
├── [ 512] dir2
│ └── [ 25K] file2_1.txt
├── [ 512] dir3
│ └── [ 512] dir3_1
│ ├── [ 35K] file3_1.txt
│ └── [ 25K] file3_2.txt
└── [ 50K] file
4 directories, 4 files
then using zsh
with glob qualifiers:
% print -RC1 dir/*/**/*(.OLon[1,2])
dir/dir3/dir3_1/file3_1.txt
dir/dir2/file2_1.txt
where
-
dir/*/
ensures we start at least 1 directory belowdir
, equivalent offind
's-mindepth
-
**/*
is a shell glob that matches recursively (the same is available inbash
if theglobstar
option is set) -
()
encloses a collection of qualifiers, specifically-
.
matches regular files only (equivalent offind -type f
) -
OL
orders the results by size (Length) descending, whileon
breaks ties by name ascending -
[1,2]
selects a range of results
-
Unlike find
, shell globs generally omit hidden files by default - if you want to include them, add D
to the qualifiers i.e. (.DOLon[1,2])
Solution 3:
From the top of my head:
ls -lsR * | awk '{print $6,$10}'| sort -nr | head -n5