Calculating total file size by extension in shell
Solution 1:
For any given extension you an use
find /path -name '*.frq' -exec ls -l {} \; | awk '{ Total += $5} END { print Total }'
to get the total file size for that type.
And after some thinking
#!/bin/bash
ftypes=$(find . -type f | grep -E ".*\.[a-zA-Z0-9]*$" | sed -e 's/.*\(\.[a-zA-Z0-9]*\)$/\1/' | sort | uniq)
for ft in $ftypes
do
echo -n "$ft "
find . -name "*${ft}" -exec ls -l {} \; | awk '{total += $5} END {print total}'
done
Which will output the size in bytes of each file type found.
Solution 2:
With bash version4, you just need to call find
, ls
and awk
not necessary:
declare -A ary
while IFS=$'\t' read name size; do
ext=${name##*.}
((ary[$ext] += size))
done < <(find . -type f -printf "%f\t%s\n")
for key in "${!ary[@]}"; do
printf "%s\t%s\n" "$key" "${ary[$key]}"
done
Solution 3:
Every second column splited by .
and last part (extension) saved in
array.
#!/bin/bash
find . -type f -printf "%s\t%f\n" | awk '
{
split($2, ext, ".")
e = ext[length(ext)]
size[e] += $1
}
END{
for(i in size)
print size[i], i
}' | sort -n
then you got every extensions total size in bytes.
60055 gemspec
321991 txt
2075312 html
2745143 rb
13387264 gem
47196526 jar
Solution 4:
Extending on Iain's script with a faster version for working with a large number of files.
#!/bin/bash
ftypes=$(find . -type f | grep -E ".*\.[a-zA-Z0-9]*$" | sed -e 's/.*\(\.[a-zA-Z0-9]*\)$/\1/' | sort | uniq)
for ft in $ftypes
do
echo -ne "$ft\t"
find . -name "*${ft}" -exec du -bcsh '{}' + | tail -1 | sed 's/\stotal//'
done