Find the number of files for each extension in a directory
find "$path" -type f | sed -e '/.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/' -e 's/.*\.//' | LC_COLLATE=C sort | uniq -c
Explanation:
-
find "$path" -type f
get a recursive listing of all the files on the"$path"
folder. -
sed -e '/.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/' -e 's/.*\.//'
regular expressions:-
/.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/
replace all the files without extension by (none). -
s/.*\.//
get the extension of the remaining files.
-
-
LC_COLLATE=C sort
sort the result, keeping the symbols at the top. -
uniq -c
count the number of repeated entries.
Using Python:
import os
from collections import Counter
from pprint import pprint
lst = []
for file in os.listdir('./'):
name, ext = os.path.splitext(file)
lst.append(ext)
pprint(Counter(lst))
The output:
Counter({'': 7,
'.png': 4,
'.mp3': 3,
'.jpg': 3,
'.mkv': 3,
'.py': 1,
'.swp': 1,
'.sh': 1})
If you have GNU awk, you could do something like
printf '%s\0' * | gawk 'BEGIN{RS="\0"; FS="."; OFS="\t"}
{a[(NF>1 ? $NF : "(none)")]++}
END{for(i in a) print a[i],i}
'
i.e. construct / increment an associative array keyed on the last .
separated field, or some arbitrary fixed string such as (none)
if there is no extension.
mawk
doesn't seem to allow a null-byte record separator - you could use mawk
with the default newline separator if you are confident that you don't need to deal with newlines in your file names:
printf '%s\n' * | mawk 'BEGIN{FS="."; OFS="\t"} {a[(NF>1 ? $NF : "(none)")]++} END{for(i in a) print a[i],i}'
With basic /bin/sh
or even bash
the task can be a little difficult, but as you can see in other answers the tools that can work on aggregate data can deal with such task particularly easy. One such tool would be sqlite
database.
The very simple process to use sqlite
database would be to create a .csv
file with two fields: file name and extension. Later sqlite
can use simple aggregate statement COUNT()
with GROUP BY ext
to perform counting of files based on extension field
$ { printf "file,ext\n"; find -type f -exec sh -c 'f=${1##*/};printf "%s,%s\n" "${1}" "${1##*.}"' sh {} \; ; } > files.csv
$ sqlite3 <<EOF
> .mode csv
> .import ./files.csv files_tb
> SELECT ext,COUNT(file) FROM files_tb GROUP BY ext;
> EOF
csv,1
mp3,6
txt,1
wav,27