Find the number of files for each extension in a directory

find "$path" -type f | sed -e '/.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/' -e 's/.*\.//' | LC_COLLATE=C sort | uniq -c

Explanation:

  • find "$path" -type f get a recursive listing of all the files on the "$path" folder.
  • sed -e '/.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/' -e 's/.*\.//' regular expressions:
    • /.*\/[^\/]*\.[^\/]*$/!s/.*/(none)/ replace all the files without extension by (none).
    • s/.*\.// get the extension of the remaining files.
  • LC_COLLATE=C sort sort the result, keeping the symbols at the top.
  • uniq -c count the number of repeated entries.

Using Python:

import os
from collections import Counter
from pprint import pprint

lst = []
for file in os.listdir('./'):
        name, ext = os.path.splitext(file)
        lst.append(ext)

pprint(Counter(lst))

The output:

Counter({'': 7,
         '.png': 4,
         '.mp3': 3,
         '.jpg': 3,
         '.mkv': 3,
         '.py': 1,
         '.swp': 1,
         '.sh': 1})

If you have GNU awk, you could do something like

printf '%s\0' * | gawk 'BEGIN{RS="\0"; FS="."; OFS="\t"} 
  {a[(NF>1 ? $NF : "(none)")]++} 
  END{for(i in a) print a[i],i}
'

i.e. construct / increment an associative array keyed on the last . separated field, or some arbitrary fixed string such as (none) if there is no extension.

mawk doesn't seem to allow a null-byte record separator - you could use mawk with the default newline separator if you are confident that you don't need to deal with newlines in your file names:

printf '%s\n' * | mawk 'BEGIN{FS="."; OFS="\t"} {a[(NF>1 ? $NF : "(none)")]++} END{for(i in a) print a[i],i}'

With basic /bin/sh or even bash the task can be a little difficult, but as you can see in other answers the tools that can work on aggregate data can deal with such task particularly easy. One such tool would be sqlite database.

The very simple process to use sqlite database would be to create a .csv file with two fields: file name and extension. Later sqlite can use simple aggregate statement COUNT() with GROUP BY ext to perform counting of files based on extension field

$ { printf "file,ext\n"; find -type f -exec sh -c 'f=${1##*/};printf "%s,%s\n" "${1}" "${1##*.}"' sh {} \; ; }  > files.csv
$ sqlite3 <<EOF
> .mode csv
> .import ./files.csv files_tb
> SELECT ext,COUNT(file) FROM files_tb GROUP BY ext;
> EOF
csv,1
mp3,6
txt,1
wav,27