How can I find media files not encoded with a specific codec?

Solution 1:

A combination of find, ffprobe, jq utilities could be used to print info about all *.avi, *.mp4 files (in the current directory tree recursively) that are NOT encoded using "h264" video codec:

$ find -name \*.avi -o -name \*.mp4 \
    -exec ffprobe -v quiet -show_streams -show_format -of json {} \; |
  jq -c '.format.filename as $path |
         .streams[]? |
         select(.codec_type=="video" and .codec_name!="h264") |
         {codec: .codec_name, path: $path}'

To install the utilities, run:

$ sudo apt-get install ffmpeg jq findutils

The example works on jq 1.4. If apt-get version provides an older version; you could download a newer jq binary (just put the downloaded file somewhere in $PATH, rename it to jq and make it executable: chmod +x jq).

Solution 2:

You can get a sorted list of your codec specific media files under the working directory by

$ mediainfo * | grep -v codec_id | grep .file_extension | cut -f2 -d: > list.txt

where codec_id is the codec in question (ex. h264) and file_extension is the extension of the container in question (ex. .mkv)

but if the file names have blank spaces between names then the command will not work as desired.

Solution 3:

You can use this small python script together with find to print all files with specific codec:

filterByCodec.py

import os
import sys
import json

inputPath = sys.argv[1]
codec = sys.argv[2]
type = sys.argv[3]

cmd = 'ffprobe -v quiet -show_streams -print_format json ' + inputPath
output = os.popen(cmd).read()
output = json.loads(output)

if not 'streams' in output:
    sys.exit(0)
for stream in output['streams']:
    if stream['codec_name'] == codec and stream['codec_type'] == type:
        print inputPath
        sys.exit(0)

This will call ffprobe, store its output in json string, iterate over all streams and print the input path in case codec name and type match. You will need ffprobe for this. You can get it as a static build from here if you don't have it installed on your system.

Then you can call it on every file using find like this:

find . -type f -exec python filterByCodec.py {} hevc video \;

this will print all videos containing HEVC video codec. More examples:

find . -type f -exec python filterByCodec.py {} h264 video \;
find . -type f -exec python filterByCodec.py {} mp3 audio \; 

You can extend the script and move those files into some directory or whatever. This could look something like this:

cmd = 'mv ' + inputPath + ' onlyhevcDir'
os.system(cmd)

I know that this is not the best way to do it, but using python it's pretty simple to do.