ffmpeg: How to determine output extension automatically (-c:a copy)
There's a difference between containers and the encoding. m4v is a container, as is WAV, WMA, WMV, AAC, etc. They all support multiple encodings. But, there are some general patterns. ffprobe can help.
Extracting audio from video files using ffmpeg is covered very thoroughly here: https://gist.github.com/protrolium/e0dbd4bb0f1a396fcb55
In that, there is an example of how you could do what you are seeking, in some cases, using ffprobe and sed:
for file in *mp4 *avi; do ffmpeg -i "$file" -vn -acodec copy "$file".`ffprobe "$file" 2>&1 |sed -rn 's/.Audio: (...), ./\1/p'`; done
In the linked page, the above appeared to be corrupted by html encoding. I've attempted to fix it. It could likely be simplified for a single file to:
ffmpeg -i "myfile.m4v" -vn -acodec copy "myfile".`ffprobe "myfile.m4v" 2>&1 |sed -rn 's/.Audio: (...), ./\1/p'`
But, if you aren't using sed and a bash shell, then this won't work. (ie. won't work on windows). It also won't work if the encoding in the video file doesn't map commonly to a file extension. In windows, you could probably come up with a powershell or vbscript that would do the same thing.
Encountering the same needs, I have crafted the following PHP script:
isset($argv[1]) || exit('You have to specify a file.');
$file = new SplFileInfo($argv[1]);
$file->isFile() || exit('File not found.');
$input = '"' . $file->getPathname() . '"';
// full path to the containing folder
$full_dir = $file->getPathInfo()->getRealPath();
// filename only: without path, without extension
$base_name = $file->getBasename('.' . $file->getExtension());
// deduce file extension from the audio stream
$output_extension = get_output_extension($file->getPathname());
// combine all that stuff
$output = '"' . $full_dir . '/' . $base_name . '.' . $output_extension . '"';
exec('ffmpeg -i ' . $input . ' -vn -acodec copy ' . $output);
function get_output_extension($file)
{
$file = '"' . trim($file, '"') . '"';
$stream_info = shell_exec('ffprobe -v quiet -print_format json -show_streams -select_streams a ' . $file);
$data = json_decode($stream_info);
if (!isset($data->streams[0]->codec_name)) {
exit('Audio not found - ' . $file);
}
$audio_format = $data->streams[0]->codec_name;
$output_extensions = [
'aac' => 'm4a',
'mp3' => 'mp3',
'opus' => 'opus',
'vorbis' => 'ogg',
];
if (!isset($output_extensions[$audio_format])) {
exit('Audio not supported - ' . $file);
}
return $output_extensions[$audio_format];
}
This script is designed so that it can handle files that are not in the current directory, whether they are referenced by full or relative paths.
I'm not really happy as the code is too long for such a simple task. If someone can make it more concise, you are very welcome :)
Actually, the most complex code is not about ffmpeg, but about SplFileInfo (which has an horrible API, as the above script may demonstrate).
For a related script I had given the plain old pathinfo()
a try, but it is locale-aware and it unexpectly missed some files, so for me it's a no-no.