My ffmpeg output always add extra 30s of silence at the end
Use
ffmpeg -y -loop 1 -framerate 2 -i "some.png" -i "with.mp3" -c:v libx264 -tune stillimage -c:a aac -b:a 192k -pix_fmt yuv420p -shortest -fflags +shortest -max_interleave_delta 100M "result.mkv"
Containers (AVI, MP4, MKV) usually store multiple streams in an interleaved fashion i.e. a few seconds of video, then a few seconds of audio, and so on. So ffmpeg buffers data from all streams, when writing.
-shortest
acts at a relatively high-level and is triggered when the first of the streams has finished. However, buffered data from other streams will still be written to file. -fflags shortest
acts at a lower level and stops the buffered data from being written when used with a sufficiently high max_interleave_delta.
When you have framerates too low, as one is tempted to do when merging audio with a "poster" image, ffmpeg runs into problems.
The time of the output comes out wrong. I've had the output time coming wrong still with "-shortest -fflags +shortest -max_interleave_delta 100M" (though it made it better), so I had to cut the output with the "-t" command.
And then, if you grab the output and copy it, with "ffmpeg -i output.mp4 output-copy.mp4", it throws this problem:
"https://trac.ffmpeg.org/ticket/6375?cversion=0"
(
Too many packets buffered for output stream 0:1.
[aac @ 0x7ffda6818c00] Qavg: 65179.457
[aac @ 0x7ffda6818c00] 2 frames left in the queue on closing
)
Which is solved by "-max_muxing_queue_size 9999" (before the output, after the input)
Which again, if you set the fps (or "-r") higher, the problems, all of them, dissapear.
Looking at the ffmpeg doc for "max_muxing_queue_size", I got some insight:
-max_muxing_queue_size packets (output,per-stream)
When transcoding audio and/or video streams, ffmpeg will not begin writing into the output until it has one packet for each such stream. While waiting for that to happen, packets for other streams are buffered. This option sets the size of this buffer, in packets, for the matching output stream.
I think ffmpeg has to grab one video frame and a lot of audio frames at the same time, so it needs to buffer a lot of audio frames at the same time, and isn't used to that, it's used to... buffer 1/30th (for 30fps) of those audio frames, join them with a video frame, and move on. Maybe.
I think ffmpeg should work this to be more smooth maybe... but idk, maybe you just have to once and for all read all the ffmpeg doc.
Maybe put some prompt, "muxer buffer need to be increased detected, want to raise it? to preset this, see docs for "max_muxer (etc)"".
I don't know why the incorrect output time exists though, but something similar.