FFMPEG / libx264: How to specify a variable frame rate but with a maximum?
Solution 1:
Frustrated that you hadn't found an answer either, I was going to at least answer other people's questions about how to enable VFR (not VBR) output from FFMPEG.
The answer to that is the oddly named -vsync
option. You can set it to a few different options, but the one you want is '2' or vfr
. From the man page:
-vsync parameter
Video sync method. For compatibility reasons old values can be specified as numbers. Newly added values will have to be specified as strings always.
0, passthrough
- Each frame is passed with its timestamp from the demuxer to the muxer.
1, cfr
- Frames will be duplicated and dropped to achieve exactly the requested constant frame rate.
2, vfr
- Frames are passed through with their timestamp or dropped so as to prevent 2 frames from having the same timestamp.
drop
- As passthrough but destroys all timestamps, making the muxer generate fresh timestamps based on frame-rate.
-1, auto
- Chooses between 1 and 2 depending on muxer capabilities. This is the default method.
Note that the timestamps may be further modified by the muxer, after this. For example, in the case that the format option avoid_negative_ts is enabled.
With -map you can select from which stream the timestamps should be taken. You can leave either video or audio unchanged and sync the remaining stream(s) to the unchanged one.
However, I don't quite have enough reputation to post a comment to just answer that 'sub-question' that everyone seems to be having. But I did have a few ideas that I wasn't honestly very optimistic about... But the first one I tried actually worked. So.
You simply need to combine the -vsync 2
option with the -r $maxfps
option, of course where you replace $maxfps
with the maximum framerate you want! And it WORKS! It doesn't duplicate frames from a source file, but it will drop frames that cause the file to go over the maximum framerate!
By default it seems that -r $maxfps
by itself just causes it to duplicate/drop frames to achieve a constant framerate, and -vsync 2
by itself causes it to pull the frames in directly without really affecting the PTS values.
I wasn't optimistic about this because I already knew that -r $maxfps
puts it at a constant framerate. I honestly expected an error or for it to just obey whichever came first or last or whatever. The fact that it does exactly what I wanted makes me quite pleased with the FFMPEG developers.
I hope this helps you, or someone else later on if you no longer need to know this.
Solution 2:
I would like to specify a variable frame rate with a MAXIMUM value, and allow libx264 to down the frame rate as it sees fit. The idea here is to get extra compression when there is something like an extended still frame
In my understanding, this may be possibly in a comparably clumsy way, but is undesirable for some complex and counterintuitive reasons
Even though an x264 stream has a framerate(s), frame rate is more a container-level problem than a codec one.
In a passthrough VFR encode, there will be what is essentially a text file detailing what the frame rate is over what frames/times, and in encoding a source, a function like tcfile-in or tcfile-out pass the timestamps through to the encode, to map the rate locations and keep the video subjectively consistent from source.
The low-framerate idea is a logical one, but doesn't work out for several reasons. Though x264 is VFR-aware with some capabilities, I don't think there is an analysis function that will vary the framerate with regard to motion in order to lower the file size (in a way analagous to the many bitrate controls).
The source is also a problem: VFR sources will by default retain their frame variability, but apparently encoding a CFR file at variable bitrate (a good idea sometimes, esp. when telecine is needed) will simply produce the same CFR.
This means you would probably have to re-write bitrate by hand (i.e. timestamps of slow scenes muxed into the file), or resort to a frame decimation algorith like dup, dedup, and exactDedup for avisynth. If your video does have extremely low motion, some frames (even half?) would be thrown out. The problem is that these algorithms are not advanced, and don't make good choices with "real life" footage as to what will contribute to the best encode.
Also, removing frames that contain things like I and B frames reduces the amount of detail available over time, which causes motion to look "steppy" and can interfere with the other basic video parameters and cause artifacts like aliasing.
And because of the way the quantizers work, x264 will actually decrease the bitrate disproportionately further in these scenes of low motion. Unless you have a slideshow of identical images, there will be motion (if only grain and other artifacts) and there will be a loss in quality that would not be seen without drastic changes to the bitrate.
And finally, the reason there aren't many options to do what you want is that x264 is really good at managing bitrate just using temporal compression (recording changes in partial frames). Going to 1/2 framerate will not cut the file size in half; 10% is probably a realistic gain to expect from low motion or animation.
So in short, dropping the bitrate of your static scenes will do very little for your file size, but will add a host of quality and sync issues, not to mention incompatibility with video editing software.
If you do want to try a decimator, you might be able to limit the maximum new frame rate by using the levels options, each of which species a maximum resolution and framerate. Unfortunately, you would probably have to work at very low resolutions to get the kind of frame rates you want, using profiles. It comes back to editing the rates by hand, either entirely or to correct frame rates you think are too high. Either way, it will take juggling to keep the sound in sync with the new framerates if alterations are made after the encoding process when the tcfile is conserved.
The takeaway is that spending time optimizing the many bitrate settings will yield much more in the way of file size management, and improve the quality of your video, rather than cause complications for little gain. Preserving the original FPS is probably the best idea unless you're aiming for broadcast or media standards. Players are well capable of playing variable bitrates (unlike editors), and the more frames in your video, the smoother the playback and perhaps the smaller the file size, due to smaller changes in motion between frames.
Here's a collection of links to standards info and forum discussions that should help with this confusing aspect of encoding:
-avisynth decimation tools
-fps and -r switches
-x264 General (tcfile, fps)
-timecode file standards
-Levels and profiles
-Short, clear CFR/VFR setting summary ("framerate" section)
doom9, videohelp, &c theoretical discussions
1
2
3
4
5
6
7