merging video and audio with different durations
Following https://superuser.com/a/762752/7796 I'm using the following to merge a video and audio file into one file with some custom audio panning rules.
ffmpeg -i vid.mp4 -i rec.mp3 -filter_complex "[0:a][1:a]amerge[a]" \
-strict experimental -c:a aac -map 0:v -map "[a]" -ar 48000 -ab 128k \
-c:v copy -f mp4 out.mp4 -y
ffmpeg version N-61286-gdbc3e11 Copyright (c) 2000-2014 the FFmpeg developers
built on Mar 11 2014 22:01:37 with gcc 4.8.2 (GCC)
configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libcaca --enable-libfreetype --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-zlib
libavutil 52. 66.101 / 52. 66.101
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 34.100 / 55. 34.100
libavdevice 55. 11.100 / 55. 11.100
libavfilter 4. 3.100 / 4. 3.100
libswscale 2. 5.101 / 2. 5.101
libswresample 0. 18.100 / 0. 18.100
libpostproc 52. 3.100 / 52. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'vid.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.34.100
Duration: 00:00:20.07, start: 0.033333, bitrate: 214 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 480x270 [SAR 1:1 DAR 16:9], 80 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
[mp3 @ 02c6c2a0] Estimating duration from bitrate, this may be inaccurate
Input #1, mp3, from 'rec.mp3':
Duration: 00:00:50.31, start: 0.000000, bitrate: 127 kb/s
Stream #1:0: Audio: mp3, 44100 Hz, mono, s16p, 128 kb/s
[Parsed_amerge_0 @ 02d514e0] No channel layout for input 1
[Parsed_amerge_0 @ 02d514e0] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
Output #0, mp4, to 'out.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.34.100
Stream #0:0(eng): Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 480x270 [SAR 1:1 DAR 16:9], q=2-31, 80 kb/s, 25 fps, 12800 tbn, 12800 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp, 128 kb/s (default)
Stream mapping:
Stream #0:1 (aac) -> amerge:in0
Stream #1:0 (mp3) -> amerge:in1
Stream #0:0 -> #0:0 (copy)
amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[mp3 @ 02c6f960] overread, skip -6 enddists: -4 -4
frame= 501 fps=227 q=-1.0 Lsize= 527kB time=00:00:20.07 bitrate= 215.0kbits/s
video:196kB audio:315kB subtitle:0 data:0 global headers:0kB muxing overhead 3.170177%
The problem is that if the video is 30 seconds and the audio 45, the end merged video is 30 seconds so I'm missing the remaining 15 seconds of audio. Displaying just a black screen or the final frame of the video for the remaining seconds (until the audio is finished) will be fine.
How can I do this?
Solution 1:
After doing some testing, it seems that the problem is caused by the amerge
filter. According to documentation:
6.7 amerge
[...]
If inputs do not have the same duration, the output will stop with the shortest.
Try using apad
as a previous filter:
ffmpeg -i vid.mp4 -i rec.mp3 -filter_complex \
"[0:a]apad [b] ; [b][1:a]amerge[a]" \
-strict experimental -c:a aac -map 0:v -map "[a]" -ar 48000 -ab 128k \
-c:v copy -f mp4 out.mp4 -y