Merge many audio files with specific positions

Solution 1:

Let's assign the relative time of the start of the first audio as t=0. So, if the first audio recording started at 16:59:23 and the third recording started at 17:14:13, then the third's relative start time is 14:50.

With that in mind, the basic command structure is

ffmpeg -i first.mka -i second.mka -i third.mka -i fourth.mka
       -filter_complex
         "[1]adelay=184000|184000[b];
          [2]adelay=360000|360000[c];
          [3]adelay=962000|962000[d];
          [0][b][c][d]amix=4"
merged.mka

What the command does is delay the relative start of each audio file except the first one to match their real-life relative start times. Then all the delayed audio streams are mixed together. The amix filter inserts silence where needed.

adelay requires value in milliseconds, so 3 minutes, 4 seconds is 184 seconds is 184000 ms. A value has to be supplied for each channel of an audio stream, so if you're dealing with mono streams, then [1]adelay=184000[b] is the syntax.