Convert multiple .mp3 files (or single .m4a) into .m4b with ffmpeg and afconvert on macOS

I'm using macOS and have a set of .mp3 files that represent the chapters of an audiobook. I want to combine them into a single .m4b file.

How can I do this from command line without significantly increasing file size?


EDIT: Whelp, looks like m4b-tool is a great CLI tool for just this. I'm going to be switching over to use their solution instead of my custom ffmpeg/afconvert. I still use the python script to generate chapters.txt to be used with m4b-tool's merge script.

ORIGINAL: After a lot of research, I found a CLI solution and wanted to share my findings here (if you're looking for an app to handle this for you, I've had really good luck with Audiobook Binder, which can merge mp3s, m4as, and m4bs into a single m4b with very high efficiency).

The general idea of the process is to:

  • Combine the separate mp3s into a single mp3
  • Convert the combined mp3 into an m4a using afconvert (AFAIK mac-only) to reduce the amount of filesize increase from converting the mp3's to an m4a by utilizing the itunes plus commands (stackexchange topic; apple docs) (my attempts at using ffmpeg for this part resulted in huge file size bloat or long processing times)
    • Note that there is still some file size bloat because the command I used to turn the caf > m4a drops the -u pgcm 2 parameter because it resulted in errors (Couldn't set audio converter property ('prop'))
  • Generate an FFMETADATA file (more info) using a python script (inspired by this) to facilitate the conversion of an m4a to m4b and preserve chapters.
  • Combine the m4a and FFMETADATA file into an m4b

Prereqs:

  • Order your mp3 files in a directory for just that audiobook (ex. 00 - Chapter 1.mp3, 01 - Chapter 2.mp3, etc)
  • Python3 (brew install python if you dont have it)
  • FFMPEG (brew install ffmpeg if you don't have it)
  • AFConvert (pre-installed on osx)

Steps:

  1. Save the below python script to a file then execute it. Supply the audiobook directory that your mp3s are in. If you followed the ordering example above, the "enumeration separator" would be " - " (enter into the command without quotes)
import re
import glob
from mutagen.mp3 import MP3
import os
import datetime
from pprint import pprint

chapterFileName = "chapters.txt"
metadataFileName = "FFMETADATAFILE"

def main():
   global chapterFileName
   global metadataFileName

   print("This script will help generate an FFMETADATA file to facilitate\nconverting an .m4a to a .m4b file")

   # scan given directory for file type
   directory=input('Directory (default pwd): ') or os.getcwd()
   print('   using: "' + directory + '"')
   chapterFileName = directory + "/chapters.txt";
   metadataFileName = directory + "/FFMETADATAFILE"

   skip = input('Skip chapter.txt creation? (default n): ') or 'n'
   if skip == 'y':
      createMetadataFile()
      return

   fileType=input('Input audio file type (default mp3): ') or 'mp3'
   print('   using: "' + fileType + '"')
   numberSeperator=input('Enumeration separator (symbol/phrase between enumeration and title): ') or ''
   print('   using: "' + (numberSeperator or '(blank)') + '"')
   if not directory or not fileType:
      print('Input missing - exiting')
      return

   fileNames = list()
   for file in glob.glob(directory + '/*.' + fileType):
      fileNames.append(file)
   fileNames.sort()

   rawChapters = list()
   currentTimestamp = 0 # in seconds
   for file in fileNames:
      audioLength = ''
      if fileType == 'mp3':
         audioLength = MP3(file).info.length
      else:
         audioLength = with_ffprobe(file)

      
      time = str(datetime.timedelta(seconds=currentTimestamp)) + '.000'

      title = os.path.splitext(file)[0].split('/')[-1]
      if numberSeperator != '':
         title = numberSeperator.join(title.split(numberSeperator)[1:])

      rawChapters.append(time + ' ' + title)
      currentTimestamp = int(currentTimestamp + audioLength)

   with open(chapterFileName, "w") as chaptersFile:
      for chapter in rawChapters:
         chaptersFile.write(chapter + "\n")

   input('File created at "' + chapterFileName + '". Review to make sure it looks right\n ("<timestamp> <title>"), then hit Enter to continue... ')
   createMetadataFile()

def createMetadataFile():
   global chapterFileName
   global metadataFileName

   # import chapters and create ffmetadatafile
   chapters = list()
   with open(chapterFileName, 'r') as f:
      for line in f:
         x = re.match(r"(\d*):(\d{2}):(\d{2}).(\d{3}) (.*)", line)
         hrs = int(x.group(1))
         mins = int(x.group(2))
         secs = int(x.group(3))
         title = x.group(5)

         minutes = (hrs * 60) + mins
         seconds = secs + (minutes * 60)
         timestamp = (seconds * 1000)
         chap = {
            "title": title,
            "startTime": timestamp
         }
         chapters.append(chap)

   text = ";FFMETADATA1\n"
   for i in range(len(chapters)-1):
      chap = chapters[i]
      title = chap['title']
      start = chap['startTime']
      end = chapters[i+1]['startTime']-1
      text += f"[CHAPTER]\nTIMEBASE=1/1000\nSTART={start}\nEND={end}\ntitle={title}\n"

   with open(metadataFileName, "w") as myfile:
       myfile.write(text)
   
   print('Created metadata file at "' + metadataFileName + '"')
   removeChapters = input('Remove chapter.txt? (default n): ') or 'n'
   if removeChapters == 'y':
      os.remove(chapterFileName)


def with_ffprobe(filename):
    import subprocess, json

    result = subprocess.check_output(
            f'ffprobe -v quiet -show_streams -select_streams v:0 -of json "{filename}"',
            shell=True).decode()
    fields = json.loads(result)['streams'][0]
    return float(fields['duration'])

main()
  1. Open Terminal to the directory of your mp3s and FFMETADATA file and execute the following command:
ffmpeg -f concat -safe 0 -i <(for f in ./*.mp3; do echo "file '$PWD/$f'"; done) -c copy output.mp3 && afconvert output.mp3 intermediate.caf -d 0 -f caff --soundcheck-generate -v && afconvert intermediate.caf -d aac -f m4af --soundcheck-read -b 256000 -q 127 -s 2 output.m4a -v && ffmpeg -i output.m4a -i FFMETADATAFILE -map_metadata 1 -codec copy output.m4b && rm output.mp3 && rm output.m4a && rm FFMETADATAFILE && rm intermediate.caf

This is a joined command that will (you can split it up at the "&&"s and run separately if you want): (1) combine the mp3's into a single mp3 called "output.mp3"; (2) convert the combined mp3 into an intermediate caff file; (3) convert the caff file into an m4a; (4) combine the m4a and FFMETADATAFILE into an m4b; (5) clean up the files used and generated by this command

If the file bloat is too much from afconvert, you can also move combined mp3 (output.mp3) into Music/iTunes and convert to aac from there (if so, don't need scripts 2 & 3 here), but its a lot slower and may not yield significant improvements.

  1. You now have an m4b file (output.m4b) to use! I like to open the file in the freeware Kid3 tag editor and add the following fields:
    • Title: title of audiobook
    • Author
    • Album: title of audiobook
    • Comment: audiobook description
    • Genre: "Audiobook"
    • Date: Year of recording or book published
    • Cover

From here, you can add the m4b to your audiobooks app of choice or store it in your calibre library