How to convert an epub package to regular epub?

How iBooks imports an ePub file:

A .epub file, as noted in other answers, is essentially a zipped up file. When iBooks imports the .epub file, the .epub file it stores is an unzipped file. This explains why it has the Show Package Contents option which lets you explore the "unzipped" files. However, just zipping the package doesn't always work.

Re-creating the .epub file from the package:

Manually:

  1. Right click on the .epub file and click on Show Package Contents.
  2. Select all the contents (CMD + A) → Right clickCompress.
  3. This will create a .zip file. Simply change the extension from .zip to .epub and voila! The file has become an ePub document.

Automation:

I had more than 2,000 .epub packages I wanted to convert to .epub files, so the above method wasn't feasible. To avoid the manual labor, I wrote a script that essentially runs the above method on all the files. I used the simple and elegant shell code provided by Matthias here and wrapped it in a python script shared below:

# Convert epub packages to epub files
import os
import subprocess

filenames = []
path_to_files = ""
#   Function to store all filenames in a list
def extract_filename(path_to_files):    # "/Users/****/Desktop/Old_epubs"
    os.chdir(path_to_files)
    books = os.getcwd()
    for f in os.listdir(books):
        f_name, f_ext = os.path.splitext(f)
        if f_ext == ".epub":
            filenames.append(f_name)

    filenames.sort()

#   Function to generate new epub files
def create_epub(path_to_new_files): # "/Users/****/Desktop/new_epubs/"
    total_files = len(filenames)
    for i in range(total_files):
        epub_path = "cd " + path_to_files
        filename = filenames[i] + ".epub"
        zipping = " zip -X -r " + path_to_new_files + filename + " mimetype *"
        plist = "rm iTunesMetadata.plist"
        comm = epub_path + filename + "; " + plist + "; " + zipping
        p1 = subprocess.run(comm, capture_output = True, text = True, shell = True)
        success = p1.returncode
        if success == 0:
            rem_files = total_files - i + 1
            print("File #", i+1, " has been processed successfully. Remaining files: ", rem_files)

#   Enter the paths
extract_filename("/Users/****/Desktop/Books")   # Path to directory containing epub packages
create_epub("/Users/****/Desktop/new_epubs/")   # Path to store new epub files in

The extract_filename function takes a path to a directory that contains the .epub packages that need to be converted. [WARNING] It is best to work on a copy of the .epub packages in case something goes wrong. To be safe, just copy the packages to a different directory and work on that.

The create_epub function takes a path to a directory where you want to store the generated files. It then runs a shell command to open each .epub package and generate a .epub file.


Hope this helps! It certainly solved a big headache of mine.


FWIW, here's a shell command that works:

 cd my-broken.epub

 # iTunes/Books seems to add a file 
 # 'iTunesMetadata.plist', and it produces a warning.
 # May also contain private data, so better delete it.

 rm iTunesMetadata.plist 

 zip -X -r ../fixed.epub mimetype *

As far as I can tell, compression does not need to be deactivated (-0). epubcheck has no complaints. There might be differences between versions of the epub spec, however. My test was with an epub 3.0 file.


An ePub file is essentially just a zipped folder, though it has a mimetype file inside which apparently needs to not be compressed.

This would imply that it's not completely straightforward to recreate with a simple zip app. However, it may be simpler than that.
Let's assume nothing has actually unpacked it, merely got confused about how to deal with it. Work on a copy.

Two things to try...

  1. Try just renaming it, change .epub to .zip, then change it back again, see if it's recognised correctly.

  2. Open it in Calibre
    You than have a myriad ways to deal with it, simplest is see if it can talk to your ebook reader via OPDS. Calibre can run its own local server on your wifi & you can copy books over very simply.
    If still no joy, get Calibre to convert it to an ePub [again] This is a great method for fixing a file, as it can re-examine it, fix fonts, bad hyphenations, all kinds of issues.

Calibre itself is too big a subject to really cover in a simple QA, but there are reams of data about it on the site itself & at http://www.mobileread.com/forums/ including sections for most major e-readers too.


Reproduction of the problem:

  1. A ePub file named, say, book.epub is a file (-rw-r--r--).
  2. Open book.epub using iBooks app.
  3. Take out the cached file stored in ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books/, which has been renamed to another name such as A22DFAF7E75C21D979C375B1AD07008F.epub and becomes a directory (drwxr-xr-x@).

Steps of the work-around that works on my Mac:

  1. Change extension of A22DFAF7E75C21D979C375B1AD07008F.epub from .epub to .zip.
  2. Go into the zip package and zip up all the contents inside into a new .zip file, say, Archive.zip.
  3. Drag out the new .zip file and change extension back to .epub.
  4. The Archive.epub file is a file (-rw-r--r--).

I've taken some of the comments here and provided a Jupyter notebook that does this for a backup of say a Books directory:

import pathlib
import glob
import os
import zipfile

# Extract the relevant components of a full path
def pathComponents(fullpath):
    path = pathlib.Path(fullpath)
    name = path.name
    stem = path.stem
    suffix = path.suffix
    parent = path.parent
    return name, stem, suffix, parent

# Get directories in your path that have a .epub suffix
def getePubDirs(path):
    epubSearch = os.path.join(path, "*" + "." + "epub")
    eDirs=[]
    for p in glob.glob(epubSearch):
        if (os.path.isdir(p)):    # Only if it's an epub directory
            eDirs.append(p)
    return eDirs

def createZipFile(fname, files):
    zFile = zipfile.ZipFile(fname, mode='w', compression=zipfile.ZIP_DEFLATED)
    for f in files:
        zFile.write(f)
    zFile.close()

def convertDirToePub(dirname):
    name, stem, suffix, parent = pathComponents(dirname)        # get dirname details
    newdirname = zFile = str(parent) + "/" + name + ".ori"
    zFile = str(parent) + "/" + stem + ".epub"
    os.rename(dirname,newdirname)     # rename the epub dir so the epub can take its old name
    # get all the files within the epub directory
    os.chdir(newdirname) # We need to be in directory so the resultant epub paths are correct
    everything = [os.path.join(r,file) for r,d,f in os.walk(".") for file in f]
    createZipFile(zFile, everything)  # Create a zip file containing all those files
    os.chdir(parent)

dirtoconvert='/Users/<username>/tmp/Booksdir'

eDirs = getePubDirs(dirtoconvert)
totalDirs=len(eDirs)
print(totalDirs, "docs to convert")
ctr=1
for i in eDirs:
    print(ctr, "of", totalDirs, ": Converting ", i)
    convertDirToePub(i)
    ctr = ctr + 1
print("Done!")