How to deduplicate iMovie files in "Original Media"?
With each project, iMovie imports videos, images, and audio into an iMovie Library.imovielibrary
. I want my originals in a separate folder, along with other originals that I didn't import into iMovie and others for which I use other tools such as ffmpeg
, but I still want to keep the ability to edit and export projects. At the moment I have videos in two places and the iMovie library is a bloated 300 GB in a 1 TB drive.
How can I avoid duplicating video or other files in an iMovie library and save disk space?
Solution 1:
First a disclaimer: As you can see from miguelmorin's answer, some people have created various scripts to replace the duplicate images and videos in the iMovies library with hard links or symlinks. Before going any further, I would avoid hard links. Symlinks seem to work fine with iMovie, and hard links can have weird side effects, for instance Time Machine may back them up as separate files.
In my case, I used rdfind, which is an existing utility for cleaning up duplicate files and isn't specific to iMovie or even macOS.
-
Install rdfind
brew install rdfind
-
Do a dry run
e.g.
rdfind -dryrun true -minsize 1048576 -makesymlinks true ~/Pictures/ ~/Movies/
-
-minsize
is used to avoid touching any files that aren't image or video files. Adjust it as needed. - Replace
~/Pictures/
with the location(s) of the original image/video files. You can list as many directories as you want, but~/Movies/
should be last because rdfind expects the locations of the original files to be listed first.
Update: YMMV but it looks like iMovie 10 puts all of the original images and videos in
Original Media
directories under ~/Movies/iMovie Librarie.imovielibrary. This will go through those directories only and run rdfind on them, in which case-minsize
shouldn't be needed (as above, replace~/Pictures/
as needed):find ~/Movies/ -type d -name "Original Media" -exec rdfind -dryrun true -makesymlinks true ~/Pictures/ {} \;
-
-
Create the symlinks
Once you're happy with the output of the command from the dry run, remove
-dryrun true
to replace the duplicate files with symlinks, e.g.rdfind -minsize 1048576 -makesymlinks true ~/Pictures/ ~/Movies/
Or:
find ~/Movies/ -type d -name "Original Media" -exec rdfind -makesymlinks true ~/Pictures/ {} \;
Pros:
- Dry run option to show you what it's going to do first
- Will actually check the files to see if they're the same rather than just compare file names
- Will find duplicates even if the filenames are different
Cons:
- There's no way to restrict it to only image and video files (worked around above either by using
-minsize
or by only running rdfind on theOriginal Media
directories)
Solution 2:
This page suggests replacing the video files with links to the original, which saves space. It has this gist in ruby, and I coded this gist in Python, which is also below. The iMovie library went from 300 GB to 5 GB because I skipped two projects I was still working on.
Like the ruby version:
- it goes through an iMovie 10 library and replaces the files in
Original Media
for which it can find a correspondence with links - it requires you to import into the library, quit iMovie, and then run the script.
Unlike the ruby version:
- it uses symlinks to the original media instead of hard links (I confirmed that it works just as well)
- you can define the filetypes to replace (movie, audio, image)
- you can adapt the global variable
PROJECTS_TO_SKIP
to avoid replacing media on some projects that you may be working on. - you can skip projects that you're still working on
- it assumes that your iMovie library and originals folder are organized by the same event name, because in my case I had multiple
DSC001.MOV
and I use the event name to distinguish them - if the event names are different, e.g. if you create two events titled "movie" then iMovie renames
the second to "movie 1", you can adapt the global variable
SHOW_NAME_CORRESPONDENCE
to map the name of the iMovie event to the name of the folder with the original content.
import doctest
import glob
import os
import pathlib
import shutil
import sys
FILE_SUFFIXES_LOWERCASE = [".mp4", ".mts", ".mov", ".jpg", ".jpeg", ".png"]
PROJECTS_TO_SKIP = [] # e.g., ["project 1", "project 2"]
SHOW_NAME_CORRESPONDENCE = {} # e.g. {"movie": "movie 1"}
def skip(f):
"""Returns a boolean for whether to skip a file depending on suffix.
>>> skip("abc.mp4")
False
>>> skip("ABC.JPEG")
False
>>> skip("abc.plist")
True
>>> skip("00114.MTS")
False
"""
suffix = pathlib.Path(f).suffix.lower()
return suffix not in FILE_SUFFIXES_LOWERCASE
def get_show_and_name(f):
"""
>>> show, name = get_show_and_name("/Volumes/video/iMovie Library.imovielibrary/my great show/Original Media/00117.mts")
>>> "my great show" == show
True
>>> "00117.mts" == name
True
>>> show, name = get_show_and_name("/Volumes/video/path/to/originals/my great show/00117.mts")
>>> "my great show" == show
True
>>> "00117.mts" == name
True
"""
path = pathlib.Path(f)
name = path.name.lower()
dirname = str(path.parents[0])
imovie = "iMovie Library.imovielibrary" in dirname
parent_dir = str(path.parents[2 if imovie else 1])
show = dirname.replace(parent_dir, "")
if imovie:
assert show.endswith("/Original Media"), f
show = show.replace("/Original Media", "")
assert show.startswith("/")
show = show[1:].lower()
if show in SHOW_NAME_CORRESPONDENCE:
show = SHOW_NAME_CORRESPONDENCE[show]
return show, name
def build_originals_dict(originals):
"""Go through the original directory to build a dictionary of filenames to paths."""
originals_dic = dict()
for f in glob.glob(os.path.join(originals, "**", "*.*"), recursive=True):
if skip(f):
continue
show, name = get_show_and_name(f)
originals_dic[(show, name)] = f
return originals_dic
def replace_files_with_symlinks(library, originals):
"""Go through the iMovie library and find the replacements."""
originals_dic = build_originals_dict(originals)
# List files recursively
for f in glob.glob(os.path.join(library, "**", "*.*"), recursive=True):
if skip(f) or os.path.islink(f):
continue
show, name = get_show_and_name(f)
if (show, name) in originals_dic:
target = originals_dic[(show, name)]
print("Replacing %s with %s" % (f, target))
os.unlink(f)
os.symlink(target, f)
else:
print("No original found for %s" % f)
def main():
args = sys.argv
assert 3 == len(args), "You need to pass 3 arguments"
library = args[1]
originals = args[2]
replace_files_with_symlinks(library = library, originals = originals)
if "__main__" == __name__:
r = doctest.testmod()
assert 0 == r.failed, "Problem: doc-tests do not pass!"
main()