Command Line Tool to Batch Convert .EML/.EMLX/.MBOX to Searchable PDFs?
Solution 1:
I had to do this with ~180 emails, and I used a command tool I found on GitHub that converts .eml to .pdf via .html: https://github.com/nickrussler/eml-to-pdf-converter
It takes a little while to convert each .eml file - 22 minutes for 186 emails with lots of images - so it's probably not helpful for a 500k email task. (Maybe if you're reeeally not in a rush and not afraid of multiprocessing!) If it is helpful for you or anyone else, though, here's how I got it to work in the bash command line:
git clone
the repoInstall the
wkhtmltopdf
tool from binary (installing withpip
is insufficient) from here: https://wkhtmltopdf.org/downloads.htmlFrom within the cloned repo, generate the email converter .jar file:
./gradlew shadowJar
Run for loop to convert every file in the .mbox (or a directory of .eml):
for file in /path/to/mailbox.mbox/*;
do
java -jar ./build/libs/emailconverter-2.0.1-all.jar "$file";
done
Solution 2:
I recently came across How to open eml files? on AskUbuntu.
It suggests using munpack, which is part of mpack. It can convert an eml
to html
or plain txt
. There are several tools to convert html
to a pdf
. WeasyPrint is one of them. You can install it via pip
.
mpack is also available in Homebrew. Assuming you have Homebrew installed, it's easily installed via:
brew install mpack
Then run
munpack -t <my.eml>