How can I convert .epub files to plain text?
I'm able to view an epub file in, say, okular, select all the text and copy-paste into a text editor. I'd like a command line method - anyone know of such a thing?
I don't know if Calibre is worth installing for your job, but if you have it you could use the powerful ebook converter:
ebook-convert input.epub output.txt
Output format is deducted from output file extension
I imagine there could be some XML tools/scripts (XSLT) that can transform epub in text as epub is basically XHTML in ZIP archive
An alternative is epub2txt by Kevin Boone, available on Github.
epub2html is a simple command-line utility for extracting text from EPUB documents and, optionally, re-flowing it to fit a text display of a particular number of columns. It is written entirely in ANSI-standard C.
Usage example:
epub2txt input.epub > output.txt
MuPDF can convert from epub
to html
and txt
. To install it:
sudo apt install mupdf mupdf-tools
To use it:
mutool convert -o somefilename.txt somefilename.epub
It assumes txt
output from the -o
option.
See mutool convert
documentation for more information.