PDF to text convertor [closed]
I'm looking for a "one-click" way of taking ANY PDF and converting it to plain text. Ideally on OSX or Linux.
Ideally, the solution would include OCR functionality, but it need not necessarily.
The top priority is having something that can take ANY file WITHOUT configuration.
Solution 1:
There's xpdf which includes the pdftotext
binary.
Pdftotext converts Portable Document Format (PDF) files to plain text.
On Linux there's a installer available. It seems that it also comes in the poppler-utils
package. On OS X you could install it using Homebrew (install that first) and then use
brew install homebrew/x11/xpdf
which will download the source files and compile it for OS X. After that, just use it like:
pdftotext your_pdf_file.pdf
which will generate a plain text file. There are a couple of options as well, check out man pdftotext
for more details.
An alternative is poppler, in OSX:
brew install poppler
in Debian and friends
apt-get install poppler-utils