PDF to text convertor [closed]

I'm looking for a "one-click" way of taking ANY PDF and converting it to plain text. Ideally on OSX or Linux.

Ideally, the solution would include OCR functionality, but it need not necessarily.

The top priority is having something that can take ANY file WITHOUT configuration.


Solution 1:

There's xpdf which includes the pdftotext binary.

Pdftotext converts Portable Document Format (PDF) files to plain text.

On Linux there's a installer available. It seems that it also comes in the poppler-utils package. On OS X you could install it using Homebrew (install that first) and then use

brew install homebrew/x11/xpdf

which will download the source files and compile it for OS X. After that, just use it like:

pdftotext your_pdf_file.pdf

which will generate a plain text file. There are a couple of options as well, check out man pdftotext for more details.

An alternative is poppler, in OSX:

brew install poppler

in Debian and friends

apt-get install poppler-utils