Are there more robust tools than Automator to extract text from multiple PDF?

Solution 1:

I don't know how it compares against other options, but you could use pdfotext. It can be installed with brew install xpdf.

do shell script "/usr/local/bin/pdftotext /usr/share/doc/bash/bash.pdf -" without altering line endings

Calibre also comes with some command line utilities:

/Applications/calibre.app/Contents/MacOS/ebook-convert /usr/share/doc/bash/bash.pdf /tmp/output.txt

Related questions:

  • How to convert a pdf file into a text file?
  • PDF to TEXT open source command line tool
  • How to extract text from pdf in script on Linux?