How to convert all pdf files to text (within a folder) with one command?

The following will convert all files in the current directory:

for file in *.pdf; do pdftotext "$file" "$file.txt"; done

ls *.pdf | xargs -n1 pdftotext

xargs is often a quick solution for running the same command multiple times with just a small change each time. The -n1 option makes sure that only one pdf file is passed to pdftotext at a time.

Edit: If you're worried about spaces in filenames and such, you can use this alternative:

find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext

write a bash script

for f in *.pdf; do
  pdftotext "$f"
done

or type it in a one-line command as follows:

for f in *.pdf; do pdftotext "$f"; done

I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.


I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:

so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager

enter image description here

enter image description here

enter image description here

The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...

enter image description here

enter image description here

...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!

(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i) each containing the same pdf. Running that command in /Desktop/test has converted all pdfs down to that in "i" folder.)

(I would welcome comments on how to adjust this command so as to avoid that risk.)

Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done) coming from Sam, the problem is avoided.

But in certain cases one might wish exactly what Ryan's solution does!