Is there a better pdf to text converter than pdftotext?

I'm using pdftotext (part of poppler-utils) to convert PDF documents to text. It works, for the most part, but one thing I wish it did was to insert blank lines between separate paragraphs instead of mashing them together.

Is there way to get pdftotext to do this? And if not, is there another pdf to text utility that can do this?


If you are using pdftotext you can use the -layout flag to preserve the layout of the text on the pages in your input pdf file:

pdftotext -layout input.pdf output.txt

You could try ebook-convert from Calibre.

If anything, I'd say it errs in the other direction: too many line breaks.

Another thing I'd definitely consider though is converting to HTML using pdfreflow, and then convert the HTML to TXT.