Howto search in PDFs using regular expressions? [closed]
Usually I use Notepad++ to search in file(s) using regular expressions. Today I am wondering if there is a PDF program that does the same for PDFs. Of course I could convert the PDF to text and use Notepad++ but is there a more easy way without converting?
several options:
- Agent Ransack (top answer in Best way to *confidently* search files and contents in Windows without using an indexing service? )
- DnGrep which is a Free and Open source software. Unfortunately it is at the moment only available on Windows. (a feature request has been opened for other platforms here)
- Agent Ransack is free (lite) and supports PDF as its release notes confirm.
- PowerGREP is a commercial product.
Just as you said, the evident alternative is to convert PDFs to text. One way for a programmer to set that up for bulk processing is by using the Python package PDFMiner. Agent Ransack uses "pdftotext" from the Xpdf project (and you can too).