How to replaces images of text in PDFs with formatted text using OCR

Solution 1:

Even Adobe's own software is not good at doing this or making clear how to do it.

With Adobe Acrobat X, you can create a text layer through the menus (View | Tools | Recognize Text) or by click Tools in the toolbar and then Recognize Text in the Tools pane.

You then have options to perform OCR on the document or find "suspects". The "suspects" are possible OCR results that don't look right (don't spellcheck?). Once you have gone through the suspects, there doesn't seem to be any way to access or edit the text layer again short of redoing the OCR.

You can choose page ranges to limit OCR (e.g. if you have a multilingual document), but you can't limit it to a selection.

Given that this is such a useful feature, it's disappointing that Adobe don't make it very user-friendly.

Edit: Two other possible solutions.

Adobe Acrobat using ClearScan

When you perform OCR with Adobe Acrobat you can change the PDF Output Style from the default Searchable Image format to ClearScan. This format will actually change the image as well, replacing characters with outlines derived from the OCR. This would both make your PDF more readable and add a text layer, but it does change the original image.

Infix PDF Editor

This program does seem to be able to display the text layer, but it still seems tricky fixing places where Adobe's OCR goes wrong (e.g. lone words in their own positioned para).

Sadly none of these options are freely available.

Will Windows fail activation on a new hard drive after previous hard drive failed

Linux/Unix - how to enable one user to `sudo` without a password

Determining the maximum RAM I can upgrade my PC to and what type

Can't connect to MySQL server on 'localhost' (10061), but can via PHP

'conda activate" is not working in git-bash shelll But it looks well on cmd.exe

How to capture Internet connection event in Windows Task Scheduler?

How can you disable zooming in Internet Explorer and Firefox?

How to make Opera browser obey the etc/hosts file?

How do I open a new Firefox window without toolbars, tabs, addins etc.? (shortcut rather than JavaScript)

How to execute Outlook 2010 macro with AutoHotKey?

Windows: Will compressing the hard drive partition speed up disk access? [duplicate]