pytesseract cannot find the file specified
My code is straight forward and is the following:
import pytesseract
from PIL import Image
img = Image.open('C:/temp/foo.jpg')
img.load()
i = pytesseract.image_to_string(img)
and the error response I get back is:
Traceback (most recent call last):
File "img.py", line 6, in <module>
i = pytesseract.image_to_string(img)
File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 161, in image_to
_string
File "build\bdist.win32\egg\pytesseract\pytesseract.py", line 94, in run_tesse
ract
File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py",
line 710, in __init__
errread, errwrite)
File "C:\Users\%USER%\AppData\Local\Continuum\Anaconda\lib\subprocess.py",
line 958, in _execute_child
startupinfo)
WindowsError: [Error 2] The system cannot find the file specified
Any guidance would be fantastic.
Adding tesseract to my path variable helped:
C:\Program Files (x86)\Tesseract-OCR
But the code now crashes when trying to run the pytesseract piece.
Just hit the same error and decided to answer this question - it might help someone to save time...
First, make sure you have installed/copied Tesseract-OCR executables.
Windows can't find the executable tesseract
in the directories specified in your PATH
environment variable. So either make sure that the directory containing tesseract
is in your PATH
variable or overwrite tesseract_cmd
variable in your Python script like as following (put your PATH instead):
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
Beside that make sure that TESSDATA_PREFIX
Windows environment variable is set to the directory, containing tessdata
directory. For example:
TESSDATA_PREFIX=C:\Program Files (x86)\Tesseract-OCR
if tessdata
location is: C:\Program Files (x86)\Tesseract-OCR\tessdata