pytesseract的光学字符识别(OCR)与其他语言错误3221225477

时间:2019-04-27 05:03:44

标签: python python-3.x opencv tesseract python-tesseract

我正在尝试使用pytesseract库从图像中获取单词。我已经安装了Google的Tesseract OCR,Pytesseract,PIL,Opencv和Pillow库。

然后,我从GitHub下载并放置tessdata和langdata。

我正在使用Tesseract 4.0.0。和pytesseract 0.2.6。

当我尝试lang='eng'时,会给我完美的结果,但是当我尝试lang='sin'时,会给我以下错误消息。

---------------------------------------------------------------------------
TesseractError                            Traceback (most recent call last)
<ipython-input-1-a50dd4690117> in <module>
     10 cv2.destroyAllWindows()
     11 test_image = Image.fromarray(img)
---> 12 text = tess.image_to_string(test_image, lang='sin')
     13 print("PyTesseract Detected the following text: ", text)

~\Anaconda3\envs\mainenv\lib\site-packages\pytesseract\pytesseract.py in image_to_string(image, lang, config, nice, output_type)
    307         Output.DICT: lambda: {'text': run_and_get_output(*args)},
    308         Output.STRING: lambda: run_and_get_output(*args),
--> 309     }[output_type]()
    310 
    311 

~\Anaconda3\envs\mainenv\lib\site-packages\pytesseract\pytesseract.py in <lambda>()
    306         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    307         Output.DICT: lambda: {'text': run_and_get_output(*args)},
--> 308         Output.STRING: lambda: run_and_get_output(*args),
    309     }[output_type]()
    310 

~\Anaconda3\envs\mainenv\lib\site-packages\pytesseract\pytesseract.py in run_and_get_output(image, extension, lang, config, nice, return_bytes)
    216         }
    217 
--> 218         run_tesseract(**kwargs)
    219         filename = kwargs['output_filename_base'] + os.extsep + extension
    220         with open(filename, 'rb') as output_file:

~\Anaconda3\envs\mainenv\lib\site-packages\pytesseract\pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice)
    192 
    193     if status_code:
--> 194         raise TesseractError(status_code, get_errors(error_string))
    195 
    196     return True

TesseractError: (3221225477, '')

Python 3.6代码:

from PIL import Image
import pytesseract as tess
import cv2

tess.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"

img = cv2.imread("./images/scr.png")
cv2.imshow("Test Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
test_image = Image.fromarray(img)
text = tess.image_to_string(test_image, lang='sin')
print("PyTesseract Detected the following text: ", text)

如何解决此错误消息?

0 个答案:

没有答案