我在网站上拍了任何文字图片

Question

我正在尝试使用tesseract-OCR从图像中打印文本。但我收到上述错误。我已经使用https://github.com/UB-Mannheim/tesseract/wiki安装了tesseract OCR，并在anaconda提示符中使用了pip install pytesseract安装了pytesseract，但是它不起作用。如果有人遇到过类似问题，请提供帮助。

（基本）C：\ Users \ 500066016> pip install pytesseract 收集pytesseract 正在下载https://files.pythonhosted.org/packages/13/56/befaafbabb36c03e4fdbb3fea854e0aea294039308a93daf6876bf7a8d6b/pytesseract-0.2.4.tar.gz（169kB） 100％|█████████████████████████████████| 174kB 288kB /秒已经满足要求：枕在c：\ users \ 500066016 \ appdata \ local \ continuum \ anaconda3 \ lib \ site-packages中（来自pytesseract）（5.1.0）用于收集包裹的建筑轮子：pytesseract 为pytesseract运行setup.py bdist_wheel ...完成存储在以下目录中：C：\ Users \ 500066016 \ AppData \ Local \ pip \ Cache \ wheels \ a8 \ 0c \ 00 \ 32e4957a46128bea34fda60b8b01a8755986415cbab3ed8e38 成功构建pytesseract

下面是代码：

---
title: "Minimal"
output: 
  bookdown::html_document2:
    fig_caption: yes
  bookdown::pdf_document2:
    fig_caption: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

Here is a reference to the plot below \@ref(fig:minGraph)

```{r minGraph, echo=FALSE, fig.cap="test"}
plot(x=1)
```

以下是错误：

回溯（最近通话最近一次）：

文件“”，第1行，在运行文件（'C：/Users/500066016/.spyder-py3/project1.py'，wdir ='C：/Users/500066016/.spyder-py3'）

第705行，运行文件中的“ C：\ Users \ 500066016 \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ spyder \ utils \ site \ sitecustomize.py”文件 execfile（文件名，命名空间）

exec文件中第102行的文件“ C：\ Users \ 500066016 \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ spyder \ utils \ site \ sitecustomize.py” exec（compile（f.read（），文件名，'exec'），命名空间）

文件“ C：/Users/500066016/.spyder-py3/project1.py”，第23行，在打印（get_string（'quotes.jpg'））

文件“ C：/Users/500066016/.spyder-py3/project1.py”，第20行，位于get_string中 res = pytesseract.image_to_string（'thesh.jpg'）

文件“ C：\ Users \ 500066016 \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pytesseract \ pytesseract.py”，行294，在image_to_string中返回run_and_get_output（* args）

文件“ C：\ Users \ 500066016 \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pytesseract \ pytesseract.py”，行202，位于run_and_get_output中 run_tesseract（** kwargs）

run_tesseract中的第172行“ C：\ Users \ 500066016 \ AppData \ Local \ Continuum \ anaconda3 \ lib \ site-packages \ pytesseract \ pytesseract.py” 引发TesseractNotFoundError（）

TesseractNotFoundError：未安装tesseract或不在您的路径中

Answer 1

第1步：从link下载并安装Tesseract OCR。

第2步：安装后，找到“ Tesseract-OCR ”文件夹，双击该文件夹并找到 tesseract.exe 。

第3步：找到 tesseract.exe 后，复制文件位置。

第4步：像这样将这个位置传递到您的代码中

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

注意：C：\ Program Files \ Tesseract-OCR \ tesseract.exe ==您复制的位置

Answer 2

从错误中可以明显看出，如果您在Windows上，只需在命令提示符下运行以下命令，系统便找不到tesseract软件包。

pip install tesseract

希望它将解决您的问题：）

Answer 3

您应该安装：！ apt安装tesseract-ocr ！ apt安装libtesseract-dev

还有

！点安装枕头！ pip安装pytesseract

导入pytesseract 从PIL导入ImageEnhance，ImageFilter，Image

我在可乐中从Google驱动器上运行了代码。下面是我的示例代码：

我在网站上拍了任何文字图片

第1步：导入一些软件包

import pytesseract
import cv2
import matplotlib.pyplot as plt
from PIL import Image

步骤2：在Colab上上传text.png文件

from google.colab import files
uploaded = files.upload()

current browser session. Please rerun this cell to enable.
---------------------------------------------------------------------------
MessageError                              Traceback (most recent call last)
<ipython-input-31-21dc3c638f66> in <module>()
      1 from google.colab import files
----> 2 uploaded = files.upload()

2 frames
/usr/local/lib/python3.6/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
    104         reply.get('colab_msg_id') == message_id):
    105       if 'error' in reply:
--> 106         raise MessageError(reply['error'])
    107       return reply.get('data', None)
    108 
MessageError: TypeError: Cannot read property '_uploadFiles' of undefined

->不用担心，请再次运行代码，它将接受它。然后，您可以选择要上传的内容

第3步：

使用OpenCV读取图像

image = cv2.imread（“ text.png”）
或者您可以使用枕头

image = Image.open（“ text.png”）
检查它。他们有显示文件文字图片吗？

图片

获取字符串

string = pytesseract.image_to_string(image)

打印

print(string)

完成。对您有帮助。

TesseractNotFoundError：未安装tesseract或不在您的路径中

3 个答案:

我在网站上拍了任何文字图片

第1步：导入一些软件包

步骤2：在Colab上上传text.png文件

第3步：

获取字符串

打印