如何从图像中获取文本?

时间:2021-02-25 22:44:09

标签: python-3.x opencv python-tesseract

我正在尝试从地图中获取坐标,我获取了坐标部分,并使用 Pytesseract 在其上应用了 OCR,但我无法获取坐标。这是图像“https://ibb.co/hVynk2b”的链接,我试过这个脚本:

import numpy as np
import cv2 as cv
%matplotlib inline
from matplotlib import pyplot as plt
img = cv.imread('a.jpg')
corped = img[460:700, 700:1000]
image=cv2.cvtColor(corped,cv2.COLOR_BGR2GRAY)
se=cv2.getStructuringElement(cv2.MORPH_RECT , (8,8))
bg=cv2.morphologyEx(image, cv2.MORPH_DILATE, se)
out_gray=cv2.divide(image, bg, scale=255)
out_binary=cv2.threshold(out_gray, 0, 255, cv2.THRESH_OTSU )[1] 
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
from pytesseract import Output
d = pytesseract.image_to_data(out_binary, output_type=Output.DICT)
print(d['text'])

1 个答案:

答案 0 :(得分:1)

它似乎对我有用。我运行了你粘贴的代码,但稍微清理了一下:

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
import pytesseract
from pytesseract import Output
img = cv.imread(r'a.jpg')
cropped = img[460:700, 700:1000]
image = cv.cvtColor(cropped, cv.COLOR_BGR2GRAY)
se = cv.getStructuringElement(cv.MORPH_RECT, (8, 8))
bg = cv.morphologyEx(image, cv.MORPH_DILATE, se)
out_gray = cv.divide(image, bg, scale=255)
out_binary = cv.threshold(out_gray, 0, 255, cv.THRESH_OTSU)[1]
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
d = pytesseract.image_to_data(out_binary, output_type=Output.DICT)
print(d['text'])

它返回'35°21'24°'

但是我确实注意到 pytesseract 没有捕捉到垂直文本。您可以在调用 image_to_data 时添加并处理 config 参数,也可以简单地将图像顺时针旋转 90 度并再次运行:

import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
import pytesseract
from pytesseract import Output
img = cv.imread(r'C:\Users\guneh\Desktop\a.jpg')
rotate = cv.rotate(cropped, cv.ROTATE_90_CLOCKWISE)
image = cv.cvtColor(rotate, cv.COLOR_BGR2GRAY)
se = cv.getStructuringElement(cv.MORPH_RECT, (8, 8))
bg = cv.morphologyEx(image, cv.MORPH_DILATE, se)
out_gray = cv.divide(image, bg, scale=255)
out_binary = cv.threshold(out_gray, 0, 255, cv.THRESH_OTSU)[1]
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
d = pytesseract.image_to_data(out_binary, output_type=Output.DICT)
print(d['text'])

返回'10°37'02"'