Question

我一直在使用Tesseract OCR来识别图像中的文字。对于具有“ HI，这是Ramakrishna ”等纯文本的图像，它工作得很好。但它无法正确识别当我的图像具有特殊字符（包括空格）时，如下图所示

这是我的代码

var tesseract = G8Tesseract(language: "eng")
tesseract.pageSegmentationMode = .Auto
tesseract.engineMode  = .TesseractOnly
let value = "~`!@#$%^&*()-=_+[] {};:<>,'.?/\\\"abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"
tesseract.setVariableValue(value, forKey: "tessedit_char_whitelist")
tesseract.charWhitelist = value
tesseract.image = UIImage(named: "Sample3")!
tesseract.recognize()
print(tesseract.recognizedText)

我得到的结果是这样的

 l_ _I
 IHHIHIHIHIH
 / \vvww \

如何识别图像中的特殊字符？

Tesseract无法识别图像中的特殊字符

0 个答案: