Tesseract iOS的英语以外的语言

时间:2011-06-01 01:20:20

标签: ios tesseract

我正在尝试使用Tesseract Open源代码来查看我是否可以在iPhone上编译和识别英文字符。我能够这样做。现在我尝试在tessdata中包含“ita.traineddata”并更改

tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding],    // Path to tessdata-no ending /.
           "eng");                                                  // ISO 639-3 string or NULL.

tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding],    // Path to tessdata-no ending /.
           "ita");                                                  // ISO 639-3 string or NULL.

但我收到此错误: Error openning data file /var/mobile/Applications/A37DB8B7-2272-4F80-9836-0034CEB56CC5/Documents/tessdata/ita.traineddata

我错过了什么以及如何处理?

1 个答案:

答案 0 :(得分:1)

首先将tessdata添加到项目/项目名称文件夹,然后将(重要)转到目标/构建阶段/复制捆绑资源,并将tessdata文件夹添加为REFERENCE!

然后按照以下方式启动tesseract:

// Set up the tessdata path. This is included in the application bundle
// but is copied to the Documents directory on the first run.
NSArray *documentPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentPath = ([documentPaths count] > 0) ? [documentPaths objectAtIndex:0] : nil;

NSString *dataPath = [documentPath stringByAppendingPathComponent:@"tessdata"];
NSFileManager *fileManager = [NSFileManager defaultManager];
// If the expected store doesn't exist, copy the default store.
if (![fileManager fileExistsAtPath:dataPath]) {
    // get the path to the app bundle (with the tessdata dir)
    NSString *bundlePath = [[NSBundle mainBundle] bundlePath];
    NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"];
    if (tessdataPath) {
        [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL];
    }
}    
setenv("TESSDATA_PREFIX", [[documentPath stringByAppendingString:@"/"] UTF8String], 1);
// init the tesseract engine.
tesseract = new tesseract::TessBaseAPI();    
tesseract->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "ita");

注意:Tesseract默认使用英语初始化自己,一旦我删除了整个tessdata文件夹,它仍然可以在没有eng.traineddata文件的情况下工作,这就是为什么它适用于英语而不是意大利语训练数据,你的tessdata文件夹没有正确初始化。