可靠地检测 Spyder IDE

时间:2021-07-12 22:03:16

标签: python multiprocessing ide spyder ocrmypdf

如何可靠地检测脚本/模块是否正在 Spyder IDE 中运行?

我在 spyder IDE 中运行 ocrmypdf 时遇到问题。它适用于 cmd 和 anaconda 提示符。在 spyder IDE、Windows 7 和 10、各种机器、各种新/旧 anaconda 设置中运行时会出错。 (有关错误的详细信息,请参阅下面的存根和内联注释。)ocrmypdf 的开发人员建议这是由于多处理在 spyder IDE (Python's multiprocessing doesn't work in Spyder IDE) 中不起作用。我想知道是否有一种可靠的方法可以检测 ocrmypdf 或任何脚本/模块是否正在 Spyder IDE 中运行。

基本上,这是一个重复:Detect where Python code is running (e.g., in Spyder interpreter vs. IDLE vs. cmd)

我再次问这个问题,因为这个问题最初是在 2013 年提出的,并且答案被接受——检查 spyder 在 os.environment 中设置的环境变量——是可行的,但有误报的风险。

如果有更聪明的方法来解决这个问题,请告诉我!


import os, io
import ocrmypdf
from wand.image import Image as Img

try:
    from PIL import Image
except ImportError:
    import Image
    
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"


ocrmypdf_exitcodes = {0:'ok', 1:'bad_args', 2:'input file', 3:'missing_dependency', 
                      4:'invalid_output_pdf', 5:'file_access_error', 6:'already_done_ocr', 
                      7:'child_process_error', 8:'encrypted_pdf', 9:'invalid_config', 
                      10:'pdfa_conversion_failed', 15:'other_error', 130:'ctrl_c'}

path = "C:\Users\public\Documents"
tess_lang = "eng"

#Test files from https://github.com/jbarlow83/OCRmyPDF/tree/master/tests/resources

file = "skew.pdf" #works
file = "cardinal.pdf" #breaks at scanning contents section/it's been 20 minutes with no progress past first page
file = "c02-22.pdf" #Breaks at OCR section on first page - logs say 0.5 and then it stalls for 10+ minutes. Sometimes breaks by saying [Errno9] Bad File Descriptor instead.


pdf = os.path.join(path, file)
try:
    filename = pdf.rsplit('.', 1)[0]+'_new.pdf'
    ocrmypdf.ocr(input_file = pdf, output_file = filename, language = '+'.join(list(set([tess_lang, 'eng']))), rotate_pages=True, deskew=True, force_ocr = False)
except Exception as e:
    filename = pdf
    print('Error occurred when trying to process file {} error message is: {}'.format(pdf, repr(e) + " " + str(e)))
    print(repr(e))
    try:
        print(ocrmypdf_exitcodes[e.returncode])
    except:
        pass

0 个答案:

没有答案
相关问题