我的IF语句出现语法错误,不确定为什么吗?

时间:2018-12-28 14:31:46

标签: python syntax syntax-error pdfminer

我正在尝试在python 3.7中运行以下代码。我不断收到无效的语法错误,不确定为什么,有人可以发现我在做什么错吗?缩进似乎很好,我相信我的“打印”放在正确的括号内,但我完全不了解“ if”和“ else”语句。

class pdfPositionHandling:

    def parse_obj(self, lt_objs):

        # loop over the object list
        for obj in lt_objs:

            if isinstance(obj, pdfminer.layout.LTTextLine):
                print ("%6d, %6d, %s" % (obj.bbox[0], obj.bbox[1], obj.get_text().replace('\n', '_'))

            # if it's a textbox, also recurse
            if isinstance(obj, pdfminer.layout.LTTextBoxHorizontal):
                self.parse_obj(obj._objs)

            # if it's a container, recurse
            elif isinstance(obj, pdfminer.layout.LTFigure):
                self.parse_obj(obj._objs)

    def parsepdf(self, filename, startpage, endpage):

        # Open a PDF file.
        fp = open(filename, 'rb')

        # Create a PDF parser object associated with the file object.
        parser = PDFParser(fp)

        # Create a PDF document object that stores the document structure.
        # Password for initialization as 2nd parameter
        document = PDFDocument(parser)

        # Check if the document allows text extraction. If not, abort.
        if not document.is_extractable:
            raise PDFTextExtractionNotAllowed

        # Create a PDF resource manager object that stores shared resources.
        rsrcmgr = PDFResourceManager()

        # Create a PDF device object.
        device = PDFDevice(rsrcmgr)

        # BEGIN LAYOUT ANALYSIS
        # Set parameters for analysis.
        laparams = LAParams()

        # Create a PDF page aggregator object.
        device = PDFPageAggregator(rsrcmgr, laparams=laparams)

            # Create a PDF interpreter object.
        interpreter = PDFPageInterpreter(rsrcmgr, device)


        i = 0
        # loop over all pages in the document
        for page in PDFPage.create_pages(document):
            if i >= startpage and i <= endpage:
                # read the page into a layout object
                interpreter.process_page(page)
                layout = device.get_result()

                # extract text from this object
                self.parse_obj(layout._objs)
            i += 1

我收到以下错误:

File "C:/Users/951298/Documents/Python Scripts/PDF Scraping/untitled1.py", line 12
    if isinstance(obj, pdfminer.layout.LTTextBoxHorizontal):
                                                           ^
SyntaxError: invalid syntax

不确定为什么它最后指向结肠?

3 个答案:

答案 0 :(得分:1)

在第9行中,您应该在最后键入3个括号,但您只能输入2个。添加另一个括号即可。

答案 1 :(得分:0)

您忘记在打印对帐单上放置结尾括号。这会导致下一行错误,因为解释器在读取方括号内的代码时会忽略换行符。实际上,它在第12行引发错误的唯一原因是if isinstance(obj, pdfminer.layout.LTTextBoxHorizontal):不是传递给print的有效参数。

因此,以下代码将在第11行引发错误。

bar = "a"
baz = "a"

def foo(msg, bar="\n"):
    print(msg, end=bar)

if bar == baz:
    foo("bar is equal to baz",
    bar = baz

else: #Throws error here
    foo("bar is not equal to baz")

#Not the best example, I know, sorry.

奇怪,不是吗?确保查看抛出错误的行上方的行。它为您提供了上下文和潜在的错误代码。您特别需要在需要换行符终止的编程语言中注意此类错误。

答案 2 :(得分:0)

在第9行中,您应该有3个结尾括号,但是我也偶然注意到您有两个if语句和一个elif语句,但没有其他,它们都应该是if语句。希望我能帮上忙!