Pyparsing发现比预期更多的匹配

时间:2017-11-16 16:40:10

标签: python-3.x pyparsing

我正在编写代码来解析基本计算机指令行。我的输入字符串类似于Date

我期待的结果如下:

Moment.js instance

我的实际结果具有我正在寻找的一般结构,但是行和指令解析器似乎在错误的位置匹配,或者标签出现在错误的位置。

实际结果:

format()

结果转储

ADD(input1,input2) DEL(input3), SUB(input1,input2) INS(input3)

由于某种原因, line 匹配整个结构,第二行指令匹配为单个指令组。我试图在<line> <instruction> <type>ADD</type> <args> <ITEM>input1</ITEM> <ITEM>input2</ITEM> </args> </instruction> <instruction> <type>DEL</type> <args> <ITEM>input3</ITEM> </args> </instruction> </line> <line> <instruction> <type>SUB</type> <args> <ITEM>input1</ITEM> <ITEM>input2</ITEM> </args> </instruction> <instruction> <type>INS</type> <args> <ITEM>input3</ITEM> </args> </instruction> </line> 行使用<line> <line> <instruction> <type>ADD</type> <args> <ITEM>input1</ITEM> <ITEM>input2</ITEM> </args> </instruction> <instruction> <type>DEL</type> <args> <ITEM>input3</ITEM> </args> </instruction> </line> <instruction> <instruction> <type>SUB</type> <args> <ITEM>input1</ITEM> <ITEM>input2</ITEM> </args> </instruction> <instruction> <type>INS</type> <args> <ITEM>input3</ITEM> </args> </instruction> </instruction> </line> 函数,但是我不确定如何解释输出。我不明白为什么最后的应该作为指令匹配,因为它不遵循Word(Word)模式。

我的代码:

[[['OTE', ['output1']]], [['XIO', ['input2']], ['OTE', ['output2']]]]
- branch: [[['OTE', ['output1']]], [['XIO', ['input2']], ['OTE', ['output2']]]]
  [0]:
    [['OTE', ['output1']]]
    - instruction: ['OTE', ['output1']]
      - args: ['output1']
      - type: 'OTE'
  [1]:
    [['XIO', ['input2']], ['OTE', ['output2']]]
    - instruction: ['OTE', ['output2']]
      - args: ['output2']
      - type: 'OTE'

调试输出:

.setDebug()

我做错了什么?

1 个答案:

答案 0 :(得分:0)

您发布的代码的转储输出如下所示:

ADD(input1,input2) DEL(input3), SUB(input1,input2) INS(input3)

[[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
- line: [[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
  [0]:
    [['ADD', ['input1', 'input2']], ['DEL', ['input3']]]
    - instruction: ['DEL', ['input3']]
      - args: ['input3']
      - type: 'DEL'
  [1]:
    [['SUB', ['input1', 'input2']], ['INS', ['input3']]]
    - instruction: ['INS', ['input3']]
      - args: ['input3']
      - type: 'INS'

我们可以在dump()输出中看到所有指令都被解析,但只有每个组中的最后一条指令显示在“指令”名称下。发生这种情况是因为,就像Python dict一样,当多个值(如你可能在ZeroOrMore或OneOrMore中获得)被赋予相同的键时,只保留最后一个值。

有两种解决方案。一种是删除(“指令”)结果名称,以便您只获得每个子列表中的解析说明:

[[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
- line: [[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
  [0]:
    [['ADD', ['input1', 'input2']], ['DEL', ['input3']]]
    [0]:
      ['ADD', ['input1', 'input2']]
      - args: ['input1', 'input2']
      - type: 'ADD'
    [1]:
      ['DEL', ['input3']]
      - args: ['input3']
      - type: 'DEL'
  [1]:
    [['SUB', ['input1', 'input2']], ['INS', ['input3']]]
    [0]:
      ['SUB', ['input1', 'input2']]
      - args: ['input1', 'input2']
      - type: 'SUB'
    [1]:
      ['INS', ['input3']]
      - args: ['input3']
      - type: 'INS'

还应该在pyparsing中为给定名称保存多个值。 setResultsName()方法有一个可选参数listAllMatches,可以启用此行为。使用setResultsName的可调用快捷方式时,您无法通过listAllMatches=True - 而是使用'*'结束结果名称:

instruction = Group(instructionType 
                                + Literal("(").suppress() 
                                + arguments 
                                + Literal(")").suppress())("instruction*")

这给出了这个输出:

[[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
- line: [[['ADD', ['input1', 'input2']], ['DEL', ['input3']]], [['SUB', ['input1', 'input2']], ['INS', ['input3']]]]
  [0]:
    [['ADD', ['input1', 'input2']], ['DEL', ['input3']]]
    - instruction: [['ADD', ['input1', 'input2']], ['DEL', ['input3']]]
      [0]:
        ['ADD', ['input1', 'input2']]
        - args: ['input1', 'input2']
        - type: 'ADD'
      [1]:
        ['DEL', ['input3']]
        - args: ['input3']
        - type: 'DEL'
  [1]:
    [['SUB', ['input1', 'input2']], ['INS', ['input3']]]
    - instruction: [['SUB', ['input1', 'input2']], ['INS', ['input3']]]
      [0]:
        ['SUB', ['input1', 'input2']]
        - args: ['input1', 'input2']
        - type: 'SUB'
      [1]:
        ['INS', ['input3']]
        - args: ['input3']
        - type: 'INS'

您可以选择更适合您的方法。