Question

我已经构建了一个原始解析器，但我真的很想让它在pyparsing中工作。

我想解析两种类型的字符串。一个只解析节点和第二个节点关系

verb node1, node2, ...

和

verb node1->node2->node3

您可以指定一个或多个可以引用的节点此外，您可以通过添加^

来指示节点位于另一个节点内

verb node1, node2 ^ node3, node4

您可能还希望使用->，<-或<->指标来指明节点关系。

verb node1->node2<->node3

同样，您可以使用^

指示某个节点在另一个节点内

verb node1->node2^node4<->node3

Answer 1

此格式的概念性BNF如下所示：

node :: word composed of alphas, digits, '_'
verb :: one of several defined keywords
binop :: '->' | '<-' | '<->'
nodeFactor :: node '^' node | node
nodeExpr :: nodeFactor op nodeFactor
nodeCommand :: verb nodeExpr [',' nodeExpr]...

这映射到几乎步骤的pyparsing：

from pyparsing import (Word,alphas,alphanums,Keyword,
    infixNotation,opAssoc,oneOf,delimitedList)

nodeRef = Word(alphas,alphanums+'_')
GO, TURN, FOLLOW = map(Keyword, "GO TURN FOLLOW".split())
verb = GO | TURN | FOLLOW
binop = oneOf('-> <- <->')

下一部分最容易使用pyparsing的infixNotation方法（以前称为operatorPrecedence）实现。 infixNotation允许我们定义操作层次结构，并根据层次结构定义的优先级对解析后的输出进行分组。我假设你的'^'“在里面”运算符应该在二元'->'等运算符之前进行评估。 infixNotation也允许在括号内嵌套，但是没有一个示例将此显示为绝对需要。您可以通过指定基本操作数类型来定义infixNotation，然后是3元组列表，每个元组显示运算符，一元，二元或三元运算符的值为1,2或3，常量{{1 }}或opAssoc.LEFT表示运算符的左右关联性：

RIGHT

最后，我们定义整体表达式，我将其解释为某种命令。逗号分隔的节点表达式列表可以直接实现为nodeExpr = infixNotation(nodeRef, [ ('^', 2, opAssoc.LEFT), (binop, 2, opAssoc.LEFT), ])（我们从解析的输出中抑制逗号 - 它们在解析时很有用，但之后我们只需要跳过它们）。但是这经常出现所以，pyparsing提供方法nodeExpr + ZeroOrMore(Suppress(',') + nodeExpr)：

delimitedList

名称'动词'和'节点'使得在各个表达式中解析的结果与这些名称相关联，这将使解析完成后更容易处理解析后的数据。

现在测试解析器：

nodeCommand = verb('verb') + delimitedList(nodeExpr)('nodes')

tests = """\ GO node1,node2 TURN node1->node2->node3 GO node1,node2^node3,node4 FOLLOW node1->node2<->node3 GO node5,node1->node2^node4<->node3,node6 """.splitlines() for test in tests: test = test.strip() if not test: continue print (test) try: result = nodeCommand.parseString(test, parseAll=True) print (result.dump()) except ParseException as pe: print ("Failed:", test) print (pe)方法将解析后的标记打印为嵌套列表，然后列出每个结果名称及其附加值：

dump()

此时，您可以解析命令，然后根据GO node1,node2 ['GO', 'node1', 'node2'] - nodes: ['node1', 'node2'] - verb: GO TURN node1->node2->node3 ['TURN', ['node1', '->', 'node2', '->', 'node3']] - nodes: [['node1', '->', 'node2', '->', 'node3']] - verb: TURN GO node1,node2^node3,node4 ['GO', 'node1', ['node2', '^', 'node3'], 'node4'] - nodes: ['node1', ['node2', '^', 'node3'], 'node4'] - verb: GO FOLLOW node1->node2<->node3 ['FOLLOW', ['node1', '->', 'node2', '<->', 'node3']] - nodes: [['node1', '->', 'node2', '<->', 'node3']] - verb: FOLLOW GO node5,node1->node2^node4<->node3,node6 ['GO', 'node5', ['node1', '->', ['node2', '^', 'node4'], '<->', 'node3'], 'node6'] - nodes: ['node5', ['node1', '->', ['node2', '^', 'node4'], '<->', 'node3'], 'node6'] - verb: GO，调度到执行该动词的任何适当方法。

但是，让我建议一个我发现有助于使用Python对象捕获此逻辑的结构。定义一个简单的命令类层次结构，用于在抽象方法verb中实现各种动词函数：

doCommand

此方法会将解析后的结果转换为相应命令类的实例：

# base class
class Command(object):
    def __init__(self, tokens):
        self.cmd = tokens.verb
        self.nodeExprs = tokens.nodes

    def doCommand(self):
        """
        Execute command logic, using self.cmd and self.nodeExprs.
        To be overridden in sub classes.
        """
        print (self.cmd, '::', self.nodeExprs.asList())

# these should implement doCommand, but not needed for this example
class GoCommand(Command): pass
class TurnCommand(Command): pass
class FollowCommand(Command): pass

但你也可以将它作为解析时回调构建到你的解析器中，这样一旦完成解析，你不仅会获得字符串和子列表的列表，而且还可以通过调用它来准备“执行”的对象verbClassMap = { 'GO' : GoCommand, 'TURN' : TurnCommand, 'FOLLOW' : FollowCommand, } def tokensToCommand(tokens): cls = verbClassMap[tokens.verb] return cls(tokens)方法。要执行此操作，只需将doCommand作为解析操作附加到整个tokensToCommand表达式上：

nodeCommand

现在我们稍微修改我们的测试代码：

nodeCommand.setParseAction(tokensToCommand)

由于我们实际上没有在子类上实现for test in tests: test = test.strip() if not test: continue try: result = nodeCommand.parseString(test, parseAll=True) result[0].doCommand() except ParseException as pe: print ("Failed:", test) print (pe)，我们得到的只是默认的基类行为，它只是回显解析的动词和节点列表：

doCommand

（此代码使用Python 3运行，pyparsing 2.0.0。它也将运行Python 2，pyparsing 1.5.7。）

修改

要将链式表达式GO :: ['node1', 'node2'] TURN :: [['node1', '->', 'node2', '->', 'node3']] GO :: ['node1', ['node2', '^', 'node3'], 'node4'] FOLLOW :: [['node1', '->', 'node2', '<->', 'node3']] GO :: ['node5', ['node1', '->', ['node2', '^', 'node4'], '<->', 'node3'], 'node6']放入a op b op c，请使用解析操作将[a，op，b，op，c]结果重组为解析为成对表达式。 [a,op,b], [b, op, c]方法允许您定义要附加到运算符层次结构中的级别的解析操作。

重构链式表达式结果的方法如下：

infixNotation

这构建了一个全新的ParseResults来替换原始的链式结果。注意每个def expandChainedExpr(tokens): ret = ParseResults([]) tokeniter = iter(tokens[0]) lastexpr = next(tokeniter) for op,nextexpr in zip(tokeniter,tokeniter): ret += ParseResults([[lastexpr, op, nextexpr]]) lastexpr = nextexpr return ret如何保存为自己的子组，然后将lastexpr op nextexpr复制到nextexpr，然后循环以获取下一个op-nextexpr对。

要将此重新格式化程序附加到解析器中，请将其添加为lastexpr中该层次结构级别的第四个元素：

infixNotation

现在输出：

nodeExpr = infixNotation(nodeRef, [ ('^', 2, opAssoc.LEFT), (binop, 2, opAssoc.LEFT, expandChainedExpr), ])

扩展为：

FOLLOW node1->node2<->node3

如何解析pyparsing中的节点和节点关系？

1 个答案: