使用if-else和while语句的Python PLY问题

时间:2017-12-11 04:48:56

标签: python python-3.x ply

if语句和while语句不断从p_error(p)抛出语法错误,PLY告诉我在运行时存在冲突。问题来自if-else和while语句,因为在添加它们之前没问题。任何帮助将不胜感激。

如果可能的话,请不要更改实施,即使这是不好的做法。我只是想要帮助理解它我不想彻底检修(这是抄袭)。

import ply.lex as lex
import ply.yacc as yacc

# === Lexical tokens component ===

# List of possible token namesthat can be produced by the lexer
# NAME: variable name, L/RPAREN: Left/Right Parenthesis
tokens = (
    'NAME', 'NUMBER',
    'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'MODULO', 'EQUALS',
    'LPAREN', 'RPAREN',
    'IF', 'ELSE', 'WHILE',
    'EQUAL', 'NOTEQ', 'LARGE', 'SMALL', 'LRGEQ', 'SMLEQ',
)

# Regular expression rules for tokens format: t_<TOKEN>
# Simple tokens: regex for literals +,-,*,/,%,=,(,) and variable names (alphanumeric)
t_PLUS    = r'\+'
t_MINUS   = r'-'
t_TIMES   = r'\*'
t_DIVIDE  = r'/'
t_MODULO  = r'%'
t_EQUALS  = r'='
t_LPAREN  = r'\('
t_RPAREN  = r'\)'
t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'
t_IF      = r'if'
t_ELSE    = r'else'
t_WHILE   = r'while'
t_EQUAL   = r'\=\='
t_NOTEQ   = r'\!\='
t_LARGE   = r'\>'
t_SMALL   = r'\<'
t_LRGEQ   = r'\>\='
t_SMLEQ   = r'\<\='


# complex tokens
# number token
def t_NUMBER(t):
    r'\d+'  # digit special character regex
    t.value = int(t.value)  # convert str -> int
    return t


# Ignored characters
t_ignore = " \t"  # spaces & tabs regex

# newline character
def t_newline(t):
    r'\n+'  # newline special character regex
    t.lexer.lineno += t.value.count("\n")  # increase current line number accordingly


# error handling for invalid character
def t_error(t):
    print("Illegal character '%s'" % t.value[0])  # print error message with causing character
    t.lexer.skip(1)  # skip invalid character


# Build the lexer
lex.lex()

# === Yacc parsing/grammar component ===

# Precedence & associative rules for the arithmetic operators
# 1. Unary, right-associative minus.
# 2. Binary, left-associative multiplication, division, and modulus
# 3. Binary, left-associative addition and subtraction
# Parenthesis precedence defined through the grammar
precedence = (
    ('left', 'PLUS', 'MINUS'),
    ('left', 'TIMES', 'DIVIDE', 'MODULO'),
    ('right', 'UMINUS'),
)

# dictionary of names (for storing variables)
names = {}

# --- Grammar:
# <statement> -> NAME = <expression> | <expression>
# <expression> -> <expression> + <expression>
#               | <expression> - <expression>
#               | <expression> * <expression>
#               | <expression> / <expression>
#               | <expression> % <expression>
#               | - <expression>
#               | ( <expression> )
#               | NUMBER
#               | NAME
# ---
# defined below using function definitions with format string/comment
# followed by logic of changing state of engine


# if statement
def p_statement_if(p):
    '''statement : IF LPAREN comparison RPAREN statement
                    | IF LPAREN comparison RPAREN statement ELSE statement'''
    if p[3]:
        p[0] = p[5]
    else:
        if p[7] is not None:
            p[0] = p[7]


def p_statement_while(p):
    'statement : WHILE LPAREN comparison RPAREN statement'
    while(p[3]):
        p[5];


# assignment statement: <statement> -> NAME = <expression>
def p_statement_assign(p):
    'statement : NAME EQUALS expression'
    names[p[1]] = p[3]  # PLY engine syntax, p stores parser engine state


# expression statement: <statement> -> <expression>
def p_statement_expr(p):
    'statement : expression'
    print(p[1])


# comparison
def p_comparison_binop(p):
    '''comparison : expression EQUAL expression
                          | expression NOTEQ expression
                          | expression LARGE expression
                          | expression SMALL expression
                          | expression LRGEQ expression
                          | expression SMLEQ expression'''
    if p[2] == '==':
        p[0] = p[1] == p[3]
    elif p[2] == '!=':
        p[0] = p[1] != p[3]
    elif p[2] == '>':
        p[0] = p[1] > p[3]
    elif p[2] == '<':
        p[0] = p[1] < p[3]
    elif p[2] == '>=':
        p[0] = p[1] >= p[3]
    elif p[2] == '<=':
        p[0] = p[1] <= p[3]


# binary operator expression: <expression> -> <expression> + <expression>
#                                          | <expression> - <expression>
#                                          | <expression> * <expression>
#                                          | <expression> / <expression>
#                                          | <expression> % <expression>
def p_expression_binop(p):
    '''expression : expression PLUS expression
                          | expression MINUS expression
                          | expression TIMES expression
                          | expression DIVIDE expression
                          | expression MODULO expression'''
    if p[2] == '+':
        p[0] = p[1] + p[3]
    elif p[2] == '-':
        p[0] = p[1] - p[3]
    elif p[2] == '*':
        p[0] = p[1] * p[3]
    elif p[2] == '/':
        p[0] = p[1] / p[3]
    elif p[2] == '%':
        p[0] = p[1] % p[3]


# unary minus operator expression: <expression> -> - <expression>
def p_expression_uminus(p):
    'expression : MINUS expression %prec UMINUS'
    p[0] = -p[2]


# parenthesis group expression: <expression> -> ( <expression> )
def p_expression_group(p):
    'expression : LPAREN expression RPAREN'
    p[0] = p[2]


# number literal expression: <expression> -> NUMBER
def p_expression_number(p):
    'expression : NUMBER'
    p[0] = p[1]


# variable name literal expression: <expression> -> NAME
def p_expression_name(p):
    'expression : NAME'
    # attempt to lookup variable in current dictionary, throw error if not found
    try:
        p[0] = names[p[1]]
    except LookupError:
        print("Undefined name '%s'" % p[1])
        p[0] = 0


# handle parsing errors
def p_error(p):
    print("Syntax error at '%s'" % p.value)


# build parser
yacc.yacc()

# start interpreter and accept input using commandline/console
while True:
    try:
        s = input('calc > ')  # get user input. use raw_input() on Python 2
    except EOFError:
        break
    yacc.parse(s)  # parse user input string

1 个答案:

答案 0 :(得分:0)

您的基本问题是您的词法分析器无法识别关键字ifwhile(也不会识别else),因为t_NAME模式会在这些情况下触发。 section 4.3 of the Ply documentation中描述了问题和可能的解决方案。问题是:

  

接下来通过按正则表达式长度递减的顺序对字符串定义的标记进行排序(首先添加较长的表达式)。

并且t_NAME的表达式比简单关键字模式更长。

你不能通过将t_NAME变成词法分析器函数来解决这个问题,因为函数定义的标记在字符串定义的标记之前被检查。

但是你可以将t_NAME变成一个函数,并在函数中查找字典中匹配的字符串,看它是否是一个保留字。 (请参阅链接部分末尾的示例,在开始的段落中#34;处理保留字...&#34;)。执行此操作时,您根本不会定义t_IFt_WHILEt_ELSE

转移 - 减少冲突是&#34;悬挂其他&#34;问题。如果您搜索该短语,您会找到各种解决方案。

最简单的解决方案是什么也不做,只是忽略警告,因为默认情况下Ply会做正确的事情。

第二个最简单的解决方案是将('if', 'IF'), ('left', 'ELSE')添加到优先级列表中,并为if生产添加优先级标记:

'''statement : IF LPAREN comparison RPAREN statement %prec IF
             | IF LPAREN comparison RPAREN statement ELSE statement'''

ELSE一个比IF更高的优先级值可确保当解析器需要在第二次生产中转移ELSE或在第一次生产时减少时,选择转移(因为ELSE具有更高的优先权)。实际上,这是默认行为,因此优先级声明根本不会影响解析行为;但是,它会抑制转移 - 减少冲突警告,因为冲突已经解决。

有关其他解决方案,请参阅this question and answer

最后,请查看您问题的评论。您对ifwhile语句的操作根本不起作用。