Question

我们不知道如何跟踪yacc解析器中的错误。我们正在尝试在我们的lex文件中使用yylineno并尝试添加%option yylineno，但它仍无法正常工作，我们无法在yacc中访问这些变量。

我们想要的是使用yacc中的error和行号打印出语法错误。

这是我们的.l文件

%{
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
int yylineno=1;

%}

%option yylineno

identifier  [a-zA-Z_][a-zA-Z0-9_]*
int_constant    [0-9]+
delimiter       ;

%%

"int"       {return INT;}
{int_constant}  return INT_CONST;
{identifier}    return IDENT;
\=      {return ASOP;}
\+      {return PLUS;}
\-      {return MINUS;}
\*      {return MULT;}
\/      {return DIV;}
\,      {return COMMA;}
\(      {return OP;} /*OP CP = Opening Closing Parenthesis*/
\)      {return CP;}
\[      {return OB;} /*OB CB = Opening Closing Brace*/
\]      {return CB;}
\{      {return OCB;} /*OCB CCB = Opening Closing Curly Brace*/
\}      {return CCB;}
{delimiter} return DEL;
[ \t]
[\n]        {yylineno++;}


%%

现在这是我们的.y文件

%{
#include <stdio.h>
#include <string.h>
#include "y.tab.h"

extern FILE *yyin;

%}

%token INT INT_CONST IDENT ASOP PLUS MINUS MULT DIV DEL COMMA CP CB CCB
%left OP OB OCB


%%

program:        program_unit;
program_unit:   program_unit component | component
component:  var_decl DEL | func_decl DEL | func_defn ;
var_decl:       dt list;
dt:     INT;
list:       list COMMA var | var 
        | error {printf("before ';' token\n"); yyerrok;}
        | error INT_CONST {printf("before numeric constant\n"); yyerrok;};
var:        IDENT
        |IDENT init;
init:       ASOP IDENT init | ASOP expr | ASOP IDENT ;
expr:       IDENT op expr | const op expr | const | OP expr CP;
const:      INT_CONST;
op:     PLUS | MINUS | MULT | DIV;
func_decl:  dt mult_func;
mult_func:  mult_func COMMA mfunc | sfunc;
mfunc:      IDENT OP CP;
sfunc:      IDENT OP CP OCB func_body CCB;
func_body:  program_unit;

func_defn:  dt IDENT OP CP OCB func_body CCB
        | IDENT OP CP OCB func_body CCB; 

%%

int yyerror(char *s){
    extern int yylineno;
    fprintf(stderr,"At line %d %s ",s,yylineno);  
}

int yywrap(){
    return 1;
}

int main(int argc, char *argv[]){
    yyin=fopen("test.c","r");
    yyparse();
    fclose(yyin);
    return 0;
}

Answer 1

这些文件存在许多问题，但没有一个会阻止{b}生成的解析器使用yylineno。

您的yyerror定义会产生编译时警告。或者可能有几个警告。

首先，正确的签名是：

void yyerror(const char *msg);

可以返回int，但该值永远不会被使用;但是，你对函数的定义只是结束了，所以编译器会抱怨没有返回值的事实。此外，yyerror通常使用文字字符串参数调用，该参数是不可变的;标准C允许将文字字符串传递给参数类型为非const的函数，但不推荐使用，编译器可能会发出警告。更重要的是，

fprintf(stderr,"At line %d %s ",s,yylineno);

将%d（整数）格式应用于s（字符串），将%s（字符串）格式应用于yylineno（整数）;再次，这应该产生一个编译时警告，如果你忽略错误，你的程序可能会发生段错误。

最后（与yylineno相关），如果您在%option yylineno输入中指定flex（如果您想计算行数，这是一个好主意），那么flex生成的扫描程序将定义并初始化yylineno并为您进行计数。因此，您在yylineno文件中对.l的定义将触发编译时错误（重新定义yylineno）。此外，当您明确增加yylineno（[\n] {++yylineno;}）时，最终会重复计算行数; yylineno将由扫描仪递增，然后通过您的操作再次递增。我的建议：指定%option yylineno然后让flex为你做一切。您只需要在extern文件中将其声明为bison（就像您一样）。您只需将\n添加到忽略的空白字符列表中即可。

一个警告：直接在yylineno中使用bison意味着您没有确切的语法错误位置，因为bison生成的解析器通常会读取一个先行令牌，并且在yylineno注意到语法错误时，bison已经更新为此令牌末尾的行号。有时这会产生误导，特别是在由于丢失令牌而导致语法错误的情况下。

其他一些问题：

使用文字字符标记而不是在bison中定义标记名称并使用flex文件进行协调时，样式（IMHO）要好得多。如果你只使用文字字符，那么这两个文件就更容易保持彼此同步;语法更具可读性;你不需要像
这样的评论
```
/*OP CP = Opening Closing Parenthesis*/
```
相反，只需在语法中使用')'，在词法分析器中你可以这样做：
```
[][=+*/,(){}-]  { return yytext[0]; }
```
或者您甚至可以在最后使用默认规则：
```
.  { return yytext[0]; }
```
与上述相关，以及我通常选择第二个选项（默认规则）的原因，您的词法分析器没有针对所有可能字符的规则，因此将使用flex提供的默认规则。 flex提供的默认规则是将无效字符回显到yyout。在实际编译器中，这绝不是您想要的，结果是隐藏了输入错误（或扫描程序错误）。最好使用我上面建议的默认规则，并使用%option nodefault来避免flex生成的默认规则来保护自己。使用%option nodefault，如果输入可能不匹配，flex会给出警告;请不要忽视此警告。

如何使用yacc解析器检测错误行号

1 个答案: