将正则表达式转换为PegJs语法

时间:2017-04-07 03:28:38

标签: javascript parsing pegjs

我是PEGjs的新手,我正在尝试编写一个PEGjs语法,将RegEx (\s*[\(])|(\s*[\)])|(\"[^\(\)]+?\")|([^\(\)\s]+)转换为语法。

基本上我要做的是转换测试输入

(App= smtp AND "SPort" != 25) OR (App= pop3 AND "SPort" != 110) OR (App = imap AND "SPort" != 143) AND (App= imap OR "SPort" != 143)

到json格式如下

{
  "eventTypes": [
    "All"
  ],
  "condition": {
    "operator": "and",
    "terms": [
      {
        "operator": "or",
        "terms": [
          {
            "operator": "or",
            "terms": [
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "smtp"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "25"
                  }
                ]
              },
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "pop3"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "110"
                  }
                ]
              }
            ]
          },
          {
            "operator": "and",
            "terms": [
              {
                "name": "App",
                "operator": "equals",
                "value": "imap"
              },
              {
                "name": "Sport",
                "operator": "notEquals",
                "value": "143"
              }
            ]
          }
        ]
      },
      {
        "operator": "or",
        "terms": [
          {
            "name": "App",
            "operator": "equals",
            "value": "imap"
          },
          {
            "name": "Sport",
            "operator": "notEquals",
            "value": "143"
          }
        ]
      }
    ]
  }
}

我编写了一些复杂的javascript代码,将示例输入转换为JSON格式显示,但代码有点复杂,长期不容易维护,所以我想尝试一个语法解析器。 由于我是语法世界的新手,我寻求一些帮助或指导来实现上述语法,以便我可以根据需要进行增强/编写?

You can see the output of the Regex here

修改

Javascript解决方案:

 var str = '((Application = smtp AND "Server Port" != 25) AND (Application = smtp AND "Server Port" != 25)) OR (Application = pop3 AND "Server Port" != 110) OR (Application = imap AND "Server Port" != 143) AND (Application = imap OR "Server Port" != 143)';

var final = str.replace(/\((?!\()/g,"['")        //replace ( with [' if it's not preceded with (
           .replace(/\(/g,"[")               //replace ( with [
           .replace(/\)/g,"']")              //replace ) with '] 
           .replace(/\sAND\s/g,"','AND','")  //replace AND with ','AND','
           .replace(/\sOR\s/g,"','OR','")    //replace OR with ','OR','
           .replace(/'\[/g,"[")              //replace '[ with [
           .replace(/\]'/g,"]")              //replace ]' with ]
           .replace(/"/g,"\\\"")             //escape double quotes
           .replace(/'/g,"\"");              //replace ' with "
console.log(JSON.parse("["+final+"]"))

1 个答案:

答案 0 :(得分:0)

据我所知,你无法得到你想要的结果,因为它需要无限循环。具体来说,给出以下输入:

A OR B OR C

您要求输出:

(A OR B) OR C

要获得此结果,您需要有这样的规则:

BOOL = left:( BOOL / Expression ) "OR" right:( Expression )

这会创建一个无限循环,因为BOOL永远无法解析。无法解析BOOL,因为BOOL中的第一个规则是匹配BOOL。但是,我们可以获取

A OR ( B OR C )

,因为

BOOL = left:( Expression ) "OR" right:( BOOL / Expression )

不会创建无限循环。这是因为我们可以在递归回BOOL之前开始匹配某些。我知道,这有点令人兴奋,但请相信我......你必须为PegJS提供某些东西才能开始匹配才能递归。

如果这是可以接受的,那么我相信这个语法会让你非常接近所需的输出:

// Our top-level rule is Expression
Expression
  = BOOL
  / SubExpression
  / Comparison
  / Term

// A sub expression is just an expression wrapped in parentheses
// Note that this does not cause an infinite loop because the first term is always "("
SubExpression
  = _ "(" _ innards: Expression _ ")" _ { return innards; }

Comparison
  = name:Term _ operator:("=" / "!=") _ value:Term {
      return {
        name: name,
        operator: operator === '=' ? 'equals' : 'notEquals',
        value: value,
      };
    }

BOOL = AND / OR

// We separate the AND and OR because we want AND to take precendence over OR
AND
  = _ left:( OR / SubExpression / Comparison ) _ "AND" _ right:( AND / OR / SubExpression / Comparison ) _ {
    return {
      operator: 'and',
      terms: [ left, right ]
    }
  }

OR
  = _ left:( SubExpression / Comparison ) _ "OR" _ right:( OR / SubExpression / Comparison ) _ {
    return {
      operator: 'or',
      terms: [ left, right ]
    }
  }

Term
  = '"'? value:$( [0-9a-zA-Z]+ ) '"'? {
      return value;
    }

Integer "integer"
  = _ [0-9]+ { return parseInt(text(), 10); }

_ "whitespace"
  = [ \t\n\r]*

鉴于您的意见,我们得到:

{
   "operator": "and",
   "terms": [
      {
         "operator": "or",
         "terms": [
            {
               "operator": "and",
               "terms": [
                  {
                     "name": "App",
                     "operator": "equals",
                     "value": "smtp"
                  },
                  {
                     "name": "SPort",
                     "operator": "notEquals",
                     "value": "25"
                  }
               ]
            },
            {
               "operator": "or",
               "terms": [
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "pop3"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "110"
                        }
                     ]
                  },
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "imap"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "143"
                        }
                     ]
                  }
               ]
            }
         ]
      },
      {
         "operator": "or",
         "terms": [
            {
               "name": "App",
               "operator": "equals",
               "value": "imap"
            },
            {
               "name": "SPort",
               "operator": "notEquals",
               "value": "143"
            }
         ]
      }
   ]
}