Question

现在我有includePat = r'^#\s*include\s+"([^"]+)"' 匹配模式#include "file.h"

我在尝试纠正它时遇到了问题。如果模式跨越两条线怎么办？如此...... #include \ "file.h"

我应该如何匹配？

编辑：对不起，伙计们，为了更清楚，引文中的字符串可以是任何内容，它不仅限于file.h

Answer 1

如果您不需要摆脱空白区域并且只想匹配它，那么您所拥有的就是非常接近。

这将匹配#include以及任何后续空格，包括新行，以及引号中的以下单词

(#include\s+)"([^"]+)"

Example

Answer 2

以下是我的表现：

import re
import sys

includePat = re.compile(r'''
    (?xm)        # verbose, multiline
    ^\s*         # beginning of line, optional WS
    \#\s*        # hash, optional WS
    include      # include, naturally
    (?:\s*\\\n)* # any number of line continuations
    \s*"(.*?)"   # non-greedy string match -- .*?
''')

for filename in sys.argv[1:]:
    with open(filename) as fp:
        lines = fp.read();
    for include in includePat.findall(lines):
        print filename, include

一个重要的位是(?:\s*\\\n)*。 \s*是可选的空格。 \\匹配C行继续符。 \n匹配必须紧跟在行继续符后面的换行符。 \

另一个重要方面是您必须匹配整个文件。使用单个正则表达式，您无法在循环中独立匹配每一行 - 您必须将整个文件读入一个缓冲区。

在文本中查找模式，Python

2 个答案: