为什么我用这个正则表达式得到错误'找不到Unicode属性定义'\“'?

时间:2015-03-20 16:14:37

标签: regex perl

我有一个看起来像这样的文本文件:

Pascal 14241 Mar 28

我想检查它是否符合以下格式:

Name size date

当我运行我的代码时出现错误:

Can't find Unicode property definition "\" at testar.pl line 13, <IN> line 2 (#1
)
(F) You may have tried to use \p which means a Unicode
property (for example \p{Lu} matches all uppercase
letters). If you did mean to use a Unicode property, see
"Properties accessible through \p{} and \P{}" in perluniprops
for a complete list of available properties. If you didn't
mean to use a Unicode property, escape the \p, either by \\p
(just the \p) or by \Q\p (the rest of the string, or
until \E).

Uncaught exception from user code:
    Can't find Unicode property definition "\" at testar.pl line 13, <IN> line 2.
at testar.pl line 13

这是我的代码:

#!/bin/usr/perl

use strict;
use warnings;

use diagnostics;

open (IN, "sample1.txt") or die "cant read words from file: $!";

while (<IN>) {
    chomp;
    if ($_ =~/\p\w+\s+\d+\s\w+\s+\d+/){
        print "$_ \n";
    }
}

我该如何解决这个问题?

1 个答案:

答案 0 :(得分:1)

\p必须后跟单字符Unicode属性(例如\pL - 字母)或curlies中的属性(例如\p{Lu} - 大写字母)。

\p\无效,因为\不是有效的Unicode属性。实际上,你的正则表达式中根本不需要\p

/\w+\s+\d+\s\w+\s+\d+/

如果您打算在该行的开头停泊,请使用^

/^\w+\s+\d+\s\w+\s+\d+/

您的语句仅匹配正则表达式的输入字符串。它没有捕获任何值(并且你没有告诉perl捕获的有趣内容)。

为了捕捉你应该使用的词语:

/^(\w+)\s+(\d+)\s(\w+\s+\d+)/

捕获值将以第一次捕获的$1,第二次捕获的$2为止,依此类推。然后,您可以打印出您需要的内容:

print $1." ".$2." ".$3."\n";