找到重复字符的单词

时间:2013-06-02 00:15:07

标签: regex perl

想要在字典中搜索在第二个和最后一个位置具有相同字符的每个单词,并在中间位置搜索一次。

的示例:

statement - has the "t" at the second, fourth and last place
severe = has "e" at 2,4,last
abbxb = "b" at 2,3,last

abab = "b" only 2 times not 3
abxxxbyyybzzzzb - "b" 4 times, not 3

我的grep无效

my @ok = grep { /^(.)(.)[^\2]+(\2)[^\2]+(\2)$/ } @wordlist;

e.g。在

perl -nle 'print if /^(.)(.)[^\2]+(\2)[^\2]+(\2)$/' < /usr/share/dict/words

打印例如

zarabanda

出了什么问题。

正确的正则表达式应该是什么?

编辑:

如何捕捉封闭的群体?例如对于

statement - want cantupre: st(a)t(emen)t - for the later use

my $w1 = $1; my w2 = $2; or something like...

4 个答案:

答案 0 :(得分:13)

(?:(?!STRING).)*STRING[^CHAR]*CHAR,所以您想要的是:

^.             # Ignore first char
(.)            # Capture second char
(?:(?!\1).)*   # Any number of chars that aren't the second char
\1             # Second char
(?:(?!\1).)*   # Any number of chars that aren't the second char
\1\z           # Second char at the end of the string.

所以你得到:

perl -ne'print if /^. (.) (?:(?!\1).)* \1 (?:(?!\1).)* \1$/x' \
   /usr/share/dict/words

要捕获其中的内容,请在(?:(?!\1).)*周围添加parens。

perl -nle'print "$2:$3" if /^. (.) ((?:(?!\1).)*) \1 ((?:(?!\1).)*) \1\z/x' \
   /usr/share/dict/words

答案 1 :(得分:6)

这是适用于您的正则表达式:

^.(.)(?=(?:.*?\1){2})(?!(?:.*?\1){3}).*?\1$

现场演示:http://www.rubular.com/r/bEMgutE7t5

答案 2 :(得分:1)

使用lookahead:

/^.(.)(?!(?:.*\1){3}).*\1(.*)\1$/

含义:

/^.(.)(?!(?:.*\1){3})  # capture the second character if it is not
                       # repeated more than twice after the 2nd position
.*\1(.*)\1$              # match captured char 2 times the last one at the end

答案 3 :(得分:1)

my @ok = grep {/^.(\w)/; /^.$1[^$1]*?$1[^$1]*$1$/ } @wordlist;