正则表达式以任何顺序删除特定单词

时间:2013-11-29 16:51:38

标签: regex perl words optional substitution

我想替换(删除)特定单词(ALWAYS_THERE)之前和之后的任何顺序的特定可选单词(AAA,BBB,CCC)

this is important AAA BBB ALWAYS_THERE CCC
this is important BBB AAA ALWAYS_THERE CCC
this is important AAA CCC ALWAYS_THERE BBB
this is important BBB ALWAYS_THERE CCC AAA
this is important BBB this also ALWAYS_THERE CCC AAA

this is important
this is important
this is important
this is important
this is important BBB this also

如何在perl(或UNIX可用程序中的任何其他程序)上完成此操作?

5 个答案:

答案 0 :(得分:4)

试试这个:

s/\b(AAA\s+|BBB\s+|CCC\s+)*ALWAYS_THERE(\s+AAA|\s+BBB|\s+CCC)*\b//g;

[已编辑为每个@ikegami添加前导{* 1}}

答案 1 :(得分:2)

尝试这种模式(\s*(AAA|BBB|CCC))*\s*ALWAYS_THERE.*$ demo

答案 2 :(得分:1)

在perl中尝试这个正则表达式(假设单词之间只有一个单独的空格):

s/(.*)((?:(AAA|BBB|CCC) )*ALWAYS_THERE.*)/$1/

如果您不想在“ALWAYS_THERE”部分之后切断某些与AAA或BBB或CCC不匹配的东西,请使用以下内容:

s/(.*)(?:(?:(?:AAA|BBB|CCC) )*ALWAYS_THERE(?: (?:AAA|BBB|CCC))*)(.*)/$1$2/

答案 3 :(得分:0)

您可以使用此模式:

s/((?:\s+(?:AAA|BBB|CCC))*)\s+ALWAYS_THERE\g<1>//g

但是,"This is important ALWAYS_THERE"之类的字符串会被"This is important"替换。

如果您想避免此行为,可以使用其他模式:

s/((?:\s+(?:AAA|BBB|CCC))+)? ALWAYS_THERE(\g<1>)?(?(1)|(?(2)|(?!)))//g

答案 4 :(得分:-1)

这与其他答案相同,但有一些优点 - 特别是添加更多忽略令牌更不容易出错(因为你只需将它们添加到数组中;正则表达式是由数组和$always_there自动:

my $always_there = 'ALWAYS_THERE';
my @ignore = (
    'AAA',
    'BBB',
    'CCC',
);

my $ig_str = '('.join('|', map { "$_\\b\\s*" } @ignore).')*';

$data =~ s/$ig_str$always_there\s+$ig_str//; #Add /g modifier if ALWAYS_THERE can appear > once
相关问题