删除几乎重复的字符串

时间:2015-05-19 02:19:36

标签: arrays string perl duplicates

嗨,我在数组中有这些字符串和其他几个字符串。

    revised_1.4_1.4-1.05-jan
    revised_1.5_1.8-before
    revised_1.5_1.8-after
    revised_1.5_0.7-mid
    deleted&reviewed_0.9-0.8-1.05-jan
    deleted&reviewed_1.6_1.6-before
    deleted&reviewed_0.5_1.8-after
    deleted&uploaded_0.8_1.9-midweek
    deleted&uploaded_1.0_1.3-offweek
    accessedbeforesecondquarter_0.8._1.6-jan
    accessedbeforesecondquarter_0.9_1.7-feb

我需要在数组中保留一个几乎相似的字符串。如何编写代码来获取此数组?

    revised_1.4_1.4-1.05-jan 
    deleted&reviewed_0.9-0.8-1.05-jan
    deleted&uploaded_0.8_1.9-midweek
    accessedbeforesecondquarter_0.8._1.6-jan

这是我的代码,对于我来说,将字符串保存到数组中似乎没有用。

my %seen;
my @strings = grep !$seen{ substr($_,0,2) }++, @strings;

1 个答案:

答案 0 :(得分:1)

保持你所尝试的精神:

my %seen;
my @result = grep {! $seen{(split "_",$_)[0]}++} <DATA>;
print @result;

__DATA__
revised_1.4_1.4-1.05-jan
revised_1.5_1.8-before
revised_1.5_1.8-after
revised_1.5_0.7-mid
deleted&reviewed_0.9-0.8-1.05-jan
deleted&reviewed_1.6_1.6-before
deleted&reviewed_0.5_1.8-after
deleted&uploaded_0.8_1.9-midweek
deleted&uploaded_1.0_1.3-offweek
accessedbeforesecondquarter_0.8._1.6-jan
accessedbeforesecondquarter_0.9_1.7-feb

结果:

revised_1.4_1.4-1.05-jan
deleted&reviewed_0.9-0.8-1.05-jan
deleted&uploaded_0.8_1.9-midweek
accessedbeforesecondquarter_0.8._1.6-jan
相关问题