如何在Perl中打印与模式匹配的行?

时间:2010-04-01 01:06:47

标签: regex perl

假设file.txt每行只有一个句子如下:

John Depp is a great guy.  
He is very inteligent.  
He can do anything.  
Come and meet John Depp.

Perl代码如下: -

open ( FILE, "file.txt" ) || die "can't open file!";
@lines = <FILE>;
close (FILE);
$string = "John Depp";
foreach $line (@lines) {
    if ($line =~ $string) { print "$line"; }
}

输出将是第一和第四行。

我想让它适用于具有随机换行符的文件,而不是每行一个英文句子。我的意思是它也适用于以下方面: -

John Depp is a great guy. He is very intelligent. He can do anything. Come and meet John Depp.

输出应该是第一和第四句。

有什么想法吗?

6 个答案:

答案 0 :(得分:2)

首先,请注意着名演员的名字是Johnny Depp

其次,弄清楚什么是句子而不是什么是棘手的。我要作弊并使用Lingua::Sentence

#!/usr/bin/perl

use strict; use warnings;

use Lingua::Sentence;

my $splitter = Lingua::Sentence->new('en');

while ( my $text = <DATA> ) {
    for my $sentence ( split /\n/, $splitter->split($text) ) {
        print $sentence, "\n" if $sentence =~ /John Depp/;
    }
}

__DATA__
John Depp is a great guy.
He is very intelligent.
He can do anything.
Come and meet John Depp.
John Depp is a great guy. He is very intelligent. He can do anything. Come and meet John Depp.

输出:

John Depp is a great guy.
Come and meet John Depp.
John Depp is a great guy.
Come and meet John Depp.

答案 1 :(得分:2)

更简单:如果您假设“句子”以点分隔,那么您可以将其用作字段分隔符:

 $/ = '.';
 while(<>) {
        print if (/John Depp/i);
 }

答案 2 :(得分:1)

假设您的内容包含在字符串中:

my $content = "John Depp is a great guy.  
He is very intelligent.  
He can do anything.  
Come and meet John Depp.";

my @arr = $content =~ /.*John Depp.*/mg;
foreach my $a (@arr) {
    print "$a\n";
}

结果:

  

约翰·德普是个好人   快来见约翰·德普。

如果您只想提取有趣的部分,可以修改正则表达式,例如:

my @arr = $content =~ /is (\w+? ?\w+ \w+)./mg;

结果:

  

一个好人

     

非常聪明

答案 3 :(得分:0)

单程

while(<>){
 if (/John Depp/i){
   @s = split /\s*\.\s*/;
   foreach my $line (@s){
      @f=split /\s*\.\s*/ , $line;
      foreach my $found (@f){
        if ($found =~/John Depp/i) {
           print $found."\n";
        }
      }
   }
 }
}

输出

$ cat file
John Depp is a great guy.
He is very inteligent.
He can do anything.
Come and meet John Depp.
John Depp is a great guy. He is very inteligent. He can do anything. Come and meet John Depp.

$ perl perl.pl file
John Depp is a great guy
Come and meet John Depp
John Depp is a great guy
Come and meet John Depp

答案 4 :(得分:0)

如果不小心,默认变量可能被破坏。所以命名一切都是个好主意。

这应该让你开始:

#!/usr/bin/perl -w

use strict;

my $targetString = "John Depp";

while (my $line = <STDIN>) {
    chomp($line);
    my @elements = split("\\.", $line);
    foreach my $element (@elements) {
        if ($element =~ m/$targetString/is) {
            print trim($element).".\n";
        }
    }
}

sub trim {
    my $string = shift;
    $string =~ s/^\s+//;
    $string =~ s/\s+$//;
    return $string;
}

用法:

$ depp.pl < file
John Depp is a great guy.
Come and meet John Depp.
John Depp is a great guy.
Come and meet John Depp.

答案 5 :(得分:0)

查看原始代码,而不是专门回答您的问题。除非必须,否则将整个文件读入内存通常是个坏主意。您可以逐行处理文件

open ( FILE, "file.txt" ) || die "can't open file!";
$string = "John Depp";
while (<FILE>) {
   if (/$string/) { print }
}