为什么在我的perl脚本中打印空行打印输出

时间:2016-07-05 23:06:40

标签: perl unix printing

脚本正在做什么的细节并不重要,但是我已经对我的重要内容进行了评论,我只关心为什么我的输出中出现空白行

当我运行命令

./script.pl temp temp.txt tempF `wc -l temp | awk '{print $1}'`

临时文件包含

1   27800000    120700000   4
1   27800000    124300000   4
1   154800000   247249719   3
3   32100000    71800000    9
3   32100000    87200000    2
3   54400000    74200000    15
4   76500000    155100000   20
4   76500000    182600000   3
4   76500000    88200000    77
4   88200000    124000000   2
5   58900000    180857866   8
5   58900000    76400000    2
5   58900000    97300000    4
5   76400000    143100000   14
5   97300000    147200000   6
6   7000000 29900000    2
6   63500000    70000000    73
6   63500000    92100000    4
6   70000000    113900000   70
6   70000000    139100000   57
6   92100000    113900000   3

我正在获取表格的输出

hs1 27800000    124300000   4


hs3 32100000    87200000    2
hs3 54400000    74200000    15

hs4 76500000    182600000   3
hs4 76500000    88200000    77
hs4 88200000    124000000   2

hs5 58900000    76400000    2
hs5 58900000    97300000    4
hs5 76400000    143100000   14
hs5 97300000    147200000   6


hs6 63500000    92100000    4

hs6 70000000    139100000   57
hs6 92100000    113900000   3

标准输出(大约8行也打印到temp.txt文件,但这些行的格式是正确的)

这是下面的脚本

#!/usr/bin/perl

# ARGV[0] is the name of the file which data will be read from(may have overlaps)
# ARGV[1] is the name of the file which will be produced that will have no overlaps
# ARGV[2] is the name of the folder which will hold all the data  
# ARGV[3] is the number of lines that ARGV[0] will contain

use warnings;

my $file  = "./$ARGV[0]";
my @lines = do {
    open my $fh, '<', $file or die "Can't open $file -- $!";
    <$fh>;
};

my $file2 = "./$ARGV[2]/$ARGV[1]";
open( my $files, ">", "$file2" ) or die "Can't open > $file2: $!";

my $i = 0;
while ( $i < $ARGV[3] - 1 ) {

    my @ref_fields = split( '\s+', $lines[$i] );

    print $files
        "$ref_fields[0]", "\t",
        $ref_fields[1], "\t",
        $ref_fields[2], "\t",
        $ref_fields[3], "\n";

    for my $j ( $i + 1 .. $ARGV[3] - 1 ) {

        $i = $j;

        # @curr_fields is initialized here

        my @curr_fields = split /\s+/, $lines[$j];

        if ( $ref_fields[0] eq $curr_fields[0] && $ref_fields[2] > $curr_fields[1] ) {

            if ( defined( $curr_fields[0] ) && $curr_fields[0] !~ /\s+/ ) {

                chomp $curr_fields[3];

                # the line below is the one that is printing to standard output
                print
                    $curr_fields[0], "\t",
                    $curr_fields[1], "\t",
                    $curr_fields[2], "\t",
                    $curr_fields[3], "\n";
            }
        }
        else {
            last;
        }
    }

    print "\n";
}

编辑:

从发布的答案运行脚本时发现错误 当我运行命令

./script.pl temp1 temp10.txt folder

temp1包含

的地方
12  58100000    96200000    0.04348
3   74200000    87200000    0.04348
5   130600000   168500000   0.04348
6   61000000    114600000   0.04348
6   75900000    114600000   0.04348
6   88000000    114600000   0.04348
6   88000000    139000000   0.04348
6   93100000    161000000   0.04348
6   105500000   139000000   0.04348
6   130300000   139000000   0.04348
7   59900000    77500000    0.04348
7   98000000    132600000   0.04348
X   67800000    76000000    0.08696
Y   28800000    59373566    0.04348

我得到了

6   75900000    114600000   0.04348
6   88000000    114600000   0.04348
6   88000000    139000000   0.04348
6   93100000    161000000   0.04348
6   105500000   139000000   0.04348

temp10.txt包含

12  58100000    96200000    0.04348
3   74200000    87200000    0.04348
5   130600000   168500000   0.04348
6   61000000    114600000   0.04348
6   130300000   139000000   0.04348
7   59900000    77500000    0.04348
7   98000000    132600000   0.04348
X   67800000    76000000    0.08696

该行

Y   28800000    59373566    0.04348

既不在输出中也不在temp10.txt中。它似乎已经消失了但应该打印到其中一个

1 个答案:

答案 0 :(得分:2)

显然空行是打印的,因为你有一行

scala.collection.JavaConversions._

代码

我无法帮助更多,因为你说“脚本正在做的事情的细节并不重要”,所以我们不知道它是什么意思正在做

但是,只要第一列与上一行中的第一列匹配且第二列小于上一行中的第三列,您所写的内容就会从输入文件中打印行。任何时候你得到一个不符合这种方式的行你打印一个空行


您可能更喜欢对代码进行重构,这些代码的行为相同,但我认为更具可读性。它还具有将每个行与输入文件分开一次的优点,并且不需要第四个参数,因为行数只是print "\n"; 数组的大小。读取时会从文件中删除空行,因此不再需要检查第一个字段的定义

@lines