Shell,并行运行四个进程

时间:2013-11-02 22:31:18

标签: shell unix ubuntu

目前,我一直在努力有效地运行模拟计算。目的是并行运行4个模拟,因为它是单线程应用程序和四核系统。我必须改变shell脚本:

./sim -r 1 &
./sim -r 2 &
./sim -r 3 &
./sim -r 4 &
wait
./sim -r 5 &
./sim -r 6 &
./sim -r 7 &
./sim -r 8 &
wait
... (another 112 jobs) 

使用此代码会一次又一次地等待。我还尝试将任务分成四个脚本并运行每个脚本,结果是一个脚本完成而另一个脚本剩余约30%。我无法预测模拟需要多长时间。

有任何建议可以随时运行4次模拟吗?

3 个答案:

答案 0 :(得分:5)

在Ubuntu中安装moreutils包,然后使用parallel实用程序:

parallel -j 4 ./sim -r -- 1 2 3 4 5 6 7 8 ...

答案 1 :(得分:2)

如果您不想安装parallel实用程序(假设它按指示工作看起来很整洁),那么您可以调整此Perl脚本(基本上,更改执行的命令),并可能减少监控:

#!/usr/bin/env perl
use strict;
use warnings;
use constant MAX_KIDS => 4;

$| = 1;

my %pids;
my $kids = 0;   # Number of kids

for my $i (1..20)
{
    my $pid;
    if (($pid = fork()) == 0)
    {
        my $tm = int(rand() * (10 - 2) + 2);
        print "sleep $tm\n";
        # Using exec in a block on its own is the documented way to
        # avoid the warning:
        # Statement unlikely to be reached at filename.pl line NN.
        #   (Maybe you meant system() when you said exec()?)
        # Yes, I know the print and exit statements should never be
        # reached, but, dammit, sometimes things go wrong!
        { exec "sleep", $tm; }
        print STDERR "Oops: couldn't sleep $tm!\n";
        exit 1;
    }
    $pids{$pid} = 1;
    $kids++;
    my $time = time;
    print "PID: $pid; Kids: $kids; Time: $time\n";
    if ($kids >= MAX_KIDS)
    {
        my $kid = waitpid(-1, 0);
        print "Kid: $kid ($?)\n";
        if ($kid != -1)
        {
            delete $pids{$kid};
            $kids--;
        }
    }
}

while ((my $kid = waitpid(-1, 0)) > 0)
{
    my $time = time;
    print "Kid: $kid (Status: $?); Time: $time\n";
    delete $pids{$kid};
    $kids--;
}

# This should not do anything - and doesn't (any more!).
foreach my $pid (keys %pids)
{
    printf "Undead: $pid\n";
}

示例输出:

PID: 20152; Kids: 1; Time: 1383436882
PID: 20153; Kids: 2; Time: 1383436882
sleep 5
PID: 20154; Kids: 3; Time: 1383436882
sleep 7
sleep 9
PID: 20155; Kids: 4; Time: 1383436882
sleep 4
Kid: 20155 (0)
PID: 20156; Kids: 4; Time: 1383436886
sleep 4
Kid: 20152 (0)
PID: 20157; Kids: 4; Time: 1383436887
sleep 2
Kid: 20153 (0)
PID: 20158; Kids: 4; Time: 1383436889
sleep 9
Kid: 20157 (0)
PID: 20159; Kids: 4; Time: 1383436889
sleep 6
Kid: 20156 (0)
PID: 20160; Kids: 4; Time: 1383436890
sleep 6
Kid: 20154 (0)
PID: 20161; Kids: 4; Time: 1383436891
sleep 9
Kid: 20159 (0)
PID: 20162; Kids: 4; Time: 1383436895
sleep 7
Kid: 20160 (0)
PID: 20163; Kids: 4; Time: 1383436896
sleep 9
Kid: 20158 (0)
PID: 20164; Kids: 4; Time: 1383436898
sleep 6
Kid: 20161 (0)
PID: 20165; Kids: 4; Time: 1383436900
sleep 9
Kid: 20162 (0)
PID: 20166; Kids: 4; Time: 1383436902
sleep 9
Kid: 20164 (0)
PID: 20167; Kids: 4; Time: 1383436904
sleep 2
Kid: 20163 (0)
PID: 20168; Kids: 4; Time: 1383436905
sleep 6
Kid: 20167 (0)
PID: 20169; Kids: 4; Time: 1383436906
sleep 9
Kid: 20165 (0)
PID: 20170; Kids: 4; Time: 1383436909
sleep 4
Kid: 20168 (0)
PID: 20171; Kids: 4; Time: 1383436911
Kid: 20166 (0)
sleep 9
Kid: 20170 (Status: 0); Time: 1383436913
Kid: 20169 (Status: 0); Time: 1383436915
Kid: 20171 (Status: 0); Time: 1383436920

答案 2 :(得分:2)

NUMJOBS=30
NUMPOOLS=4

seq 1 "$NUMJOBS" | for p in $(seq 1 $NUMPOOLS); do
    while read x; do ./sim -r "$x"; done &
done

for循环创建一个后台进程池,从共享标准输入读取以启动模拟。每个后台进程在其模拟运行时“阻塞”,然后从seq命令读取下一个作业编号。

如果没有for循环,可能会更容易理解:

seq 1 "$NUMJOBS" | {
    while read x; do ./sim -r "$x"; done &
    while read x; do ./sim -r "$x"; done &
    while read x; do ./sim -r "$x"; done &
    while read x; do ./sim -r "$x"; done &
}

假设sim需要花费非常少的时间来运行,第一个while将从其标准输入读取1,第二个2,等等。sim首先完成, while循环将read 5来自标准输入;完成的下一个将读取6,依此类推。一旦最后一次模拟开始,每个read都将失败,导致循环退出。