用于运行MPI程序的脚本,其中包含三个输入文件作为命令行参数

时间:2018-03-20 12:51:13

标签: bash

enter code here MPI程序需要3个命令行参数(2个输入文件和SIZE,对于文件对是相同的)。例如,在同一目录中,我有这些文件。

abc.mtx,  abc.txt  SIZE is same for these 2 files

def.mtx,  def.txt  SIZE is same for these two files

qas.mtx,  qas.txt  SIZE is same for these two files

and so on .....

请注意:文件名相同,但扩展名不同。

我想将我的代码作为

运行
mpirun -np 4 ./myexe file1.mtx file1.txt -SIZE 10  //.myexe is executable

我想用不同数量的进程执行我的程序,比如-np 2,4,6,8和10.我有超过一百个文件。我想从命令行执行一次代码,该命令行使用指定的进程数逐个读取这些文件。

例如

abc.mtx and abc.txt should run first with 2,4,6,8,10 processes 

and then next two files def.mtx and def.txt with 2,4,6,8,10  processes and so on....

对于串行代码,我尝试了以下命令,它通过逐个获取所有.txt文件来工作。(仅适用于txt或mtx文件,但不适用于两者)

find . -name "*.txt" | awk -F"/" '{system ("./myexe." $2)}'

如何运行2个不同扩展名的输入文件,即(mtx,txt)。什么是第三个参数SIZE的最佳方法。我应该创建另一个包含SIZE的文件,并提供三个输入文件参数作为输入。

EDIT 这是一个脚本

#!/bin/bash

while read base size; do
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4 6 8 10; do
      echo mpirun -np $np ./myexe "$mtx" "$txt" -SIZE $size
   done
done < jobs

jobs.txt文件看起来像

bus 490
bcs_B 10
arc 1178
tk18 99

我正在使用以下命令执行

./script.sh jobs.txt ./new

也尝试了

bash script.sh jobs.txt ./new

编辑2

Jobs.txt看起来像

494_bus 494
arc130 130
bcsstk02 66
bcsstk18 11948

脚本是

#!/bin/bash
while read base size; do
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4; do
      mpirun -np $np ./new "$txt" "$mtx" -SIZE $size
   done
done < "$1"

我只是从我的代码中打印矩阵的维度。输出是

Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 

它只接受第一对文件,使用-np 2和-np 4执行它们,但不执行其余文件。

如果我在mpirun之前在剧本中写了Echo,那就显示

mpirun -np 2 ./new 494_bus.txt 494_bus.mtx -SIZE 494
mpirun -np 4 ./new 494_bus.txt 494_bus.mtx -SIZE 494
mpirun -np 2 ./new arc130.txt arc130.mtx -SIZE 130
mpirun -np 4 ./new arc130.txt arc130.mtx -SIZE 130
mpirun -np 2 ./new bcsstk02.txt bcsstk02.mtx -SIZE 66
mpirun -np 4 ./new bcsstk02.txt bcsstk02.mtx -SIZE 66
mpirun -np 2 ./new bcsstk18.txt bcsstk18.mtx -SIZE 11948
mpirun -np 4 ./new bcsstk18.txt bcsstk18.mtx -SIZE 11948

如果我分别执行这些命令中的每一个,它们都能正常工作。例如

mpirun -np 4 ./new arc130.txt arc130.mtx -SIZE 130
mpirun -np 2 ./new bcsstk18.txt bcsstk18.mtx -SIZE 11948

这些运行命令工作正常,但未运行脚本。 感谢

编辑3

cat jobs.txt 

494_bus 494
arc130 130
bcsstk02 66
bcsstk18 11948



cat -vet jobs.txt

494_bus 494$
arc130 130$
bcsstk02 66$
bcsstk18 11948$

cat script.sh

#!/bin/bash

while read base size; do
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4; do
      mpirun -np $np ./new "$txt" "$mtx" -SIZE $size
   done
done < "$1"

cat -vet script.sh

#!/bin/bash$
$
while read base size; do$
   mtx="${base}.mtx"$
   txt="${base}.txt"$
   for np in 2 4; do$
      mpirun -np $np ./new "$txt" "$mtx" -SIZE $size$
   done$
done < "$1"$

编辑4

bash -xv script2.sh jobs.txt
#!/bin/bash

while read base size; do
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4; do
      mpirun -np $np ./new "$txt" "$mtx" -SIZE $size
   done
done < "$1"
+ read base size
+ mtx=494_bus.mtx
+ txt=494_bus.txt
+ for np in 2 4
+ mpirun -np 2 ./new 494_bus.txt 494_bus.mtx -SIZE 494
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
+ for np in 2 4
+ mpirun -np 4 ./new 494_bus.txt 494_bus.mtx -SIZE 494
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
Dimension of the matrix is = 494 
+ read base size

1 个答案:

答案 0 :(得分:0)

更新了答案

我怀疑MPI程序正在消耗部分/全部stdin,因此我建议将整个作业列表读入bash数组:

#!/bin/bash

# Read entire jobs file into array jobs[]
IFS=$'\n' jobs=($(cat "$1"))

for j in "${jobs[@]}"; do
   IFS=" " read base size <<< "$j"
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4 6 8 10; do
      echo mpirun -np $np ./myexe "$mtx" "$txt" -SIZE $size
   done
done

更新了答案

根据您的评论,最简单的可能是有一个名为jobs的文件,内容如下:

abc 108
def 120
qas 196

更改脚本使其看起来像这样:

#!/bin/bash

while read base size; do
   mtx="${base}.mtx"
   txt="${base}.txt"
   for np in 2 4 6 8 10; do
      echo mpirun -np $np ./myexe "$mtx" "$txt" -SIZE $size
   done
done < "$1"

示例输出

mpirun -np 2 ./myexe abc.mtx abc.txt -SIZE 108
mpirun -np 4 ./myexe abc.mtx abc.txt -SIZE 108
mpirun -np 6 ./myexe abc.mtx abc.txt -SIZE 108
mpirun -np 8 ./myexe abc.mtx abc.txt -SIZE 108
mpirun -np 10 ./myexe abc.mtx abc.txt -SIZE 108
mpirun -np 2 ./myexe def.mtx def.txt -SIZE 120
mpirun -np 4 ./myexe def.mtx def.txt -SIZE 120
mpirun -np 6 ./myexe def.mtx def.txt -SIZE 120
mpirun -np 8 ./myexe def.mtx def.txt -SIZE 120
mpirun -np 10 ./myexe def.mtx def.txt -SIZE 120
mpirun -np 2 ./myexe qas.mtx qas.txt -SIZE 196
mpirun -np 4 ./myexe qas.mtx qas.txt -SIZE 196
mpirun -np 6 ./myexe qas.mtx qas.txt -SIZE 196
mpirun -np 8 ./myexe qas.mtx qas.txt -SIZE 196
mpirun -np 10 ./myexe qas.mtx qas.txt -SIZE 196

原始答案

当您使用batch-fileawk时,不确定为什么将其标记为find(这是一个讨厌的Windows事物)?

无论如何,这似乎是你要求的:

#!/bin/bash

for f in *.mtx; do
   # Get name of corresponding text file
   t="${f%.*}.txt"
   for np in 2 4 6 8 10; do
      echo mpirun -np $np ./myexe "$f" "$t" -SIZE 10
   done
done

示例输出

mpirun -np 2 ./myexe abc.mtx abc.txt -SIZE 10
mpirun -np 4 ./myexe abc.mtx abc.txt -SIZE 10
mpirun -np 6 ./myexe abc.mtx abc.txt -SIZE 10
mpirun -np 8 ./myexe abc.mtx abc.txt -SIZE 10
mpirun -np 10 ./myexe abc.mtx abc.txt -SIZE 10
mpirun -np 2 ./myexe def.mtx def.txt -SIZE 10
mpirun -np 4 ./myexe def.mtx def.txt -SIZE 10
mpirun -np 6 ./myexe def.mtx def.txt -SIZE 10
mpirun -np 8 ./myexe def.mtx def.txt -SIZE 10
mpirun -np 10 ./myexe def.mtx def.txt -SIZE 10
mpirun -np 2 ./myexe qas.mtx qas.txt -SIZE 10
mpirun -np 4 ./myexe qas.mtx qas.txt -SIZE 10
mpirun -np 6 ./myexe qas.mtx qas.txt -SIZE 10
mpirun -np 8 ./myexe qas.mtx qas.txt -SIZE 10
mpirun -np 10 ./myexe qas.mtx qas.txt -SIZE 10

如果您有bash v4 +,则可以将2 4 6 8 10替换为{2..10..2}

只是为了好玩,因为我从未尝试过对 GNU Parallel 的嵌套调用,你可以像这样做一个单行:

parallel -j1 "parallel -j1 -I // echo mpirun -np // {} {.}.txt -SIZE 10 ::: 2 4 6 8 10" ::: *mtx

mpirun -np 2 abc.mtx abc.txt -SIZE 10
mpirun -np 4 abc.mtx abc.txt -SIZE 10
mpirun -np 6 abc.mtx abc.txt -SIZE 10
mpirun -np 8 abc.mtx abc.txt -SIZE 10
mpirun -np 10 abc.mtx abc.txt -SIZE 10
mpirun -np 2 def.mtx def.txt -SIZE 10
mpirun -np 4 def.mtx def.txt -SIZE 10
mpirun -np 6 def.mtx def.txt -SIZE 10
mpirun -np 8 def.mtx def.txt -SIZE 10
mpirun -np 10 def.mtx def.txt -SIZE 10
mpirun -np 2 qas.mtx qas.txt -SIZE 10
mpirun -np 4 qas.mtx qas.txt -SIZE 10
mpirun -np 6 qas.mtx qas.txt -SIZE 10
mpirun -np 8 qas.mtx qas.txt -SIZE 10
mpirun -np 10 qas.mtx qas.txt -SIZE 10

parallel的最外层实例遍历mtx个文件,而内部实例遍历进程数。两者都使用-j1,以便parallel一次只启动一个作业,并且不会引入任何额外的并行性。