python3

时间:2017-11-03 17:05:48

标签: python bash awk

我正在尝试开发一个涉及运行UNIX命令的管道,我在os.system()调用中执行该命令。执行python代码时调用失败。问题是当我复制粘贴并直接从UNIX终端运行相同的代码时,它运行正常。我不确定这里有什么问题。这是os.system()命令。

cmd="paste "
cmd+="<(awk '{print $1, $2, $3}' "+DATA_DIRECTORY_y2h+"/data/trimmed_reads/S"+str(library)+"_STAR_transcriptome_vector_freeAligned.sortedByCoord.out.bam.idxstats) "
cmd+="<(awk '{print $3}' "+DATA_DIRECTORY_y2h+"/data/trimmed_reads/S"+str(library)+"_STAR_transcriptome_trimmed_vector_containingAligned.sortedByCoord.out.bam.idxstats) "
cmd+="| awk '$3>0 && $4>0'|awk '{a[NR]=$0;x+=(b[NR]=$3)}END{while(++i<=NR)print a[i]\" \"100*b[i]/x}'|sort -grk5,5 > "
cmd+=DATA_DIRECTORY_y2h+"/data/trimmed_reads/S"+str(library)+"_STAR_transcript_read_coverage.txt"

当我打印出来时,命令是

paste <(awk '{print $1, $2, $3}' /home/bigdata/sagnik/y2h//data/trimmed_reads/S1_STAR_transcriptome_vector_freeAligned.sortedByCoord.out.bam.idxstats) <(awk '{print $3}' /home/bigdata/sagnik/y2h//data/trimmed_reads/S1_STAR_transcriptome_trimmed_vector_containingAligned.sortedByCoord.out.bam.idxstats) | awk '$3>0 && $4>0'|awk '{a[NR]=$0;x+=(b[NR]=$3)}END{while(++i<=NR)print a[i]" "100*b[i]/x}'|sort -grk5,5 > /home/bigdata/sagnik/y2h//data/trimmed_reads/S1_STAR_transcript_read_coverage.txt

1 个答案:

答案 0 :(得分:0)

os.system()使用/bin/sh。您的代码需要bash。

因此,解决最直接和最明显问题的最小变化是:

subprocess.check_call(['bash', '-c', command])

也就是说,一个更实用的实现(更可读的更安全,因为它不允许shell解析文件名作为代码)看起来像:

cmd=r'''

dd_y2h=$1
library=$2

input1="${dd_y2h}/data/trimmed_reads/S${library}_STAR_transcriptome_vector_freeAligned.sortedByCoord.out.bam.idxstats"
input2="${dd_y2h}/data/trimmed_reads/S${library}_STAR_transcriptome_trimmed_vector_containingAligned.sortedByCoord.out.bam.idxstats"
output="${dd_y2h}/data/trimmed_reads/S${library}_STAR_transcript_read_coverage.txt"

paste \
  <(awk '{print $1, $2, $3}' "$input1") \
  <(awk '{print $3}' "$input2") \
  | awk '$3>0 && $4>0' \
  | awk '
      {
        a[NR]=$0;
        x+=(b[NR]=$3)
      }
      END {
        while (++i<=NR) {
          print(a[i] " " 100*b[i]/x)
        }
      }' \
  | sort -grk5,5 \
  >"$output"
'''

subprocess.check_call([
  'bash', '-c', cmd, '_',
  str(DATA_DIRECTORY_y2h),  # $1 in shell script
  str(library),             # $2 in shell script
])