Question

我的文件夹结构看起来像这样

$ tree
.
├── Original_folder
│   └── cat.txt
├── folder1
│   └── cat.txt
├── folder2
│   └── cat.txt
├── folder3
│   
└── cat.txt

每个cat.txt文件在开始列标题之前都有5行。示例cat.txt文件就是这样

Version LRv1.10.0
Build date 2017-12-06
MOL-calc
PRESSURE
!                       
      Time[s]     InletT[K]   InletP[Pa]   O2_GasOut     C_GasOut
       100         0.000885   1000000       0.0007       0.2111
and so on....

我想绘制第一列以及具有关键字“ _GasOut”的列标题的列。（此关键字的标题数量未知，对于每列我都希望有一个单独的图）。此外，对于folder1，folder2，folder3 ......等所有绘图，应将Original_folder的图形结果绘制在同一图形中。

对应的图应保存在相应的文件夹中。

N.B：文件夹编号不固定。

我得到的一个可能的解决方案是这样的...

#!/bin/bash

## truncate tmp.plt and set line style
echo -e "set style data lines\nplot \\" > tmp.plt

cnt=0   ## flag for adding ',' line ending

## loop over each file
for i in folder*/data; do
    if ((cnt == 0)); then       ## check flag (skips first iteration)
        cnt=1                   ## set flag to write ending comma
    else
        printf ",\n" >> tmp.plt             ## write comma
    fi
    printf "\"$i\" using 1:2,\n" >> tmp.plt ## write using 1:2
    printf "\"$i\" using 1:3" >> tmp.plt    ## write using 1:3 (no ending)
done
echo "" >> tmp.plt              ## write final newline

这将创建一个tmp.plt文件，以后需要在

中调用

gnuplot -p tmp.plt

但这是基于列号，而不是基于列标题的名称。我添加了cat.txt文件之一作为参考。 https://1drv.ms/t/s!Aoomvi55MLAQh1wMmpnPGnliFmgg

Answer 1

更新后的答案

我对gnuplot知之甚少，无法理解注释中所述的语法。我添加了保存列标题并修剪所有空格的代码，还添加了从文件名导出文件夹名称的代码-但是我不知道如何使用它们，所以我只打印了它们。请参见代码中标记为"FIXME"的行！

#!/bin/bash

gawk -F $'\t' '                                     # Using TABs as field separators
   /_GasOut/{                                       # On lines containing "_GasOut"
      for(f=1;f<=NF;f++){                           # ... iterate over all fields on line
         hdr=$f                                     # ... picking them up
         colhdr[f]=hdr                              # ... saving the column headers
         if(index(hdr,"_GasOut"))wanted[f]=1        # ... and noting which ones we want to print
      }
   }
   ENDFILE{                                         # As we reaach end of each file
      for(f in wanted){                             # ... iterate over wanted fields
         if(length(cmds)) cmds = cmds ",\n"         # ... adding commas and newlines if needed
         hdr = colhdr[f]                            # ... grabbing column header
         gsub(/^[[:space:]]+|[[:space:]]+$/,"",hdr) # ... trim leading or trailing spaces
         folder = FILENAME
         gsub(/\/cat.txt/,"",folder)                # ... deriving foldername
         print "hdr=", hdr, ", folder=", folder     # FIXME
         cmds = cmds "\"" FILENAME "\" using 1:" f  # ... and adding the "using" statement
      }
      delete wanted                                 # Forget list of wanted fields for next file
   }
   END{                                             # At very end of last file
      print cmds                                    # ... print accumulated gnuplot cmds
   }

   ' folder*/cat.txt

示例输出

hdr= O2_GasOut , folder= folder1
hdr= H2O_GasOut , folder= folder1
hdr= H2_GasOut , folder= folder1
hdr= N2_GasOut , folder= folder1
hdr= NO_GasOut , folder= folder1
hdr= NO2_GasOut , folder= folder1
hdr= N2O_GasOut , folder= folder1
hdr= O2_GasOut , folder= folder2
hdr= H2O_GasOut , folder= folder2
hdr= H2_GasOut , folder= folder2
hdr= N2_GasOut , folder= folder2
hdr= NO_GasOut , folder= folder2
hdr= NO2_GasOut , folder= folder2
hdr= N2O_GasOut , folder= folder2
"folder1/cat.txt" using 1:22,
"folder1/cat.txt" using 1:23,
"folder1/cat.txt" using 1:24,
"folder1/cat.txt" using 1:25,
"folder1/cat.txt" using 1:26,
"folder1/cat.txt" using 1:27,
"folder1/cat.txt" using 1:28,
"folder2/cat.txt" using 1:22,
"folder2/cat.txt" using 1:23,
"folder2/cat.txt" using 1:24,
"folder2/cat.txt" using 1:25,
"folder2/cat.txt" using 1:26,
"folder2/cat.txt" using 1:27,
"folder2/cat.txt" using 1:28

原始答案

我似乎无法像您一样对字段进行计数，但这就是我所拥有的：

#!/bin/bash

gawk -F $'\t' '                                     # Using TABs as field separators
   /_GasOut/{                                       # On lines containing "_GasOut"
      for(f=1;f<=NF;f++){                           # ... iterate over all fields on line
         this=$f                                    # ... picking them up
         if(index(this,"_GasOut"))wanted[f]=1       # ... and noting which ones we want to print
      }
   }
   ENDFILE{                                         # As we reaach end of each file
      for(f in wanted){                             # ... iterate over wanted fields
         if(length(cmds)) cmds = cmds ",\n"         # ... adding commas and newlines if needed
         cmds = cmds "\"" FILENAME "\" using 1:" f  # ... and adding the "using" statement
      }
      delete wanted                                 # Forget list of wanted fields for next file
   }
   END{                                             # At very end of last file
      print cmds                                    # ... print accumulated gnuplot cmds
   }

   ' folder*/cat.txt

这是示例输出：

"folder1/cat.txt" using 1:22,
"folder1/cat.txt" using 1:23,
"folder1/cat.txt" using 1:24,
"folder1/cat.txt" using 1:25,
"folder1/cat.txt" using 1:26,
"folder1/cat.txt" using 1:27,
"folder1/cat.txt" using 1:28,
"folder2/cat.txt" using 1:22,
"folder2/cat.txt" using 1:23,
"folder2/cat.txt" using 1:24,
"folder2/cat.txt" using 1:25,
"folder2/cat.txt" using 1:26,
"folder2/cat.txt" using 1:27,
"folder2/cat.txt" using 1:28

Answer 2

这可以完全在gnuplot中完成。

我不确定该脚本在哪里可以找到与所需列标题匹配的名称，但是下面是一个简单的gnuplot脚本，该脚本适用于单个已知列标题，然后是一个稍微复杂的脚本，假定您已将所需的名称收集到字符串数组中。

set term pdf
set key autotitle columnheader   # not strictly necessary but ensures the program
                                 # interprets the first line of data as headers
HEADER = "_GasOut"
orig = "original_folder/cat.txt"
do for [f=1:10] {                # 10 folders 
  in = sprintf("folder%d/cat.txt", f)
  out = sprintf("folder%d/plot.pdf")
  plot orig skip 5 using 1:(column(HEADER)), \
       in skip 5 using 1:(column(HEADER))
}

现在，除了在多个列标题上添加循环之外，让我们做同样的事情。

set term pdf
set key autotitle columnheader

array HEADS = [ "C_GasOut", "SurfaceT_cell7[K]", "O2_intMassFlowOut[kg]", \
                  "H2O_Conv", "InletT[K]" ]
orig = "original_folder/cat.txt"

do for [h = 1 : |HEADS|] {
  HEADER = HEADS[h]
  do for [f=1:10] { 
    in = sprintf("folder%d/cat.txt", f)
    out = sprintf("folder%d/plot.pdf")
    plot orig skip 5 using 1:(column(HEADER)), \
         in skip 5 using 1:(column(HEADER))
  }
}

如果事先不知道子文件夹的名称或编号，则可以用循环访问文件夹名称列表的循环[f = 1：10]代替循环[f = 1：10]。例如：

folders = system("ls -1 fold*")
nfolders = words(folders)
do for [f = 1 : nfolders] {
    in = sprintf("%s/cat.txt", folders(word(folders,f)))
    ... etc

Answer 3

此示例显示如何绘制标题包含子字符串“ _GasOut”的所有列。该策略是测试子字符串的每个列标题。如果匹配，则绘制列值；如果不匹配，则将所有值视为NaN。 set datafile missing NaN然后跳过整个情节。

在这里，我不重复前面示例中所示的子目录迭代。

set rmargin at screen 0.85
set key reverse Left top left at screen 0.85, 0.9
set log y

data = 'cat.txt'
set datafile missing NaN

plot for [i=1:99] data skip 5 using 1:(strstrt(columnhead(i),"_GasOut") ? column(i) : NaN) title columnhead(i) noenhanced lw 2

Gnuplot：通过读取列标题

3 个答案: