内插列数据

时间:2018-09-19 01:13:38

标签: csv unix awk interpolation

我正在尝试找到一种方法,可能使用awk在CSV文件的两行数据之间进行插值。现在,每行代表0点和6点的数据点。我希望填写0点和6点之间的丢失的小时数据。

当前CSV

lat,lon,fhr
33.90000,-76.50000,0
34.20000,-77.00000,6

预期插值输出

lat,lon,fhr
33.90000,-76.50000,0
33.95000,-76.58333,1
34.00000,-76.66667,2
34.05000,-76.75000,3
34.10000,-76.83333,4
34.15000,-76.91667,5
34.20000,-77.00000,6

1 个答案:

答案 0 :(得分:1)

这是一个应该达到此目的的awk文件

# initialize lastTime, also used as a flag to show that the 1st data line has been read
BEGIN { lastTime=-100 }
# match data lines
/^[0-9]/{
   if (lastTime == -100) {
      # this is the first data line, print it
      print;
   } else {
      if ($3 == lastTime+1) {
         # increment of 1 hour, no need to interpolate
         print;
      } else {
        # increment othet than 1 hour, interpolate
        for (i = 1 ; i < $3 - lastTime; i = i + 1) {
            print lastLat+($1-lastLat)*(i/($3 - lastTime))","lastLon+($2-lastLon)*(i/($3 - lastTime))","lastTime+i
         }
         print;
      }
   }
   # save the current values for the next line
   lastTime = $3;
   lastLon = $2;
   lastLat = $1;

}
/lat/{
   # this is the header line, just print it
   print;
}

运行为

 awk -F, -f test.awk test.csv

我假设您的第三列具有整数值。