awk拆分字符串,没有分隔符

时间:2014-09-10 01:03:13

标签: bash time awk split

我有以下结果/ stat文件来自运行我想用awk分析的测试:

$date     $time    $statname $traffic_rate $val1 $val2
20140909 132920326 stat1     30/sec        40    80
20140909 132950326 stat1     29/sec        60    20
20140909 133020326 stat1     28/sec        70    100
20140909 133050326 stat1     0/sec          0    0
20140909 133120326 stat1     0/sec          0    0
20140909 133150326 stat1     30/sec        90    50

$time采用以下格式:HHMMSSmmm,并以30秒为间隔生成统计信息。我需要为每个具有$ traffic_rate值> =' 28 /秒'的连续统计信息平均$val$val2值。使用traffic_rate<忽略统计信息28 /秒并重复下一个系列的过程> = 28 /秒等等。

我想使用bash脚本,并认为awk将是分析列数据的不错选择。为了将连续时间戳与$ traffic_rate> = 28 /秒进行比较,我需要使用mktime转换$ time。但是,由于没有分隔符,我无法分割$ time。有没有办法按像PHP中的字符计数分割?

示例输出如下:

test# $val   $val2
1      170/3 200/3
2      90/1  50/1

也就是说,每个连续的> = 28 /秒是单个测试结果,应该单独计算。

此外,任何其他建议分析这些类型的模式将不胜感激。感谢。

2 个答案:

答案 0 :(得分:2)

使用awk

awk -v OFS="\t" '
BEGIN { print "test#", "$val", "$val2" }
$4 == "0/sec" && count { 
    print ++id, val1/count, val2/count
    count = val1 = val2 = 0
} 
$4+0>=28 && NR>1 { 
    val1+=$5
    val2+=$6
    ++count
}
END { 
    print ++id, val1/count, val2/count
}' file
test#   $val      $val2
1       56.6667  66.6667
2       90       50

答案 1 :(得分:1)

如果val1每30秒一次,您可以使用平均val2 traffic_rate的简短脚本来完成您的需求:

#!/bin/bash

## validate data file input
[ -f "$1" ] || {
    printf "\nError: insufficient input. File '%s' not found.\n\n" "${0//\//}"
    exit 1
}

declare -i cnt=0                    # simple count variable

printf "\n    val1    val2\n\n"     # print generic header

## read each line in file
while read -r dt tm sn trf v1 v2 || [ -n "$dt" ]; do

    trf=${trf%/*}               # extract numeric traffic_rate

    if [ "$trf" = 30 ]; then    # if equal to 30
        v1a+=( $v1 )            # add values to v1 array and v2 array
        v2a+=( $v2 )
        ((cnt++))
    else
        v1s=0                   # reset v1 sum and v2 sum
        v2s=0
        for i in ${v1a[@]}; do v1s=$((v1s+i)); done # calculate v1 sum from v1 array
        for i in ${v2a[@]}; do v2s=$((v2s+i)); done # calculate v2 sum from v2 array
        if [ $v1s -gt 0 ] && [ $v2s -gt 0 ]; then   # if both greater than 0, output
            printf "  %6s  %6s\n" \
            $( echo "scale=2; $v1s/$cnt" | bc ) $( echo "scale=2; $v2s/$cnt" | bc )
        fi
        cnt=0
        unset v1a v2a
    fi

done <"$1"

## output if array elements remain
if [ ${#v1a[@]} -gt 0 ]; then
    v1s=0
    v2s=0
    for i in ${v1a[@]}; do v1s=$((v1s+i)); done
    for i in ${v2a[@]}; do v2s=$((v2s+i)); done
    if [ $v1s -gt 0 ] && [ $v2s -gt 0 ]; then
        printf "  %6s  %6s\n" \
        $( echo "scale=2; $v1s/$cnt" | bc ) $( echo "scale=2; $v2s/$cnt" | bc )
    fi
    cnt=0
    unset v1a v2a
fi

printf "\n"

exit 0

<强>输出:

$ bash avg30.sh dat/split.dat

    val1    val2

    56.66   66.66
    90.00   50.00