计算JSON数组中的平均值

时间:2018-01-19 01:16:00

标签: arrays json shell average jq

我正在使用这种格式的JSON文件:

  {
  "Response" : {
    "TimeUnit" : [ 1516298400000, 1516302000000, 1516305600000, 1516309200000, 1516312800000, 1516316400000 ],
    "metaData" : {
      "errors" : [ ],
      "notices" : [ "Source:Postgres", "Limit applied: 14400", "PG Host:ruappg0ro.apigeeks.net", "Metric with Avg of total_response_time was requested. For this a global avg was also computed with name global-avg-total_response_time", "query served by:88bec25a-ef48-464e-b41d-e447e3beeb88", "Table used: edge.api.faxgroupusenondn012.agg_api" ]
    },
    "stats" : {
      "data" : [ {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "test" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 28.0, 129.0, 24.0, 20.0, 71.0, 30.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 312.57142857142856, 344.2480620155039, 374.2083333333333, 381.1, 350.67605633802816, 363.8 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 349.5860927152318 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "test2" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 16.0, 137.0, 100.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 237.4375, 198.02189781021897, 189.44 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 16.0, 137.0, 100.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 197.12252964426878 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "appdyn" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 11.0, 137.0, 98.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 170.0, 161.57664233576642, 149.16326530612244 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 11.0, 137.0, 98.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 157.0081300813008 ]
        } ]
      }, {
        "identifier" : {
          "names" : [ "apiproxy" ],
          "values" : [ "AppDyn" ]
        },
        "metric" : [ {
          "env" : "test",
          "name" : "sum(message_count)",
          "values" : [ 0.0, 0.0, 0.0, 3.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "avg(total_response_time)",
          "values" : [ 0.0, 0.0, 0.0, 39.333333333333336, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "sum(is_error)",
          "values" : [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ]
        }, {
          "env" : "test",
          "name" : "global-avg-total_response_time",
          "values" : [ 39.333333333333336 ]
        } ]
      } ]
    }
  }
}

并希望迭代地为每个"name" : "avg(total_response_time)"计算identifier下的所有值的平均值。

我尝试了一些尝试,但我真的不知道如何继续,identifiersavg(total_response_time)的数量会有所不同。

for identifier in $(cat response4.json | jq -r  '.[].stats.data[].identifier.values' | sed 's/[][]//g' | sed
     

' S /" //克&#39);做         echo $ {identifier}

avg_response_time=$(cat response4.json | jq -r  '.[].stats.data[].metric[]') #don't know how to iterate through the 
done

任何帮助/想法将不胜感激。

2 个答案:

答案 0 :(得分:1)

首先,为清楚起见,这里是一个面向流的辅助函数:

def average(s): 
  reduce s as $x (null; .sum += $x | .n += 1)
  | if . == null then null else .sum / .n end;

接下来,我们有一个选择。我们可以单独处理.stats.data数组中的每个项目,也可以按.identifier的值对项目进行分组。在示例中,结果将是相同的(除了可能的排序),但让我们在这里分别考虑这两种情况:

.stats.data

中的每件商品的平均值
.Response.stats.data[]
| {id: (.identifier.values),
   average: average(.metric[]
     | select(.name == "avg(total_response_time)")
     | .values[]) }

按.identifier分组

.Response.stats.data
| group_by(.identifier)[]
| {id: (.[0].identifier.values),
   average: (.[].metric[] 
     | select(.name == "avg(total_response_time)") 
     | .values[] ) }

输出

{"id":["test"],"average":354.43398004304896}
{"id":["test2"],"average":104.14989963503649}
{"id":["appdyn"],"average":80.1233179403148}
{"id":["AppDyn"],"average":6.555555555555556}

答案 1 :(得分:0)

jq -r '
    .[].stats.data[]
    | (.identifier.values[]) as $identifier
    | (.metric[]
       | select(.name == "avg(total_response_time)")
       | .values
      ) as $values
    | [$identifier, ($values | add) / ($values | length)]
    | @tsv
    ' <test.json

...产率:

test    354.43398004304896
test2   104.14989963503649
appdyn  80.1233179403148
AppDyn  6.555555555555556