JSON数据聚合的最佳实践是什么?

时间:2015-03-19 21:15:46

标签: json d3.js aggregate c3

假设我有以下格式的数据集:

var smallTestData = [
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
{"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"} ...];

现在,如果我想画一张D3或C3图表来显示每个客户的每年和每月的累计收入,我想我最终会得到这样的结果:

   [{"yearMonth":"2009 1","revenueCustomer1":158989,"revenueCustomer2":68181},
    {"yearMonth":"2009 2","revenueCustomer1":171217,"revenueCustomer2":204975},
    {"yearMonth":"2009 3","revenueCustomer1":38477,"revenueCustomer2":46605} ...];

当然,这看起来并不优雅,但那没什么。最糟糕的部分是基于多维度(例如,年,月,客户)的度量(例如收入)的聚合,这对于JSON数据来说是一件痛苦的事情。

我已经尝试编写自己的聚合函数来解决这个问题,但除了手动将值组合在一起外,无法找到任何令人满意的解决方案。谁能引导我朝着正确的方向前进?您将如何汇总数据以适应我所描述的图表类型?有没有现成的解决方案?

既然我们关注这个主题,你会如何根据主键加入两个或多个数据集?

谢谢!

2 个答案:

答案 0 :(得分:6)

这似乎是淘汰d3.nest()的最佳时机。请查看https://github.com/mbostock/d3/wiki/Arrays#-nest了解详情。

对于您的数据以及您希望执行的操作,您可以执行以下操作:

var nestedData = d3.nest().key(function(d) { return d.YEAR + " " + d.MONTH; })
                          .key(function(d) { return d.CUSTOMER; })
                          .rollup(function(leaves) {
                              return d3.sum(leaves, function(d) {
                                  return +d.REVENUE;
                              });
                           })
                           .entries(smallTestData);

这将返回类似于:

的对象
[ 
    { 
        key: "2009 1",
        values: [
            { 
              key: "Customer1"
              values: 1938.49488391425
            },
            { 
              key: "Customer2"
              values: 175.2598842275366
            }
      },
      { ... }
]

d3.nest的工作方式是key函数定义您希望聚合的对象属性,rollup函数用于汇总与{{key匹配的所有数据。 1}}功能。每个key函数都将创建一个新的嵌套深度。如果您没有rollup函数,则每个键的values只是数据中与key函数中定义的值匹配的所有值的数组

答案 1 :(得分:3)

var smallTestData = [
  {"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
  {"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
  {"YEAR": "2009", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
  {"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
  {"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
  {"YEAR": "2009", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
  {"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
  {"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "75.9142774343491"},
  {"YEAR": "2008", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
  {"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
  {"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer1", "REVENUE": "75.9142774343491"},
  {"YEAR": "2008", "MONTH": "2", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"},
  {"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer1", "REVENUE": "1938.49488391425"},
  {"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "75.9142774343491"},
  {"YEAR": "2007", "MONTH": "1", "CUSTOMER": "Customer2", "REVENUE": "99.3456067931875"}
];

var nested = d3.nest()
  .key(function(d) { return d.CUSTOMER; }) // nest first by customer
  .key(function(d) { return d.YEAR; }) // then-by year
  .key(function(d) { return d.MONTH; }) // then-by month
  .rollup(function(values) {
    return d3.sum(values, function(d) { return d.REVENUE; });
  })
  .map(smallTestData)

console.log(nested);
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.4.11/d3.min.js"></script>

运行此代码段,打开控制台窗口并查看记录的结果。如果使用.map(nested)代替.entries(nested),则会得到相同的东西,但表示为一系列嵌套数组,这可能会更方便地绑定到d3选择(使用他们的.data()方法)

改变.key()函数的顺序控制嵌套的顺序。