Clojure / dataset:按层次分组多个列?

时间:2014-08-19 15:05:32

标签: clojure group-by dataset

我想实现一个可以分层次地对多个列进行分组的函数。我可以通过以下两个列的暂定实现来说明我的要求:

(defn group-by-two-columns-hierarchically
  [col1 col2 table]
  (let [data-by-col1 ($group-by col1 table)
        data-further-by-col2 (into {} (for [[k v] data-by-col1] [k ($group-by col2 v)]))
        ]
    data-further-by-col2
    ))

我正在寻求帮助如何推广任意数量的列。

(据我所知,Incanter支持多列的分组,但它只提供一个不是层次结构的结构,一个多列复合键到数据集值的映射。)

感谢您的帮助!

注意:要使Michał的解决方案适用于incanter数据集,只需稍作修改即可将“group-by”替换为“incanter.core / $ group-by”,如以下实验所示:

(defn group-by*
      "Similar to group-by, but takes a collection of functions and returns
      a hierarchically grouped result."
      [fs coll]
      (if-let [f (first fs)]
        (into {} (map (fn [[k vs]]
                        [k (group-by* (next fs) vs)])
                   (incanter.core/$group-by f coll)))
        coll))

(def table (incanter.core/dataset ["x1" "x2" "x3"]
                                      [[1 2 3]
                                       [1 2 30]
                                       [4 5 6]
                                       [4 5 60]
                                       [7 8 9]
                                       ]))


(group-by* [:x1 :x2] table)
=>
    {{:x1 1} {{:x2 2} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  1 |  2 |  3 |
        |  1 |  2 | 30 |
        }, 
    {:x1 4} {{:x2 5} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  4 |  5 |  6 |
        |  4 |  5 | 60 |
        }, 
    {:x1 7} {{:x2 8} 
        | x1 | x2 | x3 |
        |----+----+----|
        |  7 |  8 |  9 |
        }}

1 个答案:

答案 0 :(得分:3)

(defn group-by*
  "Similar to group-by, but takes a collection of functions and returns
  a hierarchically grouped result."
  [fs coll]
  (if-let [f (first fs)]
    (into {} (map (fn [[k vs]]
                    [k (group-by* (next fs) vs)])
               (group-by f coll)))
    coll))

示例:

user> (group-by* [:foo :bar :quux]
        [{:foo 1 :bar 1 :quux 1 :asdf 1}
         {:foo 1 :bar 1 :quux 2 :asdf 2}
         {:foo 1 :bar 2 :quux 1 :asdf 3}
         {:foo 1 :bar 2 :quux 2 :asdf 4}
         {:foo 2 :bar 1 :quux 1 :asdf 5}
         {:foo 2 :bar 1 :quux 2 :asdf 6}
         {:foo 2 :bar 2 :quux 1 :asdf 7}
         {:foo 2 :bar 2 :quux 2 :asdf 8}
         {:foo 1 :bar 1 :quux 1 :asdf 9}
         {:foo 1 :bar 1 :quux 2 :asdf 10}
         {:foo 1 :bar 2 :quux 1 :asdf 11}
         {:foo 1 :bar 2 :quux 2 :asdf 12}
         {:foo 2 :bar 1 :quux 1 :asdf 13}
         {:foo 2 :bar 1 :quux 2 :asdf 14}
         {:foo 2 :bar 2 :quux 1 :asdf 15}
         {:foo 2 :bar 2 :quux 2 :asdf 16}])
{1 {1 {1 [{:asdf 1, :bar 1, :foo 1, :quux 1}
          {:asdf 9, :bar 1, :foo 1, :quux 1}],
       2 [{:asdf 2, :bar 1, :foo 1, :quux 2}
          {:asdf 10, :bar 1, :foo 1, :quux 2}]},
    2 {1 [{:asdf 3, :bar 2, :foo 1, :quux 1}
          {:asdf 11, :bar 2, :foo 1, :quux 1}],
       2 [{:asdf 4, :bar 2, :foo 1, :quux 2}
          {:asdf 12, :bar 2, :foo 1, :quux 2}]}},
 2 {1 {1 [{:asdf 5, :bar 1, :foo 2, :quux 1}
          {:asdf 13, :bar 1, :foo 2, :quux 1}],
       2 [{:asdf 6, :bar 1, :foo 2, :quux 2}
          {:asdf 14, :bar 1, :foo 2, :quux 2}]},
    2 {1 [{:asdf 7, :bar 2, :foo 2, :quux 1}
          {:asdf 15, :bar 2, :foo 2, :quux 1}],
       2 [{:asdf 8, :bar 2, :foo 2, :quux 2}
          {:asdf 16, :bar 2, :foo 2, :quux 2}]}}}