在Ruby中汇总哈希数组

时间:2015-12-08 18:59:46

标签: arrays ruby

我有一个3个哈希的ruby数组。每个和平都有关于report_data(消耗2种能量)和monthes_data(每种都相同)的信息。请参阅下面的代码。

arr = [{:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[10, 20, 30, 40]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[20, 30, 40, 50]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]},

 {:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[15, 25, 35, 45]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[25, 35, 45, 55]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]},

 {:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[17, 27, 37, 47]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[27, 37, 47, 57]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]}]

我是Ruby的新手。请帮我按能源类型汇总所有数据。最后我想要一个带有report_data和monthes_data的哈希。我需要结果如下:

{:report_data=>
  [{:type=>
      {:"id"=>1, "name"=>"electricity"},
    :data=>[42, 72, 102, 132]},
    {:type=>
      {"id"=>2, "name"=>"water"}},
    :data=>[72, 102, 132, 162]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]}}

3 个答案:

答案 0 :(得分:1)

arr = [{:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[10, 20, 30, 40]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[20, 30, 40, 50]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]}},

 {:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[15, 25, 35, 45]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[25, 35, 45, 55]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]}},

 {:report_data=>
  [{:type=>
      {"id"=>1, "name"=>"electricity"},
    :data=>[17, 27, 37, 47]},
    {:type=>
      {"id"=>2, "name"=>"water"},
    :data=>[27, 37, 47, 57]}],
  :monthes_data=>
    {:monthes=>
      ["jan", "feb"]}}]


acc = {}
arr.each do
    |e| e[:report_data].each_with_index do
        |e, idx| 
        type = e[:type]['id']
        e[:data].each_with_index do 
            |e, idx|
            acc[type] = [] if not acc[type]
            acc[type][idx] = (acc[type][idx] or 0) + e  
        end
    end
end
p acc

输出

{1=>[42, 72, 102, 132], 2=>[72, 102, 132, 162]}

您应该可以将其重新格式化为记录

答案 1 :(得分:0)

为清楚起见,我已重新格式化输入数组并删除了:monthes_data键,因为这似乎与您的问题无关。这是我们的数据:

TL; DR

def zip_sum(arr1, arr2)
  return arr2 if arr1.nil?
  arr1.zip(arr2).map {|a, b| a + b }
end

def sum_report_data(arr)
  arr.flat_map do |item|
    item[:report_data].map {|datum| datum.values_at(:type, :data) }
  end
  .reduce({}) do |sums, (type, data)|
    sums.merge(type => data) do |_, old_data, new_data|
      zip_sum(old_data, new_data)
    end
  end
  .map {|type, data| { type: type, data: data } }
end
p sum_report_data(arr)
# =>
[ { type: { "id" => 1, "name" => "electricity" }, data: [ 42, 72, 102, 132 ] },
  { type: { "id" => 2, "name" => "water" }, data: [ 72, 102, 132, 162 ] }
]

说明

arr = [
  { report_data: [
      { type: { "id" => 1, "name" => "electricity" },
        data: [ 10, 20, 30, 40 ]
      },
      { type: { "id" => 2, "name" => "water" },
        data: [ 20, 30, 40, 50 ]
      }
    ]
  },

  { report_data: [
      { type: { "id" => 1, "name" => "electricity" },
        data: [ 15, 25, 35, 45 ]
      },
      { type: { "id" => 2, "name" => "water" },
        data: [ 25, 35, 45, 55 ]
      }
    ]
  },

  { report_data: [
      { type: { "id" => 1, "name" => "electricity" },
        data: [ 17, 27, 37, 47 ]
      },
      { type: { "id" => 2, "name" => "water" },
        data: [ 27, 37, 47, 57 ]
      }
    ]
  }
]

第1步

首先,让我们定义一个辅助方法来对两个数组的值求和:

def zip_sum(arr1, arr2)
  return arr2 if arr1.nil?
  arr1.zip(arr2).map {|a, b| a + b }
end

zip_sum([ 1, 2, 3 ], [ 10, 20, 30 ])
# => [ 11, 22, 33 ]

zip_sum(nil, [ 5, 6, 7 ])
# => [ 5, 6, 7 ]

zip_sum的工作方式是"压缩"使用Enumerable#zip将两个数组放在一起(例如[1, 2].zip([10, 20])返回[ [1, 10], [2, 20] ]),然后将每个数组加在一起。

第2步

接下来,让我们使用Enumerable#flat_map来获取我们关注的数据部分:

result1 = arr.flat_map do |item|
  item[:report_data].map {|datum| datum.values_at(:type, :data) }
end
# result1 =>
[ [ { "id" => 1, "name" => "electricity" }, [ 10, 20, 30, 40 ] ],
  [ { "id" => 2, "name" => "water" },       [ 20, 30, 40, 50 ] ],
  [ { "id" => 1, "name" => "electricity" }, [ 15, 25, 35, 45 ] ],
  [ { "id" => 2, "name" => "water" },       [ 25, 35, 45, 55 ] ],
  [ { "id" => 1, "name" => "electricity" }, [ 17, 27, 37, 47 ] ],
  [ { "id" => 2, "name" => "water" },       [ 27, 37, 47, 57 ] ]
]

上面我们刚从:type数组的每个哈希值中抓取:data:report_data值。

第3步

接下来让我们使用Enumerable#reduce迭代数组数组,并使用我们之前定义的:data方法计算zip_sum值的运行总和:

result2 = result1.reduce({}) do |sums, (type, data)|
  sums.merge(type => data) do |_, old_data, new_data|
    zip_sum(old_data, new_data)
  end
end
# result2 =>
{ { "id" => 1, "name" => "electricity" } => [ 42,  72, 102, 132 ],
  { "id" => 2, "name" => "water" }       => [ 72, 102, 132, 162 ]
}

结果可能看起来有些奇怪,因为我们通常使用字符串或符号作为哈希键,但在此哈希中我们使用其他哈希值(来自上面的:type值)作为键。这是关于Ruby的一个好处:您可以将任何对象用作哈希中的键。

reduce块中,sums是最终返回的哈希值。它以空哈希({}开头,我们传递给reduce的值作为参数)。 type是我们用作键的哈希值,data是整数数组。在每次迭代中,result2数组中的下一个值都分配给type,但sums将使用上一次迭代中块末尾返回的任何值进行更新。

我们以某种棘手的方式使用Hash#merge

sums.merge(type => data) do |_, old_data, new_data|
  zip_sum(old_data, new_data)
end

这会合并散列{ type => data }(请记住type:type散列  并且data是整数数组)到散列sums中。如果存在任何键冲突,则将调用该块。由于我们只有一个密钥type,因此如果sums[type]已存在,则会调用该块。如果是,我们会使用之前的zip_sumsums[type]data,有效保持data的运行总和。

实际上,它基本上是这样做的:

sums = {}

type, data = result2[0]
sums[type] = zip_sum(sums[type], data)

type, data = result2[1]
sums[type] = zip_sum(sums[type], data)

type, data = result2[3]
# ...and so on.

第4步

我们现在在result3中有这个哈希:

{ { "id" => 1, "name" => "electricity" } => [ 42,  72, 102, 132 ],
  { "id" => 2, "name" => "water" }       => [ 72, 102, 132, 162 ]
}

这就是我们想要的数据,所以现在我们只需将其从这种奇怪的格式中取出并将其放入带有:type:data键的常规哈希中:

result3 = result2.map {|type, data| { type: type, data: data } }
# result3 =>
[ { type: { "id" => 1, "name" => "electricity" },
    data: [ 42, 72, 102, 132 ]
  },
  { type: { "id" => 2, "name" => "water" },
    data: [ 72, 102, 132, 162 ]
  }
]

答案 2 :(得分:0)

<强>代码

def convert(arr)
  { :months_data=>arr.first[:months_data],
    :report_data=>arr.map { |h| h[:report_data] }.
      transpose.
      map { |d| { :type=>d.first[:type] }.
        merge(:data=>d.map { |g| g[:data] }.transpose.map { |a| a.reduce(:+) }) }
  }
end

示例

像这样的问题的一半战斗是可视化数据。当这样写时,它更加清晰,imo:

arr = [
  {:report_data=>[
     {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[10, 20, 30, 40]},
     {:type=>{"id"=>2, "name"=>"water"},       :data=>[20, 30, 40, 50]}
     ],
   :months_data=>{:months=>["jan", "feb"]}
  },    
  {:report_data=>[
     {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[15, 25, 35, 45]},
     {:type=>{"id"=>2, "name"=>"water"},       :data=>[25, 35, 45, 55]}
     ],
   :months_data=>{:months=>["jan", "feb"]}
  },    
  {:report_data=>[
     {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[17, 27, 37, 47]},
     {:type=>{"id"=>2, "name"=>"water"},       :data=>[27, 37, 47, 57]}],
   :months_data=>{:months=>["jan", "feb"]}
  }
]

让我们试一试:

convert(arr)
  #=> {:months_data=>{:months=>["jan", "feb"]},
  #    :report_data=>[
  #      {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[42, 72, 102, 132]},
  #      {:type=>{"id"=>2, "name"=>"water"},       :data=>[72, 102, 132, 162]}
  #    ]
  #   }

<强>解释

我做的第一件事就是专注于计算总和,所以我将其转换为:report_data的值。那把钥匙,以及几个月的键值对。数据(arr的所有元素(哈希)相同)可以在以后添加回来。

b = arr.map { |h| h[:report_data] }
  #=> [
  #    [{:type=>{"id"=>1, "name"=>"electricity"}, :data=>[10, 20, 30, 40]},
  #     {:type=>{"id"=>2, "name"=>"water"},       :data=>[20, 30, 40, 50]}
  #    ],
  #    [{:type=>{"id"=>1, "name"=>"electricity"}, :data=>[15, 25, 35, 45]},
  #     {:type=>{"id"=>2, "name"=>"water"},       :data=>[25, 35, 45, 55]}
  #    ],
  #    [{:type=>{"id"=>1, "name"=>"electricity"}, :data=>[17, 27, 37, 47]},
  #     {:type=>{"id"=>2, "name"=>"water"},       :data=>[27, 37, 47, 57]}
  #    ]
  #   ] 

如果您不确定每个数组的元素是否按"id"排序,您可以写:

b = arr.map { |h| h[:report_data].sort_by { |g| g[:type]["id"] } }

c = b.transpose
  #=> [
  #    [{:type=>{"id"=>1, "name"=>"electricity"}, :data=>[10, 20, 30, 40]},
  #     {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[15, 25, 35, 45]},
  #     {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[17, 27, 37, 47]}
  #    ],
  #    [{:type=>{"id"=>2, "name"=>"water"},       :data=>[20, 30, 40, 50]},
  #     {:type=>{"id"=>2, "name"=>"water"},       :data=>[25, 35, 45, 55]},
  #     {:type=>{"id"=>2, "name"=>"water"},       :data=>[27, 37, 47, 57]}
  #    ]
  #   ] 

e = c.map {|d| { :type=>d.first[:type] }.
      merge(:data=>d.map { |g| g[:data] }.transpose.map { |a| a.reduce(:+) }) }
  #=> [{:type=>{"id"=>1, "name"=>"electricity"}, :data=>[42, 72, 102, 132]},
  #    {:type=>{"id"=>2, "name"=>"water"}      , :data=>[72, 102, 132, 162]}] 

最后,我们需要将密钥:report_data放回去并添加月份&#39;数据:

{ :months_data=>arr.first[:months_data], :report_data=>e }
  #=> {:months_data=>{:months=>["jan", "feb"]},
  #    :report_data=>[
  #      {:type=>{"id"=>1, "name"=>"electricity"}, :data=>[42, 72, 102, 132]},
  #      {:type=>{"id"=>2, "name"=>"water"},       :data=>[72, 102, 132, 162]}
  #    ]
  #   }