Ruby基于哈希中另一个键的键值和值来合并数组中的哈希值

时间:2017-06-15 14:36:30

标签: ruby optimization hash amazon-dynamodb

我有一个来自发电机表的哈希数组,我需要通过键对其进行分组并对另一个键的值求和。我的数组看起来类似于:

data = [
  { 'state' => 'Florida', 'minutes_of_sun' => 10, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Florida', 'minutes_of_sun' => 7, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Florida', 'minutes_of_sun' => 2, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 15, 'timestamp' => 1497531600, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 5, 'timestamp' => 1497531600, 'region' => 'Southeast' }
]

我要找的最终结果是:

data = [
  { 'state' => 'Florida', 'minutes_of_sun' => 19, 'region' => 'Southeast' },
  { 'state' => 'Georgia', 'minutes_of_sun' => 20, 'region' => 'Southeast' }
]

我已经能够通过下面写的方法做到这一点,但它很慢而且笨重。想知道是否有更快/更少的LoC方式来做到这一点?

def combine_data(data)
  combined_data = []

  data.each do |row|
    existing_data = combined_data.find { |key| key['state'] == row['state'] }
    if existing_data.present?
      existing_data['minutes_of_sun'] += row['minutes_of_sun']
    else
      combined_data << row
    end
  end

  combined_data
end

2 个答案:

答案 0 :(得分:1)

试试这个

data.group_by { |item| item['state'] }.values.map do |arr| 
  h = arr.first
  h.delete('timestamp')
  h.merge('minutes_of_sun' => arr.inject { |acc, h| acc + h['minutes_of_sun'] }) 
end
 => [{"state"=>"Florida", "minutes_of_sun"=>19, "region"=>"Southeast"}, {"state"=>"Georgia", "minutes_of_sun"=>20, "region"=>"Southeast"}]

来自ruby 2.4.0

data.group_by { |item| item['state'] }.values.map do |arr| 
  h = arr.first
  h.delete('timestamp')
  h.merge('minutes_of_sun' => arr.sum { |item| item['minutes_of_sun'] }) 
end
 => [{"state"=>"Florida", "minutes_of_sun"=>19, "region"=>"Southeast"}, {"state"=>"Georgia", "minutes_of_sun"=>20, "region"=>"Southeast"}]

答案 1 :(得分:0)

您可以使用Hash#update(aka merge!)的形式使用块来确定合并的两个哈希中存在的键的值。有关该块中三个块变量的解释,请参阅doc。

data = [
  { 'state'=>'Florida', 'sun_min'=>10, 'stamp'=>149, 'region'=>'SE' },
  { 'state'=>'Georgia', 'sun_min'=>15, 'stamp'=>149, 'region'=>'SE' },
  { 'state'=>'Georgia', 'sun_min'=> 5, 'stamp'=>149, 'region'=>'SE' }
]

data.each_with_object({}) do |g,h|
  h.update(g['state']=>g.reject { |k,_| k=='stamp' }) do |_,o,n|
    o.merge('sun_min'=>o['sun_min']+n['sun_min'])
  end
end.values
  #=> [{"state"=>"Florida", "sun_min"=>10, "region"=>"SE"},
  #    {"state"=>"Georgia", "sun_min"=>20, "region"=>"SE"}]

请注意,如果没有.values,则返回

#=> {"Florida"=>{"state"=>"Florida", "sun_min"=>10, "region"=>"SE"},
#    "Georgia"=>{"state"=>"Georgia", "sun_min"=>20, "region"=>"SE"}}