基于键合并散列数组中的散列值

时间:2014-12-05 00:38:58

标签: ruby arrays hash merge duplicates

我有一系列类似于此的哈希:

[
  {"student": "a","scores": [{"subject": "math","quantity": 10},{"subject": "english", "quantity": 5}]},
  {"student": "b", "scores": [{"subject": "math","quantity": 1 }, {"subject": "english","quantity": 2 } ]},
  {"student": "a", "scores": [ { "subject": "math", "quantity": 2},{"subject": "science", "quantity": 5 } ] }
]

是否有一种更简单的方法可以使输出与此类似,只是循环遍历数组并查找副本然后将它们组合起来?

[
  {"student": "a","scores": [{"subject": "math","quantity": 12},{"subject": "english", "quantity": 5},{"subject": "science", "quantity": 5 } ]},
  {"student": "b", "scores": [{"subject": "math","quantity": 1 }, {"subject": "english","quantity": 2 } ]}
]

合并重复对象的规则:

  • 学生在匹配“价值”时合并(例如学生“a”,学生“b”)
  • 添加相同科目的学生成绩(例如,合并时学生a的数学成绩2和10成为12)

2 个答案:

答案 0 :(得分:2)

  

是否有一种更简单的方法可以使输出与此类似,只是循环遍历数组并查找副本然后将它们组合起来?

不是我知道的。如果您解释这些数据的来源,答案可能会有所不同,但只是基于ArrayHash我认为您将需要迭代和组合的对象。

虽然它不优雅但你可以使用这样的解决方案

arr = [
      {"student"=> "a","scores"=> [{"subject"=> "math","quantity"=> 10},{"subject"=> "english", "quantity"=> 5}]},
      {"student"=> "b", "scores"=> [{"subject"=> "math","quantity"=> 1 }, {"subject"=> "english","quantity"=> 2 } ]},
      {"student"=> "a", "scores"=> [ { "subject"=> "math", "quantity"=> 2},{"subject"=> "science", "quantity"=> 5 } ] }
    ]
#Group the array by student
arr.group_by{|student| student["student"]}.map do |student_name,student_values|
  {"student" => student_name,
  #combine all the scores and group by subject
  "scores" => student_values.map{|student| student["scores"]}.flatten.group_by{|score| score["subject"]}.map do |subject,subject_values|
    {"subject" => subject,
    #combine all the quantities into an array and reduce using `+`
    "quantity" => subject_values.map{|h| h["quantity"]}.reduce(:+)
    }
  end
  }
end
#=> [
    {"student"=>"a", "scores"=>[
                        {"subject"=>"math", "quantity"=>12},  
                        {"subject"=>"english", "quantity"=>5}, 
                        {"subject"=>"science", "quantity"=>5}]}, 
    {"student"=>"b", "scores"=>[
                        {"subject"=>"math", "quantity"=>1}, 
                        {"subject"=>"english", "quantity"=>2}]}
    ]

我知道您指定了预期结果,但我想指出使输出更简单会使代码更简单。

 arr.map(&:dup).group_by{|a| a.delete("student")}.each_with_object({}) do |(student, scores),record|
   record[student] = scores.map(&:values).flatten.map(&:values).each_with_object(Hash.new(0)) do |(subject,score),obj|
     obj[subject] += score
     obj
  end
  record
 end
 #=>{"a"=>{"math"=>12, "english"=>5, "science"=>5}, "b"=>{"math"=>1, "english"=>2}}

通过这种结构,学生就像打电话给.keys一样简单,得分也同样简单。我在想像

这样的东西
above_result.each do |student,scores|
    puts student
    scores.each do |subject,score|
      puts "  #{subject.capitalize}: #{score}"
    end
  end
end

控制台输出将是

a
  Math: 12
  English: 5
  Science: 5
b
  Math: 1
  English: 2

答案 1 :(得分:0)

在这种情况下,有两种常用的聚合值的方法。第一种是使用方法Enumerable#group_by,正如@engineersmnky在他的回答中所做的那样。第二种是使用方法Hash#update(a.k.a。merge!)的形式构建哈希,该方法使用块来解析在合并的两个哈希中存在的键的值。我的解决方案使用后一种方法,不是因为我更喜欢group_by,而只是为了向您展示它可以采用的不同方式。 (如果工程师使用update,我会选择group_by。)

您使用的特定数据结构会使您的问题变得复杂。我发现通过首先将数据转换为不同的结构,更新分数,然后将结果转换回数据结构,可以简化解决方案并使其更容易理解。您可能需要考虑更改数据结构(如果这是您的选项)。我在“讨论”部分讨论了这个问题。

<强>代码

def combine_scores(arr)
  reconstruct(update_scores(simplify(arr)))
end

def simplify(arr)
  arr.map do |h|
    hash = Hash[h[:scores].map { |g| g.values }]
    hash.default = 0
    { h[:student]=> hash }
  end
end

def update_scores(arr)
  arr.each_with_object({}) do |g,h|
    h.update(g) do |_, h_scores, g_scores|
      g_scores.each { |subject,score| h_scores[subject] += score }
      h_scores
    end
  end
end

def reconstruct(h)
  h.map { |k,v| { student: k, scores: v.map { |subject, score|
    { subject: subject, score: score } } } }
end

示例

arr = [
  { student: "a", scores: [{ subject: "math",    quantity: 10 },
                           { subject: "english", quantity:  5 }] },
  { student: "b", scores: [{ subject: "math",    quantity:  1 },
                           { subject: "english", quantity:  2 } ] },
  { student: "a", scores: [{ subject: "math",    quantity:  2 },
                           { subject: "science", quantity:  5 } ] }]
combine_scores(arr)
  #=> [{ :student=>"a",
  #      :scores=>[{ :subject=>"math",    :score=>12 },
  #                { :subject=>"english", :score=> 5 },
  #                { :subject=>"science", :score=> 5 }] },
  #    { :student=>"b",
  #      :scores=>[{ :subject=>"math",    :score=> 1 },
  #                { :subject=>"english", :score=> 2 }] }] 

<强>解释

首先考虑两个中间计算:

a = simplify(arr)
  #=> [{ "a"=>{ "math"=>10, "english"=>5 } },
  #    { "b"=>{ "math"=> 1, "english"=>2 } },
  #    { "a"=>{ "math"=> 2, "science"=>5 } }]

h = update_scores(a)
  #=> {"a"=>{"math"=>12, "english"=>5, "science"=>5}
  #    "b"=>{"math"=> 1, "english"=>2}}

然后

reconstruct(h)

返回上面显示的结果。

+ 简化

arr.map do |h|
  hash = Hash[h[:scores].map { |g| g.values }]
  hash.default = 0
  { h[:student]=> hash }
end

这会将每个哈希映射为更简单的哈希。例如,arr的第一个元素:

h = { student: "a", scores: [{ subject: "math",    quantity: 10 },
                             { subject: "english", quantity:  5 }] }

映射到:

{ "a"=>Hash[[{ subject: "math",    quantity: 10 },
             { subject: "english", quantity:  5 }].map { |g| g.values }] }
#=> { "a"=>Hash[[["math", 10], ["english", 5]]] }
#=> { "a"=>{"math"=>10, "english"=>5}}

将每个哈希的默认值设置为零简化了后续的更新步骤。

+ update_scores

对于a返回的哈希simplify数组,我们计算:

a.each_with_object({}) do |g,h|
  h.update(g) do |_, h_scores, g_scores|
    g_scores.each { |subject,score| h_scores[subject] += score }
    h_scores
  end
end

a(散列)的每个元素都合并为一个初始为空的散列h。由于update(与merge!相同)用于合并,因此修改了h。如果两个哈希共享相同的密钥(例如,“数学”),则将值相加;其他subject=>score已添加到h

请注意,如果h_scores没有密钥subject,那么:

h_scores[subject] += score
  #=> h_scores[subject] = h_scores[subject] + score
  #=> h_scores[subject] = 0 + score (because the default value is zero)
  #=> h_scores[subject] = score

也就是说,来自g_scores的键值对仅添加到h_scores

我用占位符_替换了代表主题的块变量,以减少出错的可能性,并通知读者它没有在块中使用。

+ 重建

最后一步是将update_scores返回的哈希值转换回原始数据结构,这很简单。

<强>讨论

如果您更改了数据结构,并且符合您的要求,您可以考虑将其更改为combine_scores生成的数据:

h = { "a"=>{ math: 10, english: 5 }, "b"=>{ math:  1, english: 2 } }

然后用以下内容更新分数:

g = { "a"=>{ math: 2, science: 5 }, "b"=>{ english: 3 }, "c"=>{ science: 4 } }

您只需要以下内容:

h.merge(g) { |_,oh,nh| oh.merge(nh) { |_,ohv,nhv| ohv+nhv } }
  #=> { "a"=>{ :math=>12, :english=>5, :science=>5 },
  #     "b"=>{ :math=> 1, :english=>5 },
  #     "c"=>{ :science=>4 } }