问题: 我需要提取某些键并将它们计算在哈希中,作为示例考虑:
data = [{"name"=>"name1", "priority"=>"1", "owner"=>"test3"},
{"name"=>"name1", "priority"=>"1", "owner"=>"test4"},
{"name"=>"name2", "priority"=>"1", "owner"=>"test5"},
{"name"=>"name2", "priority"=>"2", "owner"=>"test5"},
{"name"=>"nae954me2", "priority"=>"2", "owner"=>"test5"}]
我想计算每个[id(从名称中提取)和优先级]的记录数,以便最后我会得到类似的内容:
#{{"priority"=>"1", "id"=>"name1"}=>2, {"priority"=>"1", "id"=>"name2"}=>1, {"priority"=>"2", "id"=>"name2"}=>1}
我正在做以下事情,但我觉得我过于复杂了:
#!/usr/bin/env ruby
data = [{"name"=>"name1", "priority"=>"1", "owner"=>"test3"},
{"name"=>"name1", "priority"=>"1", "owner"=>"test4"},
{"name"=>"name2", "priority"=>"1", "owner"=>"test5"},
{"name"=>"name2", "priority"=>"2", "owner"=>"test5"},
{"name"=>"nae954me2", "priority"=>"2", "owner"=>"test5"}]
# (1) trash some keys, just because I don't need them
data.each do |d|
d.delete 'owner'
# in the real data I have about 4 or 5 that I'm trashing
d['id'] = d['name'].scan(/[a-z][a-z][a-z][a-z][0-9]/)[0] # only valid ids
d.delete 'name'
end
puts data
#output:
#{"priority"=>"1", "id"=>"name1"}
#{"priority"=>"1", "id"=>"name1"}
#{"priority"=>"1", "id"=>"name2"}
#{"priority"=>"2", "id"=>"name2"}
#{"priority"=>"2", "id"=>nil}
# (2) reject invalid keys
data = data.reject { |d| d['id'].nil? }
puts data
#output:
#{"priority"=>"1", "id"=>"name1"}
#{"priority"=>"1", "id"=>"name1"}
#{"priority"=>"1", "id"=>"name2"}
#{"priority"=>"2", "id"=>"name2"}
# (3) count
counts = Hash.new(0)
data.each do |d|
counts[d] += 1
end
puts counts
#{{"priority"=>"1", "id"=>"name1"}=>2, {"priority"=>"1", "id"=>"name2"}=>1, {"priority"=>"2", "id"=>"name2"}=>1}
有关改进我的计算方法的任何建议?
答案 0 :(得分:1)
有很多方法可以做到这一点。 (您可能已经注意到我已经对我的答案进行了大量编辑,详细解释了一种方法是如何工作的,只是意识到有一种更好的方法可以做到这一点,所以出现了大砍刀。)这里有两个解决方案。第一个是受到你采用的方法的启发,但我试图将它打包成更像Ruby的方法。我不确定什么是有效的“名称”,所以我把这个决定放在一个可以轻易改变的单独方法中。
<强>代码强>
def name_valid?(name)
name[0..3] == "name"
end
data.each_with_object(Hash.new(0)) {|h,g|
(g[{"id"=>h["name"],"priority"=>h["priority"]}]+=1) if name_valid?(h["name"])}
#=> {{"id"=>"name1", "priority"=>"1"}=>2,
# {"id"=>"name2", "priority"=>"1"}=>1,
# {"id"=>"name2", "priority"=>"2"}=>1}
<强>解释强>
Enumerable#each_with_object创建一个初始为空的哈希,其默认值为零,由块变量g
表示。 g
是通过添加从data
:
g[{"id"=>h["name"],"priority"=>h["priority"]}]+=1
如果哈希g
具有密钥
{"id"=>h["name"],"priority"=>h["priority"]}
与键相关联的值增加1。如果h
没有此密钥,
g[{"id"=>h["name"],"priority"=>h["priority"]}]
之前
设置为零
g[{"id"=>h["name"],"priority"=>h["priority"]}]+=1
调用,因此值变为1
。
替代方法
<强>代码强>
data.each_with_object({}) do |h,g|
hash = { { "id"=>h["name"], "priority"=>h["priority"] } => 1 }
g.update(hash) { |k, vg, _| vg + 1 } if name_valid?(h["name"])
end
#=> {{"id"=>"name1", "priority"=>"1"}=>2,
# {"id"=>"name2", "priority"=>"1"}=>1,
# {"id"=>"name2", "priority"=>"2"}=>1}
<强>解释强>
在这里,我使用Hash#update(又名Hash#merge!
)将data
(哈希)的每个元素合并到最初为空的哈希h
中(前提是"name"
的值有效)。 update
的阻止
{ |k, vg, _| vg + 1 }
当且仅当合并的散列(g
)和合并散列(hash
)具有相同的密钥k
时才会调用,在这种情况下,块返回值钥匙。请注意,第三个块变量是散列k
的键hash
的值。由于我们不使用该值,因此我将其替换为占位符_
。
答案 1 :(得分:1)
根据你所说的“类似的东西”,这可能就是诀窍:
data.group_by { |h| [h["name"], h["priority"]] }.map { |k, v| { k => v.size } }
=> [{["name1", "1"]=>2}, {["name2", "1"]=>1}, {["name2", "2"]=>1}, {["nae954me2", "2"]=>1}]