Question

我有一个以下所示格式的哈希数组，我试图根据一个单独的数组对哈希的:book键进行排序。订单不是按字母顺序排列的，对于我的用例，它不能按字母顺序排列。

我需要根据以下数组进行排序：

array = ['Matthew', 'Mark', 'Acts', '1John']

请注意，我已经看到了一些利用Array#index（例如Sorting an Array of hashes based on an Array of sorted values）来执行自定义排序的解决方案，但这不适用于字符串。

我尝试了与Array#sort和Array#sort_by进行排序的各种组合，但他们似乎不接受自定义订单。我错过了什么？提前感谢您的帮助！

哈希数组

[{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]

Answer 1

这是一个例子

arr = [{a: 1}, {a: 3}, {a: 2}] 

order = [2,1,3]  

arr.sort { |a,b| order.index(a[:a]) <=> order.index(b[:a]) }                                           
# => [{:a=>2}, {:a=>1}, {:a=>3}]

在你的情况下，它将是

order = ['Matthew', 'Mark', 'Acts', '1John']
result = list_of_hashes.sort do |a,b|
  order.index(a[:name]) <=> order.index(b[:name])
end

这里有两个重要的概念：

使用Array#index查找数组中找到元素的位置
'太空飞船运营商'<=> Array#sort的工作原理 - 请参阅What is the Ruby <=> (spaceship) operator?

您可以通过索引要订购的元素列表来使其快一点：

order_with_index = order.each.with_object.with_index({}) do |(elem, memo), idx|
  memo[elem] = idx
end

然后代替order.index(<name>)使用order_with_index[<name>]

Answer 2

由于您知道所需的顺序，因此无需对数组进行排序。这是你可以做到的一种方式。（我把你的哈希数组称为bible。）

bible.group_by { |h| h[:book] }.values_at(*array).flatten
  #=> [{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #    {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"},
  #    {:book=>"Mark", :chapter=>"6", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #    {:book=>"Acts", :chapter=>"17", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #    {:book=>"1John", :chapter=>"1", :section=>"new_testament"}]

由于Enumerable#group_by，Hash#values_at和Array#flatten每个只需要通过数组bible，因此这可能比bible较大时排序要快。

以下是步骤。

h = bible.group_by { |h| h[:book] }
  #=> {"Matthew"=>[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #                {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    "Mark"   =>[{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    "1John"  =>[{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #                {:book=>"1John", :chapter=>"1", :section=>"new_testament"}],
  #    "Acts"   =>[{:book=>"Acts", :chapter=>"9", :section=>"new_testament"}, 
  #                {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}]
  #   } 

a = h.values_at(*array)
  #=> h.values_at('Matthew', 'Mark', 'Acts', '1John')
  #=> [[{:book=>"Matthew", :chapter=>"4", :section=>"new_testament"},
  #     {:book=>"Matthew", :chapter=>"22", :section=>"new_testament"}],
  #    [{:book=>"Mark", :chapter=>"6", :section=>"new_testament"}],
  #    [{:book=>"Acts", :chapter=>"9", :section=>"new_testament"},
  #     {:book=>"Acts", :chapter=>"17", :section=>"new_testament"}],
  #    [{:book=>"1John", :chapter=>"1", :section=>"new_testament"},
  #     {:book=>"1John", :chapter=>"1", :section=>"new_testament"}]]

最后，a.flatten返回前面显示的数组。

让我们做一个基准。

require 'fruity'

@bible = [
  {:book=>"Matthew",
   :chapter=>"4",
   :section=>"new_testament"},
  {:book=>"Matthew",
   :chapter=>"22",
   :section=>"new_testament"},
  {:book=>"Mark",
   :chapter=>"6",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"1John",
   :chapter=>"1",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"9",
   :section=>"new_testament"},
  {:book=>"Acts",
   :chapter=>"17",
   :section=>"new_testament"}]

@order = ['Matthew', 'Mark', 'Acts', '1John']

def bench_em(n)
  arr = (@bible*((n/@bible.size.to_f).ceil))[0,n].shuffle
  puts "arr contains #{n} elements"
  compare do 
    _sort       { arr.sort { |h1,h2| @order.index(h1[:book]) <=>
                  @order.index(h2[:book]) }.size }
    _sort_by    { arr.sort_by { |h| @order.find_index(h[:book]) }.size }
    _sort_by_with_hash {ord=@order.each.with_index.to_h;
                        arr.sort_by {|b| ord[b[:book]]}.size}    
    _values_at  { arr.group_by { |h| h[:book] }.values_at(*@order).flatten.size }
  end
end

@maxpleaner，@ ChaitanyaKale和@Michael Kohl分别贡献了_sort，_sort_by和sort_by_with_hash。

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _values_at
_values_at is faster than _sort by 2x ± 1.0

bench_em  1_000
arr contains 1000 elements
Running each test 16 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is faster than _sort_by by 10.000000000000009% ± 10.0%
_sort_by is faster than _sort by 2x ± 0.1

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

这是第二轮。

bench_em    100
arr contains 100 elements
Running each test 128 times. Test will take about 1 second.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

bench_em  1_000
arr contains 1000 elements
Running each test 8 times. Test will take about 1 second.
_values_at is faster than _sort_by_with_hash by 10.000000000000009% ± 10.0%
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2.2x ± 0.1

bench_em 10_000
arr contains 10000 elements
Running each test once. Test will take about 1 second.
_values_at is similar to _sort_by_with_hash
_sort_by_with_hash is similar to _sort_by
_sort_by is faster than _sort by 2x ± 1.0

bench_em 100_000
arr contains 100000 elements
Running each test once. Test will take about 3 seconds.
_sort_by_with_hash is similar to _values_at
_values_at is similar to _sort_by
_sort_by is faster than _sort by 2x ± 0.1

Answer 3

从documentation可以看出，Array#index确实对字符串起作用（甚至是提供的示例），所以这可行：

books.sort_by { |b| array.index(b[:book]) }

但是，您不必反复搜索array，而只需确定订单一次，然后查找：

order = array.each.with_index.to_h
#=> { "Matthew" => 0, "Mark" => 1, "Acts" => 2, "1John" => 3 }
books.sort_by { |b| order[b[:book]] }

Answer 4

由于Array#sort_by的描述接受了一个块。该块应返回-1,0或+1，具体取决于a和b之间的比较。您可以使用find_index上的array进行此类比较。

array_of_hashes.sort_by {|a| array.find_index(a[:book]) }应该可以解决问题。

Answer 5

您的错误是认为您正在排序。但是，实际上，您还没有，已经有了命令，只需要放置元素即可。我并不是在提出一个紧凑或最佳的解决方案，而是一个简单的解决方案。首先将大型数组转换为由:book键索引的哈希（应该是您的第一个数据结构），然后只需使用map：

array = ['Matthew', 'Mark', 'Acts', '1John']
elements = [{:book=>"Matthew",
  :chapter=>"4",
  :section=>"new_testament"},
 {:book=>"Matthew",
  :chapter=>"22",
  :section=>"new_testament"},
 {:book=>"Mark",
  :chapter=>"6",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"1John",
  :chapter=>"1",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"9",
  :section=>"new_testament"},
 {:book=>"Acts",
  :chapter=>"17",
  :section=>"new_testament"}]
by_name = {}
for e in elements
  by_name[e[:book]] = e
end
final = array.map { |x| by_name[x] }

Ruby - 基于数组顺序对哈希值（字符串）进行排序

5 个答案: