Question

我希望方法repeated获取一个字符串，并按重复的顺序返回其重复字符的数组。区分大小写适用。在这个例子中，

repeated("abba") # => ["b", "a"]

在"b"之前重复

"a"。另一个例子：

repeated("limbojackassin the garden") # => ["a", "s", "i", " ", "e", "n"]

的工作原理如下：

limbojackassin the garden 
      |--^                "a"
          |^              "s"
 |----------^             "i"
              |---^       " "
                    *     "a" ignored
                 |-----^  "e"
             |----------^ "n"

我有这种方法。

def repeated(str)
  puts str.chars.select{|i| str.count(i) > 1}.uniq
end

出乎意料地工作：

repeated("limbojackassin the garden")
# => ["i", "a", "s", "n", " ", "e"]
# => expected ["a", "s", "i", " ", "e", "n"]
repeated("Pippi")
# => ["i", "p"]
# => expected ["p", "i"]
repeated("apple")
# => as expected ["p"]

如何检查数组项目索引之间的距离？

Answer 1

def repeated(str)
  puts(str.each_char.with_object([{}, []]) do
    |c, (h, a)| h[c] ? a.push(c) : h.store(c, true)
  end.last.uniq)
end

Answer 2

以下是基于您的代码的解决方案：

def repeated(str)
  str.each_char.select.with_index { |c, i| str[0, i].count(c) == 1 }
end

对于每个字符，count是字符前子字符串中的字符。如果计数为1，则字符为select - ed。

如Cary Swoveland所述，这会为每个角色创建并扫描一个中间字符串。如果效率很重要，我会使用sawa's code的变体：

def repeated(str)
  h = Hash.new(0)
  str.each_char.with_object([]) do |c, a|
    a << c if h[c] == 1
    h[c] += 1
  end
end

这将通过each_char遍历字符串，并使用散列h计算每个字符。如果第二次遇到一个字符（计数值为1），它将被添加到数组a，该数组最后由each_object隐式返回。

我刚刚意识到您可以使用select代替with_object：

def repeated(str)
  h = Hash.new(0)
  str.each_char.select { |c| (h[c] += 1) == 2 }
end

Answer 3

使用Enumerable#each_with_index()索引的另一种方法。

select()拒绝所有首次出现的字符，因为index()返回第一次出现。

def repeated(str)
  puts str.chars.each_with_index
                .select { |c,i| i != str.index(c) }
                .map(&:first)
                .uniq
end

与select()相反，您可以使用reject { |c,i| i == str.index(c) }。也许它更具表现力。

更新：在Stefan的回答中，从Ruby 1.9开始，使用Enumerator#with_index()代替map()时，您可以放弃Enumerable#each_with_index()方法。

def repeated(str)
  p str.chars.reject
             .with_index { |c,i| i == str.index(c) }
             .uniq
end

Answer 4

str = "Jack be nimble, jack be quick, Jack jumped over the candle stick"

require 'set'

read1, read2 = Set.new, Set.new
str.each_char { |c| read2.add(c) unless read1.add?(c) }
read2.to_a
  #=> [" ", "b", "e", "a", "c", "k", "i", ",", "J", "j", "u", "m", "n", "d", "l", "t"]

尝试将每个字符添加到read1。如果可以添加角色，则第一次遇到该角色。如果无法将该字符添加到read1（因为read1已包含该字符），则会尝试将其添加到read2。 read2是否已包含该字符并不重要;重要的是read2在尝试添加该字符后将包含该字符。

因此，在枚举所有字符后，read2将包含按正确顺序出现多次的字符。

编辑：我要感谢@Stefan在read2.include?(c) ||之后指出我不需要unless。

Answer 5

女士们，先生们，启动引擎！

require 'fruity'
require 'set'

def sschmeck(str)
  str.chars.each_with_index.select { |c,i| i != str.index(c) }.map(&:first).uniq
end

def drenmi(str)
  str.chars.each_with_object(Hash.new { |h, k| h[k] = [] }).
      with_index { |(c, m), i| m[c] << i }.select { |_, v| v.length > 1 }.
      to_a.sort_by { |e| e[1][1] }.map(&:first)
end

def redithion(str)
    str.chars.map { |e|
        first_index = str.index(e)
        [first_index, str.index(e,first_index+1)]
    }.map{|e| e[1].nil? ? str.length + 1 : e[1]}
     .select{|e| e < str.length }.uniq.sort.map {|e| str[e]}
end

def sawa(str)
  str.each_char.with_object([{}, []]) do
    |c, (h, a)| h[c] ? a.push(c) : h.store(c, true)
  end.last.uniq
end

def stefan_1(str)
  str.each_char.select.with_index { |c, i| str[0, i].count(c) == 1 }
end

def stefan_2(str)
  h = Hash.new(0)
  str.each_char.with_object([]) do |c, a|
    a << c if h[c] == 1
    h[c] += 1
  end
end

def stefan_3(str)
  h = Hash.new(0)
  str.each_char.select { |c| (h[c] += 1) == 2 }
end

def cary(str)
  read1, read2 = Set.new, Set.new
  str.each_char { |c| read2.add(c) unless read1.add?(c) }
  read2.to_a
end

str = "Jack be nimble, jack be quick, Jack jumped over the candle stick"

compare(
  { sschmeck:   -> { sschmeck  str },
    drenmi:     -> { drenmi    str },
    redithion:  -> { redithion str },
    sawa:       -> { sawa      str },  
    stefan_1:   -> { stefan_1  str },  
    stefan_2:   -> { stefan_2  str },  
    stefan_3:   -> { stefan_3  str },  
    cary:       -> { cary      str }}
)

stefan_3 is faster than stefan_1 by 19.999999999999996% ± 10.0%
stefan_1 is similar to stefan_2
stefan_2 is faster than sawa by 10.000000000000009% ± 10.0%
sawa is similar to cary
cary is faster than sschmeck by 10.000000000000009% ± 10.0%
sschmeck is faster than drenmi by 39.99999999999999% ± 10.0%
drenmi is similar to redithion

有关：

str = "It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way – in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only."
str.size #=>613

我们获得：

Running each test 16 times. Test will take about 1 second.
stefan_3 is faster than sawa by 10.000000000000009% ± 10.0%
sawa is similar to stefan_2
stefan_2 is similar to drenmi
drenmi is similar to cary
cary is faster than sschmeck by 60.00000000000001% ± 1.0%
sschmeck is faster than stefan_1 by 19.999999999999996% ± 1.0%
stefan_1 is faster than redithion by 39.99999999999999% ± 10.0%

我的结论是结果足够接近，方法的选择应该取决于其他因素。我用秒表测试了可读性，当我想到时停止计时器，＆＃34;哦，我明白了。＆＃34;我赢了，但当然我已经理解了我在读什么。接下来是stefan_3秒的0.3。我在sawa找到1:23:14。

Answer 6

我不确定是否有一个简单的解决方案，但这里可以作为一个起点。它的工作原理是保持对所有出现的字符的索引的引用。

使用字符串"abba"：

的示例

str = "abba"

char_map = str.chars.each_with_object(Hash.new { |h, k| h[k] = [] }).with_index { |(c, m), i| m[c] << i }
#=> {"a"=>[0, 3], "b"=>[1, 2]}

我们现在有一个散列，每个字符作为键，它们的出现的索引作为值。我们现在可以过滤掉非重复项，并在第二次出现时对其余项进行排序：

char_map.select { |_, v| v.length > 1 }.to_a.sort_by { |e| e[1][1] }.map(&:first)
#=> ["b", "a"]

完整的实施：

def repeated(str)
  str.chars.each_with_object(Hash.new { |h, k| h[k] = [] }).with_index { |(c, m), i| m[c] << i }.select { |_, v| v.length > 1 }.to_a.sort_by { |e| e[1][1] }.map(&:first)
end

repeated("abba")
#=> ["b", "a"]

repeated("Pippi")
#=> ["p", "i"]

repeated("limbojackassin the garden")
#=> ["a", "s", "i", " ", "e", "n"]

Answer 7

<强>代码：

def repeated(str)
    return str.chars.map { |e|
        first_index = str.index(e)
        [first_index, str.index(e,first_index+1)]
    }.map{|e| e[1].nil? ? str.length + 1 : e[1]}
     .select{|e| e < str.length }.uniq.sort.map {|e| str[e]}
end

示例：

repeated('Pippi') => ["p", "i"]

<强>解释

它基本上获取每个字符的两个第一个索引，它删除不重复的字符（第二个索引为空），然后获取第二个索引，生成uniq，排序升序并将这些索引映射到字符串中的相应字符。

项目按重复顺序排列

7 个答案: