Elixir - 如何每3个字符将字符串拆分为一个列表

时间:2017-03-28 06:04:11

标签: elixir

如果我有字符串"UGGUGUUAUUAAUGGUUU"如何将其转换为按["UGG", "UGU", "UAU", "UAA", "UGG", "UUU"]每3个字符拆分的列表?

6 个答案:

答案 0 :(得分:18)

如果你的字符串只包含ASCII字符而且你的字符串byte_size是3的倍数,那么使用一个鲜为人知的Elixir特征就是一个非常优雅的解决方案:二元理解:

iex(1)> string = "UGGUGUUAUUAAUGGUUU"
"UGGUGUUAUUAAUGGUUU"
iex(2)> for <<x::binary-3 <- string>>, do: x
["UGG", "UGU", "UAU", "UAA", "UGG", "UUU"]

这将字符串拆分为3个字节的块。这将比分割代码点或字形快得多,但如果您的字符串包含非ASCII字符,则无法正常工作。 (在这种情况下,我会和@ michalmuskala一起回答。)

编辑:Patrick Oscity的回答提醒我,这也适用于代码点:

iex(1)> string = "αβγδεζηθικλμνξοπρςστυφχψ"
"αβγδεζηθικλμνξοπρςστυφχψ"
iex(2)> for <<a::utf8, b::utf8, c::utf8 <- string>>, do: <<a::utf8, b::utf8, c::utf8>>
["αβγ", "δεζ", "ηθι", "κλμ", "νξο", "πρς", "στυ", "φχψ"]

答案 1 :(得分:13)

"UGGUGUUAUUAAUGGUUU"
|> String.codepoints
|> Enum.chunk(3)
|> Enum.map(&Enum.join/1)

我也想知道是否有更优雅的版本

答案 2 :(得分:9)

这可以使用Stream.unfold/2函数来实现。在某种程度上,它与reduce相反 - 减少允许我们将集合折叠为单个值,展开是将单个值扩展为集合。

作为Stream.unfold/2的生成器,我们需要一个返回元组的函数 - 第一个元素是生成集合的下一个成员,第二个元素是我们将要传递到下一个迭代的累加器。这准确描述了函数String.split_at/2。最后,我们需要终止条件 - String.split_at("", 3)将返回{"", ""}。我们对空字符串不感兴趣,所以它应该足以处理我们生成的流,直到我们遇到空字符串 - 这可以通过Enum.take_while/2实现。

string
|> Stream.unfold(&String.split_at(&1, 3)) 
|> Enum.take_while(&(&1 != ""))

答案 3 :(得分:5)

另一种可能性是使用Regex.scan/2

iex> string = "abcdef"
iex> Regex.scan(~r/.{3}/, string)
[["abc"], ["def"]]

# In case the number of characters is not evenly divisible by 3
iex> string = "abcdefg"
iex> Regex.scan(~r/.{1,3}/, string)
[["abc"], ["def"], ["g"]]

# If you need to handle unicode characters, you can add the `u` modifier
iex> string = "abc"
iex> Regex.scan(~r/.{1,3}/u, string)
[[""], ["abc"]]

或使用递归函数,这有点冗长,但IMO应该是使用热切评估的最佳性能解决方案:

defmodule Split do
  def tripels(string), do: do_tripels(string, [])

  defp do_tripels(<<x::utf8, y::utf8, z::utf8, rest::binary>>, acc) do
    do_tripels(rest, [<<x::utf8, y::utf8, z::utf8>> | acc])
  end

  defp do_tripels(_rest, acc) do
    Enum.reverse(acc)
  end
end

#  in case you actually want the rest in the result, change the last clause to
defp do_tripels(rest, acc) do
  Enum.reverse([rest | acc])
end

答案 4 :(得分:4)

请尝试

List.flatten(Regex.scan(~r/.../, "UGGUGUUAUUAAUGGUUU"))

你会得到

["UGG", "UGU", "UAU", "UAA", "UGG", "UUU"]

文档来源:

scan method

flatten method

答案 5 :(得分:0)

如何使用String.split

0. null
1. Letter Template already exists for the selected event.
2. Sms Template already exists for the selected event.
3. Sms and Letter Templates already exist for the selected event.
4. Email Template already exists for the selected event.
5. Email and Letter Templates already exist for the selected event.
6. Email and Sms Templates already exist for the selected event.
7. Email, Sms and Letter Templates already exist for the selected event.