Question

我正在构建一个脚本来读取和解析Ruby中的markdown文件。该脚本需要能够读取和理解文件顶部的multimarkdown标头信息，以便它可以对输出执行其他操作。

标题值如下所示：

Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41

我无法弄清楚如何将文本行拆分为简单的键值字典。内置拆分功能在这种情况下似乎不起作用，因为我只希望它在每行中第一次出现冒号（:)时拆分。额外的冒号将是值字符串的一部分。

如果重要，我在OS X上使用Ruby 1.8.7。

Answer 1

这样做：

s = <<EOS
Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41
EOS

h = Hash[s.each_line.map { |l| l.chomp.split(': ', 2) }]
p h

输出：

{"Title"=>"My Treatise on Kumquats", "Author"=>"Joe Schmoe", "Author URL"=>"http://somedudeswebsite.me/", "Host URL"=>"http://googlesnewthing.com/", "Created"=>"2012-01-01 09:41"}

Answer 2

将split与可选的第二个参数一起使用（感谢@MichaelKohl）

s = 'Author URL: http://somedudeswebsite.me/'
key, value = s.split ': ', 2
puts key
puts value

输出

Author URL
http://somedudeswebsite.me/

Answer 3

您可以使用正则表达式来解析文本：

str = "Title: My Treatise on Kumquats
Author: Joe Schmoe
Author URL: http://somedudeswebsite.me/
Host URL: http://googlesnewthing.com/
Created: 2012-01-01 09:41"

matches = str.scan /^(.+?): (.+?)$/m

matches.each { |m|
   key = m[0]
   value = m[1]
}

这是多行正则表达式（/<regex>/m） - 它将每行匹配为两组（索引为0和1）。第一组将包含": "（冒号+空格）第一次出现之前的所有字符。第二组将包含此行中的所有其余字符（直到正则表达式遇到行$的结尾）。

这是你可以将结果转换为Hash的方法：

dictionary = matches.inject({}) do |dict, m| 
  dict[m[0]] = m[1]
  dict
end

<强>更新

Michael Kohl提到，可以将其写成一行：

hash = Hash[str.scan /^(.+?): (.+?)$/m]

Answer 4

您可以通过

执行此操作

>> s = 'Author URL: http://somedudeswebsite.me/'
>> first_idx = s.index(':')
>> key,value = s[0..first_idx-1],s[first_idx+1..s.length]
=> ["Author URL", " http://somedudeswebsite.me/"]

或通过

键值哈希

>> kv = Hash[*s[0..first_idx-1],s[first_idx+1..s.length]]
=> {"Author URL"=>" http://somedudeswebsite.me/"}

希望这有帮助

Answer 5

line.split(':',2)你想要的是什么吗？

String＃split接受第二个参数，该参数指定要拆分的部分。它适用于ruby 1.9.3，在早期版本中不确定。（但我几乎可以肯定它也适用于1.9.2）

如果这不可用，line.scan(%r{^([^:]*):(.*)})也应该有用。

如何将文本拆分为键值对？

5 个答案: