如何在ruby中的两个字符串之间获取文本?

时间:2011-08-09 05:18:05

标签: ruby regex string parsing

我有一个包含此文本的文本文件:

What's New in this Version
==========================
-This is the text I want to get 
-It can have 1 or many lines
-These equal signs are repeated throughout the file to separate sections

Primary Category
================

我只想获得==========================和主要类别之间的所有内容,并将该文本块存储在变量中。我认为以下匹配方法可行,但它给了我,NoMethodError:未定义的方法`匹配'

    f = File.open(metadataPath, "r")
    line = f.readlines
    whatsNew = f.match(/==========================(.*)Primary Category/m).strip

有什么想法吗?提前谢谢。

4 个答案:

答案 0 :(得分:4)

f是一个文件描述符 - 您希望匹配文件中的文本,并将其读入line。我喜欢做什么而不是将文本读入数组(这很难正则表达)就是把它读成一个字符串:

contents = File.open(metadataPath) { |f| f.read }
contents.match(/==========================(.*)Primary Category/m)[1].strip

最后一行产生您想要的输出:

-This is the text I want to get \n-It can have 1 or many lines\n-These equal signs are repeated throughout the file to separate sections"

答案 1 :(得分:0)

f = File.open(metadataPath, "r")
line = f.readlines
line =~ /==========================(.*)Primary Category/m
whatsNew = $1

你可能想考虑改进。*虽然这可能是贪婪的

答案 2 :(得分:0)

您的问题是 readlines 为您提供了一个字符串数组(每行一个),但您使用的正则表达式需要一个字符串。您可以将文件读作一个字符串:

contents = File.read(metadataPath)
puts contents[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category

或者您可以在应用正则表达式之前将这些行连接成单个字符串:

lines = File.readlines(metadataPath)
puts lines.join[/^=+(.*?)Primary Category/m]
# => ==========================
# => -This is the text I want to get
# => -It can have 1 or many lines
# => -These equal signs are repeated throughout the file to separate sections
# =>
# => Primary Category

答案 3 :(得分:0)

我将采用的方法是在行中读取,找出哪些行号是一系列等号(使用Array#find_index),然后将这些行分成等号后的行。前一行(或前两行)下一批等号(可能使用Enumerable#each_cons(2)map)。这样,如果章节标题改变,我就不必修改太多。

相关问题