如何将文本块解析为行?

时间:2019-01-28 02:44:11

标签: python

我们正在尝试将文本块解析为单独的行。它被保存为文本文档,我们的目标是将单独的文本块分配到单独的行上。

ggplot2 is a data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.[1][2] It is licensed under GNU GPL v2.[3]

来源:https://en.wikipedia.org/wiki/Ggplot2

我想创建一个表,其中有一个新行,其中包含“ ggplot”之后的文本。

Row Text    Separator
1   ggplot2 is a data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005,  "ggplot2"
2   ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers.   "ggplot2"
3   ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005,     "ggplot2"
4   ggplot2 has grown in use to become one of the most popular R packages.[1][2] It is licensed under GNU GPL v2.[3]    "ggplot2"

格式已关闭,但每行的分隔符列均为“ ggplot2”。

这是我尝试过的

text = open('ggplot2.txt','r+')
l=[]
for i in text.readlines():
    if i == "ggplot2":
        l.newline(i)

2 个答案:

答案 0 :(得分:1)

您可以使用.append()创建行,并用"ggplot2"拆分以获取所需的行:

text = "ggplot2 is a data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.[1][2] It is licensed under GNU GPL v2.[3]"

lines = text.split("ggplot2")
rows = []

for line in lines:
  if(line != ""):
    rows.append("ggplot2" + line)

print(rows)

在上面的代码中执行i == "ggplot2"的问题是它正在检查所解析文本的整个行是否等于字符串"ggplot2",而不是是否包含字符串{{1} }。

答案 1 :(得分:0)

AttributeError:“列表”对象没有属性“换行符” 请记住,如果要添加项目到列表,则需要添加属性。
示例:

table.append(item)

我认为您应该尝试一下。

text = open('ggplot2.txt','r+')
table=[]
for row in text.readlines():
    if "ggplot2" in row:
        data = row.split('ggplot2')
        for index, e in enumerate(data):
            table.append([index, 'ggplot2 {0}'.format(e), 'ggplot2'])

print(table)

列表没有名为换行符的属性,也许您是说追加。