如何在Python中读取不同字符串中的不同.txt文件列?

时间:2018-04-09 08:19:41

标签: python file

我有以下.txt文件,名为answers.txt

0 identify
0 organizations
0 that
0 participate
0 in
0 international
0 criminal
0 activity
0 the
0 activity
0 and
0 if
0 possible
0 collaborating
0 organizations
0 and
0 countries
0 involved
1 is
1 the
1 disease
1 of
1 poliomyelitis
1 polio
1 under
1 control
1 in
1 the
1 world

第一列扮演角色id,这意味着具有相同id的列属于同一句子,如下所示:

answer_0 = 'identify organizations that participate in international criminal activity and if possible collaborating organizations and countries involved'

answer_1= 'is the disease of poliomyelitis polio under control in the world' 

到目前为止,我已经能够使用以下代码阅读我的文档的每一行:

separator=' '
string=[]
for line in open("answers.txt"):
    columns = line.split(separator)
    if len(columns) >= 2:
        print (columns[1])

但我不希望属于同一个句子的单词分开但与answer_0answer_1中的字符串在一起。理想情况下,我希望list=[answer_0, answer_1]

5 个答案:

答案 0 :(得分:2)

如果我理解正确,我建议您在每行的开头读取id并将字符串存储在字典中。像这样:

answer_dict = {}
for line in open("answers.txt"):
    line_values = line.split()
    try:
        answer_dict[int(line_values[0])] += " " + line_values[1]
    except:
        answer_dict[int(line_values[0])] = line_values[1]

然后你可以用字典做任何你想做的事情。把它变成一个列表:

answer_list = []
for id in answer_dict.keys():
    answer_list += answer_dict[id]

答案 1 :(得分:1)

你似乎在寻找的是:

def fileReader(filename):
    f_obj = open(filename,"r")
    table_dict = {}
    seperator = " "
    for line in f_obj:
        id, word = line.split(seperator)
        existing_list = table_dict.get("answer_"+id, "")
        existing_list += " " + word
        table_dict["answer_"+id] = existing_list
    return table_dict

答案 2 :(得分:1)

我认为不需要字典。拆分行时,它会创建一个子字符串列表,其中第一个元素是指示句子编号的列号,另一个元素将包含句子的子字符串。因此,你可以随时生成你的句子,这将节省字典所需的空间复杂度,也可能更快一些。

separator=' '
string=[]
for line in open("answers.txt"):
    columns = line.split(separator)
    if columns[0]== '0':
        answer_0 += " "+ columns[1]
    elif columns[0]== '1':
        answer_1 += " "+ columns[1]

答案 3 :(得分:0)

你可以动态构建句子。例如:

sentences = dict()
for line in open('answers.txt'):
    n, word = line.split(' ')
    sentences.setdefault(n, []).append(word)

然后每个句子都有sentences中的一个键,并且是一个单词列表,你可以加入它们,例如id为1的句子:

' '.join(sentences[1])

对于所有句子:

for n, words in sentences.items():
   print(' '.join(words))

答案 4 :(得分:0)

试试这个:

columns = []
string1 = []
string2 = []
for line in open("answers.txt"):
    columns = line.split(separator)
    if columns[0] == “0”:
        string1.append(columns[1])
    else:
        string2.append(columns[1])
answer1 = ‘’.join(string1)
answer2 = ‘’.join(string2)
print answer1
print answer2